Link charts show the magnitude and direction of relationships between two or more categorical variables. They're used in link analysis for identifying relationships between nodes that are not easy to see from the raw data.
Link charts can answer questions about your data, such as the following:
- How is it related?
- In which direction does the information flow?
A GIS analyst is studying patterns of migration in the United States. A link chart can be used to visualize the rate of migration between individual states. The link chart can be configured to show the direction of migration.
Create a link chart
To create a link chart, complete the following steps:
- Select one of the following combinations of data:
- Two string fields
- Two string fields plus a number or rate/ratio field
If you do not select a number or rate/ratio field, your data will be aggregated and a count will be displayed.
You can search for fields using the search bar in the data pane.
- Create the link chart using the following steps:
- Drag the selected fields to a new card.
- Hover over the Chart drop zone.
- Drop the selected fields on Link Chart.
You can also create charts using the Chart menu above the data pane or the Visualization type button on an existing card. For the Chart menu, only charts that are compatible with your data selection will be enabled. For the Visualization type menu, only compatible visualizations (including maps, charts, or tables) will be displayed.
Link charts can also be created using View Link Chart, which is accessed from the Action button under Find answers > How is it related?
Click on a node to display the Hide leaf nodes button , the Set as root node or Set as central node button , and the Edit button . Hide leaf nodes will collapse any nodes that are connected only to the selected node. The nodes can be unhidden using the Show leaf nodes button . Set as root node and Set as central node will change the root or central node from the node with the highest centrality to the selected node. Set as root node is only available for charts using a hierarchical layout and Set as central node is only available for charts using a radial layout. Edit can be used to change the style of the selected image. Symbol styles that are changed using the Edit button will be saved in the workbook and on the page, but not in the model.
The Layer options button can be used to change the style of the symbols. Select a node or link to change the style options in the Layer options pane. The style options include changing the size and color of nodes, changing the node symbol to an image, changing the pattern and thickness of links, and applying classification types to both links and nodes.
The nodes can be sized using the following centrality methods:
- Degree—The number of direct neighbors of the node. If the chart is directed, the degree can be measured as either indegree (the number of direct neighbors with connections directed toward the node) or outdegree (the number of direct neighbors with connections directed away from the node).
- Betweenness—The extent to which a node lies on the shortest path between other nodes in the network.
- Closeness—The average of the shortest distance paths to all other nodes.
- Eigenvector—The measure of the influence of a node in a network based on its proximity to other important nodes.
The Symbology tab and Appearance tab will display different options based on the selections you make in the Layer options pane. The following options are available for link charts:
The Directed parameter can be used to change the links to arrows from one node to the other.
The centrality method can be set from the Size node using parameter.
The Edge weight parameter is used to calculate weighted centrality values. By default, the Edge weight parameter is set to Uniform, meaning the centrality calculation is unweighted. A field can be chosen to apply weights to the calculation. Edge weight is available for betweenness, closeness, and eigenvector centralities.
The Normalized parameter can be used to normalize the node centralities by dividing by another field to create a ratio or proportion. The Normalized parameter is enabled by default but can be disabled for nodes using betweenness and closeness centrality.
The Natural Breaks, Equal Interval, and Unclassed classifications can be chosen in the Classification type parameter. If Natural Breaks or Equal Interval is chosen, the number of classes can also be edited.
Click View centralities to create a reference table showing the centrality values for each node. The table includes a column for entity (field name), node (feature), and centrality.
Drag a string field to the Layer options pane and drop it on the link to style the links by unique values.
Use the Choose node field parameter to switch the selected node to a different string field.
Change the Node style options, including the following options:
Add an image file or url to symbolize the nodes using Custom from the Symbol shape menu.
Use the Add button and Delete button to add new node fields or delete existing node fields. New node fields will be connected to the selected node field. You must have three or more node fields to delete a node.
Drag a string field to the Layer options pane and drop it on the Add button or on an existing node to add additional node fields.
Use Ctrl+click to select multiple nodes. The following options are available:
The Weight parameter can be used to change or remove the number or rate/ratio field being used to apply weight to the links.
The Type parameter can be used to change or remove the string field being used to style the links by unique category.
Change the Link style options parameter, including the following options:
The Legend tab is enabled if a Weight field or Type field is added. The Legend can be used to view the classification values or unique categories for the links and to make selections on the chart.
If the arrows are pointing in the wrong direction, use the Flip button to change the direction of the flow.
If the map includes three or more node fields, the Delete button can be used to remove a link from the map. Deleting a link also removes a node field that has become disconnected from the rest of the map.
Drag a number or rate/ratio field to the Layer options pane and drop it on the selected link to change the Weight parameter. Use a string field to change the Type parameter.
How link charts work
A force directed layout displays the relationships between nodes in an organization that balances performance and drawing quality, including minimizing edge crossing, optimizing space, creating an even distribution of nodes, and displaying the graph symmetrically. A force directed layout is especially useful in analyses where the relationships are not hierarchical, so the organization is based on optimizing the clarity of the graph. Force directed is the default layout, and is used in the example above.
A hierarchical layout organizes a link chart so that the most important node (by default, this will be the node with the highest centrality) is located at the top, with links directed downward, similar to a family tree. A hierarchical layout is especially useful in analyses where the hierarchy is inherent in the dataset (for example, a workplace with an employer, managers, and employees).
A police department has been tracking communication between members of a criminal organization. A link chart can be used to create connections between the different members of the organization. A hierarchical layout provides the police department with information about the internal organization, including who is the boss, and which lower level members are working together.
A radial layout functions similarly to the hierarchical layout, but with an organization that is circular rather than linear from top to bottom. In a radial layout, the most important node (by default, this will be the node with the highest centrality) is located in the center, with links directed outwards in an orbital pattern. A radial layout tends to have a more efficient use of space than a hierarchical layout, which makes it useful for large datasets. However, the change in layout can have trade-offs; for example, the hierarchical structure may be less obvious in a radial layout. Therefore, it is more useful to use a radial layout in situations when aspects like groups of related nodes are more important than the hierarchical relationship.
In the previous example, a police department was tracking communication between members of a criminal organization. Rather than using a link chart to understand the internal hierarchy of the organization, this time the link chart can be used to look more specifically at direct relationships. By switching the chart to a radial layout, the focus is switched from Peter (the leader of the organization) to Carmen (the second-in-command). This change in focus is caused by Carmen's role as a go-between for the top level and the lower levels, whereas Peter only has contact with a small number of lower-level members. The radial organization puts more emphasis on how those levels are grouped, rather than who is commanding whom.
A limit to the number of connections that can be displayed is based on the maximum query limit for the dataset. The error message There's too much data to complete this operation will be displayed if the number of connections is greater than the limit. The maximum query limit for point features is 16 thousand. The maximum query limit for line and area features is 8 thousand.
For example, a dataset of flights throughout Europe contains hundreds of thousands of flight numbers for 126 airports. Every airport has at least one direct flight to every other airport. Therefore, the number of connections is:
126 origins * 126 destinations = 15876 connections
The number of flights does not affect the query limit, but the number of airports does. If one extra airport is added to the dataset with direct flights to all other airports, the number of connections increases to 16,129, which is over the query limit. However, if there is not a connection between every unique value, the number of unique values can be higher. If some of the airports do not have direct flights between each other then the number of airports that can be displayed could increase until the number of connections surpasses the query limit.