Create relationships to join datasets

Insights in ArcGIS Online
Insights in ArcGIS Enterprise
Insights desktop

Relationships are used to join attributes from two or more datasets based on common fields or locations. Relationships can be used to add attributes from a reference table to another dataset, to combine attributes from two datasets to enable different types of analysis, and to enable location in nonspatial tables.

To create a relationship in ArcGIS Insights, you must add two or more supported datasets with common fields to your page.

About relationships

Insights allows you to join data from two or more datasets using a common field or location in the Create Relationships window. For example, you can associate a dataset of sales transactions with a dataset containing store information by identifying a common field (for example, store number). The common field identifies which rows from each dataset will be joined to each other, and the relationship type determines which rows are kept in the joined dataset.

The four types of relationships you can use to join your datasets are as follows:

Note:

Spatial joins (joins made using location fields) will be created using Intersects as the spatial relationship. However, you can still choose Inner, All, Left, or Right as the relationship type.

The joined dataset is displayed as a result dataset Results in the data pane. The joined dataset allows you to expand your analysis by using the data from all of the joined datasets as a single layer.

In the context of a joined dataset, cardinality refers to the relationships between datasets. In the Create Relationships window, take the time to consider the cardinality between the datasets, which can be one-to-one, many-to-one, or many-to-many. Cardinality does not affect the type of relationship you create, but it does have implications for the fields you might want to aggregate in the joined dataset, or the field you choose to show on your map.

Insights also suggests a common field when you choose the datasets you want to join. If no field is suggested, you can select the fields you want to use for the relationship.

Supported data

The following data types can be used to create relationships:

  • Hosted feature layers
  • Files added to your workbook
  • Datasets from a database connection (available in Insights in ArcGIS Enterprise and Insights desktop)
  • Result datasets from spatial analysis, including results from unsupported data types

Definition queries in feature services are not supported for creating relationships. Data in your workbook that is not supported for relationships will be disabled in the Create Relationships window.

Note:

Relationships cannot be created with remote feature layers, such as ArcGIS Living Atlas layers.

Database datasets must be from the same database and instance to be compatible for joining. Other datasets must be stored in the same data store to be joined.

If your datasets are not compatible with joining or are stored in separate data stores, you can copy the datasets to your workbook and use the copied datasets to create the relationship.

Create a relationship

To learn how to create a relationship to join datasets, you can complete the steps below, or follow a sample use case to walk you through the process, including descriptions and results for each type of relationship.

  1. Click the Create Relationships button Create relationships above the data pane.

    The Create Relationships window appears and displays a list of datasets for your current page under My Data.

  2. Under My Data, click one or more datasets you want to join.
    Note:
    If you plan to use the Left or Right relationship types, make sure the dataset that should be on the left or the right is in the desired place. The dataset you choose first will appear on the left after you add it to the page.
    Tip:

    It's best practice to place the dataset that has the finest level of detail (the most granular) on the left. For the sample use case described below, the StoreTransactions dataset should be on the left.

    When you select the first dataset, the view updates to show only those datasets that are eligible for a relationship.

    Insights analyzes the datasets you added and suggests a field to use for the join based on the type of data the field contains and the name of the field. If a field is suggested, thumbnails that represent each dataset are joined using a line, and the suggested field appears on each thumbnail.

    Selected datasets showing suggested field

  3. Do one of the following:
    • If the suggested field is acceptable, and you want to use the default relationship type (Inner), proceed to the final step.
    • If you want to use a different field, add another field to join, or choose a different type of relationship, proceed to the next step.
    • If a common field could not be detected, a line appears between your dataset thumbnails, and the Create Relationship window opens where you can choose the common fields. Proceed to step 5.
  4. To edit relationship properties, such as choosing a different field, adding other fields to join, or changing the relationship type, click the Edit relationship button Edit relationship on the relationship line that connects the dataset thumbnails.
  5. Choose a different field in one or more datasets, or choose a different relationship type and close or click outside of the relationship properties dialog box.
  6. Optional: If the relationship requires a composite key, you can add additional fields using the Add fields button under Choose Fields.

    For example, if you are joining CityName to CityName, you might also have to specify county, state, and country because values in CityName are duplicated in the dataset.

  7. Do one of the following:
    • Click Finish (at the lower right) to proceed with the relationship.
    • Click Cancel to undo the current relationship.

    When you click Finish, the Create Relationships window closes. A new result dataset Results with the joined data is added to the data pane.

Sample use case

This section describes a sample use case that shows how you can create a relationship between two datasets to answer specific questions, and how each relationship type affects the data in the joined dataset.

The questions are as follows:

  • What is sales revenue by store?
  • Where is my highest grossing store located in the city?

There are two datasets on the workbook page that can help answer this question. The first is the following Excel table showing store transactions from three stores:

OrderIDStoreIDTotalAmountPaymentMethod

1204

1

$91.48

Credit

1295

1

$103.86

Credit

1316

3

$97.81

Debit

1406

2

$103.24

Credit

1421

2

$44.95

Cash

1426

2

$153.71

Debit

1454

3

$168.95

Credit

The StoreTransactions dataset alone could provide a summary of TotalAmount by StoreID (if you change StoreID to a string field, and create a bar chart), but a report showing TotalAmount by a store name is preferred for the sake of clarity when it’s time to share the analysis results with teams across the organization.

In addition, the StoreTransactions dataset cannot show the store locations on a map.

The second is a feature layer showing the following six store locations:

StoreIDStoreName

1

Baskets and Cases

2

Toys! Toys! Toys!

3

Robert's Plants

4

Aisha's Bookstore and Coffee Shop

5

Gadgets and Gizmos

6

Sally's Hardware and Construction

These datasets track data at two different levels of detail. The first dataset tracks data at the transaction level, which is a more detailed level than the second table, which tracks data at the store level.

As a result, you see duplicate records (rows) in the StoreID column in StoreTransactions, while StoreLocations has only unique records in its StoreID column. There are many transactions per store. This is what’s referred to as a many-to-one relationship, a particular type of cardinality between the datasets.

Looking at the data, it's apparent that StoreID is the best field to use to create a relationship between these two datasets.

To answer the questions for this use case, you need to do the following:

  1. Consider the effect that each relationship type has on the resulting joined dataset.
  2. Follow the instructions in Create a relationship (above) to create the relationship that works best for the input datasets.
  3. Visualize the joined data to answer the questions.
    1. Using the StoreName and TotalAmount fields from the joined dataset, create a bar chart or a summary table to aggregate the fields. Sales are aggregated by each store name.
    2. Create a graduated symbol map using the TotalAmount field.

Relationship types

The relationship type you use, along with the common field you choose, determines the records that you keep in the joined dataset. The following subsections provide a look at what the joined dataset looks like by relationship type.

Inner relationship type

If you used the default relationship type while following the steps in Create a relationship (above), the joined dataset uses the Inner relationship type.

Inner keeps only the records that match both datasets. Null records are not added.

OrderIDStoreIDTotalAmountPaymentMethodStoreIDStoreName

1204

1

$91.48

Credit

1

Baskets and Cases

1295

1

$103.86

Credit

1

Baskets and Cases

1316

3

$97.81

Debit

3

Robert's Plants

1406

2

$103.24

Credit

2

Toys! Toys! Toys!

1421

2

$44.95

Cash

2

Toys! Toys! Toys!

1426

2

$153.71

Debit

2

Toys! Toys! Toys!

1454

3

$168.95

Credit

3

Robert's Plants

The records that were excluded were records for StoreID 4 through 6 from the StoreLocations dataset.

Using the default relationship type is a good choice to help answer the sample use case questions above. The Inner relationship type provides data that will not show null values when you visualize your data.

All relationship type

The All relation type (full outer join) keeps all records from both datasets and treats records containing no matches as null values.

OrderIDStoreIDTotalAmountPaymentMethodStoreIDStoreName

1204

1

$91.48

Credit

1

Baskets and Cases

1295

1

$103.86

Credit

1

Baskets and Cases

1316

3

$97.81

Debit

3

Robert's Plants

1406

2

$103.24

Credit

2

Toys! Toys! Toys!

1421

2

$44.95

Cash

2

Toys! Toys! Toys!

1426

2

$153.71

Debit

2

Toys! Toys! Toys!

1454

3

$168.95

Credit

3

Robert's Plants

5

Gadgets and Gizmos

6

Sally's Hardware and Construction

4

Aisha's Bookstore and Coffee Shop

The resulting dataset contains null values in the columns originating from the left dataset (StoreTransactions) for records corresponding to StoreIDs 4, 5, and 6 from the right dataset (StoreLocations). This is because there are no matching records for those store IDs in the left dataset.

Left relationship type

When using the Left relationship type (left outer join), you must ensure that the dataset you want to retain the records from is on the left side of the Create Relationships window by selecting this dataset first. See Create a relationship for more information.

Records that match those from the dataset on the left (StoreTransactions) are retained, along with matching records from the dataset on the right (StoreLocations). If no match on the right is found, the right side contains null values.

OrderIDStoreIDTotalAmountPaymentMethodStoreIDStoreName

1204

1

$91.48

Credit

1

Baskets and Cases

1295

1

$103.86

Credit

1

Baskets and Cases

1316

3

$97.81

Debit

3

Robert's Plants

1406

2

$103.24

Credit

2

Toys! Toys! Toys!

1421

2

$44.95

Cash

2

Toys! Toys! Toys!

1426

2

$153.71

Debit

2

Toys! Toys! Toys!

1454

3

$168.95

Credit

3

Robert's Plants

This joined dataset looks exactly like the Inner relationship type. To answer the sample use case questions, you can use either the Left or Inner relationship type to create a relationship between the two tables, and then create a bar chart, which won't show null values.

If the left side (StoreTransactions) contained a record for StoreID 7, which is not in the StoreLocations dataset on the right, a Left join would have null records on the right.

OrderIDStoreIDTotalAmountPaymentMethodStoreIDStoreName

1204

1

$91.48

Credit

1

Baskets and Cases

1295

1

$103.86

Credit

1

Baskets and Cases

1316

3

$97.81

Debit

3

Robert's Plants

1406

2

$103.24

Credit

2

Toys! Toys! Toys!

1421

2

$44.95

Cash

2

Toys! Toys! Toys!

1426

2

$153.71

Debit

2

Toys! Toys! Toys!

1454

3

$168.95

Credit

3

Robert's Plants

1502

7

$112.65

Cash

Right relationship type

When using the Right relationship type (right outer join), you must ensure that the dataset you want to retain the records from is on the right side of the Create Relationships window by selecting the left dataset first, and the right dataset next. See Create a relationship for more information.

Records that match those from the dataset on the right (StoreLocations) are retained, along with matching records from the dataset on the left (StoreTransactions). If no match on the left is found, the left side contains null values.

OrderIDStoreIDTotalAmountPaymentMethodStoreIDStoreName

1204

1

$91.48

Credit

1

Baskets and Cases

1295

1

$103.86

Credit

1

Baskets and Cases

1316

3

$97.81

Debit

3

Robert's Plants

1406

2

$103.24

Credit

2

Toys! Toys! Toys!

1421

2

$44.95

Cash

2

Toys! Toys! Toys!

1426

2

$153.71

Debit

2

Toys! Toys! Toys!

1454

3

$168.95

Credit

3

Robert's Plants

5

Gadgets and Gizmos

6

Sally's Hardware and Construction

4

Aisha's Bookstore and Coffee Shop

The resulting bar chart from the above right-joined dataset would contain null values.