Join—ArcGIS Data Pipelines

The Join tool joins datasets based on the specified relationships. Datasets can be joined using matching attributes, spatial relationships, temporal relationships, or any combination of the three.

Examples

The Join tool can be used in scenarios such as the following:

Join county polygons with county census data based on matching countyID field values.
Join records that occur at a relative time and place using both a temporal relationship and a spatial relationship.

Parameters

The following table outlines the parameters used in the Join tool:


Parameter	Description
Target dataset	The dataset that will have records joined to it.
Join dataset	The dataset that will be joined to the target dataset.
Join operation	Specifies whether all the matching records will be joined (one to many) or whether all the matching records will be summarized together (one to one).
Join type	Specifies whether target records that do not have a matching join record are returned in the output.
Summary fields	Determines which join dataset fields will have statistics calculated.
Use matching fields	Specifies whether an attribute relationship will be used.
Target field	The field from the target dataset that will be used to match records in the join dataset.
Join field	The field from the join dataset that will be used to match records in the target dataset.
Use spatial relationship	Specifies whether a spatial relationship will be used .
Spatial relationship	Specifies the spatial relationship type that will be used to define the join condition between the target and join datasets.
Near distance	The linear distance threshold that will be used to determine locations that are near each other. This parameter is only available for spatial relationships of type Near planar or Near geodesic.
Near distance unit	The unit of the linear distance threshold. This parameter is only available for spatial relationships of type Near planar or Near geodesic.
Target geometry	The geometry field from the target dataset that will be used to identify spatial relationships with the join dataset.
Join geometry	The geometry field from the join dataset that will be used to identify spatial relationships with the target dataset.
Use temporal relationship	Specifies whether a temporal relationship will be used.
Temporal relationship	Specifies the temporal relationship type that will be used to define the join condition between the target and join datasets.
Near distance	The temporal distance threshold that will be used to determine records that are near each other in time. This parameter is only available for temporal relationships of type Near, Near before, or Near after.
Near distance unit	The unit of the temporal distance threshold. This parameter is only available for temporal relationships of type Near, Near before, or Near after.
Target start time	The datetime field from the target dataset that will be used to specify the start time of a record.
Target end time	The datetime field from the target dataset that will be used to specify the end time of a record. This parameter is required if you want to join using times of type interval. See Temporal relationships in Data Pipelines for details.
Join start time	The datetime field from the join dataset that will be used to specify the start time of a record.
Join end time	The datetime field from the join dataset that will be used to specify the end time of a record. This parameter is required if you want to join using times of type interval. See Temporal relationships in Data Pipelines for details.

Usage notes

Use the Target dataset parameter to identify the dataset to which records will be joined. Use the Join dataset parameter to identify the dataset that contains the records that will be joined to the target records.

If multiple join records match the same target record, you can specify whether all the matching records will be joined (Join one to many option) or all the matching records will be summarized together (Join one to one option) as follows:

Join one to many—All the matching records in the join dataset will be joined to the target dataset. The output dataset will contain multiple records of the target record.
Join one to one—All of the matching join records will be summarized with each record in the target dataset. Only the records that have a match will be included in the summary and output. The count of joined records will be added. By default, only those target records that have the specified relationships will be maintained in the output record class (Inner join). If Left join is specified, all input target records will be written to the output record class (this is known as a left outer join).

Examples of joining one to many and one to one — In this example, the one-to-one join only includes the count.

The Summary fields parameter determines which fields from the join dataset will be summarized. The Summary fields parameter is available if the join operation is Join one to many. The following summary statistics are available:

Sum—The sum of numeric values in a field. The sum of [null, 1, 3] is 4.
Min—The minimum value of a numeric or date field. The minimum of [0, 2, null] is 0.
Max—The maximum value of a numeric or date field. The maximum of [0, 2, null] is 2.
Mean—The mean of numeric values. The mean of [0, 2, null] is 1.
Standard deviation—The standard deviation of a numeric field. The standard deviation of [null, 1, 1, 1] is 0.
Count—The number of nonnull values. It can be used on a field of any type. The count of [null, "hello", "world"] is 2.
Any—A sample value from a field of any type.

You can join records based on a spatial relationship, a temporal relationship, an attribute relationship, or a combination of the three. A spatial join matches records (rows) from the input records (the Target dataset and Join dataset parameter values) based on their spatial locations. A temporal join matches records from the input records based on their temporal relationships. An attribute join matches records based on field values. The following table describes each type of join relationship:


Relationship	Description
Attribute	The attribute relationship that will determine whether records are joined to each other. Records are matched when the field values in the join dataset are equal to the field values in the target dataset. It is recommended to use fields of the same type.
Spatial	The spatial relationship that will determine whether records are joined to each other. The available relationships depend on the geometry type (point, line, or area) of the datasets being joined. Available spatial relationships are as follows: Intersects Equals Near planar—Uses planar distances Near geodesic—Uses geodesic distances Contains Within Touches Crosses Overlaps
Temporal	The temporal relationship that will determine whether records are joined to each other. The available relationships depend on the time type (instant or interval) of the datasets being joined. Available temporal relationships are as follows: Meets Met by Overlaps Overlapped by During Contains Equals Finishes Finished by Starts Started by Intersects Near Near before Near after

If target and join geometries are in different coordinate systems, the coordinate system of the target dataset will be used in analyzing spatial relationships.

When the Spatial relationship parameter value is Near planar, the Target geometry parameter value must be in a projected coordinate system.

Outputs

When running a one-to-many join, all fields from the target dataset and the joined dataset will be included in the output. If the same field name exists in both datasets, the joined field will start with the word join appended (for example, join_fieldname).

When running a one-to-one join, in addition to all fields (including geometry) from the target dataset, the following fields will be included in the output:


Field name	Description
COUNT	The count of records that have been joined. This is always included in the output.
STATISTIC_fieldname	Specified statistics will each create an attribute field, named in the following format: STATISTIC_fieldname. For example, the maximum of the id field is MAX_id.
START_DATETIME	This field represents the start time and is only included when using temporal joins.
END_DATETIME	This field represents the end time and is only included when using temporal joins with interval time values.

Licensing requirements

The following licensing and configurations are required:

Creator or Professional user type
Publisher, Facilitator, or Administrator role, or an equivalent custom role

To learn more about Data Pipelines requirements, see Requirements.

Examples

Parameters

Usage notes

Outputs

Licensing requirements

Related topics

In this topic