Skip To Content

Pivot

The Pivot tool converts a long dataset to a wide dataset by using distinct values from an existing field to create new fields.

Examples

The Pivot tool can be used in scenarios such as the following:

  • You have a long dataset with many records and few fields, and want to format it to have few records and many fields.
  • You have a dataset containing records from field observations for multiple years, and want to summarize values by year.

Parameters

The following table outlines the parameters in the Pivot tool:

ParameterDescription

Input dataset

The dataset to pivot.

Input fields

One or more fields used to define the unique records in the output dataset. Records with matching values will be aggregated using the aggregate functions.

Pivot field

The field containing the values to transform into new fields in the output dataset.

Value field

The field used to determine the values of the new records.

Aggregate functions

The function or functions used to calculate the new values.

Usage notes

Use the Input dataset parameter to identify the dataset that will be pivoted.

Use the Input fields parameter to identify one or more fields containing values to group by. The unique values from the specified field or fields will become the records in the output dataset. The same field cannot be specified for both Input fields and Pivot field or Value field.

Use the Pivot field parameter to identify the field containing the values that will be used to create the new field names in the output dataset.

The Aggregate functions parameter is used to determine how the values from the grouped records will be calculated. The following aggregate functions are available:

  • Sum—The sum of numeric values in a field. The sum of [null, 1, 3] is 4.
  • Min—The minimum value of a numeric or date field. The minimum of [0, 2, null] is 0.
  • Max—The maximum value of a numeric or date field. The maximum of [0, 2, null] is 2.
  • Mean—The mean of numeric values. The mean of [0, 2, null] is 1.
  • Standard deviation—The standard deviation of a numeric field. The standard deviation of [null, 1, 1, 1] is 0.
  • Count—The number of nonnull values. It can be used on a field of any type. The count of [null, "hello", "world"] is 2.
  • Any—A sample value from a field of any type.

Outputs

The tool output is a dataset containing the following fields and values:

  • All fields specified using the Input fields parameter. The output will contain one record per unique value found in the field or combination of fields.
  • New fields for each unique value in the field specified using the Pivot field parameter. These fields will contain aggregated values from the field specified using the Value field parameter. For example, if the pivot field contains two unique values, California and Washington, and you calculate aggregate functions of Mean and Any, the result will contain four new fields: MEAN_California, ANY_California, MEAN_Washington, and ANY_Washington.

If the Pivot field contains null values, a field called null will be added. If the Pivot field contains empty strings, a field called _ will be added.

An example of using Pivot

Limitations

The following are the known limitations of the Pivot tool:

  • Field names are not case-sensitive.If the Pivot field parameter contains matching text values with differing casing, the output fields will be renamed to avoid duplicates.
  • Field names that contain invalid characters or start with a number are renamed. If the Pivot field parameter contains multiple values that would conflict when renamed, they will be renamed to avoid duplicates.
  • A maximum of 1,000 distinct values are supported in the Pivot field parameter. If more than 1,000 distinct values are found, an error will be raised.
  • Pivot requires a scan of the entire input dataset to determine the new field values. If performance of downstream tools becomes an issue, consider limiting the size of the input dataset.

Licensing requirements

The following licensing and configurations are required:

  • Creator or Professional user type
  • Publisher, Facilitator, or Administrator role, or an equivalent custom role

To learn more about Data Pipelines requirements, see Requirements.