Introduction to ArcGIS Data Pipelines—ArcGIS Data Pipelines

ArcGIS Data Pipelines is a no-code, visual data engineering capability that makes it easy to prepare and integrate data for mapping and analysis workflows. With a drag-and-drop interface, you can connect to a variety of data sources, apply common data preparation tools, and publish the results as a hosted feature layer or table. Data Pipelines can be run on demand or scheduled to run automatically, reducing manual effort and ensuring your organization’s data stays current and ready for use.

Data Pipelines supports both spatial and non-spatial data, making it a versatile tool for a wide range of GIS and business workflows. You can work with vector data such as points, lines, and polygons, as well as non-spatial data like CSVs or JSON files. Data can be brought in from cloud storage (for example, Amazon S3 and Microsoft Azure Storage), cloud databases (for example, Google BigQuery and Snowflake), URLs, feature layers, and more. Once connected, you can use tools to format, build, and blend datasets from multiple sources into a format that is ready for mapping, analysis, or sharing across ArcGIS.

Data Pipelines tools are organized into structured toolsets such as clean, construct, integrate, and format, which help you to transform your data. The tools support a wide range of workflows, for example:

Manipulate dataset schemas by updating field names or types.
Select a subset of fields to extract targeted information.
Filter by attribute or geometry values to clean the data.
Combine datasets using join or merge functionality.
Calculate fields using ArcGIS Arcade functions.
Create geometry or time fields for use in spatial or temporal analysis.

When building a data pipeline and configuring tools, you can preview results. This means you can inspect and perfect the data in preparation for writing the final result. Once you’ve completed the data pipeline, you can run it to create or update a hosted feature layer or table that will be available in your content. You can configure geometry and time properties for the output feature layer so it’s ready for visualization, or spatial or temporal analysis, dashboards, or web maps.

Feedback on this topic?