Skip To Content

FAQ

Answers to frequently asked questions about ArcGIS Data Pipelines are provided.

What is ArcGIS Data Pipelines?

Data Pipelines is an ArcGIS Online application that allows you to connect to, process, and integrate data from various sources. You can perform data preparation and save the results to your Web GIS to complete your organization's workflows. All of this is completed using an intuitive interface where you can construct, run, save, share, and reproduce your data preparation workflows.

Does Data Pipelines charge credits?

Yes. Credit consumption is based on compute resource usage time. See Compute resources for more information.

Credits are consumed when a compute resource is active. Compute resources are active in the following scenarios:

  • Interactive editing—While authoring or editing data pipelines in the editor, credits are consumed while the connection status is Connected. The credit rate is 50 credits per hour, calculated per minute, with a 10-minute minimum.
  • Jobs—Jobs are run for scheduled data pipeline tasks, when you run a data pipeline using ArcGIS API for Python, or when you use the run option from the Data Pipelines gallery page. Jobs only consume credits while the data pipeline is running. Credits are charged per run for the time it takes to complete at a rate of 70 credits per hour, calculated per minute. There is no minimum charge for jobs.
Credits cease to be charged when a compute resource is stopped. Compute resources will be stopped in the following scenarios:
  • After using the disconnect all button on the connection details dialog box. This disconnects all connected editors and credits will not be consumed until at least one editor is reconnected.
  • After all browser tabs with connected editors have been closed for at least 10 minutes. Credits are not consumed for those 10 minutes.
  • After 30 minutes of inactivity in all editor browser tabs. The status will be Disconnected.
  • When a scheduled data pipeline task run is complete.
  • When a data pipeline run using ArcGIS API for Python is complete.
To learn more about credits in ArcGIS Online, see Understand credits.

Is Data Pipelines available in ArcGIS Enterprise?

No. Data Pipelines is only available in ArcGIS Online. Data Pipelines may be available in ArcGIS Enterprise in a future release, but it is not guaranteed.

How do I access Data Pipelines?

You can access Data Pipelines by using the app launcher and choosing Data Pipelines.

To access Data Pipelines, your user account must have the required privileges. See Requirements to learn more about the privileges and requirements to access Data Pipelines.

If you are unsure whether you or your organization meets the requirements above, contact your organization administrator.

How can I get started with Data Pipelines?

To get started with Data Pipelines, see Tutorial: Create a data pipeline. The tutorial outlines the key components for using Data Pipelines, including connecting to and processing data, running a data pipeline, and more.

For more resources to get started, see the Data Pipelines Community blog posts.

What data can I use in Data Pipelines?

The following types of data are supported as input:

See the linked input data type documentation to learn more about supported file types and how to connect to an input dataset.

Can I use ArcGIS Living Atlas layers as input to my data pipeline?

Yes. You can use ArcGIS Living Atlas feature layers as input. To add a layer to a diagram, see Feature layer. By default, the feature layer browse dialog box opens to My content. To search for an ArcGIS Living Atlas layer, switch to Living Atlas on the dialog box.

Can I connect to my datasets on the Google Cloud platform?

No, not yet. In future releases, the following additional types of external data sources will be supported:

  • Google Cloud platform
  • Microsoft Azure Cosmos DB for PostgreSQL
  • Data returned from API requests

The data sources in this list are not guaranteed for a specific release, and data sources that are not listed here may be added. If you have suggestions for data sources that will improve your workflows, leave a comment in the Data Pipelines Community forums.

My data was updated in its source location. How do I sync my dataset in my data pipeline?

If the data is regularly updating in the source location and you want to use it in a data pipeline, it is recommended that you do not use the Use caching parameter for inputs. If you do not use caching, Data Pipelines reads the latest data every time you request a preview or run. If you use caching, only the data available at the time you cached is used.

If you created an output feature layer and need to update it with the latest data, use the Replace or Add and update options in the Feature layer tool, and run the data pipeline again. You can automate rerunning a data pipeline by scheduling a task for the data pipeline item. To learn more about automating data pipeline workflows, see Schedule a data pipeline task.

Where can I store my Data Pipelines results? Can I store them in Amazon S3?

No. The only output format currently supported by Data Pipelines is a feature layer. You cannot write results to other formats or storage containers, including Amazon S3. Data Pipelines can only read from your S3 bucket.

Learn more about output feature layers in Data Pipelines

How many features can I write to a feature layer or table using Data Pipelines?

Writing to feature layers is most performant up to hundreds of thousands of records. You can try to write more, but performance varies depending on a variety of factors including the complexity of geometries or number of fields.

Can I geocode addresses using Data Pipelines?

No, not yet. This capability is planned for a future release.

What tools are coming in future releases?

The following tools may be included in future releases:

  • Find and replace—Search fields for specific values and replace them with a new value.
  • Geocode addresses—Use string addresses from a table or file to return the geocoded results.

The tools in this list are not guaranteed for any release, and tools that are not listed here may be added. If you have suggestions for tools that will improve your workflows, leave a comment in the Data Pipelines Community forums.

Can I share a data pipeline?

Yes. You can share data pipeline items with groups in your organization or with the public. Only the owner of the item can edit data pipeline items. Use shared update groups so everyone in the group can edit and save the data pipeline. If a data pipeline is shared with a group that does not have shared update capabilities, you can save the data pipeline as an editable copy in your content using the Save As option on the editor toolbar.

Is there a way to undo or redo an action in the Data Pipelines editor?

No, not yet. Undo and redo are not currently supported actions for the editor. These actions will be available in a future release.

Is there a way to copy and paste elements in a diagram?

Yes. You can use command keys to cut (Ctrl+X), copy (Ctrl+C), paste (Ctrl+V), and delete (Delete) elements. Select the elements, and use the command keys to complete the actions.

Can I schedule a data pipeline run?

Yes. You can create tasks for data pipeline items to run your workflows on a schedule. To learn more about creating data pipeline tasks, see Schedule a data pipeline.

How is Data Pipelines different from ArcGIS Velocity?

There are certain similarities between Data Pipelines and Velocity in ArcGIS Online. Both applications allow you to connect to external data sources and import the data into ArcGIS Online for use across the ArcGIS system. However, they serve distinct purposes. Velocity is specifically designed for real-time and big data processing, efficiently handling high-speed data streams from sensors and similar sources. It also is focused on enabling analytics such as device tracking, incident detection, and pattern analysis. Data Pipelines is primarily a data integration application that focuses on data engineering tasks, particularly for non-sensor-based data streams. While Velocity is used for handling real-time data, Data Pipelines is used for managing and optimizing data that requires updates on a less frequent basis.

How is Data Pipelines different from ArcGIS Data Interoperability?

Both are no-code ETL tools for ArcGIS, supporting data integration, transformation, and cleaning. However, they are very different in that Data Pipelines is a web-based application available immediately in ArcGIS Online, while Data Interoperability is an extension for ArcGIS Pro, requiring a separate license and installation. Data Pipelines is focused on data integration for ArcGIS Online, with results being written out to a hosted feature layer, while Data Interoperability supports a larger set of supported inputs and file types, and is able to write results back to the source.

How is Data Pipelines different from ModelBuilder in Map Viewer?

ModelBuilder in Map Viewer and Data Pipelines are similar in that they both provide a low-code, drag-and-drop user experience for authoring repeatable workflows in the web. However, there are some key differences:

  • ModelBuilder can be used to automate analysis workflows, leveraging the analysis tools available in Map Viewer; Data Pipelines can be used to automate data integration and preparation workflows, and includes tools focused on cleaning, formatting, and preparing data for visualization and downstream analysis.
  • ModelBuilder supports feature layers and tables. Data Pipelines, on the other hand, supports vector and tabular data from a variety of sources including Amazon S3, Microsoft Azure Storage, Google BigQuery, Snowflake, feature layers, uploaded local files, and data read directly from URLs.
  • ModelBuilder is a capability included in ArcGIS Online Map Viewer and integrated into Map Viewer analysis; Data Pipelines is an application that is used independently of Map Viewer.

I am a user in a new organization and I cannot access Data Pipelines. How do I resolve this?

The Data Pipelines service is not set in your organization until at least one feature layer has been published. This is the same behavior as other services such as spatial analysis, notebooks, and others. To resolve this, navigate to your content and create a feature layer. This needs to be done once per organization, not per user. If the issue persists after publishing a feature layer, contact Esri Technical Support.