Skip To Content

Tutorial: Create a data pipeline

Learn how to create a workflow to prepare and integrate data from various sources into a dataset that is available to your GIS environment.

Open the ArcGIS Data Pipelines app and create a data pipeline

To open the Data Pipelines app and begin creating a data pipeline, complete the following steps:

  1. Sign in with an ArcGIS account and access the Data Pipelines app by using the app launcher Apps.

    The Data Pipelines gallery page appears.

  2. Click Create data pipeline.

    The data pipeline editor opens.

    Data Pipelines editor

Add a data source

A data source loads data into the data pipeline for preparation. To add a data source to the diagram, complete the following steps:

  1. Click Inputs on the editor toolbar.

    The Inputs panel appears.

  2. Click File.

    The Select a file modal appears.

  3. Choose Browse existing then click Next.

    The item browser opens.

  4. In the item browser, choose ArcGIS Online from the choice list next to the search bar.
  5. Search for Coastal Ferry Routes - Create Your First Data Pipeline, click the matching item, and click Add.

    You are returned to the File panel and the Format parameter is set to GeoJSON for the dataset.

  6. Click Preview.

    The preview loads.

  7. Explore the input dataset by doing any of the following:
    • Click the Table preview Table preview tab to view a tabular representation of the dataset.
    • Click the Map preview Map preview tab to view the locations of the dataset on a map. In map preview, you can pan, zoom, and inspect attributes.
    • Click the Schema Schema tab to verify the schema of the dataset.
    • Click the Messages Messages tab to review messages returned from the preview action.

Prepare the data

Data Pipelines includes tools that clean and transform data. Two of these tools are the Filter by attribute tool, which is used to select the most frequently used routes, and the Select fields tool, which is used to maintain specified fields in the final output.

To prepare the data using the Filter by attribute and Select fields tools, complete the following steps:

  1. Click the close button at the top of the preview pane.
  2. Click Tools on the editor toolbar, and click Filter by attribute.

    The Filter by attribute element is added to the canvas.

  3. Configure Filter by attribute to use the file dataset as input by doing one of the following:
    • Drag the pointer from the output port of the File element to the input port of the Filter by attribute element.
    • In the Filter by attribute panel, choose the file dataset using the Input dataset parameter.
  4. Click the Build new query button in the tool panel.

    The Query builder dialog box appears.

  5. Click Expression, then click Next.
  6. In the field picker, choose FREQUENCY_OF_USE_IND. Enter a value of High in the text box.

    Query builder inputs

  7. Click Add on the Query builder dialog box.
  8. Click the Tools button on the editor toolbar, and click Select fields.

    The Select fields element is added to the canvas.

  9. Connect the output port of the Filter by attribute element to the input port of the Select fields element using one of the options in step 3 above.
  10. In the Select fields panel, click the +Field button, and choose geometry, FERRY_ROUTE_ID, ROUTE_NAME, and MANIFEST_TYPE. Click Done to save the field selection.

    You can use the search text box to quickly find the fields.

    Select fields

  11. Click Preview and review the result.

Export the data to ArcGIS Online

The Coastal Ferry Routes input dataset has now been filtered to contain only the routes with a high frequency of use and a selection of the fields that contain the information of interest. To export this dataset to a feature layer in ArcGIS Online, complete the following steps:

  1. Click Outputs on the editor toolbar, and click Feature layer.

    The Feature layer element is added to the canvas.

  2. Connect the output port of the Select fields element to the input port of the Feature layer element.
  3. Verify that the Geometry field is populated with the geometry value.
  4. For Output name, provide a unique title for the output feature layer.

    Output feature layer

  5. On the action bar at the top of the canvas, click Run Run.

    The data pipeline is now running and the Latest run details console appears. Once the process is complete, the feature layer is shown on the Output results tab. Complete data pipeline

  6. Optionally, click the feature layer to open its item page in ArcGIS Online.
  7. Optionally, click Save and open on the editor toolbar and choose Save as to save the data pipeline as a new item in your content.