Schedule a data pipeline task—ArcGIS Data Pipelines

Use tasks to automate running a data pipeline. Tasks can be scheduled using a range of intervals, from every 15 minutes to monthly. Results of previous runs can be viewed and include the output layers and messages returned from the data pipeline run. You can manage data pipeline tasks in the following ways:

Use the scheduled tasks page to manage tasks for existing data pipelines.
Use the editor to manage tasks for the data pipeline you are currently editing.

Create tasks

Use one of the workflows described below to create a task.

Create a task on the scheduled tasks page

To create a task on the scheduled tasks page for an existing data pipeline, complete the following steps:

Sign in with an ArcGIS account and access the Data Pipelines app using the app launcher.
The Data Pipelines gallery page appears.
Click Manage scheduling.
The scheduled tasks page appears.
Click Create task.
The create task dialog box appears.
Choose a data pipeline to schedule, and click Next.
Provide a title for the task, configure the schedule details, and click Save.
The task is created and the data pipeline is scheduled to run.

Create a task in the editor

To create a task in the editor for the open data pipeline, complete the following steps:

Sign in with an ArcGIS account and access the Data Pipelines app using the app launcher.
The Data Pipelines gallery page appears.
Open an existing data pipeline or create a data pipeline that you want to run on a schedule.
The editor opens.
Click Schedule on the editor toolbar.
This is only enabled if the open data pipeline has been saved.
The scheduled tasks pane appears.
Click Create task.
The create task dialog box appears.
Provide a title for the task, configure the schedule details, and click Save.
The task is created and the data pipeline is scheduled to run.

Work with existing tasks

The list of all your data pipeline tasks can be viewed through the scheduled tasks page, which is accessible from the Manage scheduling button in the data pipelines gallery page. In the editor, the Schedule button opens a panel that displays the tasks for the open data pipeline.

Both the schedule panel in the editor and the scheduled tasks page contain information about the task including its title, the next run date, and task status. Values for Next run time are as follows:

A date—The date and time when the next run is scheduled to begin.
Completed—The end condition for the task has been met and the last run completed.
Failed—The end condition for the task has been met and the last run failed.
Paused—The task has been paused.

In addition to viewing tasks, the following options are available for managing tasks:

Pause or Resume—Pause an active task, or resume a paused task. Paused tasks will not run at the scheduled date and time until resumed.
Edit—Edit the parameters set for a task. You can edit the title and the schedule settings.
Delete—Delete a task. Deleted tasks cannot be restored.
Restart—Restart a task that is completed.

Task runs and run details

When you click a task, the task run history is displayed. Click an individual run to see the run details, including the output feature layers and any messages. Task runs that are in progress do not show any information until the run is complete. Use the refresh button in the tasks list to get the latest status.

Considerations

Consider the following when authoring data pipelines that will be scheduled:

Using the Create output method with the Overwrite if layer already exists output parameter enabled is not recommended for scheduled or automated runs. Unlike Replace and Add and update, the overwrite parameter can alter the schema, geometry and records, which may result in broken downstream workflows such as pop-ups or filters. Additionally, overwrite operations do not rollback when a failure occurs during write, which may result in the loss of the layer until the data pipeline is run successfully. Replace and Add and update rollback when a failure occurs, which means the original data is maintained.
If a scheduled data pipeline has an output feature layer that uses the Create output method without the overwrite parameter enabled, the task will fail after the first run since the feature layer already exists. You can create the feature layer before scheduling a task, and set the output method to Replace or Add and update (recommended), or enable the Overwrite if layer already exists parameter (not recommended).

Consider the following when scheduling data pipeline tasks:

The maximum frequency a task can be scheduled to run is every 15 minutes.
Use the advanced Maximum run duration parameter to specify the maximum time a task can run before it is automatically canceled.
You can only create tasks for data pipeline items that you own. This applies to all user types including administrators. Administrators can view and edit tasks for all data pipeline items in the organization.
If a task run takes longer than the interval between scheduled tasks, the subsequent task run will be skipped.
A maximum of 30 task runs per task are stored in the task runs list. After 30 runs, the oldest task run is no longer be accessible and the latest run is added to the list.
Each user account can have a maximum of 10 active tasks. This limit includes any active tasks of other types, such as scheduled notebooks or reports. In Data Pipelines, active tasks show a next run time. There is no limit to inactive tasks. If you have 10 active tasks, pause or delete an active task to create a new one.
If a data pipeline with scheduled tasks is deleted or moved to the recycle bin, the tasks will be deleted permanently and cannot be restored.

Feedback on this topic?