Skip To Content

Schedule a data pipeline task

Use tasks to automate running a data pipeline. Tasks can be scheduled using a range of intervals, from every 15 minutes to monthly. Results of previous runs can be viewed and include the output layers and messages returned from the data pipeline run. Additionally, you can choose to receive email notification when a task run fails. This feature is in beta.

You can manage data pipeline tasks in the following ways:

  • Use the scheduled tasks page to manage tasks for existing data pipelines.
  • Use the editor to manage tasks for the data pipeline you are currently editing.

Create tasks

Use one of the workflows described below to create a task.

Create a task on the scheduled tasks page

To create a task on the scheduled tasks page for an existing data pipeline, complete the following steps:

  1. Sign in with an ArcGIS account and access the Data Pipelines app using the app launcher.

    The Data Pipelines gallery page appears.

  2. Click Manage scheduling.

    The scheduled tasks page appears.

  3. Click Create task.

    The create task dialog box appears.

  4. Choose a data pipeline to schedule, and click Next.
  5. Provide a title for the task and configure the schedule details.
  6. Optionally, choose to receive emails when a data pipeline task run fails using the Enable email notifications on failures (Beta) option. If your organization has blocked beta apps and capabilities, you will not see the option to enable email notifications. See the Notifications section below for more information.
  7. Click Save.

    The task is created and the data pipeline is scheduled to run.

Create a task in the editor

To create a task in the editor for the open data pipeline, complete the following steps:

  1. Sign in with an ArcGIS account and access the Data Pipelines app using the app launcher.

    The Data Pipelines gallery page appears.

  2. Open an existing data pipeline or create a data pipeline that you want to run on a schedule.

    The editor opens.

  3. Click Schedule on the editor toolbar.

    This is only enabled if the open data pipeline has been saved.

    The scheduled tasks pane appears.

  4. Click Create task.

    The create task dialog box appears.

  5. Provide a title for the task and configure the schedule details.
  6. Optionally, choose to receive emails when a data pipeline task run fails using the Enable email notifications on failures (Beta) option. If your organization has blocked beta apps and capabilities, you will not see the option to enable email notifications. See the Notifications section below for more information.
  7. Click Save.

    The task is created and the data pipeline is scheduled to run.

Notifications

Data Pipelines supports sending email notifications to data pipeline owners when a task run fails. This feature is currently in beta. If your organization has blocked beta apps and capabilities, you will not see the option to enable email notifications.

When creating or editing a scheduled task, use the Enable email notifications on failure (Beta) parameter to receive emails when a data pipeline task run fails. You can opt-out of emails at any time by editing the task and disabling this parameter. You can edit a task in the scheduled tasks page, or in the editor. To learn more about editing tasks, see the Work with existing tasks section below.

Emails are sent to the owner of the data pipeline item when the task run returns a status of failed. A failed status means all outputs in the data pipeline failed to be created or updated.

If the task run returns a status other than failed, you will not receive an email. Notably, you will not receive emails for task runs that return completedWithWarnings, which is returned when at least one output feature layer was created successfully but other outputs failed. Additionally, if the submit job for the task run fails, the task will not run and you will not receive an email.

You will receive a maximum of one email per task every six hours. However, if a task fails five consecutive times within the six hour time span, the task will automatically be set to a failed state and will not run again. In this scenario, you will only receive an email for the first failure.

Note:
This feature is currently in beta. Share your experience, request enhancements, and seek support through the Beta Features Feedback forum in the Data Pipelines Community.

Work with existing tasks

The list of all your data pipeline tasks can be viewed through the scheduled tasks page, which is accessible from the Manage scheduling button in the data pipelines gallery page. In the editor, the Schedule button opens a panel that displays the tasks for the open data pipeline.

Both the scheduled tasks page and the schedule panel in the editor contain information about the task including its title, the next run time, and task status. Values for Next run time are as follows:

  • A date—The date and time when the next run is scheduled to begin.
  • Completed—The end condition for the task has been met and the last run completed.
  • Failed—The end condition for the task has been met and the last run failed.
  • Paused—The task has been paused.

In addition to viewing tasks, the following options are available for managing tasks:

  • Pause or Resume—Pause an active task, or resume a paused task. Paused tasks will not run at the scheduled date and time until resumed.
  • Edit—Edit the parameters set for a task. You can edit the title, the schedule, and notification settings.
  • Delete—Delete a task. Deleted tasks cannot be restored.
  • Restart—Restart a task that is completed.

Task runs and run details

When you click a task, the task run history is displayed. Click an individual run to see the run details, including the output feature layers and any messages. Task runs that are in progress do not show any information until the run is complete. Use the refresh button in the tasks list to get the latest status.

Considerations

Consider the following when authoring data pipelines that will be scheduled:

  • Using the Create output method with the Overwrite if layer already exists output parameter enabled is not recommended for scheduled or automated runs. Unlike Replace and Add and update, the overwrite parameter can alter the schema, geometry and records, which may result in broken downstream workflows such as pop-ups or filters. Additionally, overwrite operations do not rollback when a failure occurs during write, which may result in the loss of the layer until the data pipeline is run successfully. Replace and Add and update rollback when a failure occurs, which means the original data is maintained.
  • If a scheduled data pipeline has an output feature layer that uses the Create output method without the overwrite parameter enabled, the task will fail after the first run since the feature layer already exists. You can create the feature layer before scheduling a task, and set the output method to Replace or Add and update (recommended), or enable the Overwrite if layer already exists parameter (not recommended).

Consider the following when scheduling data pipeline tasks:

  • The maximum frequency a task can be scheduled to run is every 15 minutes.
  • Use the advanced Maximum run duration parameter to specify the maximum time a task can run before it is automatically canceled.
  • You can only create tasks for data pipeline items that you own. This applies to all user types including administrators. Administrators can view and edit tasks for all data pipeline items in the organization.
  • If a task run takes longer than the interval between scheduled tasks, the subsequent task run will be skipped.
  • Any tasks that fail 5 consecutive times will automatically be switched to a failed state and will no longer run. To ensure that the tasks continue to run, the owner of the task must identify and rectify the failure and change the task to the active state.
  • A maximum of 30 task runs per task are stored in the task runs list. After 30 runs, the oldest task run is no longer be accessible and the latest run is added to the list.
  • Each user account can have a maximum of 10 active Data Pipelines tasks. Each organization can have a maximum of 50 total active Data Pipelines tasks across all users. In Data Pipelines, active tasks show a next run time. There is no limit to inactive tasks. If you have 10 active tasks, pause or delete an active task to create a new one.
  • If a data pipeline with scheduled tasks is deleted or moved to the recycle bin, the tasks will be deleted permanently and cannot be restored.