URL—ArcGIS Data Pipelines

Use records from a URL or API as input to ArcGIS Data Pipelines.

Usage notes

Keep the following in mind when working with URLs:

Use the URL parameter to specify the dataset to use as input to your data pipeline. Only HTTP and HTTPS URLs are supported.
ArcGIS feature layers and tables are not recommended for use as a URL input. ArcGIS feature layers should be added to your content and then used as a Feature layer input. To learn how to add a feature layer to your content, see Add a service or document from a URL.
The This URL requires authentication (Beta) parameter determines whether the URL requires authentication to access the data (enabled), or if the data is publicly accessible (disabled). This parameter is currently in beta.
To load data from a URL that requires authentication, you must first create a service connection item. Service connection items securely store the credentials and secrets that will be included in the request to the URL. To create a service connection, follow the steps in the Connect to URLs requiring authentication section below. This feature is currently in beta.
To change the service connection item you configured, use the Service connection (Beta) parameter to remove the currently selected item, and choose one of the following options:
- Add connection—Create a service connection item.
- Select item—Browse your content to select an existing service connection item.
This parameter is currently in beta.
Use the Custom headers (Beta) parameter to specify the names and values of the headers that should be sent in the URL request. For example, an API may require a header of Content-type that is set to a certain value. Specifying credentials or secrets as headers is not recommended. Credentials and secrets should be stored in service connection items. This parameter is currently in beta.
Use the Response format parameter to specify the format of the data that is returned from the URL. The following format options are available:
- CSV or delimited (for example, .csv, .tsv, and .txt)
- JSON (.json)
- GeoJSON(.geojson)
- Parquet (.parquet)
- GeoParquet (.parquet)
If the CSV or delimited format option is specified, the following dataset definition parameters are available:
- Delimiter—The delimiter used to split field (or column) and record (or row) values. You can choose from the following options or enter your own value:
  - Comma (,)—Field and record values are separated by commas (,). This is the default.
  - Tab (\t)—Field and record values are separated by tabs (\t).
  - Pipe (|)—Field and record values are separated by pipes (|).
  - Semicolon (;)—Field and record values are separated by semicolons (;).
  - Space ( )—Field and record values are separated by spaces ( ).
  If you are entering your own value it must be one or two characters in length, including spaces. Delimiters longer than two characters are not supported.
- Has header row—Specifies whether the dataset contains a header row. The default is true. If set to false, the first row of the dataset will be considered a record.
- Has multiline data—Specifies whether the dataset has records that contain new line characters. The default is false. If set to true, data that contains multiline data will be read and formatted correctly.
- Character encoding—Specifies the encoding type used to read the specified dataset. The default is UTF-8. You can choose from the available encoding options or specify an encoding type. Spaces are not supported in encoding values. For example, specifying a value of ISO 8859-8 is invalid and must be specified as ISO-8859-8.

Fields is available to configure field names and types when the data format value is CSV or delimited. The Configure schema button opens a dialog box containing the dataset fields with the following options:

Include or drop fields—You can remove fields by checking the check box next to the field. By default, all fields are included.
Field name—The name of the field as it will be used in Data Pipelines. This value can be edited. By default, this value will be the same as the field in the source dataset unless the source name contains invalid characters or is a reserved word. Invalid characters will be replaced with an underscore (_), and reserved words will be prefixed with an underscore (_).
Field type—The field type as it will be used in Data Pipelines.

Removing or modifying fields in Data Pipelines will not modify the source data.

The following table describes the available field types:


Field type	Description
String	String fields support a string of text characters.
Small integer	Small integer fields support whole numbers between -32768 and 32767.
Integer	Integer fields support whole numbers between -2147483648 and 2147483647.
Big integer	Big integer fields support whole numbers between -9223372036854776000 and 9223372036854776000.
Float	Float fields support fractional numbers between approximately -3.4E38 and 3.4E38.
Double	Double fields support fractional numbers between approximately -2.2E308 and 1.8E308.
Date	Date fields support values in the format yyyy-MM-dd HH:mm:ss, for example, a valid value is 2025-12-31 13:30:30. If the date values are stored in a different format, use the Create date time tool to calculate a date field.
Date only	Date fields support values in the format yyyy-MM-dd, for example, a valid value is 2025-12-31. If the date only values are stored in a different format, use the values as input to the Calculate field tool to calculate a date only field.
Boolean	Boolean fields support values of True and False. If a field contains integer representations of Boolean values (0 and 1), use the Update fields tool to cast the integers to Boolean values instead.

If the JSON format option is specified, the Root property parameter is available. You can use this parameter to specify a property in the JSON to read data from. You can reference nested properties using a decimal separator between each property, for example, property.subProperty. By default, the full JSON file will be read.
If the GeoJSON format option is specified, the Geometry type parameter is available. This parameter is optional. By default, the geometry type in the GeoJSON file is used. If the GeoJSON file contains more than one geometry type, you must specify the value for this parameter. Mixed geometry types are not supported and only the specified type will be used. The options are Point, Multipoint, Polyline, and Polygon. A geometry field containing the locations of the GeoJSON data will be automatically calculated and added to the input dataset. The geometry field can be used as input to spatial operations or to enable geometry on the output result.
To improve the performance of reading input datasets, consider the following options:
- Use the Use caching parameter to store a copy of the dataset. The cached copy is only maintained while at least one browser tab open to the editor is connected. This may make it faster to access the data during processing. If the source data has been updated since it was cached, uncheck this parameter and preview or run the tool again.
- After configuring an input dataset, configure any of the following tools that limit the amount of data being processed:
  - Filter by attribute—Maintain a subset of records that contain certain attribute values.
  - Filter by extent—Maintain a subset of records within a certain spatial extent.
  - Select fields—Maintain only the fields of interest.
  - Clip—Maintain a subset of records that intersect with specific geometries.

Connect to URLs requiring authentication (Beta)

To authenticate secure URLs, complete the following steps to create a service connection item in the Data Pipelines editor:

On the Data Pipelines editor toolbar, click Inputs and choose URL.
The Add a URL dialog box appears.
In the URL parameter, provide the URL to the dataset including the leading https://.
Use the Response format parameter to specify the format of the dataset as it is returned from the URL.
Enable the This URL requires authentication (Beta) option.
Choose Add new service connection.
Click Next.
The Add a service connection dialog box appears.
In the Base URL parameter, provide the domain name that the service connection will send credentials or secrets to.
Choose one of the following from the Authentication type drop-down menu:
- API key—Requires an API key that will be used as a header value or a query parameter.
- Basic—Requires a username and password.
Provide the values for the authentication parameters. Use the preview at the bottom of the dialog box to confirm the format matches the URL's requirements.
If you specified API key in the previous step, provide the following authentication parameters:
- Parameter location—Specifies whether the API key is sent in a header or a query parameter.
- Parameter name—Specifies the name of the header or query parameter.
- API key—Specifies the API key.
- API key prefix (optional)—Specifies a value to prepend to the API key, for example, "Bearer". This parameter is optional.
Click Next.
The item details pane appears.
Provide a title for the new service connection item.
This title will appear in your content. You can also store the item in a specific folder and provide item tags or a summary.
Click Save to create the service connection item.
A URL element is added to the canvas.

Limitations

The following are known limitations:

If your organization has blocked beta apps and capabilities, you cannot access the following parameters:
- This URL requires authentication (Beta)
- Service connection (Beta)
- Custom headers (Beta)
These features are currently in beta. If you are using these parameters, share your experience and seek support through the Beta Features Feedback forum in the Data Pipelines Community.
If the URL cannot be read in Data Pipelines, but you can download the data from the URL directly, try using the downloaded data as input to the File tool instead.
Esri JSON files (.esrijson) are not supported.
If the dataset includes field names with spaces or invalid characters, the names are automatically updated to use underscores. For example, a field named Population 2022 is renamed Population_2022, and a field named %Employed is renamed _Employed.
Some header values are not supported as input to the Custom headers (Beta) parameter, including the Authorization header. To learn how to securely store secrets, refer to the Connect to URLs requiring authentication section.
URLs that use invalid HTTPS certificates are not supported.
URLs that use custom IP addresses are not supported. Only domain names are supported.
URLs that use the arcgis.com domain are not supported. Instead of using ArcGIS URLs as input, use the File or Feature layer input tools instead.
URLs that use custom ports are not supported.
URLs that redirect may not be supported.
Learn more about diagnosing URL redirects in the Data Pipelines Community
URLs that require OAuth to authenticate are not supported.
APIs that use pagination are not supported.
To use a service connection item to connect to URLs that require authentication, you must be the owner of the item. Service connection items are private and cannot be shared.

Usage notes

Connect to URLs requiring authentication (Beta)

Limitations

Related topics

In this topic