Use records from a Google BigQuery table as input to ArcGIS Data Pipelines.
Usage notes
Keep the following in mind when working with Google BigQuery:
- To use a dataset from Google BigQuery, you must first create a data store item. Data store items securely store credentials and connection information so the data can be read by Data Pipelines. To create a data store, follow the steps in the Connect to Google BigQuery section below.
- To change the data store item you configured, use the Data store item parameter to remove the currently selected item, and choose one of the following options:
- Add data store—Create a new data store item.
- Select item—Browse your content to select an existing data store item.
- Specify the dataset using the Dataset parameter, and use the Table parameter to specify the table containing the data you want to use.
- To improve the performance of reading input datasets, consider the following options:
- Use the Use caching parameter to store a copy of the dataset. The cached copy is only maintained while at least one browser tab open to the editor is connected. This may make it faster to access the data during processing. If the source data has been updated since it was cached, uncheck this parameter and preview or run the tool again.
- After configuring an input dataset, configure any of the following tools that limit the amount of data being processed:
- Filter by attribute—Maintain a subset of records that contain certain attribute values.
- Select fields—Maintain only the fields of interest.
- Filter by extent—Maintain a subset of records within a certain spatial extent.
Connect to Google BigQuery
To use data stored in Google BigQuery, complete the following steps to create a data store item in the Data Pipelines editor:
- On the Data Pipelines editor toolbar, click Inputs and choose Google BigQuery.
The Select a data store connection dialog box appears.
- Choose Add a new data store.
- Click Next.
The Add a connection to a data store dialog box appears.
- Provide the key file that contains the credentials used to access the service account.
- Provide the project ID containing the data you want to connect to.
- Click Next.
The item details pane appears.
- Provide a title for the new data store item.
This title will appear in your portal content. You can also store the item in a specific folder and provide item tags or a summary.
- Click Create connection to create the data store item.
A Google BigQuery element that you can configure for a certain dataset is added to the canvas.
Limitations
The following are known limitations:
- Views of Google BigQuery datasets are not supported as input to Data Pipelines.
- Google BigQuery external tables are not supported as input to Data Pipelines.
- Refresh tokens are not supported for connecting to Google BigQuery.
- To use a data store item to connect to external data sources, you must be the owner of the data store item. Data store items that are shared with you are not supported as input.
Licensing requirements
The following licensing and configurations are required:
- Creator or Professional user type
- Publisher, Facilitator, or Administrator role, or an equivalent custom role
To learn more about Data Pipelines requirements, see Requirements.