The Amazon S3 source reads records from files stored in an Amazon S3 bucket and performs analysis in ArcGIS Velocity.
Examples
The following are example uses of the Amazon S3 source:
- A researcher wants to load hundreds of delimited text files from an Amazon S3 bucket into Velocity to perform analysis.
- A GIS department stores commonly used boundary shapefiles in an Amazon S3 bucket and wants to load the county boundary shapefile into Velocity as an aggregation boundary.
Usage notes
Keep the following in mind when working with the Amazon S3 source:
- All files identified in the Amazon S3 bucket by the naming pattern specified in the Dataset parameter must have the same schema and geometry type. If specifying a folder name for the Dataset parameter, all files in the directories must have the same file type and schema.
- The secret access key is encrypted the first time the analytic is saved and is stored in an encrypted state.
- When specifying the folder path, use forward slashes (/).
- After configuring source connection properties, see Configure input data to learn how to define the schema and the key properties.
- When using the Public access mode to connect to public Amazon S3 buckets using Velocity, the public Amazon S3 bucket must have the List action granted to Everyone (public access) under the bucket access control list.
- Certain Amazon S3 actions are required for the user policy associated with the provided Amazon key for Velocity to successfully connect to an Amazon S3 bucket as well as to the data in the provided bucket and the folder path.
- The s3:ListBucket action is required for the specified bucket.
- The s3:GetObject action is required on the specified folder path and subresources (arn:aws:s3:::yourBucketName/*) for an Amazon S3 source to read data.
Parameters
Parameter | Description | Data type |
---|---|---|
Access key | The Amazon access key ID for the S3 bucket, for example, AKIAIOSFODNN7EXAMPLE. Velocity uses the access key to load specified data sources into the app. For details on Amazon access keys, see Accessing AWS using your AWS credentials in the AWS documentation. | String |
Secret key | The Amazon secret access key for the S3 bucket, for example, wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY. Velocity uses the access key to load specified data sources into the app. The secret access key is encrypted the first time the analytic is saved and is stored in an encrypted state. For details on Amazon secret access keys, see Accessing AWS using your AWS credentials in the AWS documentation. | String |
S3 bucket name | The name of the Amazon S3 bucket containing the files to read. | String |
Folder path | The folder path of the folder containing the files to load into Velocity.
| String |
Dataset | The name of the file to read if you are loading a single file, or a pattern indicating a set of files, followed by the file type extension. To build a pattern indicating a set of files, use an asterisk (*) as a wildcard either on its own or in conjunction with a partial file name. All files identified by the naming pattern must have the same schema and geometry type. Alternatively, if loading multiple files or nested folders, you can also specify the containing folder name as the dataset name instead of a file name with extension. If specifying a containing folder name as the dataset, you cannot use wildcards or restrict file types. All files from the specified folder will be ingested, and they should all have the same file type. The following are examples:
| String |
Load recent files only | Specifies whether the Amazon S3 source loads all files or only the files created or modified since the last run of the analytic.
The parameter can only be set to true for scheduled big data analytics. For the first run of a scheduled big data analytic with the parameter set to true, big data analytics do not load any files and the analytic run will complete. Subsequent analytic runs load files with a last modified date since the last scheduled run of the analytic. | Boolean |
Considerations and limitations
There are several considerations to keep in mind when using the Amazon S3 source:
- All files identified in the Amazon S3 bucket by the naming pattern in the dataset property must have the same schema and geometry type.
- Ingesting JSON with an array of objects referenced by a root node is currently not supported for Amazon S3 or Azure Blob Storage.