Catalog feeds facilitate interoperability between data catalogs and can increase discovery through search engines and third-party catalogs. Editors of public sites can use the feeds available for an ArcGIS Hub site catalog to federate the site’s public content with external catalogs.
Federating means that your datasets can be discovered on other open data platforms that ingest your catalog feed and link to your data source. This allows users of other platforms to access the original data from your Hub catalog. Managers and visitors can subscribe to feeds to monitor changes to content catalogs.
Use catalog feeds
Use Hub catalog feeds to be aware of changes to a site catalog, such as the addition of new content. For all public Hub sites with public items in their catalog, an Explore feeds button appears in the site footer and search interface. While all public sites have feeds turned on by default, site editors can turn off feeds for public sites. Select the Explore feeds button to display the site's feed URL and API explore pages. Hub supports the following feeds:
- RSS (<siteURL>/api/feed/rss/2.0)
- DCAT US 1.1 (<siteURL>/api/feed/dcat-us/1.1.json)
- DCAT AP 2.1.1 (<siteURL>/api/feed/dcat-ap/2.1.1.json)
- DCAT AP 3.0.0 (<siteURL>/api/feed/dcat-ap/3.0.0.json)
- OGC API – Records (<siteURL>/api/search/definition)
RSS feeds
RSS (Really Simple Syndication) is a web feed format used to publish frequently updated content in a standardized way, allowing users to automatically subscribe and receive updates. RSS feeds contain high-level metadata such as title, description, and publication date for all publicly available content shared to the site. Users can stay up to date on changes to a search catalog or pull the feed into an RSS reader (aggregator) to showcase the content on a different site. Hub conforms to the GeoRSS specification.
DCAT feeds
Each site has a content catalog containing items that you want to share through the site. Using DCAT to describe Hub catalog content can increase discoverability and allow applications to access metadata from multiple catalogs. It also enables federated dataset search across catalogs.
Federate your Hub catalog by sharing a public feed output URL that is automatically generated for every public Hub site. Hub supports the following DCAT feeds:
- DCAT US 1.1
- DCAT AP 2.1.1
- DCAT AP 3.0.0
Tip:
Only publicly shared data items and their layers populate the <DCAT type>.json catalog. Private content within your organization cannot currently be shared or federated through the DCAT catalog method.
OGC API - Records
Use the OGC API - Records to discover geospatial resources through standardized collections and resources metadata. You can programmatically query, filter, and search a Hub site's catalog, including public and private items (if you have a valid token for private items). Use the OGC API - Records explorer to test API endpoints and search a catalog without needing to use the site's client search interface. Common uses include rendering features on a map in other tools including ArcGIS Online Map Viewer, GIS desktop applications, OWSLib, and more.
Configure and manage feeds
Feeds are available for public sites that have public content in their catalog. Site managers can choose which attributes and values are applied to a site’s output feed for DCAT US 1.1, DCAT AP 2.1.1, and RSS. You can keep Hub's default configuration, or you can configure certain fields/attributes yourself. In the feed editor, you must supply valid keys corresponding to a dataset’s metadata. Feeds are available by default but can be turned off in the site workspace.
To configure and manage feeds, complete the following steps:
- Open a site in edit mode.
- Select the Manage site
button to open the site workspace.
- Select Settings and select the Feed pane.
- To enable feeds, select Enable feeds. For Feeds, select a feed type.
- In the Configuration editor, copy and paste your code anywhere after a comma and before the last bracket.
Note:
Some attributes cannot be edited. It is not currently possible to preview the feed or edit the template for DCAT AP 3.0.0.
Tip:
The RSS feed template has a different structure that does not allow you to add a top-level key.
- Select Save.
RSS feed configuration
Hub site editors can choose which metadata to display, i.e. override default metadata values and provide values that are most important for user updates. For instance, a site editor could configure metadata values to highlight when content is updated, a brief description, and the associated geographic location.
DCAT default schema example
Hub uses a schema written in JSON to determine which metadata properties appear for each record in the corresponding feed. Below is the default DCAT US 1.1 schema. It contains key/value pairs such as "title”: “{{name}}” and “description”: {{description}}. For each record in the feed, you will see the key (“title”) and templated value (“<item’s metadata title>”). The schema’s design is based on the most straightforward mapping between ArcGIS item metadata and the DCAT US 1.1 standard.
DCAT default US 1.1 schema
{
"title": "{{name}}",
"description": "{{description}}",
"keyword": "{{tags}}",
"issued": "{{created:toISO}}",
"modified": "{{modified:toISO}}",
"publisher": {
"name": "{{source}}"
},
"contactPoint": {
"fn": "{{owner}}",
"hasEmail": "{{orgContactEmail}}"
},
"spatial": "{{extent}}"
}
You can edit the "spatial" attribute of DCAT US and DCAT AP feeds. Hub will use item extent (by default) in new templates. For items with no extent value, the spatial attribute is removed. You can override the "spatial" value with an alternative: "spatial": "{{extent || 'SPATIAL_FALLBACK'}}" and update the default template.
DCAT custom schema examples
You can customize the schema by adding, updating, or removing key/value pairs. Some keys cannot be edited, depending on the type of feed. Below is a custom DCAT US 1.1 schema example with several modifications including the following:
- Adding a key/value pair
- Updating a key/value pair
- Adding a fallback for a key/value pair
Tip:
Maintaining valid feeds is essential for interoperability and discoverability of your public Hub content. Consult appropriate resources before adding or modifying fields, as this can cause the feed to be invalid (Hub does not validate feeds). Invalid feeds can cause issues with federating catalog contents in sites such as data.gov, data.europa.eu and other places where feeds are consumed. Validate your feeds before and after customizing them using the DCAT US 1.1 validator and the DCAT AP validator.
DCAT custom US 1.1 schema
{
"title": "{{name}}",
"description": "{{description}}",
"keyword": "{{tags}}",
"issued": "{{created:toISO}}",
"modified": "{{modified:toISO}}",
"publisher": {
"name": "{{source}}"
},
"contactPoint": {
"fn": "{{owner}}",
"hasEmail": "{{orgContactEmail}}"
},
"culture": "{{culture}}",
"summary": "{{snippet}}",
"platform": "ArcGIS Hub",
"bureauCode": [
"010:86",
"010:04"
],
"programCode": [
"015:001",
"015:002"
]
}
Note:
The custom DCAT US 1.1 schema includes the addition of five new keys: “culture”, “summary”, “platform”, “bureauCode”, and “programCode”. The keys “culture” and “summary” have template values that pull from the OGC API - Records, the latest version of the Hub API. The keys “platform”, “bureauCode” and “programCode” have literal string values.
Custom value examples
To match an organization's metadata standards, site managers may want to adjust the metadata that appears in a feed. A key can be any literal string such as “title” or “” but generally they should conform to a target metadata standard. The corresponding values can be a literal string or a template that pulls a key from the OGC API - Records. For templates, you can supply any key returned from the OGC API - Records, either top-level or nested.
For example, on the Hub feeds example site at dc.esri.com, there is a public layer titled “USA Weather Watches and Warnings.” You can see JSON metadata for that dataset by accessing the layer’s ID: c7a223914778420db8bf000b4eb6ec6f using the OGC API - Records (https://hub.arcgis.com/api/search/v1/collections/all/items/c7a223914778420db8bf000b4eb6ec6f ) or https://hub.arcgis.com/api/search/v1/collections/all/items?id=c7a223914778420db8bf000b4eb6ec6f.
When accessing the example API response above, you should see a JSON response starting like the following:
Custom value example DCAT US 1.1"data":
{
"id": "c7a223914778420db8bf000b4eb6ec6f",
"type": "dataset",
"attributes": {
"errors": [],
"access": "public",
"additionalResources": [],
…
}
If you scroll down, you’ll see more keys to choose from and use as template values in the editor, such as “created”, which represents the date that the content was created. To use a value from the OGC API - Records, in the feed editor, add a template value for any OGC API - Records key underneath “attributes”. For example, if you want to include “created” in your feed records, such as the following:
Custom value example DCAT US 1.1{
…
"bureauCode": ["010:86","010:04"],
"programCode": ["015:001","015:002"],
"created": 1610151009000,
…
}
{
…
"bureauCode": [
"010:86",
"010:04"
],
"programCode": [
"015:001",
"015:002"
],
"created": "{{item.created}}"
…
}
Content managers can configure a feed to include additional custom distributions. These are appended to the existing distributions that Hub automatically generates for a content item's downloadable resources.
Federate catalogs through Hub feeds
Hub feed editors allow site managers to standardize how they describe their data. Site managers can choose which metadata values are displayed for each dataset of the feed before it’s harvested.
Federate with Data.gov
In the United States, you can modify your output to interact specifically with large data clearinghouses, such as the national Data.gov catalog. This type of interoperability means that you can point these third-party aggregators to the multiple distributions in which a dataset is available. Distributions are formats offered for use as a web service, download, or API.
Site editors can choose which attributes and values are applied to a site’s DCAT US 1.1 output feed. In the feed editor, you must supply valid keys corresponding to a dataset’s metadata.
Federate with CKAN
If your organization uses cataloging software such as CKAN, or works with other organizations that do, you can federate your hub site's data catalog. Your CKAN instance must be properly configured to support data harvesting. First, install and configure two extensions that are developed and maintained by the CKAN team and used by Data.gov and others to harvest datasets: the CKAN Harvesting extension and the CKAN DCAT extension.
Confirm that these extensions are installed and then ensure that you have the Harvester Gather_Consumer and Fetch_Consumer services running as background services. Consult CKAN documentation for details.
Harvest the ArcGIS Hub catalog
To harvest the catalog, complete the following steps:
- Go to your CKAN harvest administration page and sign in at http://yourCKANinstance/harvest.
- Select Add harvest source and provide information about your hub site:
- Fill in the URL with <siteURL>/api/feed/dcat-us/1.1.json.
- Give the harvest source a title similar to the name of your site. Optionally, fill in the description box.
- Select DCAT JSON harvester as the source type.
- For update frequency, select Manual.
- Select Save.
- Select Admin and select Reharvest.
- Run harvest jobs on your CKAN instance.
CKAN processes your .json file and includes all your datasets. The harvest source shows what is harvested. Item and layer metadata including descriptions, tags, and download format distributions from Hub are accessible from the CKAN instance.
Note:
Catalog feeds automatically use Downloads API v1 (the previously used downloads API is deprecated).