Federate data and manage feeds

Site managers can use the feeds available for an ArcGIS Hub site to federate the site’s public content with external catalogs and increase discovery through search engines and third-party catalogs. Managers and visitors can use feeds to stay up to date on changes to the site catalog.

Use catalog feeds

You can use ArcGIS Hub feeds to stay aware of changes to the site catalog, such as the addition of new content. For all public Hub sites with public items in their catalog, an Explore feeds button appears in the site footer and search interface. Selecting this opens a display of the feeds and code needed to add each one. Hub supports the following feeds:

  • DCAT US 1.1 (<siteURL>/api/feed/dcat-us/1.1.json)
  • DCAT AP 2.1.1 (<siteURL>/api/feed/dcat-ap/2.1.1.json)
  • OGC API– Records (<siteURL>/api/search/definition)
  • RSS (<siteURL>/api/feed/rss/2.0)

DCAT feeds

Each site has a catalog (content library) containing content that you want to share through the site. To federate your site’s catalog, you can share a public feed output URL that is automatically generated for every public Hub site. Hub supports two DCAT feeds: DCAT US 1.1. and DCAT AP 2.1.1. This catalog feed, for example: www.yourhubsite.gov/api/feed/dcat-us/1.1.json, conforms to DCAT US 1.1. You can also edit the content of your site's catalog using the DCAT configuration editor in Hub.

Caution:

Only data items that are shared publicly populate the <DCAT type>.json catalog. Private content within your organization cannot currently be shared or federated through the DCAT catalog method.

Note:

Every site's data catalog generates a public feed output URL that conforms to DCAT US 1.1 at <siteURL>/data.json. In early 2022, ArcGIS Hub officially migrated to a new endpoint at <siteURL>/api/feed/dcat-us/1.1.json. Read the Changes to DCAT configurations on ArcGIS Hub sites blog to learn more.

OGC API - records

You can use the OGC Records API to discover geospatial resources through standardized collections and resources metadata. With this feed, you can programmatically query, filter, and search a Hub site's catalog. Use this explorer to test API endpoints and search a catalog without needing to use the site's client search interface. Common uses cases include rendering features on a map in other tools including ArcGIS Online Map Viewer, GIS desktop applications, OWSLib, and more.

RSS feeds

RSS is a format for web content syndication and a standard way to disseminate metadata about catalog entries, especially for frequently updated or appended catalogs. RSS feeds contain high-level metadata such as title, description, and publication date for all publicly available content shared to the site. Hub site editors can choose which metadata to display, i.e. override default metadata values and provide values that are most important for user updates. For instance, a site editor could configure metadata values to highlight when content is updated, a brief description, and the associated geographic location. Users can stay up to date on changes to a search catalog or pull the feed into an RSS reader (aggregator) to showcase the content on a different site.

Configure and manage feeds

Site managers can choose which attributes and values are applied to a site’s output feed for DCAT US 1.1, DCAT AP 2.1.1, and RSS. You can keep ArcGIS Hub's default configuration, or you can configure certain fields/attributes yourself. In the feed editor, you must supply valid keys corresponding to a dataset’s metadata.

  1. Select the edit button edit to open the site in edit mode.
  2. Select to open the site menu in the top navigation bar and choose Content library.
  3. Select the More actions button more actions and choose Configure feeds.
  4. Select a feed to configure. In the Configuration editor, copy and paste your code anywhere after a comma and before the last bracket.
  5. Select Save.

Default schema example

ArcGIS Hub uses a schema written in JSON to determine which metadata properties appear for each record in the corresponding feed. Below is the default DCAT US 1.1 schema. It contains key/value pairs such as "title”: “{{name}}” and “description”: {{description}}. For each record in the feed, you will see the key (“title”) and templated value (“<item’s metadata title>”). The schema’s design is based on the most straightforward mapping between ArcGIS item metadata and the DCAT US 1.1 standard.

Default DCAT US 1.1 schema

{
	"title": "{{name}}",
	"description": "{{description}}",
	"keyword": "{{tags}}",
	"issued": "{{created:toISO}}",
	"modified": "{{modified:toISO}}",
	"publisher": {
		"name": "{{source}}"
	},
	"contactPoint": {
		"fn": "{{owner}}",
		"hasEmail": "{{orgContactEmail}}"
	},
	"spatial": "{{extent}}"
}

You can edit the "spatial" attribute of DCAT US and DCAT AP feeds. Hub will use item extent (by default) in new templates. For items with no extent value, the spatial attribute is removed. You can override the "spatial" value with an alternative: "spatial": "{{extent || 'SPATIAL_FALLBACK'}}" and update the default template.

Custom schema examples

You can customize the schema by adding, updating, or removing key/value pairs. Some keys cannot be edited, depending on the type of feed. Below is a custom DCAT US 1.1 schema example with several modifications including the following:

  • Adding a key/value pair
  • Updating a key/value pair
  • Adding a fallback for a key/value pair

Custom DCAT US 1.1 schema

{
 "title": "{{name}}",
"description": "{{description}}",
	"keyword": "{{tags}}",
	"issued": "{{created:toISO}}",
	"modified": "{{modified:toISO}}",
	"publisher": {
		"name": "{{source}}"
	},
	"contactPoint": {
		"fn": "{{owner}}",
		"hasEmail": "{{orgContactEmail}}"
	},
	"culture": "{{culture}}",
	"summary": "{{snippet}}",
	"platform": "ArcGIS Hub",
	"bureauCode": [
		"010:86",
		"010:04"
	],
	"programCode": [
		"015:001",
		"015:002"
	]
}
Note:

The custom DCAT US 1.1 schema includes the addition of five new keys: “culture”, “summary”, “platform”, “bureauCode”, and “programCode”. The keys “culture” and “summary” have template values that pull from the Hub V3 API, the latest version of the Hub API. The keys “platform”, “bureauCode” and “programCode” have string literal values.

Custom value examples

To match an organization's metadata standards, many site managers will want to adjust the metadata that appears in a feed. A key can be any literal string such as “title” or “” but generally they should conform to a target metadata standard. The corresponding values can be a string literal or a template that pulls a key from the Hub V3 API. For templates, you can supply any key returned from the V3 API, either top-level or nested.

For example, on the ArcGIS Hub feeds example site at dc.esri.com, there is a public layer titled “USA Weather Watches and Warnings.” You can see JSON metadata for that dataset by accessing the layer’s ID: c7a223914778420db8bf000b4eb6ec6f using the Hub V3 API (https://hub.arcgis.com/api/v3/datasets/c7a223914778420db8bf000b4eb6ec6f). If the item has multiple layers, add the layer number to the end of the item ID (`<item ID>_<layer number>`).

When accessing the example API response above, you should see a JSON response starting like the following:

Custom value example DCAT US 1.1

"data": 
{
    "id": "c7a223914778420db8bf000b4eb6ec6f",
    "type": "dataset",
    "attributes": {
        "errors": [],
        "access": "public",
        "additionalResources": [],
  …
}

If you scroll down, you’ll see more keys to choose from and use as template values in the editor, such as “created”, which represents the date that the content was created. To use a value from the Hub V3 API, in the feed editor, add a template value for any Hub V3 API key underneath “attributes”. For example, if you want to include “created” in your feed records, such as the following:

Custom value example DCAT US 1.1

{
…
 	"bureauCode": ["010:86","010:04"],
 	"programCode": ["015:001","015:002"],
 	"created": 1610151009000,
…
}
In this same example, you would add the following lines to the custom DCAT US 1.1 schema:
{
…
  "bureauCode": [
   "010:86",
   "010:04"
],
  "programCode": [
   "015:001",
   "015:002"
],
  "created": "{{item.created}}"
…
}

Content managers can configure a feed to include additional custom distributions. These are appended to the existing distributions that Hub automatically generates for a content item's downloadable resources.

Federate catalogs through Hub feeds

Hub feed editors allow site managers to standardize how they describe their data. Site managers can choose which metadata values are displayed for each dataset of the feed before it’s harvested.

Federate with Data.gov

In the United States, you can modify the output to work specifically with large data clearinghouses such as the national Data.gov catalog. This type of interoperability means that you can point these third-party aggregators to the multiple formats (distributions) in which a dataset is available. Distributions are formats offered for use as a web service, download, or API.

Site managers can choose which attributes and values are applied to a site’s DCAT US 1.1 output feed. In the feed editor, you must supply valid keys corresponding to a dataset’s metadata.

Federate with CKAN

If your organization uses cataloging software such as CKAN, or works with other organizations that do, you can federate your hub site's data catalog. Your CKAN instance must be properly configured to support data harvesting. First, install and configure two extensions that are developed and maintained by the CKAN team and used by Data.gov and others to harvest datasets: the CKAN Harvesting extension and the CKAN DCAT extension.

After confirming that these extensions are installed, ensure that you have the Harvester Gather_Consumer and Fetch_Consumer services running as background services. Consult CKAN documentation for details.

Harvest the ArcGIS Hub catalog

To harvest the catalog, complete the following steps:

  1. Go to your CKAN harvest administration page and sign in at http://yourCKANinstance/harvest.
  2. Select Add harvest source and provide information about your hub site:
    • Fill in the URL with http://yourOpenDataSite/data.json.
    • Give the harvest source a title similar to the title of your site.
    • Optionally, fill in the description box.
    • Select DCAT JSON harvester as the source type.
    • For update frequency, select Manual.
    • Select Save when finished.
  3. Select Admin and select Reharvest.
  4. Run harvest jobs on your CKAN instance.

    CKAN processes your data.json file and includes all of your datasets. You can see what is harvested by viewing the harvest source. All of your descriptions, tags, and dataset distributions from Hub are accessible from the CKAN instance.

Note:

You may experience some delays the first time you preview a .csv or .json file as Hub generates a cache of the data and CKAN cannot identify how to handle this while the data is processing. This will not occur the next time you preview the file.