Skip To Content

Federating with CKAN

ArcGIS Hub makes it easy to share your data to any CKAN site by providing data catalog output in the DCAT format. This type of interoperability enables users to follow known workflows to share data from ArcGIS and make it available in multiple download formats (SHP, KML, CSV) and APIs (Geoservices, WMS, GeoJSON) in a CKAN platform.

Required Information

Before you begin, your CKAN instance must be properly configured to support harvesting. This involves installing and configuring two extensions that are developed and maintained by the CKAN team and are used by Data.gov and others to harvest datasets.

  1. Follow the instructions here: https://github.com/ckan/ckanext-harvest/blob/master/README.rst if you do not already have the CKAN Harvesting extension installed.
  2. Follow the instructions here: https://github.com/ckan/ckanext-dcat/blob/master/README.md if you do not already have the CKAN DCAT extension installed.
  3. Make sure you have the Harvester Gather_Consumer and Fetch_Consumer services running as background services.
    • Activate your local python environment: ./usr/lib/ckan/default/bin/activate
    • Activate the Gather process: paster --plugin=ckanext-harvest harvester gather_consumer --config='/path/to/your config.ini'
    • Activate the Fetch process: paster --plugin=ckanext-harvest harvester fetch_consumer --config='/path/to/your config.ini'

Harvest the ArcGIS Hub catalog

  1. Go to your CKAN harvest administration page and log in at http://yourCKANinstance/harvest.
  2. Select add harvest source and enter some information about your Hub site:
    • Fill in the URL with http://yourOpenDataSite/data.json
    • Give the harvest source a title similar to the title of your Hub site
    • (Optional) Fill in the description box
    • Select source-type: DCAT JSON Harvester
    • Select update frequency: manual
    • Click save when done
  3. Select admin then select reharvest.
  4. Run harvest jobs on your CKAN instance.
  5. Activate your python environment: ./usr/lib/ckan/default/bin/activate.
  6. Enter the command: paster --plugin=ckanext-harvest harvester run --config='path/to/your config.ini'

CKAN will now start processing your data.json file and will bring in all your datasets. You can see what gets harvested by viewing the harvest source. All of your descriptions, tags, and dataset distributions from ArcGIS Hub will be accessible from the instance of CKAN.

Note: You may notice some strange behavior the first time you try to preview a CSV or JSON file. Hub is generating a cache of this data and CKAN does not know how to handle this case when the data is processing. This will not occur again the next time you try to preview the file.