ArcGIS Hub communicates to any number of servers through the open GeoServices REST Specification. These servers are running different versions of ArcGIS Server from 10.1 to 10.9.1 and run on infrastructure managed by authoritative agencies or can be cloud hosted in ArcGIS Online . To handle the varying server performance of this diverse wealth of information across the web, we recommend considering the following details when publishing data.
Service properties
Below are some best practices for configuring service properties:
Max record count should be less than 5,000
When publishing a service, a default max record count of 1,000 or 2,000 is set and intended to provide optimal performance from your server to the client. The max record count determines the maximum number of features that can be returned in a single request. When a service has this set too high, a client can try to request all the data in a single request that is slow to generate and too large to send across the Internet.
It may make sense to increase the value over the default range if you have just over 2,000 or around 4,000 features total in your layer or table.
For services with a max record count above 5,000, the administrator will receive a warning message in ArcGIS Hub suggesting it be lowered. While it may mean fewer features can appear on the map, it also means that a client will not have to wait for extended periods for the map to draw. Users will see gridded visualizations when the map cannot display all features at one time.
Services with a max record count above 10,000 will not be indexed by ArcGIS Hub, and an error will be reported to the administrator. Again, this is for performance reasons, as gathering all the data each time it’s viewed is taxing on a server and is slow for the user viewing the data.
It’s important to note that no matter the max record count, ArcGIS Hub will query out all the data 1,000 records at a time and aggregate it to support file downloads into CSV, KML, SHP, or GeoJSON.
OGC links
WMS, WFS, and WCS links are added to the API section of the item view on each site when a service publisher has enabled them on the specific service. If the publisher did not enable these capabilities at the time of publishing, they can edit the service and turn them on. Once they do so, the administrator should reindex the specific dataset or the whole site if it affects many datasets. Please note that OGC links will only appear on services from ArcGIS Server 10.2 or higher.
Feature Access is not necessary
ArcGIS Hub queries features out of the Map Service the same way it queries features from a Feature Service. Unless you have another need to have Feature Access enabled, it is best to leave it off.
Scale dependencies do not matter
ArcGIS Hub works by sending requests to your Map or Feature Services and querying out the data regardless of the extent. While cartographically thinking you may not want to show address points at a global scale, ArcGIS Hub will query out all the data and provide high-level visualizations of the data by showing a summary of locations or a gridded visualization that can be filtered and shows the density of features. These visualizations are not customizable.
Organizing services
Below are some best practices for organizing services:
Services need to be publicly accessible
ArcGIS Hub contains a process that will query statistical information from the server to show a summary of the data as well makes requests for the data 1,000 records at a time to build up a cache to support download file types. This automatic ETL (extract-transfer-load) process is run when a consumer attempts to download the data, and once cached, the download request will respond from the cache to help save the load on your server. Both of these services to index and build the cache need to communicate with the server through the firewall.
Extend your infrastructure to the cloud
Severs will hit a capacity at some point, or an organization does not expose their ArcGIS Server to the public for security reasons. In both cases Administrators have the option to publish the data layers to ArcGIS Online and leverage the hosted architecture that provides 99.9 percent uptime. Hosting data in ArcGIS Online does cost credits, but it is an optional way to deliver your data to the public.
As an organization you may choose to host particularly large or popular datasets in ArcGIS Online in an effort to defer web traffic from your servers to the cloud. This hybrid approach is quite common for open data, and providers are also seeing the advantages of cloud hosting in the ArcGIS platform as the data can be accessed, maintained, and edited, as well as used in many COTS applications.
Raster data is supported as image services
Layers from Image Services will be indexed by ArcGIS Hub and can be downloaded in supported export formats, such as JPEG, PNG, or georeferenced TIFF files. The size of the image able to be downloaded is configured at the service level. If the raster layers are part of a map service, the administrator will be notified that there is unsupported data in the service.
Large services will time out
Organizing your data in multiple services will be faster than having all your data in one service. Although it is technically possible, it does create a performance bottleneck since all the data is only available through a single end point. Providers have used a single service so they can control the minimum and maximum instances of a service in efforts to control usage available to download on a website. If you do not want to share your existing services as open data, then publish multiple services that are organized by category of data. You should have no more than 20 layers per service.
With an optimal number of data or layers per service, we can ensure that the queries to the server will be responsive and provide a good user experience for your consumers. If a query to your server to get the record count takes longer than 90 seconds, the dataset will not be indexed and an error will be raised to the admin application.
Managing the data
The following best practices will help your data display in a consistent and user-friendly format.
Note:
It is recommended that you enable editor tracking so that you can ensure that users are always receiving the most up-to-date data
Use field aliases or have user friendly field names
When creating the data, cryptic names can end up being used for the attribute columns, which are useless to the consumer when surfaced in Hub. The need for these cryptic names may come from supporting other business applications, so instead of changing the name of the column, you can apply a field alias. The field alias set in ArcMap before publishing is what will be used by the server (on-premises or hosted) and can be updated by changing and publishing again with the option to overwrite the existing service.
Turn off unimportant fields
Often the data you create and manage is used to support internal applications or comes from other systems that usually represent key values to link to other data. These extra fields can become confusing to data consumers and should be hidden in the map document prior to publishing the layer; it can be updated by changing and publishing again with the option to overwrite the existing service.
Topologies are not supported
The data provided as open data is intended to be delivered in open-machine-readable formats. The automatic built-in ETL process within ArcGIS Hub provides the machine-readable formats as CSV, KML, Shapefile, and GeoJSON. These data formats are unaware of geodatabase behavior that is understood by the ArcObjects level of processing within ArcGIS. Therefore, behaviors such as topologies (network datasets, parcel fabric, and geometric networks) and relationships maintained in the database as varying relationship classes are not supported. The data in the Feature Class can be processed into these formats, but the additional capabilities will not carry over.
If you are working with relationship classes or other forms of related data, you can share the individual tables out as data in addition to the spatial feature class. It will be important to leverage the description of the data to state what is related as well as using proper tags and organizing your services. The nonspatial tables as well as the feature class can come from the same service and will show as related in ArcGIS Hub.
Coded value domains are supported
Contrary to the previous section, there is one behavior of the geodatabase that is supported when generating the open-machine-readable formats. Coded Value Domains will be honored when viewing data in ArcGIS Hub, and when the data is downloaded, the raw values will be replaced by the coded value.
If the CSV is larger than 5 MB and shared to an open data group, then ArcGIS Hub will provide a link to the download for the data. To make these larger CSV files dynamic, choose the option to publish as a service when uploading to ArcGIS Online. Once the service is created, you can share this into any open data group and ArcGIS Hub will allow users to query, filter, and chart the data, as well as provide API end points for developers.
Data must support statistics
When publishing data to ArcGIS Server 10.03 or later, a large majority of datasets will natively support statistics. This allows the application to provide a summary of the data to the consumer so they get a quick view of the values being stored in the data. If the administrator notices an error related to the dataset not supporting statistics, we encourage you to check your server logs and contact technical support if needed.
In efforts to be efficient to the consumer as well as light on the queries to your server, the applicaiton will only build statistics for the first 20 columns in a dataset.
CSV files under 5 MB are dynamic
When uploading CSV files to ArcGIS Online, you have the option to geocode the data onto the map, providing spatially enabled data. If the data has no location aspects to it, you can still upload the CSV to ArcGIS Online. If the CSV is under 5 MB and shared into an open data group, ArcGIS Hub will provide an interactive experience to the data as well as an API end point for developers.