Skip To Content

Creating mosaic datasets

For most elevation applications, the fundamental data is a bare earth digital elevation model (DEM). When this is the case, you'll (1) create a source mosaic dataset for each elevation dataset, (2) create a derived mosaic dataset to manage the full collection, and possibly (3) generate reference mosaic datasets for end users as needed. The data management issues will center on merging different geographic areas and resolutions.

However, there are applications where multiple surface models may need to be defined. Examples include the following:

  • A digital surface model (DSM) showing tree canopies, buildings, and so on
  • A hydrologically enforced DEM
  • Ellipsoidal height
  • Subsurface bathymetry in oceans and major seas

If multiple surface models are required, simplify data management by creating separate derived mosaic datasets for each. In most cases, the same source mosaic datasets will be used, but different functions and properties will be defined to create specific derived mosaic datasets for each surface type.

See "What should I do if I have multiple elevation surfaces?" below for more information.

Source mosaic datasets

Each elevation collection should be organized using a separate source mosaic dataset.

What should the mosaic dataset properties be?

Most properties do not need to be set for the source mosaic datasets, as they are generally used only for quality assurance purposes.

What coordinate system should I use for my source mosaic dataset?

To simplify quality assurance/quality control testing, set the coordinate system for each source mosaic dataset to be the same as the source data.

What parameters should I use when adding rasters?

The geoprocessing tool will automate the process of adding rasters, but for more information see documentation on adding rasters to mosaic datasets. Use the following parameters when adding elevation rasters.

ParameterRecommended Settings

Raster type

Most elevation sources can be ingested using the Raster Dataset raster type (the exception is when using LIDAR data)

Note:

For organizations that have very well-defined metadata standards for their elevation data, a specialized raster type that ingests the metadata automatically might be preferable.

Update Cell Size Ranges flag

On (Ensures that pixel sizes are computed, important especially if the source data already contains preexisting overviews)

Update Overviews flag

Off

How should I populate metadata?

First, add appropriate fields to the mosaic dataset attribute table. Recommended names and data types for aerial imagery metadata are listed here. Record these attributes for each source mosaic dataset; the metadata will then be copied into any derived datasets. Users can also add other metadata fields to mosaic datasets as needed.

Field NameData TypeDescription

LE90

Float

Horizontal accuracy in meters

CE90

Float

Vertical accuracy in meters

Date_Start

Date

In some cases, elevation data collections are compiled from multiple sources over an extended time. This would be the earliest date represented in the data. For new elevation datasets generated via electronic sensors (LIDAR, IFSAR, radar, for example), store the acquisition date in this field.

Date_End

Date

This is generally the date the elevation data was published, or if available, the date of the most recent data collection.

DEM_Type

Short

Elevation data may represent a number of different types, so this field is recommended for querying records of each type. Values used by Esri are the following:

  • Undefined = 0
  • DSM (digital surface model; for example, LIDAR first return) = 1
  • DEM (bare earth) = 2
  • Bathymetry = 3
  • Ice (ETOPO, for example) = 4

Best

Float

A custom field that defines which image appears on top within a mosaic. This is done by setting the default Mosaic Method property to By Attribute and using the attribute Best.

Note:

Mosaic datasets of elevation data often contain multiple, overlapping datasets. One important consideration for mosaic dataset design is how to control the default ordering of the data so that it is dependent on both scale and suitability. At small scales, the lower-resolution datasets should be accessed, but at larger scales the most appropriate elevation should be used. This is best handled by creating a Best field and defining it as an attribute that indicates the priority of the dataset. Typically, this is a value computed based on metadata values such as cell size, accuracy, or time.

When using the By Attribute mosaic method and setting the Ordering field to Best with a base value of 0, the data with the highest priority (Best Value Closest to 0) is displayed on top. It is also important to take into consideration the MinPS and MaxPS values for each dataset so that the higher priority datasets are not necessarily used when more suitable small-scale datasets are available. However, even if overviews are created for source mosaic datasets, the MaxPS value should not be set too high, or these higher priority datasets will be used even at global scales, which could affect performance.

Users of the mosaic dataset or image services can also change the display order by changing the By Attribute method field and value used, using the Lock Raster mosaic method, and using WHERE clauses to restrict the system to use specific datasets.

VerticalDatum

Text16

The name of the vertical datum of the dataset (e.g. NAVD88, EGM96, NGVD29)

Dataset_ID

Text16

A unique text identifier for your source mosaic dataset. When following the source/derived/referenced model, all records from each collection (for example, NED 1/3 arcsecond versus NED 1 arcsecond) will be combined in the derived mosaic dataset. Adding and populating this field in each source mosaic dataset will allow a simple query within the derived mosaic dataset to identify records from each collection.

Source_URL

Text64

Link to the original source of the data

Link2Metadata

Text100

Path and file name to detailed metadata files that can be downloaded

How should I deal with NoData values and footprints in my imagery?

NoData values should be defined for each data source. When NoData values are defined, by default, the system will utilize data values from lower-priority rasters in cases where NoData exists (though users can still lock to a specific raster if required).

The properties of the mosaic dataset are typically set so that the system will utilize NoData values for a raster and not clip elevation data to footprints. However, there are some cases where a mask (as a separate raster or footprint) may exist to define NoData values. In such cases, once the source mosaic dataset has been created, a mask function can be applied to the rasters or the footprint used to clip the data.

However, defining footprints is still important, because they are (1) used to optimize the search of suitable elevation datasets covering any area and (2) are returned as geometries in metadata queries. In some cases, there are very large extents of NoData pixels; the footprint can be used to exclude these areas. This is typical for elevation data along transportation corridors, rivers, or power lines.

Note:

Usually, when computing the footprint boundary, the boundary of the service is also updated to be the intersection of the footprints. However, for source mosaic datasets of elevation data, this can result in a very complex boundary polygon that's not very useful. When using the workflow tools, the boundary is not updated with the footprint; instead, the boundary is built using the Envelope option.

To create footprints: Run the Build Footprint tool with the Radiometry option set to refine the footprints from the default envelope. This will change the footprint to better approximate the data extents. Typically, the number of vertices should be kept lower than 300 to avoid slow performance.

In cases where more accurate footprints exist to define the extent of the data, these can be imported using the Import Footprint or Boundary tool.

Note:

With Bathymetry and LIDAR-based projects, the source data may contain unnecessary data, and you may need to clip away these areas using a footprint. In that case, (1) use the footprint to define the required areas and (2) set the mosaic dataset to clip based on footprints. Because NoData is already being handled, the extents of these footprints don't need to be exact. Avoid using too many vertices (preferably fewer than 300) or performance may suffer.

Do I need to adjust height values?

Generally, all data in a single source mosaic dataset will have the same height unit of measure and vertical datum, so heights won't need to be adjusted.

However, sometimes the vertical datum (or units) can be incorrectly defined, so it is a good idea to perform some quality assurance steps to check the height values against some known control.

Do I need to create seamlines?

In a source mosaic dataset, all the data sources are from the same collection, so in most cases no seamline blending is required. If you need to define better blending between different datasets within a source mosaic dataset, you can use the process described for derived mosaic datasets.

Do I need to create overviews?

Ideally, you will use existing lower-resolution data (for example, contiguous datasets such as GMTED and SRTM) as low-resolution views in derived mosaic datasets. If this is the case, you will create separate source mosaic datasets for each lower-resolution dataset.

Many elevation datasets will also have pyramids generated, in which case you don't need to generate overviews for a source mosaic dataset. Typically, the elevation data for higher-resolution imagery should not be used at very small scales, so excluding overviews has the advantage of automatically turning them off at small scales.

However, for datasets that are cut into edge-joined tiles, you may want to build overviews. When the original data source has been clipped into edge-joined tiles and the columns/rows are not a power of two, there can be issues with pyramids that result in gaps at some smaller scales. In such cases, it is helpful to generate overviews, even if pyramids exist. Overviews also enable users to use a WHERE clause to display imagery with a specific Dataset ID at all scales.

When creating overviews for elevation data use the following guidelines:

  • Generate overviews with a sampling factor of 3 with nearest neighbor sampling so that no pixel shifts occur.
  • When overviews have been created for a source mosaic dataset, it is necessary to re-populate suitable metadata (for example, DatasetID) for the newly created datasets. Not all metadata fields may be appropriate for the overviews.
  • Determine at which scales the data from a source mosaic dataset is no longer valid and ensure that the MaxPS values for all records are below this value. The following approximations for conversion between scale and pixel size can be used for this:
    • PixelSize in m = Scale factor * 0.0254/96
    • PixelSize in dd = Scale factor * 0.0254/96/111111
    • PixelSize in ft = Scale factor * 0.0254/96/0.3048

Derived mosaic datasets

If you are working with multiple data collections (i.e. more than one mosaic dataset), you may want to merge your source mosaic datasets into a single derived mosaic dataset to streamline data management. With all data collections managed in a single derived mosaic dataset, user applications can easily search the attribute table to access image metadata, and then select imagery by date or any other attribute.

How should I set up my mosaic dataset?

The Derived Mosaic Dataset workflow tool, along with the appropriate XML configuration file, automatically sets the appropriate mosaic dataset properties (or prompts the user to do so). The recommended settings are listed here (any properties not mentioned should use the default setting). To set these properties manually, run the Set Mosaic Dataset Properties tool.

When creating your derived mosaic dataset, define it according to the following guidelines:

  • Define the number of bands and pixel depth for the derived mosaic dataset based on the maximum number of spectral bands, and the maximum pixel depth, of any source dataset.
  • Set the spatial reference for the derived mosaic dataset to accommodate the maximum extents expected for your project. Web Mercator Auxiliary Sphere is typically used.

ParameterSetting

Data type

1-band

Pixel type

32_BIT_FLOAT

Note:

If this isn't set and the source is, for example, SRTM, which is 16-bit integers, then computations will be performed as integers. This will create artifacts in products such as hillshade and slope.

Spatial Reference

The spatial reference for the derived mosaic dataset must cover the full extent of all project data, current and future. For compatibility with web maps, the WGS 1984 Web Mercator (Auxiliary Sphere) is generally recommended. Additionally, the computation of aspect is not affected by the Web Mercator projection, and ArcGIS includes a specific adjustment for the latitude-based scale factor in Web Mercator to ensure slopes are computed correctly at larger latitudes. However, Web Mercator is not suitable for users working primarily in polar latitudes, in which case a more suitable projection should be used.

Note:

When a client application makes a request to an image service, the client application can define the projection to be used and the system will transform the pixels to the required projection prior to performing any functions. This enables applications to make requests that are pixel-aligned with the source and not perform any reprojection. Additionally, when working in maps that use a projection such as UTM, the computations will be performed in the appropriate units.

For the derived mosaic dataset, the recommended properties settings are:

PropertySetting

(Transmission compression setting) Default method

LZ77 (Or JPEG for mosaic datasets used for visualizations)

(Transmission compression setting) Recommended quality

(For JPEG only) 80

Default resampling method

Bilinear Interpolation

Allowed mosaic methods

Lock Raster, By Attribute, and None (disable other methods, which were designed for optical imagery and aren't applicable for elevation data)

Default mosaic method

By Attribute

Order Field

Best

Maximum size of requests

The recommended maximum is 4000 x 4000 to restrict users from capturing a local copy of your full data collection. Default is 15000 x 4100.

Note:

This will restrict users from downloading larger datasets. However, you may want to set smaller or larger limits.

Download

0

Note:

This will disable download, which is recommended unless your users require the ability to download the original source data, and your network bandwidth is adequate to support large data transfers.

Blend Width

0

Note:

If blending is turned on, it will only be applied to seamlines that have explicit blend widths

Always Clip the Raster to its Footprint

No

Note:

Footprints for elevation data are typically only approximate and could result in elevation data otherwise being clipped. However, sometimes when working with LIDAR or bathymetry data, you may want to use footprints to explicitly clip out pixels outside the footprints. In that case, this should be set to Yes.

Footprints May Contain NoData

Yes

Note:

The system will look for other datasets underneath NoData values

Always Clip the Mosaic Dataset to its Boundary

No

Statistics

Don't calculate.

Note:

Nominal values for the statistics covering the entire project area should be stored in a configuration file and imported into the derived mosaic dataset using the Set Raster Properties function.

(Raster Information) Source Type

Elevation

How do I add my source mosaic datasets to my derived mosaic dataset?

All data added to the derived mosaic dataset should be contained within source mosaic datasets using the parameters below (no individual rasters should be added to the derived mosaic dataset).

ParameterSetting

Raster type

TABLE

Calculate Cell Sizes

Off

Update Boundary

On

Note:

Any new source mosaic datasets outside the current boundary will update the boundary, usually approximated to an envelope

How should I populate metadata?

After adding the source mosaics, ensure Dataset_ID and other key metadata fields have been copied into the derived mosaic dataset for all records.

How should I deal with footprints and NoData values in my imagery?

When the source mosaic datasets are combined in the derived mosaic dataset, any NoData pixels will be filled with valid data from a lower priority dataset (typically lower resolution).

Note:

This assumes that (1) NoData values in the source mosaic datasets have been dealt according to the guidelines and (2) that Footprints May Contain NoData has been set to Yes in the derived mosaic dataset properties.

This feature can also be used to create different surface types.

Desired OutcomeMethodology

DTMs that show 0 for oceans

If the derived mosaic dataset should only represent land areas (no bathymetric data included), oceans should be represented with an elevation value of 0. Typically, datasets such as SRTM have 0 for sea areas. If you're using datasets that represent oceans as NoData, it is best to add a small raster (called ZeroElevation, for example) to the derived mosaic dataset that covers the complete data extent with only 0 values. Give it a low priority (i.e. a large Best value), so NoData values such as sea will always default to 0. Areas that are not sea would typically be covered by datasets such as SRTM.

DTMs that show NoData for oceans (or other NoData area)

In this case, do not include a ZeroElevation raster. For datasets such as SRTM, set 0 to be NoData by adding a mask function to each raster.

Do I need to create seamlines?

Generally no, unless you want edges blended.

Mosaic datasets of elevation data typically contain sources of multiple resolutions, and a Best field is used to order the dataset. The question, then, is what happens at the edges between datasets. By default, ArcGIS merges the datasets with no blend so that the best pixel is on top. This can result in artifacts in the form of jumps appearing along such edges. In many cases, such artifacts are not incorrect and should be left as-is.

There are cases where you may want to have the edges blended, and this is where seamlines can be used. The ArcGIS functions for seamline generation are written primarily for blending optical imagery and are not really applicable for elevation data. For elevation data, higher-resolution (or accuracy) data should be used to the fullest extent possible, with a blend to remove only sudden changes. To do this, create a seamline that falls along the extent of the higher-resolution dataset, then define an inside blend over a specified distance or number of pixels. This process involves the following:

  1. For the higher-resolution rasters you want to blend, edit the footprints to make them fit closer with the NoData extent. Use the Build Footprint tools and increase the number of allowed vertices.
    Note:

    For highly irregular extents, it may be better to shrink the footprints a bit to ensure there are no gaps of NoData.

  2. Run the Generate Seamlines function (using Copy Footprint) to copy the footprint geometry into the seamline.
    Note:

    To reset the footprints, re-run the Build Footprint tools using previous parameters, or use Import Footprints using a backup of the mosaic dataset before the footprints were refined.

  3. Edit the Seamline geometry attribute table:

    Blend Width

    an appropriate blend distance in the units of the source mosaic dataset (for example, 5 to 10 pixels)

    Blend Type

    2

The seamline blending will be seen when the following parameters are set:

ParameterSetting

Mosaic method

Seamline

Mosaic operator

Blend

Blend distance (in the Mosaic Dataset properties)

0 (ensures that blending does not take place for rasters that do not have seamlines)

Note:

This method does not work when the higher-resolution datasets have been tiled, since blending will also take place along the tile edges. To address this, merge the tiles.

Do I need to create overviews?

Generally, overviews are not required. A source mosaic dataset that includes lower-resolution data covering the entire areas is often used instead. Typically, overviews are created for these source mosaic datasets and included in the derived mosaic dataset, removing the requirement for overview generation.

If such source overviews do not exist, then you can build overviews for the derived mosaic datasets following the source mosaic dataset guidelines.

Note:

Examples of low-resolution data that can be used include the following:

  • SRTM: resolution of 3 arcseconds or approximately 90 meters
  • GMTED: resolution of 7.5 arcseconds or approximately 250 meters

How can I use metadata to determine the default display order?

In the metadata, the Best value used for ordering elevation datasets needs to be computed. Example formulas for computing the Best value include the following:

Best value formulaExplanation

Best = 10*[LowPS]

Simplest option; higher-resolution data appears on top of lower-resolution data

Note:

Depending on the types and validity of the metadata you have, you could also use [CE90] instead of [LowPS].

Best=10*[LowPS]-([YEAR]-2000)/100)

Ensures that elevation data that is later is displayed on top

There is a special case in which the type of source may also be included (the creation of services that represent DSMs, for example, which may not available for all areas and all scales). For such areas, there are two options:

Option 1: Define these areas as NoData. Representations in these areas will be blank.

Method 1: Exclude DTMs from the creation of the DSM-derived mosaic datasets.

Option 2: Approximate the DSM using the DTM. Representations such as slope and aspect will appear the same as the DSM when there is no DTM, and the surface height (computed as DTM-DSM) will return 0.

Method 2: Include the DTM in the DSM, but change the Best computation to include the DEM_Type.

Best value formulaExplanation

Best = 10*[LowPS] + [DEM_Type]

Ensures that in the case of a data collection with both DSM and DEM datasets at the same resolution, the DSM will have a lower value, and thus higher priority, than the DEM.

What should I do if I have multiple elevation surfaces?

The fundamental layer for an elevation service is usually a bare earth digital elevation model (DEM). However, sometimes your data management system needs to support other surface models, such as a first return digital surface model (DSM) (including tree canopy, buildings, and so on) or a hydrologically enforced DEM. It that case, the design for your mosaic datasets is somewhat more complicated.

Generally, you should manage fundamentally different surfaces in separate derived mosaic datasets. You might create, for example, DEM for bare earth, DSM for top surface, hydro DEM, and bathymetry. After the multiple derived mosaic datasets have been created, they may reference each other if composite products are desired (for example, calculation of SurfaceHeight = DSM minus DTM).

Managing areas without DSM data. If your system includes a derived mosaic dataset representing the DSM, areas lacking unique DSM data may be managed in two ways:

  • To eliminate NoData areas in the DSM, include all records from the DTM. Do this by ingesting the DTM into the DSM using the TABLE raster type to fill in DSM NoData areas with elevation from the underlying DTM. Use the DEM_Type field to identify DSM versus DTM records.
  • Alternatively, if usage of the DSM explicitly accommodates NoData areas, the DSM can include only pixels with unique DSM data, with large areas of NoData. In this case, the DSM NoData areas should be bounded by footprints, and where NoData holes are required, fill with a specific pixel value (for example, -9999). Use LZW compression to minimize data storage.

Calculating ellipsoidal heights. If you need to calculate ellipsoidal heights (for example, for orthorectification of satellite imagery), you can use a simple arithmetic function without creating a new mosaic dataset. Methods for accomplishing this include the following:

  • Create a referenced mosaic dataset based on the DTM, then insert an arithmetic function and add the appropriate geoids.
  • Include the arithmetic function as a server raster function (*.rft.xml file) applied on the DTM image service.

Refer to "Reference mosaic datasets" below for a discussion of advantages of these two different methods.

If only one global geoid is required to support ellipsoidal height (EGM2008), it may be simplest to use this as a single worldwide image. However, if the user organization requires the use of multiple geoids, a source mosaic dataset is recommended to manage different geoid corrections in different regions.

Representing large water bodies. Different methods are available for representing large water bodies. Oceans and seas can be represented as NoData, as elevation=0, or with bathymetric data. The proper choice will depend on the applications the data must support.

For most applications, it is best to represent any sea level regions with 0. If oceans are defined as NoData, then some processes (for example, orthorectification) that access NoData areas will fail. If datasets exist with oceans represented with NoData, a simple method to fill in with 0 values is to add a very low-resolution dummy image with height = 0 everywhere and extents to cover the whole earth, then set this dummy image as lowest priority for display. If any areas of NoData remain, 0 will appear from the dummy image.

Note:

If the system will include bathymetric data, elevation values in the oceans will be negative, but visualizations of the ocean floor can be generated.

Referenced mosaic datasets

Users often want a variety of representations of elevation data. With mosaic datasets, you can perform on-the-fly processing to produce these various products.

Elevation-based services for quantitative values that can be generated on-the-fly include the following:

ServiceDescription

Slope

Slope of the terrain, in degrees or percent rise

Aspect

Orientation of the terrain, from 0 to 360 degrees

Ellipsoidal Height

Calculated by adding the elevation-derived mosaic dataset to an appropriate geoid to shift elevation values to ellipsoidal height. This is useful for orthorectification of satellite imagery.

For common visualizations, these additional services are recommended:

ServiceDescription

Hillshade

Single-band grayscale, which may be JPEG compressed, so this view requires the least bandwidth to deliver to the user

Elevation-tinted Hillshade

Hillshade with color map applied

Slope Map

Slope with color map applied

Aspect Map

Aspect with color map applied

Sample raster function templates (rft.xml files) are included in the …/Elevation/Parameter/RasterFunctionTemplates/ directory in the downloadable workflow files.

Services using on-the-fly processing can be achieved two ways:

  1. Referenced mosaic datasets. This is a new mosaic dataset that references an existing mosaic dataset but redefines properties and/or includes additional functions. When published they have their own service endpoint.

    Benefits: When added directly to an application, these services display the required functions and so can be served also as WMS and WCS.

    Drawbacks: Each published reference mosaic dataset takes up resources on the server.

  2. Server raster functions. Defined as .rft.xml, files, these functions are applied on demand.

    Benefits : These are accessible to ArcGIS Desktop and web applications as selectable functions. They don't take up additional server resources, and additional functions can be added without hosting additional image services.

Maintaining your managed elevation data

What should I do if an existing data collection is updated?

  • If an existing data collection is updated(for example, USGS releases a new version of the NED 1/3 arcsecond data) and the new source data files exactly match the existing file names, you can simply replace the prior version with the new source files. Since the mosaic dataset references files on disk, data content will be read from the new files.

  • If a new data collection is added to the system, do as follows:
    1. Create a new source mosaic dataset as defined above.
    2. Complete any appropriate quality control (QC) procedures.
    3. Do one of the following:

      Add the new source mosaic dataset into an existing derived mosaic dataset using the Table raster type.

      If automated processing scripts have been implemented, add process commands for the new source mosaic dataset to the automated build process, and rebuild the entire derived mosaic dataset.

    4. If a new data collection is added to the system that is outside the current boundary of any existing datasets, run the Build Boundary tool.
    5. Overviews may need to be rebuilt for the source mosaic dataset. In general, no other overviews will need to be rebuilt.
Note:

Referenced mosaic datasets are automatically updated, so they will not need to be rebuilt.

Check out the next section to learn more about publishing your managed elevation data.