Skip To Content

Configure a mosaic dataset

The mosaic dataset is the recommended data model for managing imagery and raster data in ArcGIS.

The recommended workflow and configuration for a mosaic dataset is dependent on the type of data. This section provides an overview of the general recommendations, but you may find other recommendations for specific types of data. The ArcGIS Imagery Workflows group on ArcGIS Online has scripts customized for specific types of data that automate the processes and require only a limited set of variables to be defined.

When managing multiple collections of imagery, it is recommended that you use source, derived, and referenced mosaic datasets—more detail on this model for image management is provided in the following section.

The following are typical workflow steps. They are discussed further below.

  1. Create a geodatabase.
  2. Create a new mosaic dataset.
  3. Add the raster data.
  4. Define additional metadata.
  5. Refine the mosaic dataset properties.
  6. Compute cell sizes.
  7. Refine geometry.
  8. Refine footprints and NoData.
  9. Refine radiometry.
  10. Generate seamlines.
  11. Generate overviews.

Create a geodatabase

Mosaic datasets are stored in a geodatabase. In most cases, file geodatabases are used because scaling is simple. Geodatabases can be created by right-clicking a folder in the Catalog pane and selecting New File Geodatabase, or using the Create File Geodatabase or Create Enterprise Geodatabase tool. For file geodatabases, only the name and location are required.

For enterprise geodatabases, you must create the database connection and provide the server name and connection details. Learn more about creating an enterprise geodatabase.

Create a mosaic dataset

To create a mosaic dataset, you can either right-click a geodatabase in the Catalog pane, or use the Create Mosaic Dataset tool. Specify the spatial reference system, number of bands, and bit depth that are appropriate for the data being added.

If all the source data will be in a single spatial reference system, that is the system used for the mosaic dataset. However, the data in a mosaic dataset can be from multiple spatial references and in many cases, it is recommended that you create the mosaic dataset in the spatial reference system that will encompass all the data that could be added.

For global datasets, the Web Mercator Auxiliary Sphere projection is often used because it provides near-global (excluding the poles) coverage and results in pixel sizes that are in meters and in a local area's Cartesian (X scale =Y scale), which is often easier to understand and more appropriate than the decimal degrees used in the geographic projection WGS84, which is not Cartesian. Note that a meter in Web Mercator is not a real meter, but decreases in size as one moves toward the poles. For example, in Anchorage (45 degrees north), a meter in Web Mercator represents only about 50 centimeters on the ground. For mosaic datasets, the choice of mosaic dataset spatial reference is often not critical as it is used only for defining the footprints. All transforms are applied from the spatial reference that the pixels are stored in to the spatial reference system of the client application. Having a mosaic dataset in a different spatial reference system from the source does not result in additional re-sampling and has a negligible effect on performance.

Where possible, ensure that the curvature of the projection is similar to the source. For example, UTM and Web Mercator are both transverse Mercator projections, so there is no change in curvature. If there is a change in curvature (for example, to Lambert Conformal), additional vertices may be added to the footprints so that they represent the extent accurately. If too many vertices (more than 1,000) are added, this can increase the time required to clip the image. It is recommended that you reduce the number where possible.

If the number of bands and bit depth are not defined, they are taken from the first dataset that is added to the mosaic dataset. In data-specific workflow scripts, the bands and bit depth are defined when the mosaic dataset is created to ensure that they are set correctly.

The product definition customizes the mosaic dataset to define specific bands and wavelengths, controls how it displays by default, and aids in some processing. This is more important for multispectral imagery in which correctly defining the bands simplifies the integration of other analysis and display tools later.

Learn more about creating a mosaic dataset.

Add raster data

The Add Rasters To Mosaic Dataset tool is used to add rasters. This tool prompts the users to set the appropriate raster type and suitable parameters. A raster type can be considered a crawler that reads through a source to locate all rasters of a specific type and then extracts available meta information to create the appropriate records in the mosaic dataset. If the source is a georeferenced raster dataset, the Raster Dataset raster type can be used. There are many raster types to account for all common sensors, and these can also be extended using custom Python raster types.

Most of the raster types allow for additional process functions to be added. These functions may define stretches to enhance the imagery, or define parameters for orthorectifying satellite or aerial imagery. The outcome of this step is a mosaic dataset that allows you to start visualizing and analyzing the data. The accuracy will be dependent on the accuracy of the parameters and functions being applied.

In some cases, the georeferencing is only approximately defined. For example, imagery may be orthorectified initially using approximate navigation-grade coordinates from an aircraft but be refined at a later stage when the GPS/IMU data is fully processed.

Learn more about adding rasters to a mosaic dataset.

Define additional metadata

Metadata not only provides additional information about the rasters but also is used in queries and to define the order for image display.

Typically, metadata is ingested as part of the raster type when rasters are added. In some cases, metadata cannot be extracted from the rasters or STAC items, so it is added to the mosaic dataset as additional fields in the attribute table. You can do this by opening the table, adding new fields, and using the Field Calculator to set the values, or you can use the Add Field tool and Calculate Field tool to set the values. Alternatively, if metadata is available in other tables, the tables can be joined to the mosaic dataset and the fields copied across. In the data-specific workflow scripts, methods help you create and set such metadata values.

Additionally, if auxiliary metadata is stored as part of the aux.xml file associated with a raster, the raster types will extract and add this data if a field exists in the footprint table that has the same name as the metadata. If this is the case, add a field with the appropriate name prior to running Add Rasters. Many of the raster types will also identify if there are STAC JSONs associated with the files and ingest metadata from them.

Refine the mosaic dataset properties

Mosaic datasets have a large range of properties that must be set, depending on the source data and expected usage.

Allowed Compression Methods

After the image has been mosaicked from the different sources, the resulting image is then optionally compressed before being transmitted to the client application. When the client is a web browser, this defaults to JPEG compression if there are no NoData areas, else the default is PNG compression. When the client is ArcGIS Desktop, this parameter controls both the allowed compression methods and the default. By default, this is set to None, meaning that the data will not be compressed. If the mosaic dataset will be published, it is recommended that you set the compression to JPEG Quality 80 for most imagery. If the mosaic dataset is more than 8-bit, you can use LZ77.

Maximum Number of Rasters per Mosaic

This defines the maximum number of rasters that can contribute to the output image. The default value is 20, meaning that no more than 20 rasters will contribute to a single request. Twenty is generally suitable, but if you're using small images, such as browse images or small tiles, a larger value may be required, especially if making larger requests to the services. In many cases, a maximum of 20 rasters is sufficient for a screen full of imagery, but if a user makes a large export, the number of rasters required may be larger.

The effect of having a value too low is that there may be some gaps in the output image. This is one of the reasons that imagery may appear different when viewing the imagery than results when exporting or caching (such as missing images). Setting a value too high may result in some requests taking a long time if there are very large numbers of slightly overlapping images. This only matters if you're concerned about the request slowing your server or being slow to return to the user.

Allowed and Default Mosaic Methods

The mosaic method is the primary method of controlling the ordering of overlapping imagery. Set this appropriately to ensure that users connecting to services see the correct imagery and that they are allowed to modify the mosaic method to suit their needs.

The most common mosaic method for workflows is the By Attribute method, which enables a metadata field to be defined to sort the data and a base value that defines the highest priority value. A common default requirement is to see the latest imagery on top. This can be done, for example, by setting the By Attribute field to a date field (such as AcquisitionDate) and a base value to a date far in the future (such as 1-1-2050).

In some cases, the optimum ordering cannot be defined by a single field. For example, it may be better to reduce the display priority of a scene with high cloud content or low nadir angles, so some workflows add a Best field and compute the Best value using a mathematical operation based on a number of different attributes.

If the By Attribute method is allowed, users may set the order to any of the Allowed fields. It is therefore important to review what allowed fields are visible. Additionally, the system will need to order records based on the selected attributes. For optimization, it is advantageous to index the default By Attribute field and any fields that users are likely to use to order the imagery.

Some workflows set different mosaic methods by default. For example, when using aerial imagery, the mosaic method may be set to Closest To Nadir so that the imagery that is most vertical under the camera is displayed by default. Note that there are many rules that control the ordering of the imagery and that methods such as By Attribute and Closest to Center are affected also by the LoPS values. As a result, you may need to set the LoPS value of imagery to a constant to stop the images being ordered based on their pixel size.

See more about other mosaic methods in mosaic dataset properties.

Allowed Fields

This property defines all the fields that are transmitted to the client application and also which fields can be used in the By Attribute mosaic method. By default, all the metadata fields are included. In most cases, some of the base fields, such as MinPS and MaxPS, can be turned off. For mosaic datasets with large numbers of metadata fields, it is recommended that you reduce the number of fields included, as this will optimize the speed of some metadata and table display functionality.

Clipping Rasters

When a request is made by an application for imagery covering a specified extent, the system identifies the possible imagery based on a spatial extent, pixel size, and attribute queries and sorts them based on the Z order, pixel size and mosaic method. The system then attempts to read only those images that are required. The method used depends on the setting of the Always Clip the Raster to its Footprint and Footprints May Contain NoData properties. If rasters contain NoData values, the system needs to read through the appropriate parts of each raster until all pixels required to cover the output extent are accessed. If there are many overlapping rasters containing NoData values, this process can be time-consuming. This can be optimized by turning NoData off and clipping the rasters based on the footprints.

If the Always Clip the Raster to its Footprint property is set to Yes and Footprints May Contain NoData is set to No, the system does an overlay analysis of the footprints to determine which of the rasters need to be accessed. This analysis does not require reading any of the pixels, so it can be very efficient. In many cases, this analysis will result in only a few rasters being accessed. When using imagery that does not contain NoData (or where NoData is clipped), this setting is optimum. Clipping images based on the footprints is not always recommended, for example, when using rasters that are edge-joined. This is because during reprojection of the rasters the footprints must be densified, which can result in either a lot of vertices or potential gap slivers. It is also recommended to use Merge Raster in this case. Note that if rasters have sufficient overlap, recompute the footprints with a suitable shrink to clip off the outermost pixels and set Always Clip Raster to its Footprint to Yes.

There can be cases where there is a very large number of overlapping rasters that are partially misaligned, causing sliver images. Sliver images are rasters that only contribute a very small amount to the output image. Such sliver images are considered valid by the system even if they only contribute 1 pixel to the output. To remove sliver images, the system has a Minimum Pixel Contribution parameter, which is 1 by default. This can be set to larger values. For example, a value of 1,000 would mean that images that contribute less than the equivalent of 100x10 pixels will not be included.

Geotransform

When client applications request imagery, they define the required projection. The mosaic dataset knows the projection of the mosaic dataset (used for searching the rasters) as well as the projection of the source rasters. The pixels are transformed with a single projection—from source to destination. The required projection transformations are explicitly defined, but in cases where the source and destination have different datums, different datum transforms may exist.

To ensure that the appropriate datum transform is used, a datum transform table is created within the mosaic dataset. When rasters are added to a mosaic dataset, the appropriate default datum transforms are added so that a suitable datum is used. If a specific datum transform should be used, it is necessary to explicitly define it. This can be done by defining the required datum transform in the Geographic Coordinate System Transformations parameter in the mosaic dataset's properties.

Note that when creating a derived mosaic dataset, the datum transform tables from the source mosaic datasets are not copied across, so it is important when using imagery in different datums to review this table and add as necessary.

Cell size

By default, the cell size of a mosaic dataset is set to the smallest pixel size (LowPS) that is found as data is added. The cell size of a mosaic dataset is exposed as the cell size of the image service when a mosaic dataset is published. Some client applications use this value as a base value and assume that imagery can be accessed only at a power of two from this value, and they may not make a request smaller than this value. Similar to the extent of a service, this value cannot be changed without restarting the service, so it is sometimes necessary to explicitly set it to a suitable value.

Source Type

The Source Type property has an effect on how some client applications interpret the pixels being returned. It is recommended that you set it appropriately. Refer to Raster dataset properties.

Statistics

All rasters and mosaic datasets have statistics about the data value of the rasters. These are used by some client applications to automatically enhance the imagery or control parameters, such as categorical display. By default, the system will use the statistics of the contributing rasters or the result of the Calculate Statistics tool. Using this tool with default values can take a long time to run, because all the data in the mosaic dataset may need to be read. Since the statistics are generally an approximation, this process can be sped up by using a suitable skip factor. An approximation of a skip factor can be calculated based on the Width of Mosaic Dataset/(5000*CellSize). For some mosaic datasets, the expected statistics of the dataset are known and can be explicitly set if required using the Set Raster Properties tool. If the data has a well defined set of statistics such as surface reflectance, it is better to set the values instead of allowing the system to compute them.

Processing Templates and Functions

During the Add Raster process, functions to transform the pixels are typically added to the items of the mosaic dataset. For example, a function might be used to convert an image into top-of-atmosphere radiance, or to apply orthorectification.

The mosaic dataset also allows the definition of additional functions, including custom Python raster functions, which will be applied to after the images have been mosaicked. If a function is defined here, it will be appended to all data requests. An example use of these service functions would be to append a watermark to all images.

A processing template allows a list (or chain) of raster functions to be defined for the mosaic dataset and image service. Clients to the mosaic datasets and image services can see a list of these processing templates and select the appropriate one, which will then be applied to all requests.

A typical use case for these processing templates might be working with elevation data, where a processing template is applied to define different renderings, like hillshade, slope, and aspect.

See more about managing processing templates.

Compute cell sizes

The four cell sizes (LoPS, HiPS, MinPS, and MaxPS) in a mosaic dataset are used to define the available pixel sizes of a raster and the display scale range. LoPS is generally set to the resolution of the pixels on the ground and is computed when the imagery is added.

When you use imagery that is not rectified or from a different source, each image may have a slightly different pixel size. In some workflows, it is recommended that you reset the LoPS value to a fixed value or the average of all similar rasters. A specific example of this is using the ByAttribute or ClosestToCenter mosaic methods. These methods sort the image for display based on an attribute or the image location, but the ordering is overwritten by the LoPS value. You may need to reset the LoPS value to a constant for all images that is affected by the ordering rules.

The HiPS value for imagery is used primarily to determine the scales at which overviews should be computed. In most workflows, the pixel size at which overviews are created (if required) is explicitly set. The most appropriate value for HiPS depends on the availability of pyramids and the width of the imagery. If automatically set, images with the same pixel size but different widths may have different HiPS values. Set the HiPS value appropriate to the data source depending on the typical number of pyramid levels that are useful. An approximate value for this would equal image width/1,000.

In the generic Calculate Cell Size Ranges workflow, the MinPS and MaxPS cell sizes are automatically determined and computed when the data is added and can be reset by using the tool.

The Calculate Cell Size Ranges tool computes cell sizes by determining the overlap between images and assumes that images with a larger pixel size should be displayed at the smaller scales. In some workflows, you may need to set the MinPS value for all images to 0 and the MaxPS value dependent on the pyramids available and the scales to which the images should be visible. Therefore, in many of the workflows, the MinPS and MaxPS values are explicitly set.

Some tools, such as Add Rasters, run Calculate Cell Size Ranges on all the imagery by default. Optionally, turn this off when adding additional rasters.

Learn more about calculating cell size ranges in a mosaic dataset.

Refine geometry

Some workflows involve a refinement in the geometric accuracy of the imagery. The Refine geometry section of the workflows defines how this can best be achieved. Typically, when orthorectifying imagery, you can add the imagery using approximate orientation data and elevation models, and refine these values later. Similarly, when adding satellite scenes, it may be necessary to refine the geometry to make it more accurate. The ortho mapping workflows typically implement such geometric refinement. If the geometry of the imagery is refined, it is often necessary to also redefine the footprints.

Refine footprints and NoData

When rasters are added to a mosaic dataset, the footprints are computed based on the available metadata, which is typically the envelope boundary of the rasters. The correct definition of footprints in a mosaic dataset is important, as they are not only used for identifying the extent of the imagery to be read but can also be used to clip out parts of an image that are not to be displayed.

If the geometry of a raster has been refined after it has been added or if the footprint computed using the add raster is not suitable, recompute the footprint.

For mosaic datasets created from preprocessed rectangular images, using only the envelope of the raster is sufficient. In such cases, the footprints are only used to enable the system to quickly find the appropriate rasters. If there are NoData values, they can be defined as a property of the raster or a NoData mask can be defined for each raster. When the system identifies NoData pixels covering a request extent, the system will look for the next most suitable raster for additional pixels. As long as there are not too many overlapping rasters, this works well.

When you have many overlapping rasters, the pixel-based NoData can become a bottleneck, and it is recommended that you turn off the NoData pixels and clip rasters by their footprints. Using a footprint to define NoData is more efficient, because the system can quickly perform an analysis of the overlapping footprint geometries to determine what rasters are required. There are properties of a mosaic dataset that define how the system handles NoData and clipping. These are defined in more detail in the Mosaic dataset properties step.

Footprints generally need to be recomputed for images that are not premosaicked into tiles. Such imagery will generally have a border of NoData values within the envelope of the rasters, and the Build Footprints tool can be used to refine the footprints. This tool has different modes. The radiometry mode creates a mask based on ranges of NoData values and generates a contour around the mask. There is a range of parameters that control how such masks are created, and often a very precise footprint cannot be obtained. In such cases, it is common to shrink the footprints a bit to exclude the edge pixels. For most sensors, these edge pixels are of little value and clipping them out is recommended. For satellite, aerial and drone images that are complete frames with no NoData pixels and are orthorectified by the system, the footprint can be computed based on the geometry mode. This performs an image-to-ground transform for the corners and edges of the frames.

Learn more about calculating footprints radiometrically.

Footprints also must be computed for rasters that contain large NoData areas. In some workflows, you may need to manually edit footprints or import footprints from different vector sources, for example, to clip imagery to specified extents such as county boundaries or exclude large expanses of water or clouds.

Similar to footprints, each mosaic dataset has a boundary that defines the extent. By default, the system computes this by performing an intersect of all the footprints. This can result in very complex boundaries, so workflows may recompute the footprint to be only an extent or be simplified.

Each of the detailed workflows will provide information on how best to refine the footprints and set NoData values.

Learn more about building footprints.

Refine radiometry

Unless the rasters are categorical, elevation, or already enhanced, enhance the radiometry of the imagery to increase the interpretability or make the image more suitable for visual interpretation. Typically, appropriate stretch functions are added as part of a source mosaic dataset, or in some workflows, the parameters for stretching the imagery are computed as part of the source mosaic dataset and applied when creating the derived mosaic dataset. Each workflow will describe the most suitable method for defining such enhancements.

You may need to define statistics to set the stretch parameters. These statistics may have been computed as a preprocess and stored with the rasters. Alternatively, statistics can be computed on the mosaic dataset items. This is typically performed if the item contains functions such as pan-sharpening that would change the statistics (from their original raw data format). Computing statistics on the raster items in the mosaic dataset has the advantage that clipping of the footprints is also applied; which, in some workflows, ensures that non-required border areas are removed from the statistics computations. Most workflows try to avoid computing stats and setting stretches based on these stats as stats are detrimentally influenced by large objects such as clouds and sea. In many cases, the stats are set to ensure suitable dynamic range and then the DRA (dynamic range adjustment) based stretch function is applied so that the image is enhanced based on the extent of the image being displayed.

For optical imagery, you can set the functions of the process chains such that the DN values represent either top of atmosphere radiance or surface reflectance, which are float values. In some datasets, such as Sentinel and Landsat, these values are stored as 16-bit integers with appropriate scale and offset values defined in the metadata. It is recommended that the raster type apply the appropriate scaling to bring the values back to floats. This also helps in cases where imagery from different sensor is to be merged into a single derived mosaic dataset. Depending on the data source, the data-specific workflow scripts will apply the appropriate scaling.

In some optical imagery cases, the expected output is a seamless color-balanced mosaic. In such workflows, color correction will be included as part of refining radiometry. Color correction requires statistics to be computed on the mosaic dataset. Statistics on the original rasters is of limited use as the imagery is often enhanced. If intending to use color correction, use the Build Pyramid and Statistics tool using the mosaic dataset as the input data and exclude build pyramids process. In the Statistics options, set the skip factors to 8 to speed up the statistics generation process.

Learn more about statistics in a mosaic dataset.

Generate seamlines

Typically, these are computed for satellite and aerial imagery if it required you to create a seamless mosaic. The processing is based on determining suitable areas for blending overlapping imagery. Note that if the imagery has a different resolution, the seamline generation will group rasters together based on the Cell Size Tolerance Factor value, which is set in the mosaic dataset properties. Prior to generating seamlines, set this parameter and run the Calculate Cell Size Ranges tool. When running this tool, turn off the Compute Minimum Cell Sizes and Compute Maximum Cell Sizes parameters so that the pixel sizes are not recomputed.

The seamline generation uses the full overlapping extent of imagery, and you may need to prioritize imagery or compute the seamlines only for a smaller subset of the overlapping areas. In these cases, the workflows may temporarily reset the footprints to the required extents before seamline generation. For scanned maps, seamlines are typically set to the map data extent by importing the sheet cutlines of the original maps (to remove any map marginalia along the outside of the map).

Learn more about seamlines.

Generate overviews

Overviews are often created to enable faster viewing of mosaic datasets at smaller scales. In some workflows, the overviews are not created and smaller-scale images may be used instead. Although overviews speed up display at smaller scales, they do hinder the use of the mosaic methods at these scales, which limits the user's ability to change the order of the imagery. Therefore, the scale at which they are created must be chosen carefully.

Overviews should be computed after the mosaic datasets properties are set so the appropriate default mosaic method rules are used in the overview creation.

By default, ArcGIS determines the pixel size for generating the overviews by identifying the HiPS values of the underlying images. This can result in the pixel sizes varying in different areas. Therefore, most workflows set a specific pixel size at which the overviews should be created. The optimum size to create the overviews is often dependent on the width and size of the source data and if the source data contains pyramids.

If the source data does not contain pyramids, the following rule can be used:

BasedPixelSize for Overviews = Average(LoPS)*2.5

If the source does contain pyramids (typically recommended), the following rule can be used:

BasedPixelSize for Overviews = Width of typical image / 1500

It is also recommended that the value not be defined to more than one decimal place (for example, use 1.2 versus 1.234567).

In either case, if the service is to be used in web maps on ArcGIS Online using the standard Mercator Auxiliary sphere basemaps, use one of the following, depending on the projection of the mosaic dataset:

MetersDecimal DegreesFeet

1200

0.01024

4000

600

0.00512

2000

300

0.00256

1000

150

0.00128

500

75

6.40E-04

250

38

3.20E-04

125

19

1.60E-04

62.5

9

8.00E-05

31.25

4

4.00E-05

15.5

2

2.00E-05

7.8

1

1.00E-05

3.9

0.5

5.00E-06

1.95

0.25

2.50E-06

0.97

Note that instead of creating overview, you can use existing tile caches or map services instead. This is only applicable if all the imagery in an image service is 3-band (RGB) 8-bit imagery. In such workflows, no overviews are created for the mosaic dataset. Instead, add a cache by using the raster type. Alternatively, a map service can be added using the Map Service raster type. The LoPS of this cache is then set appropriately so that it is turned off at the larger scales. The Category of the cache is also typically reset to be Overview, so that the tools select this as a primary raster.

Learn more about overviews.