Initial configuration—Imagery Workflows

The mosaic dataset is the optimal data model for managing imagery and raster data in ArcGIS.

The best workflow and configuration for a mosaic dataset is dependent on the type of data. This section provides an overview of the general recommendations, but you may find other recommendations for specific types of data. The ArcGIS Imagery Workflows group on ArcGIS Online has scripts customized for specific types of data that automate the processes and require only a limited set of variables to be defined.

When managing multiple collections of imagery, it may be beneficial to use source, derived, and referenced mosaic datasets—more detail on this model for image management is provided in the following section.

The process of authoring mosaic datasets is the same in ArcGIS Pro and ArcMap. The following are typical steps that are performed (discussed further below):

Create a geodatabase
Create a new mosaic dataset
Add the raster data
Define additional metadata
Refine the mosaic dataset properties
Compute cell sizes
Refine geometry
Refine footprints and NoData
Refine radiometry
Generate seamlines
Generate overviews

Create a geodatabase

Mosaic datasets are stored in a geodatabase. In most cases, file geodatabases are used, which scale well and are simple. Geodatabases can be created by right-clicking on a folder in the Catalog and selecting New File Geodatabase, or using the Create File Geodatabase or Create Enterprise Geodatabase tool. For file geodatabases, only the name and location are required.

For Enterprise geodatabases, you'll need to create database connection and provide the server name and connection details. Learn more about creating an enterprise geodatabase.

Create a new mosaic dataset

To create a new mosaic dataset, you can either right click on a geodatabase in the Catalog, or use the Create Mosaic Dataset tool. The spatial reference system, number of bands, and bit depth that are appropriate for the data being added should be specified.

If all the source data will be in a single spatial reference system, then that is often used for the mosaic dataset. However, the data in a mosaic dataset can be from multiple spatial references and in many cases it is more appropriate to create the mosaic dataset in the spatial reference system that will encompass all the data that could be added.

For global datasets, the Web Mercator Auxiliary Sphere projection is often used because it provides near-global (excluding the poles) coverage and results in pixel sizes that are in meters, which is often easier to understand and more appropriate than the decimal degrees used in the geographic projection WGS84. Note that all pixels are transformed directly to the spatial reference system of the client application, so having a mosaic dataset in a different spatial reference system from the source does not result in additional resampling and has a negligible effect on performance.

The curvature of the projection should, where possible, be similar to the source. For example, UTM and Web Mercator are both transverse Mercator projections so there is no change in curvature. If there is a change in curvature (e.g. to Lambert Conformal) then additional vertices may be added to the footprints so that they represent the extent accurately. If too many vertices (more than 1000) are added, this can increase the time required to clip the image. It is generally advantageous to reduce the number where possible.

If the number of bands and bit depth are not defined, they are taken from the first dataset that is added to the mosaic dataset. In data-specific workflow scripts, the bands and bit depth are defined when the mosaic dataset is created to ensure that they are set correctly.

The product definition customizes the mosaic dataset to define specific bands and wavelengths and controls how it is displays by default, and aids in some processing. This is more important for multispectral imagery where correctly defining the bands simplifies the integration of other analysis and display tools later.

Learn more about creating a mosaic dataset.

Add raster data

The Add Rasters To Mosaic Dataset tool is used to add rasters. This tool prompts the user for the appropriate raster type and suitable parameters. A raster type can be considered a crawler that reads through a source to locate all rasters of a specific type, then extracts available meta information to create the appropriate records in the mosaic dataset. If the source is a georeferenced raster dataset, then the Raster Dataset raster type can be used. There are many different raster types to account for all common sensors, and these can also be extended using custom Python raster types.

Most of the raster types allow for additional process functions to be added. These functions may define stretches to enhance the imagery, or define parameters for orthorectifing satellite or aerial imagery.

In some cases, the georeferencing is only approximately defined. For example, imagery may be orthorectified initially using approximate navigation-grade coordinates from an aircraft, but be refined at a later stage when the GPS/IMU data is fully processed.

The outcome of this step should be a mosaic dataset that enables you to start visualizing the data.

Learn more about adding rasters to a mosaic dataset.

Define additional metadata

Metadata is important not only to provide users with additional information about the rasters, but also is used in queries and to define the order for image display.

Typically, metadata is ingested as part of the raster type when rasters are added. In some cases, metadata cannot be extracted from the rasters, so it is added to the mosaic dataset as additional fields in the attribute table. You can do this by opening the table, adding new fields, then using the Field Calculator to set the values, or you can use the Add Field tool and Calculate Field tool to set the values. Alternatively, if metadata is available in other tables, the tables can be joined to the mosaic dataset and the fields copied across. In the data-specific workflow scripts, methods are often provided to create and set such metadata values.

Additionally, if auxiliary metadata is stored as part of the aux.xml associated with a raster, then the raster types will extract and add this data if a field exists in the footprint table that has the same name as the metadata. If this is the case, add a field with the appropriate name prior to running Add Rasters.

Refine the mosaic dataset properties

Mosaic datasets have a large range of properties that need to be set depending on the source data and expected usage.

Allowed Compression Methods

After the image has been mosaicked from the different sources, the resulting image is then optionally compressed before being transmitted to the client application. When the client is a web browser, this defaults to JPEG compression if there are no NoData areas, else the default is PNG compression. When the client is ArcGIS Desktop, this parameter controls both the allowed compression methods and the default. By default, this is set to None, meaning that the data will not be compressed. If the mosaic dataset will be published, it is highly advisable to set the compression to JPEG Quality 80 for most imagery. If the mosaic dataset is more than 8-bit, you can use LZ77.

Maximum Number of Rasters per Mosaic

This defines the maximum number of rasters that can contribute to the output image. The default value is 20, meaning that no more than 20 rasters will contribute to a single request. Twenty is generally suitable, but if you're using small images, such as browse images or small tiles, then a larger value may be required. This is more important if users can make larger requests to the services. In many cases, a maximum of 20 rasters is sufficient for a screen full of imagery, but if a user makes a large export then the number of rasters required may be larger.

The effect of having a value too low is that there may be some gaps in the output image. This is one of the reasons that you may get good imagery when viewing the imagery, but get unexpected results when exporting or caching (such as missing images). Setting a value too high may result in some requests taking a long time if there are very large numbers of slightly overlapping images. This only matters if you're concerned about the request slowing your server or being slow to return to the user.

Allowed and Default Mosaic Methods

The mosaic method is the primary method of controlling the ordering of overlapping imagery. It is important that this is set appropriately to ensure that users connecting to services see the correct imagery and that they are allowed to modify the mosaic method to suit their needs.

The most common mosaic method for workflows is the By Attribute method, which enables a metadata field to be defined to sort the data and a base value which defines the highest priority value. A common default requirement is to see the latest imagery on top. This can be done, for example, by setting the By Attribute field to a date field (such as AcquisitionDate) and a base value to a date far in the future (such as 1-1-2050).

In some cases, the optimum ordering can not be defined by a single field. For example, it may be better to reduce the display priority of a scene with high cloud content or low nadir angles, so some workflows add a Best field, then compute the Best value using a mathematical operation based on a number of different attributes.

If the By Attribute method is allowed, then users may set the order to any of the Allowed fields. It is therefore important to review what allowed fields are visible. Additionally, the system will need to order records based on the selected attributes. For optimization, it is advantageous to index the default By Attribute field and any fields that users are likely to use to order the imagery.

Some workflows set different mosaic methods by default. For example, when using aerial imagery, the mosaic method may be set to Closest To Nadir so that the imagery that is most vertical under the camera is displayed by default. Note that there are many rules that control the ordering of the imagery and that methods such as By Attribute and Closest to Center are affected also by the LoPS values. As a result, it is sometime necessary to set the LoPS value of imagery to a constant to stop the images being ordered based on their pixel size.

See more about other mosaic methods in mosaic dataset properties.

Allowed Fields

This property defines all the fields that are transmitted to the client application and also which fields can be used in the By Attribute mosaic method. By default, all the metadata fields are included. In most cases, some of the base fields, such as MinPS and MaxPS, can be turned off. For mosaic datasets with large numbers of metadata fields, it is advantageous to reduce the number of fields included, as this will optimize the speed of some metadata and table display functionality.

Clipping Rasters

When a request is made by an application for imagery covering a specified extent, the system identifies the possible imagery based on a spatial extent, pixel size, and attribute queries and sorts them based on the Z order, pixel size and mosaic method. The system then attempts to read only those images that are required. The method used depends on the setting of the Always Clip the Raster to its Footprint and Footprints May Contain NoData properties. If rasters contain NoData values, then the system needs to read through the appropriate parts of each raster until all pixels required to cover the output extent are accessed. If there are many overlapping rasters containing NoData values, this process can be time-consuming. This can be optimized by turning NoData off and clipping the rasters based on the footprints.

If the Always Clip the Raster to its Footprint property is set to Yes and Footprints May Contain NoData is set to No, then the system does an overlay analysis of the footprints to determine which of the rasters need to be accessed. This analysis does not require reading any of the pixels, so it can be very efficient. In many cases, this analysis will result in only a few rasters being accessed. When using imagery that does not contain NoData (or where NoData is clipped), this setting is optimum. Clipping images based on the footprints is not always recommended, for example, when using rasters that are edge-joined. This is because during reprojection of the rasters the footprints need to be densified, which can result in either a lot of vertices or potential slivers. it is also recommended to use Merge Raster in this case. Note that if rasters have sufficient overlap then it is often better to recompute the footprints with a suitable shrink to clip off the outermost pixels and set Always Clip Raster to its Footprint to Yes.

There can be cases where there is a very large number of overlapping rasters that are partially misaligned, which can cause slivers. Slivers are rasters that only contribute a very small amount to the output image. Such slivers are considered valid by the system even if they only contribute 1 pixel to the output. To remove slivers, the system has a Minimum Pixel Contribution parameter, which by default is 1. This can be set to larger values. For example, a value of 1,000 would mean that images that contribute less than the equivalent of 100x10 pixels will not be included.

Geotransform

When client applications request imagery, they define the required projection. The mosaic dataset knows the projection of the mosaic dataset (used for searching the rasters) as well as the projection of the source rasters. The pixels are transformed with a single projection—from source to destination. The required projection transformations are explicitly defined, but in cases where the source and destination have different datums, different datum transforms may exist.

To ensure that the appropriate datum transform is used, a datum transform table is created within the mosaic dataset. When rasters are added to a mosaic dataset, the appropriate default datum transforms are added so that a suitable datum is used. If a specific datum transform should be used, then it is necessary to explicitly define it. This can be done by defining the required datum transform in the Geographic Coordinate System Transformations parameter in the mosaic dataset's properties.

Note that when creating a derived mosaic dataset, the datum transform tables from the source mosaic datasets are not copied across, so it is important when using imagery in different datums to review this table and add as necessary.

Cell size

By default, the cell size of a mosaic dataset is set to the smallest pixel size (LowPS) that is found as data is added. The cell size of a mosaic dataset is exposed as the cell size of the image service when a mosaic dataset is published. Some client applications utilize this value as a base value and assume that imagery can be accessed only at a power of two from this value, and they may not make a request smaller than this value. Similar to the extent of a service, this value cannot be changed without restarting the service, so it is sometimes necessary to explicitly set it to a suitable value.

Source Type

The Source Type property has an effect on how some client applications interpret the pixels being returned.

Statistics

All rasters and mosaic datasets have statistics. These are used by some client applications to automatically enhance the imagery or control parameters, such as categorical display. By default, the system will use the statistics of the contributing rasters or the result of the Calculate Statistics tool. Using this tool with default values can take a long time to run, because all the data in the mosaic dataset may need to be read. Since the statistics are generally an approximation, this process can be sped up by using a suitable skip factor. A good approximation of a skip factor can be calculated based on the Width of Mosaic Dataset/(5000*CellSize). For some mosaic datasets, the expected statistics of the dataset are known and can be explicitly set if required using the Set Raster Properties tool.

Processing Templates and Functions

During the Add Raster process, functions to transform the pixels are typically added to the items of the mosaic dataset. For example, a function might be used to convert an image into top-of-atmosphere radiance, or to apply orthorectification.

The mosaic dataset also allows the definition of additional functions, including custom Python raster functions, which will be applied to after the images have been mosaicked. If a function is defined here, it will be appended to all data requests. An example use of these service functions would be to append a water mark to all images.

A processing template allows a list (or chain) of raster functions to be defined for the mosaic dataset and image service. Clients to the mosaic datasets and image services can see a list of these processing templates and select the appropriate one, which will then be applied to all requests.

A typical use case for these processing templates might be working with elevation data, where a processing template is applied to define different renderings, like hillshade, slope, and aspect.

See more about managing processing templates.

Compute cell sizes

The four cell sizes (LoPS, HiPS, MinPS, and MaxPS) in a mosaic dataset are used to define the available pixel sizes of a raster and the display scale range. LoPS is generally set to the resolution of the pixels on the ground and is computed when the imagery is added.

When you use imagery that is not rectified or from a different source, each image may have a slightly different pixel size. In some workflows it is advisable to reset the LoPS value to a fixed value or the average of all similar rasters. A specific example of this is when wanting to use the ByAttribute or ClosestToCenter mosaic methods. These methods sort the image for display based on an attribute or the image location, but the ordering is overwritten by the LoPS value. It is therefore sometimes required to reset the LoPS value to a constant for all images that should be affected by the ordering rules.

The HiPS value for imagery is used primarily to determine the scales at which overviews should be computed. In most workflows, the pixel size at which overviews are created (if required) is explicitly set; therefore, this value is not very important. The most appropriate value for HiPS depends on the availability of pyramids and the width of the imagery. If automatically set, it is possible that images with the same pixel size, but different widths, will end up having different HiPS values. Therefore, it is sometimes advantageous to set the HiPS value appropriate to the data source depending on the typical number of pyramid levels that are useful. An approximate value for this would equal image width/1,000.

In the generic workflow, the MinPS and MaxPS cell sizes are automatically determined and computed when the data is added and can be reset by using the Calculate Cell Size Ranges tool.

The Calculate Cell Size Ranges tool computes cell sizes by determining the overlap between images and assumes that images with a larger pixel size should be displayed at the smaller scales. In many workflows these rules do not hold, and often it is more appropriate to set the MinPS value for all images to 0 and the MaxPS value dependent on the pyramids available and the scales to which the images should be visible. Therefore, in many of the workflows the MinPS and MaxPS values are explicitly set.

Some tools such as Add Rasters run Calculate Cell Size Ranges on all the imagery by default. It is therefore common to turn this off when adding additional rasters.

Learn more about cell size ranges in a mosaic dataset.

Refine geometry

Some workflows involve a refinement in the geometric accuracy of the imagery. The Refine geometry section of the workflows defines how this can best be achieved. Typically, when orthorectifying imagery, the imagery can be added using approximate orientation data and elevation models, and later these values are refined. Similarly, when adding satellite scenes it may be necessary to refine the geometry to make it more accurate. The ortho mapping workflows typically implement such geometric refinement. If the geometry of the imagery is refined it is often necessary to also redefine the footprints.

Refine footprints and NoData

When rasters are added to a mosaic dataset, the footprints are computed based on the available metadata, which is typically the envelope boundary of the rasters. The correct definition of footprints in a mosaic dataset is important, as they are not only used for identifying the extent of the imagery to be read, but can also be used to clip out parts of an image that are not to be displayed.

If the geometry of a raster has been refined after it has been added or if the footprint computed using the add raster is not suitable, it is necessary to recompute the footprint.

For mosaic datasets created from preprocessed rectangular images, using only the envelope of the raster is sufficient. In such cases, the footprints are only used to enable the system to quickly find the appropriate rasters. If there are NoData values, they can be defined as a property of the raster or a NoData mask can be defined for each raster. When the system identifies NoData pixels covering a request extent, the system will look for the next most suitable raster for additional pixels. As long as there are not too many overlapping rasters, this works well.

In cases where there are a lot of overlapping rasters, the pixel-based NoData can become a bottleneck, and it is better to turn off the NoData pixels and clip rasters by their footprints. Using a footprint to define NoData is more efficient, because the system can quickly perform an analysis of the overlapping footprint geometries to determine what rasters are required. There are properties of a mosaic dataset that define how the system handles NoData and clipping. These are defined in more detail in the Mosaic dataset properties step.

Footprints generally need to be recomputed for images that are not premosaicked into tiles. Such imagery will generally have a border of NoData values within the envelope of the rasters, and the Build Footprints tool can be used to refine the footprints. This tool has different modes. The radiometry mode creates a mask based on ranges of NoData values and generates a contour around the mask. There is a range of parameters that control how such masks are created, and often a very precise footprint cannot be obtained. In such cases, it is common to shrink the footprints a bit to exclude the edge pixels. For most sensors these edge pixels are of little value and clipping them out is advantageous. For satellite, aerial and drone images that are complete frames with no NoData pixels and are orthorectified by the system, the footprint can be computed based on the geometry mode. This performs an image-to-ground transform for the corners and edges of the frames.

Learn more about calculating footprints radiometrically.

Footprints also need to be computed for rasters that contain large NoData areas. In some workflows it is necessary to manually edit footprints or import footprints from different vector sources, for example, to clip imagery to specified extents such as county boundaries or exclude large expanses of water or clouds.

Similar to footprints, each mosaic dataset has a boundary that defines the extent. By default, the system computes this by performing an intersect of all the footprints. This can result in very complex boundaries, so workflows may recompute the footprint to be only an extent or be simplified.

Each of the detailed workflows will provide information on how best to refine the footprints and set NoData values.

Learn more about building footprints.

Refine radiometry

Unless the rasters are categorical, elevation, or already enhanced, it is often necessary to enhance the radiometry of the imagery to increase the interpretability or make the image more suitable for visual interpretation. Typically, appropriate stretch functions are added as part of a source mosaic dataset, or in some workflows the parameters for stretching the imagery are computed as part of the source mosaic dataset and applied when creating the derived mosaic dataset. Each workflow will describe the most suitable method for defining such enhancements.

In some workflows statistics need to be defined to set the stretch parameters. These statistics may have been computed as a preprocess and stored with the rasters. Alternatively, statistics can be computed on the mosaic dataset items. This is typically performed if the item contains functions such as pan-sharpening that would change the statistics (from their original raw data format). Computing statistics on the raster items in the mosaic dataset has the advantage that clipping of the footprints is also applied; which, in some workflows, ensures that non-required border areas are removed from the statistics computations. Most workflows try to avoid computing stats and setting stretches based on these stats as stats are detrimentally influenced by large objects such as clouds and sea. In many cases the stats are set to ensure suitable dynamic range and then the DRA (Dynamic Range Adjustment) based stretch function is applied so that the image is enhanced based on the extent of the image being displayed.

For optical imagery, it is becoming more common to set the functions of the process chains such that the DN values represent either top of atmosphere radiance or surface reflectance scaled by 10,000. This also helps in cases where imagery from different sensor is to be merged into a single derived mosaic dataset. Depending on the data source, the data specific workflow scripts will apply the appropriate scaling.

In some optical imagery cases, the expected output is a seamless color-balanced mosaic. In such workflows, color correction will be included as part of refining radiometry. Color Correction requires statistics to be computed on the mosaic dataset. Statistics on the original rasters if of limited use as the imagery is often enhanced. If intending to use Color Correction use the 'Build Pyramid and Statistics' tool using the mosaic dataset as the input data and exclude build pyramids. In the Statistic options it is advisable to set the skip factors to 8 so as to significantly speed up the statistics generation process.

Learn more about statistics in a mosaic dataset.

Generate seamlines

Typically, these are computed for satellite and aerial imagery if it required to create a seamless mosaic. The processing is based on determining suitable areas for blending overlapping imagery. Note that if the imagery has different resolution then the seamline generation will group rasters together based on the Cell Size Tolerance Factor, which is set in the mosaic dataset properties. Prior to generating seamlines, is it often necessary to set this parameter and run the Compute Cell Size Ranges tool. When running this tool, you should turn off the Compute Minimum and Maximum Cell Sizes parameters so that the pixel sizes are not recomputed.

The seamline generation makes use of the full overlapping extent of imagery, and there are cases where imagery needs to be prioritized or the seamlines should be computed only for a smaller subset of the overlapping areas. In these cases the workflows may temporarily reset the footprints to the required extents before seamline generation. For scanned maps, seamlines are typically set to the map data extent by importing the sheet cutlines of the original maps (to remove any map marginalia along the outside of the map).

Learn more about seamlines.

Generate overviews

Overviews are often created to enable faster viewing of mosaic datasets at smaller scales. In some workflows, the overviews are not created and smaller-scale images may be used instead. Although overviews speed up display at smaller scales, they do hinder the use of the mosaic methods at these scales, which limits the user's ability to change the order of the imagery. Therefore, the scale at which they are created needs to be chosen carefully.

Overviews should be computed after the mosaic datasets properties are set so the appropriate default mosaic method rules are used in the overview creation.

By default, ArcGIS determines the pixel size for generating the overviews by identifying the HiPS values of the underlying images. This can result in the pixel sizes varying in different areas. Therefore, most workflows set a specific pixel size at which the overviews should be created. The optimum size to create the overviews is often dependent on the width and size of the source data and if the source data contains pyramids.

If the source data does not contain pyramids then the following rule of thumb can be used:

BasedPixelSize for Overviews = Average(LoPS)*2.5

If the source does contain pyramids (typically recommended) then the following rule of thumb can be used:

BasedPixelSize for Overviews = Width of typical image / 1500

It is also recommended that the value not be defined to too many decimal places (for example, use 1.2 vs 1.234567).

In either case, if the service is to be used in web maps on ArcGIS Online using the standard Mercator Auxiliary sphere basemaps, then there is slight additional advantage to use one of the following depending on the projection of the mosaic dataset:


Meters	Decimal Degrees	Feet
1200	0.01024	4000
600	0.00512	2000
300	0.00256	1000
150	0.00128	500
75	6.40E-04	250
38	3.20E-04	125
19	1.60E-04	62.5
9	8.00E-05	31.25
4	4.00E-05	15.5
2	2.00E-05	7.8
1	1.00E-05	3.9
0.5	5.00E-06	1.95
0.25	2.50E-06	0.97

Note that instead of creating overview it is possible to use existing tile cache or even map services instead. This is only applicable if all the imagery in an image service is 3-band (RGB) 8-bit imagery. In such workflows, no overviews are created for the mosaic dataset. Instead, a cache is adding by using the raster type. Alternatively, a map service can be added using the Map Service raster type. The LoPS of this cache is then set appropriately so that it is turned off at the larger scales. The Category of is also typically reset to be Overview, so that the tools by default to note select this as a primary raster.

Learn more about overviews.

Feedback on this topic?