Derived mosaic datasets—Standard Workflow_Creating Mosaic Datasets

Derived mosaic datasets are used to merge multiple source mosaic datasets, primarily to enable different collections of imagery to be accessed via one source. The steps to create a derived mosaic dataset are similar to those of a source mosaic dataset.

Create a derived mosaic dataset
Add the source mosaic dataset data
Compute cell sizes
Refine footprints and define NoData
Refine the mosaic dataset properties
Generate Overviews

Create a new mosaic dataset

Derived mosaic datasets are created using the spatial reference system, bands, and bit depth appropriate for the final service. For organizations that work on local datasets and have standardized on one spatial reference system, this is typically used. For global datasets, the Web Mercator Auxiliary Sphere projection is often used. The spatial reference system of the derived mosaic dataset does not need to be the same as the source, but it should be noted that when the footprints of the source mosaic dataset are transformed to the derived mosaic dataset spatial reference system, then the footprint will be densified if there are differences in the curvature of the projection. This densification can add a large number of vertices to a footprint, which can affect performance.

Add rasters

Typically rasters are added to derived mosaic datasets by using the Table raster type and selecting the source mosaic datasets. This raster type ensures that every item in the source mosaic dataset is duplicated in the derived mosaic datasets. Although this may result in a large number of records in the derived mosaic dataset, it is more scalable than adding the source mosaic datasets as a single item. This is because when using a derived mosaic dataset, all the records and associated raster item properties are quickly accessible, but if the mosaic dataset references other mosaic datasets instead, the system may need to open multiple tables to access different rasters in multiple source mosaic datasets. Although some database connections are cached, the system does not cache the mosaic dataset, so it is not a scalable approach.

There are cases where rasters are directly added to a derived mosaic dataset. For example, a service may use an image, image service, or map service as a background when there is no other imagery to display. This can be achieved by adding the selected image or service as a raster dataset and then setting the ZOrder field to a large positive value, which puts it at a low display priority. Setting a negative ZOrder value will force the imagery to be displayed at a higher priority than the other rasters.

When adding rasters to the derived mosaic dataset, it is important to turn off the Update Cell Size Ranges parameter. If it's not turned off, every cell size will be recomputed, which can potentially break the ordering that is defined in each source mosaic dataset.

Cell sizes

Cell sizes are copied from the source mosaic dataset, so there is no requirement to recompute them. Running the Calculate Cell Size Ranges tool using defaults should not be done, because this will result in the cell sizes being recomputed based on the standard overlap rules, which is rarely required and will change the imported values (which are difficult to reset). In cases where additional rasters have been added individually, their MinPS and MaxPS values should be set manually.

The Calculate Cell Size Ranges tool not only computes the MinPS and MaxPS cell size values for each raster item, but also computes values for a levels table. This table is used to determine how to group images together based on their scale ranges so that functionality such as seamline generation can correctly create lines around images of similar pixel sizes. The grouping is determined based on the mosaic dataset's Cell Size Tolerance Factor property. Therefore, it may be necessary to set this value and run the Calculate Cell Size Ranges tool, with the Compute Minimum and Maximum Cell Sizes parameters turned off (unchecked).

Footprints and NoData

In most cases there is no need to refine footprints or change NoData values in the derived mosaic datasets. There are cases where the boundary may need to be recomputed. Instead of computing the boundary when the source mosaic datasets are added, the boundary is usually computed once after all the sources are added using the Build Boundary tool. In many cases where the boundary geometry becomes unnecessarily complex, the boundary is set to the envelope of the footprints using the Build Boundary tool with the simplification method set to Envelope.

You need to consider if the imagery should be clipped by the boundary. The mosaic dataset's Always Clip the mosaic dataset to its Boundary property can be set to either clip or not clip the imagery to the boundary. Typically this is set to clip only when the boundary is to be used to restrict access to imagery outside the boundary. Otherwise, it is better not to clip to the boundary so as to remove the additional clip processing that would be performed.

The extent of an image service is set when the service is published, based on the boundary. This cannot be changed while the service is running. In applications where new imagery is added to the service after it has been published, you need to ensure that the extent (envelope) of the service is sufficient to cover all new imagery. Therefore, it is sometimes necessary to redefine the boundary of a service as a rectangle coving the complete extent of all imagery that may be added. This can be done using the standard feature editing tools and modifying boundary feature.

Mosaic dataset properties

The properties of the derived mosaic dataset are more important than those of the source mosaic datasets. Determining the most suitable properties can be one of the challenges for different workflows. This section of the workflow will provide information on how to best set the parameters, as well as explanations to set the appropriate mosaic method parameters.

Some of the more important settings include the following:

Allowed Compression Methods—This is the default compression method for transmission. By default, this it is set to None, but if the mosaic dataset will be published, it is highly advisable to set the compression to JPEG Quality 80 for most imagery. This has the most effect on users connecting to the services using ArcGIS Desktop. Whereas, by default, most web applications will request JPEG.

Maximum Number of Rasters per Mosaic—The default value is 20, meaning that no more than 20 rasters will contribute to a single request. Twenty is generally suitable, but if using small images, such as browse images or small tiles, then a larger value is required. This is more important if users can make larger requests to the services. In many cases a maximum of 20 rasters is sufficient for a screen full of imagery, but if a user makes a large export then the number of rasters required may be larger. This is one of the reasons that you may get good imagery when viewing the imagery, but get unexpected results when exporting or caching (such as missing images). Setting a value too high may result in some requests taking a long time if a situation arises where there are very large numbers of slightly overlapping images. This only matters if you're concerned about the request slowing your server or being slow to return to the user.

Allowed and Default Mosaic Methods—The mosaic method is the primary method of controlling the ordering of overlapping imagery. It is important that this is set appropriately to ensure that users connecting to services see the correct imagery and that they are allowed to modify the mosaic method to suit their needs. The most common mosaic method for workflows is the By Attribute method that enables a metadata field to be defined, on which the data is sorted as well as a base value which defines the highest priority value. A common default requirement is to see the latest imagery on top. This can be done, for example, by setting the By Attribute field to a date field (such as AcquisitionDate) and a base value to a date far in the future (such as 1-1-2050). In some cases the optimum ordering can not be defined by a single field. For example, it may be better to reduce the display priority of a scene with high cloud content or low nadir angles, so some workflows add Best field and compute the Best value using a mathematical operation based on a number of different attribute.

If the By Attribute method is allowed then users may set the order to any of the Allowed fields. It is therefore important to review what allowed fields are visible. Additionally, the system will need to order records based on the selected attributes. For optimization it is advantageous to index the default By Attribute field and any fields that users are likely to use to order the imagery.

Allowed Fields—This property defines all the fields that are transmitted to the client application and also which fields can be used in the By Attribute mosaic method. By default, all the metadata fields are included. In most cases, some of the base fields, such as MinPS and MaxPS, can be turned off. For mosaic datasets with large numbers of metadata fields, it is advantageous to reduce the number of fields included, as this will optimize the speed of some metadata and table display functionality.

Clipping Rasters—When a request is made by an application for imagery covering a specified extent, the system identifies the possible imagery based on a spatial extent, pixel size, and attribute queries and sorts them based on the mosaic method. The system then attempts to read only those images that are required. The method used depends on the setting of the Always Clip the Raster to its Footprint and Footprints May Contain NoData properties. If rasters contain NoData values, then the system needs to read through the appropriate parts of each raster until all pixels required to cover the output extent are accessed. If there are many overlapping rasters containing NoData values, this process can be time-consuming. This can be optimized by turning NoData off and clipping the rasters based on the footprints.

If the Always Clip the Raster to its Footprint property is set to Yes and Footprints May Contain NoData is set to None, then the system does an overlay analysis of the footprints to determine which of the rasters need to be accessed. This analysis does not require reading any of the pixels, so it can be very efficient. In many cases, this analysis will result in only a few rasters being accessed. When using imagery that does not contain NoData (or where NoData is clipped) this setting is optimum. Clipping images based on the footprints is not always recommended, for example, when using rasters that are edge-joined and cover a large extent. This is because during reprojection of the rasters the footprints need to be densified, which can result in either a lot of vertices or potential slivers.

There can be cases where there is a very large number of overlapping rasters that are partially misaligned, which can cause slivers. Slivers are rasters that only contribute a very small amount to the output image. Such slivers are considered valid by the system even if they only contribute 1 pixel to the output. So to remove slivers the system has a Minimum Pixel Contribution parameter, which by default is 1. This can be set to larger values. For example, a value of 1,000 would mean that images that take less than the equivalent of 100x10 pixels will not be included.

Geotransform—When client applications request imagery, they define the required projection. The mosaic dataset knows the projection of the mosaic dataset (used for searching the rasters) as well as the projection of the source rasters. The pixels are transformed with a single projection—from source to destination. The required projection transformations are explicitly defined, but in cases where the source and destination have different datums, different datum transforms may exist. To ensure that the appropriate datum transform is used, a datum transform table is created within the mosaic dataset. When rasters are added to a mosaic dataset, the appropriate default datum transforms are added so that a suitable datum is used. If a specific datum transform should be used, then it is necessary to explicitly define it. This can be done by defining the required datum transform in the Geographic Coordinate System Transformations parameter in the mosaic dataset's properties.

Note that when creating a derived mosaic dataset, the datum transform table from the source mosaic datasets is not copied across, so it is important when using imagery in different datums to review this table and add as necessary.

Cell Size—By default, the cell size of a mosaic dataset is set to the smallest pixel size (LowPS) that is found as data is added. The cell size of a mosaic dataset is exposed as the cell size of the image service when a mosaic dataset is published. Some client applications utilize this value as a base value and assume that imagery can be accessed only at a power of two from this value, and they may not make a request smaller than this value. Similar to the extent of a service, this value cannot be changed without restarting the service, so it is sometimes necessary to explicitly set it to a suitable value.

Source Type—The source type property has an effect on how some client applications interpret the pixels being returned. Workflows will often set this value appropriately to ensure the appropriate user experience.

Statistics—All rasters and mosaic datasets have statistics. These are used by some client applications to automatically enhance the imagery or control parameters such as categorical display. By default, the system will determine statistics from the statistics of the contributing rasters or when the Calculate Statistics tool is used. Using this tool with default values can take a long time to run, because all the data in the mosaic dataset may need to be read. Since the statistics are generally an approximation, this process can be sped up by using a suitable skip factor. A good approximation of a skip factor can be calculated with Width of Mosaic Dataset/(5000*CellSize). For some mosaic datasets, the expected statistics of the dataset are known and can be explicitly set if required.

Functions

During the Add Raster process, functions to transform the pixels are typically added to the items of the mosaic dataset. For example, a function might be used to convert an image into top-of-atmosphere radiance, or to apply orthorectification.

The Properties page of the mosaic dataset also allows the definition of a set of additional functions which will be applied to all datasets. If a function is defined here, it will be appended to all data requests. An example use of these service functions would be to append a water mark to all images. For derived mosaic datasets, typically Processing Templates are used instead.

Processing Templates

A processing template allows a list (or chain) of raster functions to be defined for the mosaic dataset and image service. Clients to the mosaic datasets and image services can see a list of these processing templates and select the appropriate one, which will then be applied to all requests.

A typical use case for these processing templates might be working with elevation data, where a processing template is applied to define different renderings, like hillshade, slope, and aspect.

Many of the workflows define a set of processing templates to create a set of derived products without the need to create additional image services.

Overviews

In many cases the overviews in the source mosaic datasets are used in the derived mosaic datasets. As long as suitable attributes are defined for the overviews, they can be used in some queries. For example, a derived mosaic dataset of high-resolution satellite imagery created from source mosaic datasets from different sensors may have overviews attributed as QuickBird or GeoEye1. Note that when overviews are imported using a table raster type, the Category field is set back to primary.

It is often advantageous to create a separate overview from the derived mosaic dataset for use at the very small scales. When a user zooms to the extent of a mosaic dataset (which often occurs), it is advantageous if the system only needs to read a single raster. To enable this, it is best to define and build overviews for the smallest scales. Typically, the pixel size for these overviews may be set to about 1/5000 of the width. As with creating overviews for source mosaic datasets, it is best to build these overviews after the appropriate default mosaic method has been defined.

For some imagery (such as 3-band 8-bit) it is often also possible to create a cache instead of overviews, then use the cache in place of overviews. This can be advantageous in cases where on-demand caching is used for the larger scales. This is discussed in more detail in the caching section.

Feedback on this topic?