Skip To Content

Creating mosaic datasets

If you have a single data collection (e.g. data from 2010):

  • Create a single source mosaic dataset, which you can then share directly.

If your data has multiple collections (e.g. data from 2005, 2010, and 2015):

  • Create source mosaic datasets for each data collection, then combine them into one derived mosaic dataset that can then be shared.

Source mosaic datasets

Each collection of browse imagery should be organized using a separate source mosaic dataset.

During incorporation, the data files can be selected from the existing metadata table using a query. The query string may be a file name filter (abc*.tif) or a query (for example, sensor='ETM' and year='2004'). Doing this allows the data combination process to be carefully controlled and enables parameters to be defined prior to compiling the source mosaic datasets into the derived mosaic dataset.

Catalogs of browse images can be very large and may contain millions of images. The Table raster type in ArcGIS does not utilize parallel processing, so one way to speed up the combination of very large databases is to split the database physically (or virtually, using a query) into subsets and add rasters for each subset in parallel, using separate source mosaic datasets.

What coordinate system should I use for my source mosaic dataset?

It is typically recommended to set the coordinate system for each source mosaic dataset to WGS_1984_Web_Mercator_Auxiliary_Sphere. This is because the source data is often stored in individual local coordinate systems (for example, multiple UTM zones), and the coordinate system for the mosaic dataset should support the maximum extent expected for all imagery.

The mosaic dataset performs projection on the fly, so the projection in which the mosaic dataset is managed does not need to be the same as that of the original data. Reprojection will be performed as a single step, from original source data to desired output projection, minimizing any resampling.

What parameters should I use when adding rasters?

See documentation on adding rasters to mosaic datasets. Use the following parameters when adding browse imagery.

This workflow assumes a feature table is available and that the Table raster type will be used to combine browse imagery into the source mosaic dataset. The advantage of using the Table raster type is that the metadata is immediately imported into the attribute table.

The sensor raster types (for example, Landsat TM, Quickbird, and so on) should not be used, since they are designed for adding the full-resolution datasets, not the browse imagery.

Set cell size ranges should be set to off. If this is set to on, the system attempts to determine the appropriate pixel sizes and potential overlap between images. It is much faster to turn this off and use the Calculate Field tool to set the MinPS and MaxPS. The MinPS should be set to 0 and the MaxPS to a value that corresponds to the average image width of 300. This results in the image being displayed until a scale is reached where it covers less than one-third of the screen display.

How should I populate metadata?

Metadata should be populated when adding data into the source mosaic dataset. When using the Table raster type, all metadata from the input table will be added into the attribute table of the mosaic dataset. Any required field names and types will be automatically created. Ensure non-required fields are removed first.

You may need to add additional metadata separately after the source mosaic dataset has been populated with data. For example, if multiple source mosaic datasets are required to accommodate different data collections, a field named Dataset_ID is recommended to identify all records according to their source mosaic dataset. When the source mosaic datasets are later merged into the derived mosaic dataset, this field will be necessary to allow queries to identify the original collection.

Custom metadata fields must be added to the mosaic dataset attribute table before attempting to populate those fields with metadata.

What should the mosaic dataset properties be?

The default values do not need to be changed for the properties of the source mosaic datasets, as they are generally used only for quality assurance purposes. Properties will be addressed for the derived mosaic dataset.

The MDCS Python script,using an appropriate xml configuration file, automatically sets the mosaic dataset properties.

How should I deal with footprints and NoData values in my imagery?

For the browse imagery workflow, the input metadata table should define the footprint (valid image extents) of each image. If true, when images are combined using the Table raster type, the footprint will be defined.

Note that if the images are rectified with projections that are different from the source mosaic dataset, it will be necessary to reproject these footprints so they are all in the correct SRS (spatial reference system). If you're reprojecting the data, care should be taken to review the density of vertices in the resulting footprints. Typically, a minimal number of vertices should be used or the resulting feature class can become unnecessarily large.

If an accurate footprint does not exist, then by default, the system will use the envelope (full extent) of the image file as the footprint. In the case of rectified imagery noted above, if the browse image is stored in a projected coordinate system, this is unsuitable, and it will be necessary to run the Build Footprints tool. If NoData exists within the image files (that is, it has black or white borders), use the Radiometry computation method, taking care to exclude data values that may be similar to the NoData value. For example, assuming the browse images are compressed as JPEGs, and 0 is used for NoData, set the minimum data value to 3 to eliminate compression artifacts that have altered the NoData values. To speed up the process, reduce the Request size to a minimum value (for example, the average number of columns should be 1,000). Also, set the number of vertices to a lower number, such as eight.

If the images are non-rectified and will be georeferenced using an aux.xml file, the footprints should be correct by default when set to the full extent of the image file. If not, footprints can be quickly computed by running the Build Footprints tool, using the Geometry computation method. In this case, set the number of vertices to four.

In both cases, it is best to test this on sample imagery that includes an image as the boundaries of the expected extent.

Note: Since the number of browse images is typically very large, running the Build Footprints tool can take a substantial amount of time. It is strongly recommended that the image extents be known at the time of adding data and that the pre-built extents are used to define the footprints.

Do I need to build a boundary?

Yes. After adding the data to the source mosaic dataset, run the Build Boundary tool with the method set to Envelope. This will ensure that the envelope of the image footprint is used to create the boundary. By using this method, the boundary is uncomplicated and will require the least number of vertices. This is important when the volume of data is large.

Do I need to create overviews?

It's not recommended to use Update Overviews when creating source mosaic datasets.

Derived mosaic datasets

If you are working with multiple collections of browse imagery (i.e. more than one mosaic dataset), you may want to merge your source mosaic datasets into a single derived mosaic dataset to streamline data management.

With all data collections managed in a single derived mosaic dataset, user applications can easily search the attribute table to access image metadata, and then select imagery by date or any other attribute.

How should I set up my mosaic dataset?

The MDCS Python script, using an appropriate xml configuration file, automatically sets the mosaic dataset properties.

The recommended settings are listed here (any properties not mentioned should use the default setting). To set these properties manually, run the Set Mosaic Dataset Properties tool.

When creating your derived mosaic dataset, define it according to the following guidelines:

  • Define the number of bands and pixel depth for the derived mosaic dataset based on the maximum number of spectral bands, and the maximum pixel depth, of any source dataset. For browse imagery, this will likely be three-band, 8-bit data.
  • Set the spatial reference for the derived mosaic dataset to accommodate the maximum extents expected for your project. Web Mercator Auxiliary Sphere is typically used for compatibility with web maps.

For the derived mosaic dataset, the recommended properties settings are:

AttributeValueComments

max_num_per_mosaic

100

To allow larger number of overlapping browse imagery to be displayed.

rows_maximum_imagesize

4000

Sufficient for largest screens, but not for large plots.

columns_maximum_imagesize

4000

Sufficient for largest screens, but not for large plots.

default_compression_type

JPEG

JPEG_quality

75

resampling_type

Bilinear

clip_to_footprints

Yes

Important due to potential for large areas of NoData.

clip_to_boundary

No

Boundary should be the full extent of all data.

footprints_may_contain_nodata

No

Clip_to_footprints=yes will remove NoData.

allowed_mosaic_methods

ByAttribute, LockRaster, Closest to Center

default_mosaic_method

ByAttribute

order_field

AcquisitionDate

order_base

2050/1/1

Set to a date in the future.

blend_width

0

Not required (blending is intended to create a seamless mosaic).

cell_size_tolerance

10

Such filtering is not required.

minimum_pixel_contribution

100

To remove insignificant contributors.

transmission_fields

TBD

Select those that are required.

use_time

Yes

Time is a key part of such services.

start_time_field, end_time_field

AcquisitionDate

Value and description of the properties set for a derived mosaic dataset

How do I add my source mosaic datasets to my derived mosaic dataset?

All data added to the derived mosaic dataset should be contained within source mosaic datasets, and added to the derived mosaic dataset using the parameters below (no individual rasters should be added to the derived mosaic dataset).

ParameterSettingNotes

Raster type

Table

Calculate cell sizes

Off

Cell sizes should be dealt with at the source mosaic level

Update Boundary

Off

If On, the system attempts to regenerate the boundary based on potentially millions of footprints. Typically, the boundary will be approximated to an envelope.

How should I populate metadata?

After adding the source mosaics, ensure Dataset_ID and other key metadata fields have been copied into the derived mosaic dataset for all records.

Do I need to index fields?

Yes. Creating an index in a database table means to presort a key field which may be commonly used for searching. After building an index for key fields, the database can quickly search records based on any of those indexed fields.

Because browse mosaic datasets contain large numbers of images, it is important to optimize search capabilities by indexing commonly search fields-AcquisitionDate, for example, or CloudCover.

To do this, run the Add Attribute Indexgeoprocessing tool for all fields likely to be searched.

How should I deal with footprints and NoData values in my imagery?

The source mosaic datasets should have footprints already defined. Derived mosaic dataset properties should be set as follows:

  • Clip the raster to its footprint: YES
  • NoData value: do not define

Do I need to create overviews?

No. Due to the dynamic nature of the mosaic dataset supporting browse imagery, generating overviews is not recommended.

If reduced-resolution views are desired to provide users with geographic context, use a contiguous and consistent dataset (for example, the Multispectral Landsat data) in place of overviews. Users can also reference a basemap layer (accessed via ArcGIS Online) to provide geographic context while browsing.

Ensure such added overview images have their category set to 3 (overview) in the attribute table, which will exclude them from most searches.