Imagery formats and performance—Imagery Workflows

Generally, it is best to leave imagery in its original form. When imagery is processed so that the pixels are sampled (for example, to change the projection), this leads to degradation in quality, possible artifacts, creation of NoData areas, and issues managing additional files. With imagery used for analysis or high-quality interpretation, it is preferable (and sometimes necessary) to ensure that the pixel values do not change.

In some cases, it is advisable to change the format of the imagery to make it faster to access. This does not involve sampling the imagery, but it does result in data duplication. The original data is often then archived. Including lossy or lossless compression in the format conversion is optional.

When a new imagery or raster product is created or persisted from another data source (such as a digital terrain model generated from stereo imagery) then the projection, pixel alignment, pixel depth, format and compression should be carefully chosen to ensure appropriate dynamic range, precision, and fast access.

Many traditional image workstation applications first read the complete image into memory and then allow a user to make changes to the image before saving it. This is not a scalable approach. To enable scalability for large numbers of large images, ArcGIS reads on demand only the required imagery from storage. The performance of the storage system as well as the format of the imagery can have a significant effect on the performance of the imaging system.

File format suitability

The following six factors have the greatest influence on the suitability of a file format:

Internal tiling of the rasters
A raster may be stored as a simple, non-compressed array of pixels on disk. When an extent of an image needs to be read, the system determines the location on disk for the first part of that extent, skips to that part and start reading only the appropriate rows in the file. Due to the way storage is managed as blocks on disk, it is still necessary for the system to read the blocks that store any part of the rows, so the amount of data read is substantially more than just the section of the raster required.
With larger files (more than 3,000 columns), the read access can be enhanced by using a tiled image format that breaks the image internally into smaller tiles (typically 256x256 or 512x512 pixels). This makes it faster to access a group of pixels representing a rectangular extent. Most modern file formats are either tiled or include options for tiling. The TIFF format can be tiled or non-tiled. By default, ArcGIS writes a tiled TIFF. If the imagery is tiled, then it is necessary to also maintain an index to the different tiles.
Volume of data required to be read
For any group of pixels to be processed, those pixels must be read from storage. If the data volume to be read is reduced, this can improve performance. ArcGIS includes a lot of optimization to minimize the volume of imagery that is read, including internal caching and limiting the number of files that need to be read. The volume of data read can also be significantly reduced by using compression. Typically, a natural color image can be compressed 5-10 times with negligible difference in image quality. Such compression substantially reduces the volume of data read from disk and can have a positive influence on performance, especially on systems with slower storage.
Lossless compression algorithms, such as LZW/Deflate, can be effectively used to compress imagery that contains a large number of NoData values, but generally provide minimal compression of optical imagery. Lossy compression algorithms can substantially reduce data volumes, but may add unacceptable artifacts to imagery, especially if it is to be used for analysis. Compressing imagery also increases the compute load for reading the imagery. Some compression algorithms are much more compute-efficient than others.
JPEG is a very good lossy compression format due to it speed and the good compression ratio achieved. Both 8-bit and 12-bit JPEG compression are supported in ArcGIS. LERC compression is a very efficient lossless or controlled lossy compression that minimizes the compute load.
Amount of processing power required to decompress the image
Compression requires that the system decompress the imagery before any processing can be applied. Some compression formats, such as JPEG 2000, are compute-intensive to decompress. For workstation applications where a complete but relatively small image is first read into memory, such compressions are useful, but they are not recommended for use on servers that perform processing.
If imagery is suitable for lossy compression, then JPEG compression (different from the JPEG format), which is compute-inexpensive to decompress due to the relative simplicity of the algorithms, is recommended. The widespread use of JPEG compression also means the codecs are hardwired into many CPUs. Although JPEG compression does not provide quite as high compression as some of the wavelet compression algorithms (like JPEG 2000), the difference is relatively minor. Typically, an 8-bit natural color image can be compressed about 8 times using JPEG compression, which gives similar quality to a wavelet-compressed image with 12 times compression. In relation to the 8x compression, the approximate additional 30% compression provided by wavelet compression algorithms is often not warranted, since the compute load to read formats such as JPEG 2000 can be 4-10 times greater.
For lossless compressed imagery, LZW (or Deflate) compression provides good compression with low compute load. Other lossy compression algorithms can provide a bit more compression, but the additional compute load generally outweighs the small increase in compression. ArcGIS includes support for Limited Error Raster Compression (LERC) that provides high compression of rasters with control of the precision of the data so it's very fast to both compress and decompress. LERC is most valuable when working with elevation data or higher bit depth imagery. It is also very valuable when working with optical imagery, where it can be used as a lossless compression of 8bit data providing about 25% compression. For higher bit-depth optical imagery that is typically stored as 16bit per channel, LERC can reduce file size to about half losslessly, or higher depending on the tolerance defined. LERC can be used for the transmission of data from the server to the client and is also used in the MRF and CRF file formats.
Existence of pyramids
Pyramids are reduced-resolution datasets used to read imagery at lower resolutions. Pyramids are recommended, especially for larger files. Pyramids can often be compressed even if the base data is not compressed. More details about creating pyramids are defined below.
Location and type of metadata
The way in which the metadata is stored with the file can influence the access speed. When a file is opened to read the pixels, it is often necessary to read the file's metadata to obtain georeferencing information and access properties such as spatial reference and number of rows and columns. Formats such as GeoTIFF store metadata in tags that can be quickly accessed. In some types of TIF formats, these tags maybe spread across the file, making it slow to access. ArcGIS uses tags in suitable fole formats (such as TIF or NITF), but can also write metadata to a small .aux.xml file stored next to the file to enable faster access while providing an extensible method for storing additional metadata.
When using mosaic datasets, the method by which metadata is stored in the source's files is less important, because the metadata is read when raster datasets are added, then embedded in the mosaic dataset for faster access when required.
Maintaining NoData values
Many raster datasets need to define NoData values for parts of the raster where there is no data. A typical example is the edge of a raster that has been projected. Such NoData areas are typically managed by defining a value as representing NoData. Maintaining NoData can be problematic, especially with lossy compression formats (like JPEG or JPEG2000) where the values are often changed. The MRF format has a way of maintaining NoData values for both lossless and lossy data. It is important to ensure that if NoData values are required, they are defined before data is converted or pyramids are created. Incorrectly defining NoData, or not defining it at all, can result in artifacts at the edges of full-resolution images or in the pyramid levels.

Recommended imagery formats

There are three recommended file formats to use in ArcGIS:

TIFF that is internally tiled with overviews and suitable JPEG or LZW/Deflate compression (i.e. tiled TIFF)
This is a generic standard format that is most suitable for most applications. There is a flavor of TIFF referred to as COG (Cloud Optimized GeoTIFF) that is the same as tiled TIF except the location of the index and overview is slightly more efficient.
Meta Raster Format (MRF), using LERC or JPEG compression, with pyramids
MRF is optimized especially for cloud storage, but also works well for enterprise storage. MRF splits a raster into three separate files, which can be stored separately. The main file is very small, and includes key properties and references to identify the files used to store the index and data tiles. This structure optimizes data access. MRF can use multiple compressions including LERC, JPEG (8-bit and 12-bit), and Deflate. The combination of compression and efficiency can result in file access that is about 35% faster than COG for cloud storage.
Note:
To learn more about converting data to MRF, GeoTIF or COG, see the OptimizeRasters GitHub repo. OptimizeRasters is a tool that efficiently convert larger collections of imagery to an optimal format, but also can be used to upload data to the web and perform other enhancements that can optimize data access.
Cloud Raster Format (CRF)
CRF is primarily a format used as part of distributed raster analysis, but can also be used as a raster dataset. It is optimized for writing and reading very large rasters. Internally, the large rasters are broken down into smaller bundles of tiles, which allows multiple processes to write simultaneously to a single raster.

Another format that is often recommended is tile cache. Tile cache is similar to CRF, but limited to only three bands and 8-bit data. It is primarily used for serving basemap imagery.

If imagery is to be converted, follow these recommendations for your data type:


Data type	Recommendations
8-bit or 16-bit, 1-,3-, or 4-band rasters where lossy compression is not suitable	Use MRF with LERC compression or TIF with LZW/Deflate compression. These formats include tiling with tiles of size 512, 256 or 128. Smaller tile sizes work best for scientific data where access to temporal profiles is more common.
8-bit, 3-band natural color imagery already preprocessed by orthorectification, color balanced, mosaicked, and cut into tiles	This imagery is generally used as background imagery, and should be converted directly to a tile cache or stored as MRF or TIFF with JPEG YCbCr compression. Typically, a quality value of around 80 is used, which provides approximately 8-times compression. YCbCr-based JPEG compression internally converts the image to a different spectral domain, improving the compression.
16-bit or 32-bit, 1-band elevation data	Use MRF with LERC compression or TIFF, LZW/Deflate compression, tiled 128 or 256. For 16-bit elevation, be sure that JPEG is not used.
8- or 16-bit imagery where lossy compression is suitable	Use MRF or TIF with JPEG (YCbCr) compression. The quality should be checked by testing on some sample imagery. In many cases, a quality factor of 90 is suitable. Note that ArcGIS supports a 12-bit version JPEG. Therefore, when compressing 16-bit pan imagery using JPEG, only the first 12 bits of the imagery will be used. Many modern sensors have a sensitivity in the range of 11 - 14 bits, and using 12-bit compression maintains the majority of the image content but excludes the last (often noisy) bits.
8-bit or 16-bit, 3-band, non-natural-color imagery when lossy compression is suitable	Examples of this kind of imagery include false color imagery or scanned maps. Use MRF or TIFF, with JPEG (RGB) compression. In RGB JPEG compression, each band is compressed separately.
8-bit or 16-bit, 4-band RGB-IR	This is the format often captured by modern digital sensors. If the data has been orthorectified and enhanced, then some of the original image information has been lost, potentially limiting its use for some forms of analysis. For such imagery, lossy compression may be suitable, but care should be taken to quantify the effects on any intended future analysis. It is then recommended to convert such imagery into a 3-band RGB and 1-band NIR image and use the above recommendations for compressing each. Splitting into a separate RGB image enables better compression, and most users will likely access the RGB component more than the NIR. In ArcGIS, one can virtually merge the two files to create an RGB-IR image suitable for displaying as false color or computing NDVI. Typically for such imagery, the compression quality is set higher, to 90 or 95, so that compression does not add significant artifacts to NDVI. When using JPEG compression, the recommended quality values to use can range from 80 to 95. It is best to try different factors on sample images and review the differences to determine an optimal value.
8-bit or 16-bit, 5-band RGB-IR with a panchromatic band	Many sensors include 4-band multispectral imagery and 1-band higher-resolution panchromatic imagery. If you are maintaining the IR band, it is recommended to not pre-pansharpen such imagery. The pan-image changes the multispectral properties of the bands and the pan-sharpening process will significantly increase the file sizes, reducing the suitability of the imagery for analysis. Instead, maintain the 4-band multispectral and 1-band panchromatic as separate rasters, and use the capability of ArcGIS to pan-sharpen on the fly, which is performed very fast and ensures that the integrity of the spectral bands is not lost. If you need to compress the imagery to reduce size, the panchromatic band should use JPEG compression, as the panchromatic band is typically much larger than the multispectral image and is not used for spectral analysis. Limited JPEG compression (for example, Q90) has minimal effect on visual interpretation or computation of tie points or DSM generation. When using JPEG compression, the recommended quality values to use can range from 80 to 95. It is best to try different factors on sample images and review the differences to determine an optimal value. Another option that can be use is to pansharpen and store the 3band RGB image with lossy JPEG compression. This image can be used for visual interpretation. Then store the lower resolution multispectral red and IR bands separately for use in analysis.

Reformatting imagery

If imagery is not in an optimized format, reformatting the rasters is recommended if possible.

The following formats should be reformatted:


Format	Reformatting recommendations
.jpg	JPEG files larger than 3,000 columns are slower to read, because JPEG is not tiled; therefore, access to the last pixel of the file requires the complete file to be decompressed. When converting to TIFF with JPEG compression, try to use the same quality factor and type (YCBCR or RGB) as the original data.
.asc	ASCII text files, sometimes used to store elevation, are inherently slow to read as they are unnecessarily large and need to be interpreted.
.dem	Internally, some variations of the format store the numbers as ASCII.
.jp2	It is recommended to test difference of performance after converting a sample file to MRF or TIFF. There are many variants of JPEG 2000. Some can be very costly to decompress or to access pyramids. In some cases, it is advisable to leave the format as JP2, but create an additional set of pyramids (optionally skipping the first pyramid level).
.ecw	This proprietary format has limitations on use in a server environment. The format is often used for preprocessed imagery, so a better format for storage and serving is a map cache. Conversion may result in the file size increasing by about 30-40 percent, but it will be in a web-optimized format. Since wavelet artifacts are different from JPEG artifacts, the conversion of highly compressed ECW to JPEG often results in unnecessary additional artifacts. Where possible, it is advantageous to obtain and compress the original imagery if available.
.flt	Sometimes used for elevation data. Conversion is recommended if the number of columns is greater than 2,000.
.tif	If untiled (for example, the raw format from most data providers), it is advisable to convert to tiled TIFF. A typical example would be imagery from most satellite vendors, which generally deliver TIFF files as RAW and untiled to be compatible with legacy systems that may not be able to handle tiled TIFF. Tiling these files will improve performance and put reduced load on disk storage systems.

It is generally not necessary to reformat the following formats:


Format	Reformatting recommendations
.nitf	Generally, this format can be read quickly, and extensive metadata support and complexity make other formats unsuitable. There are some formats of NITF that have forms of JPEG 2000 compression pyramids that are very slow to read, in which case consider creating additional pyramids. In some cases, to improve performance it is necessary to convert to a different compressed NITF format.
.sid (MrSID)	The MG2 and MG3 versions of the format are read relatively quickly and include pyramids. The MG4 format appears to be slower to read and it is advisable to test.

Reformatting imagery is not always necessary and may not be required at the start of a project, when time to load the imagery, additional data storage, or support by legacy systems is a concern. It is often advantageous to leave the imagery in its nonoptimized form initially, while making imagery accessibility faster, then later optimize the format of the imagery.

Optimizing the format can often be achieved without changing the mosaic datasets. It is generally possible to convert the format and just change the path of the raster in the mosaic dataset, so long as the other properties of the images did not change.

For individual rasters, the Copy Raster tool in ArcGIS can be used to convert images. When working with directories of imagery, it is recommended to use OptimizeRasters.

Pyramids

If images have more than about 2,000 rows and columns, it is advantageous to create pyramids. Pyramids are reduced-resolution versions of the imagery that enable faster access at a smaller scale.

Pyramids can be internal to the files or external in the form of .ovr files or .rrd files. The creation of external overviews has the advantage that the original files are not modified, and if necessary, they can be easily removed to reduce space.

The simplest method to create pyramids in ArcGIS on existing data is to use the Build Pyramids And Statistics tool. This creates pyramids for all rasters within a given workspace.

Pyramids are stored in a single file that generally resides next to the source rasters. They will be given the same name as the source, with an *.ovr extension. Internally, these are actually TIFF files created with multiple 2:1 downsampled resolutions. ArcGIS also supports TIFF files with internal pyramids and older .rrd pyramids.

In most cases, pyramids can be compressed even if the original data sources contain no compressed data, since analysis is typically not performed on the pyramids of files. If the data source and pyramids are not compressed, the pyramids will take one-third additional storage space. If the source is not compressed and the pyramids are compressed, the additional storage can be as small as a few percent of the original size. If the original data is compressed, then even if using the same compression as the source, the overviews may be about 40 percent of the source due to the higher frequency of image content typically in the overviews, which means that compression is less.

When creating pyramids, there are environmental variables that control how they are generated. These include the following:


Environmental variable	Notes
Compression method (pyramids)	In most cases it is suitable to compress the overviews. For natural color 3-band imagery, JPEG YCBCR is recommended. For panchromatic or other imagery, JPEG is recommended. For elevation or categorical data, LZW is recommended. Typically, the compression factor for JPEG can be set to 80. Note that even if the imagery is 16-bit (such as common satellite imagery), JPEG YCBCR or JPEG (RGB) can be used, since ArcGIS supports a 12-bit version of JPEG that is generally suitable for such imagery.
Sampling method	For optical imagery, it is advisable to use bilinear sampling, since this generally provides better-quality imagery when viewed at smaller scales. However, bilinear sampling can result in artifacts at the edges of images that include black (or white) pixels that may be used to define NoData. ArcGIS will correctly handle such NoData pixels and will not create the artifacts if the NoData pixels are correctly defined in the dataset. Therefore, for imagery that does have NoData values, it is recommended that the NoData values are defined prior to creating the pyramids. For categorical data, the nearest (or majority) sampling method should be used. For datasets such as elevation, more careful consideration needs to be taken as to what sampling method is used, but in most cases bilinear is still recommended. Note that using nearest sampling with a factor of 2 will result in a half-pixel shift at each overview level due to alignment of the image extents. If nearest neighbor sampling is required, it is generally better to set the sampling factor to 3 to avoid such shifts, although this can affect the performance at smaller scales by about 20 percent.

Statistics

Statistics are used by the system primarily to ensure suitable default display of the images. If statistics exist with a raster dataset, ArcGIS will apply a stretch to the imagery to make the imagery appear brighter. If statistics are not present, then when displaying a single image, the system will attempt to approximate statistics by reading the central part of the imagery. As a general rule, statistics should be created for satellite scenes and datasets such as elevation where the range of valid data values can be large. If using imagery that has been preprocessed, such as color-corrected imagery, then statistics are not required, since such imagery should not be additionally stretched. The creation of statistics should be done at the same time (and using the same tool) as pyramids.

Creating statistics can be time-consuming on large datasets, as it requires the complete image to be read. An environmental variable called skip factor can be used to control how statistics are created to reduce the reading time. To reduce the time to create statistics, the skip factor can be set so that not all the pixels are read. One way to identify a reasonable skip factor value is to divide the number of columns by 1,000 and use the quotient (integer) as the skip factor. Using such a skip factor only reduces the time taken if pyramids exist.

Working with large mosaics

Many organizations receive large, pre-processed mosaics created from collections of images that are orthorectified, color corrected, and mosaicked to cover a large area. In some cases, such data is split into rectangular tiles, while in others the data is combined into a single large file. The creation and use of a single large file can be problematic and so typically the data is tiled.

Within ArcGIS, the optimum way of working with very large rasters is to use tile cache or CRF. Both formats are designed to handle very large datasets and internally split the virtual raster into multiple files.

If the data is not delivered as tile cache or CRF, it recommended that data be delivered as large TIF tiles, where each file is about 1GB. It is preferable that there is some overlap (around 50 pixels) between the tiles. Such overlaps can reduce the potential of artifacts (such as gaps between images) when the data is projected, especially at small scales.

Storage system performance

In an imagery environment with large volumes of imagery, it is not possible to read the complete imagery into memory, so it is necessary for the system to read the imagery as required from the disk system on demand. Therefore, the performance of the disk system is important, and becomes more significant when using formats that are not tiled and not compressed, as they put more load on the disk. Many users simultaneously accessing the server also puts additional load on it. When the imagery is stored on a different location to the server and connected via a network, the network can quickly become a significant bottleneck.

Many issues related to poor performance of an image server implementation are related to poor storage systems. The performance of disk subsystems varies considerably, and in most cases the performance is either very good or very bad. Unlike CPU and memory performance, where the difference is generally measured in percentage, the performance of different storage systems can vary often by a factor of 10. The price of the disk system is not a good measure of performance or suitability for image server tasks. The following are some general recommendations:



Smaller, dedicated systems	Direct Attached Storage (DAS) systems generally provide high performance at lowest cost, but provide limited scalability. In some cases, it can be advantageous to have a server configured specifically for imagery and have the imagery on a dedicated DAS. To scale by a factor of two and/or have 100 percent redundancy, duplicate the server and mirror the DAS. This method has limited scalability.
Larger systems	Using a NAS (or a SAN with a NAS head) is generally recommended. This enables simpler scaling in the number of servers. Often it is advisable to have a separate NAS dedicated to imagery. This enables the NAS to be connected to the servers using a dedicated switch and NICs or using a dedicated fiber channel, and also removes potential contention with other traffic on the network.
Cloud storage	Cloud infrastructure offers blob storage such as Amazon Web Services S3 or Azure Blob storage. Such storage is low cost and very reliable, but has worse latency than direct access storage or NAS. This can be mostly overcome by using appropriate data structures and some local caching. When managing imagery using cloud infrastructure, you should do the following: Convert the data to MRF format or COG as the data is copied to the cloud. Once the data is in cloud storage, create local raster proxy files that reference the rasters in the cloud. Both steps can be done using OptimizeRasters. Note: Rasters proxies are small files that reference the source data in the cloud. They have the same extension as the original imagery, and are treated like any other raster file by ArcGIS, but they contain only key metadata. Raster proxies work best if the source data is an MRF file, but they can also reference other formats, such as TIF. Raster proxies can also be used to define local caches stored on fast ephemeral disks. This ensures that tiles of data read from the cloud storage are locally cached for subsequent access. This can increase performance, especially for cloud storage or slower NAS storage.

Feedback on this topic?