Preparing input data—Managing Preprocessed Orthophotos

This section will discuss best practices and specifications for preparing your data when using the Managing Browse Imagery workflow.

Imagery requirements

What imagery specifications are required for this workflow?

The Managing Browse Imagery workflow is intended to work with specific types of data. The basic data requirements are listed in the table below.

Parameter	Requirement
Acceptable file formats	TIFF with JEG compression (for panchromatic imagery) TIFF with JPEG_YCBCR compression (for natural color) JPEG format PNG or other formats
Source	Reduced-resolution version of your organization's imagery
Bit depth	Typically, 8-bit
Band configuration	Typically, three-band

Note:

JPEG-format images are easier to access using a web browser. However, TIFF files with JPEG compression are easier to serve, since JPEG files been to be decompressed completely (beginning to end).

If you are using larger browse imagery (more than about 2,000 columns or rows), converting your files to TIFF will improve performance.

Data structure recommendations

How should I structure my directories?

If your organization's existing imagery collections must be accessible to legacy software applications, maintaining existing data structures is probably best. Similarly, if your application requires using browse images from another organization, it is best to use the data in the structure in which it is delivered to streamline maintenance.

if you're able to refine the directory structure of the data, you should store each collection of image files in a separate directory. Define a folder hierarchy that makes sense for the data, and plan ahead to provide sufficient granularity later. (For example: directories might be separated by year, with subdirectories for multiple dates in the same year. Alternately, for very large datasets, directories might also be organized by geographic region.) To maximize performance, try to keep the number of files per directory under 1,000.

How should I manage my files?

Store your imagery's metadata in the same location as the imagery files.

Maintain browse imagery and metadata on disk drives that allow fast access to maximize performance. Full-resolution imagery, which does not need to be accessed as quickly, can then utilize slower, cheaper storage.

Don't set the directory in which the files are stored as read-only. Many of the workflows result in additional pyramid, statistics, or metadata files being written along with the source files. If the directories are read-only during the authoring processes, these files will be stored in separate locations disconnected from the originals.

How should I organize my mosaic datasets?

Store mosaic datasets in a file geodatabase (most cases) or enterprise geodatabase (when multiple users need to edit the mosaic dataset at the same time).

Typically, use one geodatabase for each mosaic dataset or for a small group of related mosaic datasets that define a project. This makes backup and restoring simpler.

Use a standardized naming convention. Imagery Workflows use the following prefixes:

S_xxx-Source mosaic dataset
D_xxx-Derived mosaic dataset
R_xxx-Referenced mosaic dataset

Preparing metadata

How should I organize the metadata that came with my imagery?

Browse images are typically created as a byproduct of either (1) an image acquisition postprocessing system for satellite imagery, or (2) navigation data from an aerial imagery campaign.

Such systems typically output both the browse image and a table that contains the metadata about the individual frames. Such tables may exist as CSV files, databases, or structured XML.

This workflow assumes that a metadata table exists and has been converted into a feature class compatible with ArcGIS. This metadata table will be used to populate a mosaic dataset to manage and serve the browse imagery. This workflow assumes that there is a one-to-one relationship between a record in the metadata table and a browse image. Typically, the footprint of each scene is available as part of the feature class; if not, the footprint must be computed.

Depending on your organization and the type of data to be managed, it is recommended to include additional metadata in the mosaic dataset (more than in the existing metadata table), especially where the browse imagery and its application support multiple sensors. New data fields can be added to the mosaic dataset as necessary to support key metadata defined by your organization.

The following is a list of suggested metadata, with recommended names for commonly used metadata fields. Since some search applications may search on multiple image services, this list is optimized for compatibility with these applications by using standardized field names and definitions. Any additional metadata available with the browse imagery can also be maintained. Some metadata fields may not be necessary for search or query, and if they significantly increase the size of the attribute table, it may be best to put them in an external file and reference these in a MetadataURL field. The data manager should decide which metadata fields will be maintained live with the browse imagery, versus any metadata that can be stored in external files for infrequent access.

If the metadata from the existing database does not include the high-priority fields listed below, it is recommended that those fields (where appropriate) be added into the table before ingestion into the mosaic dataset. It is also recommended that extraneous or redundant fields be removed from the data stored in the attribute table and included with the external metadata, since they increase the size of the tables. Similarly, reconsider the size of each field so they are not larger than necessary. Reducing the size of the tables can speed up the performance of web applications when searching metadata.

FieldName	FieldType	Length	Value	Description
Name	Text	50		Unique name of scene, for example, Scene ID. Required field.
Raster	Text	120		Path to browse image file in server directory structure. Required when using raster type = Table.
SensorName	Text	20		Name of the sensor (not the satellite; see SatelliteName), for example, ETM+.
SatelliteName	Text	25		Name of the satellite, for example, Landsat7.
CloudCover	Double		0...100; <NULL>	Estimated percent cloud cover in the full scene, as an integer from 0 to 100.
SatElevation	Double		0...90; <NULL>	Angle of vector to satellite in degrees positive upward from the horizon as seen by an observer on the earth within the scene's footprint.
SatAzimuth	Double		0...360; <NULL>	Angle of vector to satellite in degrees positive clockwise from true north as seen by an observer on the earth within the scene's footprint.
SunElevation	Double		0...90; <NULL>	Angle of sun in degrees positive upward from the horizon as seen by an observer on the earth within the scene's footprint.
SunAzimuth	Double		0...360; <NULL>	Angle of sun in degrees positive clockwise from true north as seen by an observer on the earth within the scene's footprint.
AcquisitionDate	Date/Time			UTC date and time of the acquisition.
DayOfYear	Long		1...366, <NULL>	Numerical integer representation of day of year (Jan 1 = 1, Feb 1=32, and so on). Can be calculated from AcquisitionDate. Used to search for seasonal data.
MetadataURL	Text	120		URL or link to external metadata file with detailed and/or original metadata, available for infrequent access.
GSD	Double			Approximate pixel size in meters of source data. Highest resolution in cases where scene has multiple sizes.
BrowseURL	Text	80		Internet URL to browse image file.
ProductName	Text	25		This is a standard field name in all mosaic datasets, but it is typically unused. For browse imagery, the product typically refers to a level of processing (for example, L1A, L2B) available. If this field is available in the metadata table, the data manager should decide if it should be ingested.
Download_URL	Text	80		Link to the full-resolution image (for download).

Metadata table to create a mosaic dataset to manage browse images

Preprocessing

Do I need to create pyramids or generate statistics?

No.

Do I need to preprocess metadata?

Maybe. If the metadata table is not already in a format that can be imported directly into a mosaic dataset, preprocessing the metadata table is required.

Georeferencing

Do I need to georeference my imagery?

Possibly. An important aspect of browse imagery is the extent of the image and the georeferencing of the image. There are two primary ways in which the browse images are georeferenced:

Rectified

Each image is pre-rectified (or orthorectified) based on the best available referencing information. The result is an image that is stored in a projected coordinate system and typically has a border of NoData pixels. Typically such rectified images are generated in a suitable universal transverse Mercator (UTM) zone, in decimal degrees (WGS 84), or the Web Mercator projection.

The disadvantage in using this method is that the black NoData pixels need to be clipped out. This can be done using footprints. It is optimal if the footprint can be provided by the system that rectifies the image; otherwise, processing time must be allocated for ArcGIS to calculate footprints.

Non-rectified

With browse images that are stored in non-rectified format, each image can be georeferenced for visualization by applying a simple projection using the four corner points of the image and projective transformation. In this case, the image is stored in simple image file coordinates and has the advantage of not including any NoData pixels. The definition of the projective transformation is best done using an associated .aux.xml file that defines the four image and ground coordinates. Such files can be generated quickly based on the information of the four corner coordinates and the size of the image.

Feedback on this topic?