Preparing input data—Managing Scanned Maps

This section will discuss best practices and specifications for preparing your data when using the Managing Scanned Maps workflow.

Imagery requirements and types

What imagery specifications are required for this workflow?

Most people will be scanning their own paper maps (typically stored as TIF format), which requires some trial and error. You should perform a test scan on a map that covers most types of features you plan to include in the dataset. The test scan can be used to determine the dpi, bit depth, and compression settings that result in the best quality result. To get started with scanned maps, see the table below for recommended values.

Parameter	Recommended value
Bit depth	Color maps: 24-bit RGB Panchromatic maps: 8-bit Engineering drawings: 1-bit
Compression	Color maps: JPEG Panchromatic maps: LZW Engineering drawings (1-bit): lossless
Recommended file format	TIFF
Reprojection or resampling?	Bilinear resampling. Do not rectify.
DPI	400–600
Collar	Keep the collar of the map being scanned. Do not clip to neatline/grid.
Table	Used to maintain metadata
Image file name	Use the name on the map as file name of scanned map

What should I expect with scanned map files?

Scanned maps are typically formatted as TIFF, SID, or JPEG (among others). Many users will scan their own hardcopy maps, but other common sources of scanned maps include USGS, Ordnance Survey (in the UK), municipalities, urban planning departments, and historical archives.

Data structure recommendations

If you are using pre-scanned maps, they may come structured using an established folder structure: map name, state name, scale, etc. It is best to leave the data in the same structure in which it's received. If the structure of the data is changed, then data path should be changed in the metadata table that accompanied the data.

How should I structure my directories?

Store each collection of image files in a separate directory.
Define a folder hierarchy that makes sense for the data, and plan ahead to provide sufficient granularity later.
- If the scanned maps come with an existing directory structure, it's best not to change it.
- Directories might be organized by map name, state name, scale, etc.
To maximize performance, try to keep the number of files per directory under 1,000.

How should I manage my files?

File names are generally defined by the data provider; keep the original names, if possible. If you're scanning your own maps, keep the title from the original.
Store metadata that comes with your imagery in the same location as the imagery files.
Store main imagery files as read-only when possible. This helps ensure that the original files are not modified and that they are backed up multiple times.
Don't set the directory in which the files are stored as read-only. Many of the workflows result in additional pyramid, statistics, or metadata files being written along with the source files. If the directories are read-only during the authoring processes, these files will be stored in separate locations disconnected from the originals.

How should I organize my mosaic datasets?

Store mosaic datasets in a file geodatabase (most cases) or enterprise geodatabase (when multiple users need to edit the mosaic dataset at the same time).
Typically, use one geodatabase for each mosaic dataset or for a small group of related mosaic datasets that define a project. This makes backup and restoring simpler.
Use a standardized naming convention. Imagery Workflows use the following prefixes:
- S_xxx-Source mosaic dataset
- D_xxx-Derived mosaic dataset
- R_xxx-Referenced mosaic dataset
Because the Managing Scanned Maps workflow uses the Table raster type, users will likely only need to create a source mosaic dataset to manage scanned maps.

Preparing metadata

How should I organize my metadata?

Metadata should be sourced from the scanned map itself, often from the map collar. The metadata should be added to a table with a unique identifier, typically the filename.

If the metadata is entered manually, the user should put it into a table format. If the metadata is extracted by software, it will typically be stored in individual files along with the image. These files should be created using a structured format that can later be converted to a table format programmatically.

All fields have to be entered into the feature class table, which will automatically bring them into the mosaic dataset table. Important fields include the following:

Name
Scan_Id
Scan_File_Name
Map_Name
Map_Scale
Date_On_Map
Imprint_Year
Scan_Quality

Preprocessing

Do I need to create pyramids or generate statistics?

Statistics are not required, and creating pyramids is optional. However, creating pyramids will increase the viewing speed. To create pyramids, use the following settings:

Variable	Description and notes
Pyramid/compression	Yes
Levels	1–2
Sampling method	Average/bilinear

Do I need to georeference my scanned maps?

Yes. Scanned maps should be georeferenced before attempting to create a mosaic dataset using the ArcGIS Pro Georeferencing tools or the ArcMap Georeferencing toolbar tools.

When georeferencing, follow these guidelines:

Ensure the appropriate projection and datum are set
Set the TIFF as read-only
Use Update Georeferencing (not Rectify)
Set Transformation as Projective or 2nd Order Polynomial

This will create an associated AUX.XML file.

Check out the next section to learn more about creating mosaic datasets to manage scanned maps.

Feedback on this topic?