Classify Pixels Using Deep Learning (Image Analyst)

Available with Image Analyst license.

Summary

Runs a trained deep learning model on an input raster to produce a classified raster, with each valid pixel having an assigned class label.

This tool requires a model definition file containing trained model information. The model can be trained using the Train Deep Learning Model tool or by a third-party training software such as TensorFlow, PyTorch, or Keras. The model definition file can be an Esri model definition JSON file (.emd) or a deep learning model package, and it must contain the path to the Python raster function to be called to process each object and the path to the trained binary deep learning model file.

Usage

  • You must install the proper deep learning framework Python API (such as TensorFlow or PyTorch) in the ArcGIS AllSource Python environment; otherwise, an error will occur when you add the Esri model definition file to the tool. Obtain the appropriate framework information from the creator of the Esri model definition file.

    To set up your machine to use deep learning frameworks in ArcGIS AllSource, see Install deep learning frameworks for ArcGIS.

  • This tool calls a third-party deep learning Python API (such as TensorFlow, PyTorch, or Keras) and uses the specified Python raster function to process each object.

  • Sample use cases for this tool are available on the Esri Python raster function GitHub page. You can also write custom Python modules by following examples and instructions in the GitHub repository.

  • The Model Definition parameter value can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string rather than upload the .emd file. The .dlpk file must be stored locally.

  • For more information about deep learning, see Deep learning in ArcGIS AllSource.

  • The following code sample uses the Esri model definition file (.emd):

    {
        "Framework":"TensorFlow",
        "ModelConfiguration":"deeplab",
    
        "ModelFile":"\\Data\\ImgClassification\\TF\\froz_inf_graph.pb",
        "ModelType":"ImageClassification",
        "ExtractBands":[0,1,2],
        "ImageHeight":513,
        "ImageWidth":513,
    
        "Classes" : [
            {
                "Value":0,
                "Name":"Evergreen Forest",
                "Color":[0, 51, 0]
             },
             {
                "Value":1,
                "Name":"Grassland/Herbaceous",
                "Color":[241, 185, 137]
             },
             {
                "Value":2,
                "Name":"Bare Land",
                "Color":[236, 236, 0]
             },
             {
                "Value":3,
                "Name":"Open Water",
                "Color":[0, 0, 117]
             },
             {
                "Value":4,
                "Name":"Scrub/Shrub",
                "Color":[102, 102, 0]
             },
             {
                "Value":5,
                "Name":"Impervious Surface",
                "Color":[236, 236, 236]
             }
        ]
    }
  • The input raster can be a single raster, multiple rasters, or a feature class with images attached. For more information about attachments, see Add or remove file attachments.

  • Increasing the batch size can improve tool performance; however, as the batch size increases, more memory is used. If an out of memory error occurs, use a smaller batch size. The batch_size value can be adjusted using the Arguments parameter.

  • Batch sizes are square numbers, such as 1, 4, 9, 16, 25, 64 and so on. If the input value is not a perfect square, the highest possible square value is used. For example, if a value of 6 is specified, it means that the batch size is set to 4.

  • This tool supports and uses multiple GPUs, if available. To use a specific GPU, specify the GPU ID environment. When the GPU ID is not set, the tool uses all available GPUs. This is default.

  • For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions.

Parameters

LabelExplanationData Type
Input Raster

The input raster dataset that will be classified.

The input can be a single raster, multiple rasters in a mosaic dataset, an image service, a folder of images, or a feature class with image attachments.

Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder; Feature Layer; Feature Class
Model Definition

The Model Definition parameter value can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string rather than upload the .emd file. The .dlpk file must be stored locally.

It contains the path to the deep learning binary model file, the path to the Python raster function to be used, and other parameters such as preferred tile size or padding.

File; String
Arguments
(Optional)

The information from the Model Definition parameter will be used to populate this parameter. These arguments vary, depending on the model architecture. The following are supported model arguments for models trained in ArcGIS. ArcGIS pretrained models and custom deep learning models may have additional arguments that the tool supports.

  • batch_size—The number of image tiles processed in each step of the model inference. This depends on the memory of your graphics card. The argument is available for all model architectures.
  • direction—The image is translated from one domain to another. Options are AtoB and BtoA. The argument is only available for the CycleGAN architecture. For more information about this argument, see How CycleGAN works.
  • merge_policy—The policy for merging augmented predictions. Available options are mean, max, and min. This is only applicable when test time augmentation is used. The argument is available for the MultiTaskRoadExtractor and ConnectNet architectures. If IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are also available.
  • n_timestep—The number of time steps that will be used. The default is 200. It can be increased and decreased based on the quality of generations. The argument is only supported for the Super Resolution with SR3 backbone model.
  • padding—The number of pixels at the border of image tiles from which predictions are blended for adjacent tiles. To smooth the output while reducing artifacts, increase the value. The maximum value of the padding can be half the tile size value. The argument is available for all model architectures.
  • predict_background—Specifies whether the background class will be classified. If true, the background class is also classified. The argument is available for UNET, PSPNET, DeepLab, MMSegmentation, and SAMLoRA.
  • return_probability_raster—Specifies whether the output will be a probability raster. If true, the output will be a probability raster. If false, the output will be a binary classified raster. The default is false. If ArcGISLearnVersion is 1.8.4 or later in the model's .emd file, the MultiTaskRoadExtractor and ConnectNet architectures are available. If ArcGISLearnVersion is 1.8.4 or later and IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are also available.
  • sampling_type—The type of sampling that will be used. Two types of sampling are available: ddim and ddpm. The default is ddim, which generates results in fewer time steps compared to ddpm. The argument is only supported for the Super Resolution with SR3 backbone model.
  • schedule—An optional string that sets the type of schedule. The default schedule is the same as the model it was trained on. The argument is only supported for the Super Resolution with SR3 backbone model.
  • test_time_augmentation—Performs test time augmentation while predicting. If true, predictions of flipped and rotated variants of the input image will be merged into the final output. The argument is available for UNET, PSPNET, DeepLab, HEDEdgeDetector, BDCNEdgeDetector, ConnectNet, MMSegmentation, Multi-Task Road Extractor, and SAMLoRA.
  • tile_size—The width and height of image tiles into which the imagery will be split for prediction. The argument is only available for the CycleGAN architecture.
  • thinning—Specifies whether predicted edges will be thinned or skeletonized. Options are True and False. If IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are available.
  • threshold—The predictions that have a confidence score higher than this threshold are included in the result. The allowed values range from 0 to 1.0. If ArcGISLearnVersion is 1.8.4 or later in the model's .emd file, the MultiTaskRoadExtractor and ConnectNet architectures are available. If ArcGISLearnVersion is 1.8.4 or later and IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are also available.

Value Table
Processing Mode

Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service.

  • Process as mosaicked imageAll raster items in the mosaic dataset or image service will be mosaicked together and processed. This is the default.
  • Process all raster items separatelyAll raster items in the mosaic dataset or image service will be processed as separate images.
String
Output Folder
(Optional)

The folder where the output classified rasters will be stored. A mosaic dataset will be generated using the classified rasters in this folder.

This parameter is required when the input raster is a folder of images or a mosaic dataset in which all items are to be processed separately. The default is a folder in the project folder.

Folder
Output Features
(Optional)

The feature class where the output classified rasters will be stored.

This parameter is required when the input raster is a feature class of images.

Feature Class
Overwrite attachments
(Optional)

Specifies whether existing image attachments will be overwritten.

  • Unchecked—Existing image attachments will not be overwritten and new image attachments will be stored in a new feature class. When this parameter is unchecked, the Output Features parameter will be available. This is the default.
  • Checked—The existing feature class will be overwritten with the new updated attachments.

This parameter is only available when the Input Raster parameter value is a feature class with image attachments.

Boolean
Use pixel space
(Optional)

Specifies whether inferencing will be performed on images in pixel space.

  • Unchecked—Inferencing will be performed in map space. This is the default.
  • Checked—Inferencing will be performed in image space, and the output will be transformed back to map space. This option is useful when using oblique imagery or Street View imagery, where the features may become distorted using map space.

Boolean

Return Value

LabelExplanationData Type
Output Raster Dataset

The name of the raster or mosaic dataset containing the result.

Raster Dataset

ClassifyPixelsUsingDeepLearning(in_raster, in_model_definition, {arguments}, processing_mode, {out_classified_folder}, {out_featureclass}, {overwrite_attachments}, {use_pixelspace})
NameExplanationData Type
in_raster

The input raster dataset that will be classified.

The input can be a single raster, multiple rasters in a mosaic dataset, an image service, a folder of images, or a feature class with image attachments.

Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder; Feature Layer; Feature Class
in_model_definition

The in_model_definition parameter value can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string rather than upload the .emd file. The .dlpk file must be stored locally.

It contains the path to the deep learning binary model file, the path to the Python raster function to be used, and other parameters such as preferred tile size or padding.

File; String
arguments
[arguments,...]
(Optional)

The information from the in_model_definition parameter will be used to set the default values for this parameter. These arguments vary, depending on the model architecture. The following are supported model arguments for models trained in ArcGIS. ArcGIS pretrained models and custom deep learning models may have additional arguments that the tool supports.

  • batch_size—The number of image tiles processed in each step of the model inference. This depends on the memory of your graphics card. The argument is available for all model architectures.
  • direction—The image is translated from one domain to another. Options are AtoB and BtoA. The argument is only available for the CycleGAN architecture. For more information about this argument, see How CycleGAN works.
  • merge_policy—The policy for merging augmented predictions. Available options are mean, max, and min. This is only applicable when test time augmentation is used. The argument is available for the MultiTaskRoadExtractor and ConnectNet architectures. If IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are also available.
  • n_timestep—The number of time steps that will be used. The default is 200. It can be increased and decreased based on the quality of generations. The argument is only supported for the Super Resolution with SR3 backbone model.
  • padding—The number of pixels at the border of image tiles from which predictions are blended for adjacent tiles. To smooth the output while reducing artifacts, increase the value. The maximum value of the padding can be half the tile size value. The argument is available for all model architectures.
  • predict_background—Specifies whether the background class will be classified. If true, the background class is also classified. The argument is available for UNET, PSPNET, DeepLab, MMSegmentation, and SAMLoRA.
  • return_probability_raster—Specifies whether the output will be a probability raster. If true, the output will be a probability raster. If false, the output will be a binary classified raster. The default is false. If ArcGISLearnVersion is 1.8.4 or later in the model's .emd file, the MultiTaskRoadExtractor and ConnectNet architectures are available. If ArcGISLearnVersion is 1.8.4 or later and IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are also available.
  • sampling_type—The type of sampling that will be used. Two types of sampling are available: ddim and ddpm. The default is ddim, which generates results in fewer time steps compared to ddpm. The argument is only supported for the Super Resolution with SR3 backbone model.
  • schedule—An optional string that sets the type of schedule. The default schedule is the same as the model it was trained on. The argument is only supported for the Super Resolution with SR3 backbone model.
  • test_time_augmentation—Performs test time augmentation while predicting. If true, predictions of flipped and rotated variants of the input image will be merged into the final output. The argument is available for UNET, PSPNET, DeepLab, HEDEdgeDetector, BDCNEdgeDetector, ConnectNet, MMSegmentation, Multi-Task Road Extractor, and SAMLoRA.
  • tile_size—The width and height of image tiles into which the imagery will be split for prediction. The argument is only available for the CycleGAN architecture.
  • thinning—Specifies whether predicted edges will be thinned or skeletonized. Options are True and False. If IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are available.
  • threshold—The predictions that have a confidence score higher than this threshold are included in the result. The allowed values range from 0 to 1.0. If ArcGISLearnVersion is 1.8.4 or later in the model's .emd file, the MultiTaskRoadExtractor and ConnectNet architectures are available. If ArcGISLearnVersion is 1.8.4 or later and IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are also available.

Value Table
processing_mode

Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service.

  • PROCESS_AS_MOSAICKED_IMAGEAll raster items in the mosaic dataset or image service will be mosaicked together and processed. This is the default.
  • PROCESS_ITEMS_SEPARATELYAll raster items in the mosaic dataset or image service will be processed as separate images.
String
out_classified_folder
(Optional)

The folder where the output classified rasters will be stored. A mosaic dataset will be generated using the classified rasters in this folder.

This parameter is required when the input raster is a folder of images or a mosaic dataset in which all items are to be processed separately. The default is a folder in the project folder.

Folder
out_featureclass
(Optional)

The feature class where the output classified rasters will be stored.

This parameter is required when the input raster is a feature class of images.

Feature Class
overwrite_attachments
(Optional)

Specifies whether existing image attachments will be overwritten.

  • NO_OVERWRITEExisting image attachments will not be overwritten and new image attachments will be stored in a new feature class. When this option is specified, the out_featureclass parameter must be populated. This is the default.
  • OVERWRITEThe existing feature class will be overwritten with the new updated attachments.

This parameter is only valid when the in_raster parameter value is a feature class with image attachments.

Boolean
use_pixelspace
(Optional)

Specifies whether inferencing will be performed on images in pixel space.

  • NO_PIXELSPACEInferencing will be performed in map space. This is the default.
  • PIXELSPACEInferencing will be performed in image space, and the output will be transformed back to map space. This option is useful when using oblique imagery or Street View imagery, where the features may become distorted using map space.
Boolean

Return Value

NameExplanationData Type
out_classified_raster

The name of the raster or mosaic dataset containing the result.

Raster Dataset

Code sample

ClassifyPixelsUsingDeepLearning example 1 (Python window)

This example classifies a raster based on a custom pixel classification using the ClassifyPixelsUsingDeepLearning function.

# Import system modules
import arcpy
from arcpy.ia import *

# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")

ClassifyPixelsUsingDeepLearning("c:/classifydata/moncton_seg.tif",
     "c:/classifydata/moncton.tif", "c:/classifydata/moncton_sig.emd")
ClassifyPixelsUsingDeepLearning example 2 (stand-alone script)

This example classifies a raster based on a custom pixel classification using the ClassifyPixelsUsingDeepLearning function.

# Import system modules
import arcpy
from arcpy.ia import *


# Set local variables
in_raster = "c:\\classifydata\\moncton_seg.tif"
in_model_definition = "c:\\classifydata\\moncton_sig.emd"
model_arguments = "padding 0; batch_size 16"
processing_mode = "PROCESS_AS_MOSAICKED_IMAGE"

# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")

# Execute 
Out_classified_raster = ClassifyPixelsUsingDeepLearning(in_raster, 
                   in_model_definition, model_arguments, processing_mode)
Out_classified_raster.save("c:\\classifydata\\classified_moncton.tif")

Related topics