Train Using AutoDL (Image Analyst)

Available with Image Analyst license.

Summary

Trains a deep learning model by building training pipelines and automating much of the training process. This includes data augmentation, model selection, hyperparameter tuning, and batch size deduction. Its outputs include performance metrics of the best model on the training data, as well as the trained deep learning model package (.dlpk file) that can be used as input for the Extract Features Using AI Models tool to predict on new imagery.

Learn more about how AutoDL works

Usage

  • You must install the proper deep learning framework for Python in ArcGIS AllSource.

    Learn how to install deep learning frameworks for ArcGIS

  • If you will be training models in a disconnected environment, see Additional Installation for Disconnected Environment for more information.

  • The time it takes the tool to produce the trained model depends on the following:

    • The amount of data provided during training
    • The AutoDL Mode parameter value
    • The Total Time Limit (Hours) parameter value

    By default, the timer for all modes is set at 2 hours. The Basic mode will train the selected networks on the default backbone within the given time. The Advanced mode will divide the total time by two, perform the model evaluation in the first half, and determine the top two performing models for evaluating on other backbones in the second half. If the amount of data being trained is large, all the selected models may not be evaluated within 2 hours. In such cases, the best performing model determined within 2 hours will be considered the optimum model. You can then either use this model or rerun the tool with a higher Total Time Limit (Hours) parameter value.

  • This tool can also be used to fine-tune an existing trained model. For example, an existing model that has been trained for cars can be fine-tuned to train a model that identifies trucks.

  • To run this tool, a GPU-equipped machine is required. If you have more than one GPU, use the GPU ID environment.

  • The input training data for this tool must include the images and labels folders that are generated from the Export Training Data For Deep Learning tool.

  • Potential use cases for the tool include training object detection and pixel classification models for extracting features such as building footprint, pools, solar panels, land cover classification, and so on.

  • For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions.

Parameters

LabelExplanationData Type
Input Training Data

The folders containing the image chips, labels, and statistics required to train the model. This is the output from the Export Training Data For Deep Learning tool. The metadata format of the exported data must be Classified_Tiles, PASCAL_VOC_rectangles, or KITTI_rectangles.

Folder
Output Model

The output trained model that will be saved as a deep learning package (.dlpk file).

File
Pretrained Model
(Optional)

A pretrained model that will be used to fine-tune the new model. The input is an Esri model definition file (.emd) or a deep learning package file (.dlpk).

A pretrained model with similar classes can be fine-tuned to fit the new model. The pretrained model must have been trained with the same model type and backbone model that will be used to train the new model.

File
Total Time Limit (Hours)
(Optional)

The total time limit in hours it will take for AutoDL model training. The default is 2 hours.

Double
AutoDL Mode
(Optional)

Specifies the AutoDL mode that will be used and how intensive the AutoDL search will be.

  • BasicThe basic mode will be used. This mode is used to train all selected networks without hyperparameter tuning.
  • Advanced The advanced mode will be used. This mode is used to perform hyperparameter tuning on the top two performing models.
String
Neural Networks
(Optional)

Specifies the architectures that will be used to train the model.

By default, all the networks will be used.

  • SingleShotDetectorThe SingleShotDetector architecture will be used to train the model. SingleShotDetector is used for object detection.
  • RetinaNetThe RetinaNet architecture will be used to train the model. RetinaNet is used for object detection.
  • FasterRCNNThe FasterRCNN architecture will be used to train the model. FasterRCNN is used for object detection.
  • YOLOv3The YOLOv3 architecture will be used to train the model. YOLOv3 is used for object detection.
  • HRNetThe HRNet architecture will be used to train the model. HRNet is used for pixel classification.
  • ATSSThe ATSS architecture will be used to train the model. ATSS is used for object detection.
  • CARAFEThe CARAFE architecture will be used to train the model. CARAFE is used for object detection.
  • CascadeRCNNThe CascadeRCNN architecture will be used to train the model. CascadeRCNN is used for object detection.
  • CascadeRPNThe CascadeRPN architecture will be used to train the model. CascadeRPN is used for object detection.
  • DCNThe DCN architecture will be used to train the model. DCN is used for object detection.
  • DeepLabThe DeepLab architecture will be used to train the model. DeepLab is used for pixel classification.
  • UnetClassifierThe UnetClassifier architecture will be used to train the model. UnetClassifier is used for pixel classification.
  • DeepLabV3PlusThe DeepLabV3Plus architecture will be used to train the model. DeepLabV3Plus is used for pixel classification.
  • PSPNetClassifierThe PSPNetClassifier architecture will be used to train the model. PSPNetClassifier is used for pixel classification.
  • ANNThe ANN architecture will be used to train the model. ANN is used for pixel classification.
  • APCNetThe APCNet architecture will be used to train the model. APCNet is used for pixel classification.
  • CCNetThe CCNet architecture will be used to train the model. CCNet is used for pixel classification.
  • CGNetThe CGNet architecture will be used to train the model. CGNet is used for pixel classification.
  • DETRegThe DETReg architecture will be used to train the model. DETReg is used for object detection.
  • DynamicRCNNThe DynamicRCNN architecture will be used to train the model. DynamicRCNN is used for object detection.
  • EmpiricalAttentionThe EmpiricalAttention architecture will be used to train the model. EmpiricalAttention is used for object detection.
  • FCOSThe FCOS architecture will be used to train the model. FCOS is used for object detection.
  • FoveaBoxThe FoveaBox architecture will be used to train the model. FoveaBox is used for object detection.
  • FSAFThe FSAF architecture will be used to train the model. FSAF is used for object detection.
  • GHMThe GHM architecture will be used to train the model. GHM is used for object detection.
  • LibraRCNNThe LibraRCNN architecture will be used to train the model. LibraRCNN is used for object detection.
  • PaFPNThe PaFPN architecture will be used to train the model. PaFPN is used for object detection.
  • Res2NetThe Res2Net architecture will be used to train the model. Res2Net is used for object detection.
  • SABLThe SABL architecture will be used to train the model. SABL is used for object detection.
  • VFNetThe VFNet architecture will be used to train the model. VFNet is used for object detection.
  • DMNetThe DMNet architecture will be used to train the model. DMNet is used for pixel classification.
  • DNLNetThe DNLNet architecture will be used to train the model. DNLNet is used for pixel classification.
  • FastSCNNThe FastSCNN architecture will be used to train the model. FastSCNN is used for pixel classification.
  • FCNThe FCN architecture will be used to train the model. FCN is used for pixel classification.
  • GCNetThe GCNet architecture will be used to train the model. GCNet is used for pixel classification.
  • MobileNetV2The MobileNetV2 architecture will be used to train the model. MobileNetV2 is used for pixel classification.
  • NonLocalNetThe NonLocalNet architecture will be used to train the model. NonLocalNet is used for pixel classification.
  • OCRNetThe OCRNet architecture will be used to train the model. OCRNet is used for pixel classification.
  • PSANetThe PSANet architecture will be used to train the model. PSANet is used for pixel classification.
  • SemFPNThe SemFPN architecture will be used to train the model. SemFPN is used for pixel classification.
  • UperNetThe UperNet architecture will be used to train the model. UperNet is used for pixel classification.
  • MaskRCNNThe MaskRCNN architecture will be used to train the model. MaskRCNN is used for object detection.
String
Save Evaluated Models
(Optional)

Specifies whether all evaluated models will be saved.

  • Checked—All evaluated models will be saved.
  • Unchecked—Only the best performing model will be saved. This is the default.
Boolean

Derived Output

LabelExplanationData Type
Output Model File

The output model file.

File

TrainUsingAutoDL(in_data, out_model, {pretrained_model}, {total_time_limit}, {autodl_mode}, {networks}, {save_evaluated_models})
NameExplanationData Type
in_data

The folders containing the image chips, labels, and statistics required to train the model. This is the output from the Export Training Data For Deep Learning tool. The metadata format of the exported data must be Classified_Tiles, PASCAL_VOC_rectangles, or KITTI_rectangles.

Folder
out_model

The output trained model that will be saved as a deep learning package (.dlpk file).

File
pretrained_model
(Optional)

A pretrained model that will be used to fine-tune the new model. The input is an Esri model definition file (.emd) or a deep learning package file (.dlpk).

A pretrained model with similar classes can be fine-tuned to fit the new model. The pretrained model must have been trained with the same model type and backbone model that will be used to train the new model.

File
total_time_limit
(Optional)

The total time limit in hours it will take for AutoDL model training. The default is 2 hours.

Double
autodl_mode
(Optional)

Specifies the AutoDL mode that will be used and how intensive the AutoDL search will be.

  • BASICThe basic mode will be used. This mode is used to train all selected networks without hyperparameter tuning.
  • ADVANCED The advanced mode will be used. This mode is used to perform hyperparameter tuning on the top two performing models.
String
networks
[networks,...]
(Optional)

Specifies the architectures that will be used to train the model.

  • SingleShotDetectorThe SingleShotDetector architecture will be used to train the model. SingleShotDetector is used for object detection.
  • RetinaNetThe RetinaNet architecture will be used to train the model. RetinaNet is used for object detection.
  • FasterRCNNThe FasterRCNN architecture will be used to train the model. FasterRCNN is used for object detection.
  • YOLOv3The YOLOv3 architecture will be used to train the model. YOLOv3 is used for object detection.
  • HRNetThe HRNet architecture will be used to train the model. HRNet is used for pixel classification.
  • ATSSThe ATSS architecture will be used to train the model. ATSS is used for object detection.
  • CARAFEThe CARAFE architecture will be used to train the model. CARAFE is used for object detection.
  • CascadeRCNNThe CascadeRCNN architecture will be used to train the model. CascadeRCNN is used for object detection.
  • CascadeRPNThe CascadeRPN architecture will be used to train the model. CascadeRPN is used for object detection.
  • DCNThe DCN architecture will be used to train the model. DCN is used for object detection.
  • DeepLabThe DeepLab architecture will be used to train the model. DeepLab is used for pixel classification.
  • UnetClassifierThe UnetClassifier architecture will be used to train the model. UnetClassifier is used for pixel classification.
  • DeepLabV3PlusThe DeepLabV3Plus architecture will be used to train the model. DeepLabV3Plus is used for pixel classification.
  • PSPNetClassifierThe PSPNetClassifier architecture will be used to train the model. PSPNetClassifier is used for pixel classification.
  • ANNThe ANN architecture will be used to train the model. ANN is used for pixel classification.
  • APCNetThe APCNet architecture will be used to train the model. APCNet is used for pixel classification.
  • CCNetThe CCNet architecture will be used to train the model. CCNet is used for pixel classification.
  • CGNetThe CGNet architecture will be used to train the model. CGNet is used for pixel classification.
  • DETRegThe DETReg architecture will be used to train the model. DETReg is used for object detection.
  • DynamicRCNNThe DynamicRCNN architecture will be used to train the model. DynamicRCNN is used for object detection.
  • EmpiricalAttentionThe EmpiricalAttention architecture will be used to train the model. EmpiricalAttention is used for object detection.
  • FCOSThe FCOS architecture will be used to train the model. FCOS is used for object detection.
  • FoveaBoxThe FoveaBox architecture will be used to train the model. FoveaBox is used for object detection.
  • FSAFThe FSAF architecture will be used to train the model. FSAF is used for object detection.
  • GHMThe GHM architecture will be used to train the model. GHM is used for object detection.
  • LibraRCNNThe LibraRCNN architecture will be used to train the model. LibraRCNN is used for object detection.
  • PaFPNThe PaFPN architecture will be used to train the model. PaFPN is used for object detection.
  • Res2NetThe Res2Net architecture will be used to train the model. Res2Net is used for object detection.
  • SABLThe SABL architecture will be used to train the model. SABL is used for object detection.
  • VFNetThe VFNet architecture will be used to train the model. VFNet is used for object detection.
  • DMNetThe DMNet architecture will be used to train the model. DMNet is used for pixel classification.
  • DNLNetThe DNLNet architecture will be used to train the model. DNLNet is used for pixel classification.
  • FastSCNNThe FastSCNN architecture will be used to train the model. FastSCNN is used for pixel classification.
  • FCNThe FCN architecture will be used to train the model. FCN is used for pixel classification.
  • GCNetThe GCNet architecture will be used to train the model. GCNet is used for pixel classification.
  • MobileNetV2The MobileNetV2 architecture will be used to train the model. MobileNetV2 is used for pixel classification.
  • NonLocalNetThe NonLocalNet architecture will be used to train the model. NonLocalNet is used for pixel classification.
  • OCRNetThe OCRNet architecture will be used to train the model. OCRNet is used for pixel classification.
  • PSANetThe PSANet architecture will be used to train the model. PSANet is used for pixel classification.
  • SemFPNThe SemFPN architecture will be used to train the model. SemFPN is used for pixel classification.
  • UperNetThe UperNet architecture will be used to train the model. UperNet is used for pixel classification.
  • MaskRCNNThe MaskRCNN architecture will be used to train the model. MaskRCNN is used for object detection.

By default, all the networks will be used.

String
save_evaluated_models
(Optional)

Specifies whether all evaluated models will be saved.

  • SAVE_ALL_MODELS All evaluated models will be saved.
  • SAVE_BEST_MODELOnly the best performing model will be saved. This is the default.
Boolean

Derived Output

NameExplanationData Type
output_model_file

The output model file.

File

Code sample

TrainUsingAutoDL (Python window)

This example shows how to use the TrainUsingAutoDL function.

# Name: TrainUsingAutoDL.py
# Description: Train a deep learning model on imagery data with
# automatic hyperparameter selection.
  
# Import system modules
import arcpy
import os

# Set local variables

datapath = "path_to_training_data" 
out_path = "path_to_trained_model"

out_model = os.path.join(out_path, "mymodel")

# Run Train Using AutoML Model
arcpy.geoai.TrainUsingAutoDL(
    datapath, out_model, None, 2, "BASIC", 
    ["ATSS", "DCN", "FasterRCNN", "RetinaNet", "SingleShotDetector", "YOLOv3"], 
    "SAVE_BEST_MODEL")

Environments

Related topics