Use the model—ArcGIS pretrained models

You can use the HF Zero-Shot Classification pretrained model in the Classify Objects Using Deep Learning tool available in the Image Analyst toolbox in ArcGIS Pro.

Classify objects

Complete the following steps to use HF Zero-Shot Classification from the imagery:

Download the HF Zero-Shot Classification model.
Click Add data to add an image to the Contents pane.
You'll run the prediction on this image.
Click the Analysis tab and browse to Tools.
In the Geoprocessing pane, click Toolboxes and expand Image Analyst Tools. Select the Classify Objects Using Deep Learning tool under Deep Learning.
On the Parameters tab, set the variables as follows:
1. Input Raster—Choose an input image from the drop-down menu or from a folder location.
2. Output Classified Objects Feature Class—Set the output feature layer that will contain the classification label and confidence score.
3. Model Definition—Select the pretrained model .dlpk file.
4. Arguments (optional)—Change the values of the arguments if required.
  - huggingface_id—The model id of a pretrained Zero-Shot Object Classification model hosted on huggingface.co
    Zero-Shot Image Classification models can be filtered by choosing the Zero-Shot Image Classification tag in the Tasks list on the Hugging Face model hub, as shown below:
    The model id consists of the {username}/{repository} as displayed at the top of the model page, as shown below:
    Only those models that have config.json and preprocessor_config.json are supported. The presence of these files can be verified on the Files and versions tab of the model page, as shown below:
  - padding—The number of pixels at the border of image tiles from which predictions are blended for adjacent tiles. Increase its value to smooth the output while reducing edge artifacts. The maximum value of the padding can be half of the tile size value.
  - classes— Zero-shot object classifiers are not trained to classify predetermined classes. The categories to which you want to classify the image can be provided here. Each class should be separated by a ','.
  - batch_size—The number of image tiles processed in each step of the model inference. This depends on the memory of your graphics card.
On the Environments tab, set the variables as follows:
1. Processing Extent—Select the default extent or any other option from the drop-down menu.
2. Processor Type—Select CPU or GPU as needed.
  It is recommended that you select GPU, if available, and set GPU ID to the GPU to be used.
Click Run. The output layer is added to the map. The Label column of the attribute table contains the predicted class and confidence.

Feedback on this topic?