Use the model—ArcGIS pretrained models

You can use the CLIP Zero-Shot Classifier pretrained model in the Classify Objects Using Deep Learning tool available in the Image Analyst toolbox in ArcGIS Pro.

Recommended imagery configuration

The recommended imagery configuration is as follows:

Resolution—The expected image resolution is 224x224 pixels to 800x800 pixels.

Classify Oriented Imagery

Complete the following steps to use CLIP Zero-Shot Classifier from the imagery:

Download the CLIP Zero-Shot Classifier model.
Click Add data to add an image to the Contents pane.
You'll run the prediction on this image.
Click the Analysis tab and browse to Tools.
In the Geoprocessing pane, click Toolboxes and expand Image Analyst Tools. Select the Classify Objects Using Deep Learning tool under Deep Learning.
On the Parameters tab, set the variables as follows:
1. Input Raster—Choose an input image from the drop-down menu or from a folder location.
2. Output Classified Objects Feature Class—Set the output feature layer that will contain the classification label and confidence score.
3. Model Definition—Select the pretrained model .dlpk file.
4. Arguments (optional)—Change the values of the arguments if required.
  - classes—Provide the labels for the image classification. Multiple classes can be provided here by using a comma as a class separator. If a single class is provided, a new class called Other is added by default.
  - threshold_binary_class—Applicable only for binary classification use cases. The class originally predicted by the model is maintained if its probability exceeds the provided threshold. If the class probability does not exceed the provided threshold, the predicted class is discarded and the other class (not originally predicted by the model) is assigned to the image.
On the Environments tab, set the variables as follows:
1. Processing Extent—Select the default extent or any other option from the drop-down menu.
2. Processor Type—Select CPU or GPU as needed.
  It is recommended that you select GPU, if available, and set GPU ID to the GPU to be used.
Click Run. The output layer is added to the map. The Label column of the attribute table contains the predicted class and confidence.

Feedback on this topic?