Use the model—ArcGIS pretrained models

You can use the Text SAM model in the Detect Objects Using Deep Learning tool available in the Image Analyst toolbox in ArcGIS Pro.

Complete the following steps to use the Text SAM pretrained model:

Download the model and add the imagery layer in ArcGIS Pro.
Click the Analysis tab and browse to Tools.
In the Geoprocessing pane, click Toolboxes and expand Image Analyst Tools. Select the Detect Objects Using Deep Learning tool under Deep Learning.
On the Parameters tab, set the variables as follows:
1. Input Raster—Select the image.
2. Output Detected Objects—Set the output feature class that will contain the detected objects.
3. Model Definition—Select the pretrained model .dlpk file.
4. Arguments (optional)—Change the values of the arguments if required.
  - text_prompt—Text that describes the objects to be detected. The input can be multiple text prompts, separated by commas, allowing the detection of multiple classes.
  - padding—Number of pixels at the border of image tiles from which predictions are blended for adjacent tiles. Increase its value to smooth the output while reducing edge artifacts. The maximum value of the padding can be half of the tile size value.
  - batch_size—Number of image tiles processed in each step of the model inference. This depends on the memory of your graphics card.
  - box_threshold—The confidence score used for selecting the detections to be included in the results. The allowed values range from 0 to 1.0.
  - text_threshold—The confidence score used for associating the detected objects with the provided text prompt. A higher value ensures strong association but potentially fewer matches. The allowed values range from 0 to 1.0.
  - tta_scales—Performs test time augmentation while predicting by changing the scale of the image. The values in the range of 0.5 to 1.5 are recommended. Multiple scale values separated by commas can also be provided, for example, 0.9, 1, 1.1.
  - box_nms_thresh—The box IoU cut-off used by non-maximal suppression to filter duplicate masks.
5. Non Maximum Suppression—Optionally, select the check box to remove the overlapping features with lower confidence.
  If checked, do the following:
  - Confidence Score Field—Use the default.
  - Class Value Field—Use the default.
  - Max Overlap Ratio—Set the max overlap ratio value to 0.1.
On the Environments tab, set the variables as follows:
1. Processing Extent—Select Default or any other option from the drop-down menu.
2. Cell Size—Set the value appropriately.
  Select the cell size in meters in such a way that maximizes the visibility of the objects of interest throughout the chosen extent. Consider a larger cell size for detecting larger objects and a smaller cell size for detecting smaller objects. For example, set the cell size for cloud detection to 10 meters, while for car detection, set it to 0.30 meters (30 centimeters). For further information regarding cell size, refer to the provided resource.
3. Processor Type—Select CPU or GPU as needed.
  If GPU is available, it is recommended that you select GPU and set GPU ID to the GPU to be used.
Click Run.
Once processing is complete, the output layer is added to the map.

Feedback on this topic?