Interactive object detection basics

Available with Advanced license.

Available with Image Analyst license.

Interactive object detection is used to find objects of interest from imagery displayed in a map or scene.

Object detection relies on a deep learning model that has been trained to detect specific objects in the displayed view such as windows and doors in buildings in a scene. Detection results are saved to a point feature class with a confidence score, bounding-box dimensions, and the label name as attributes. You can also interactively detect other objects—for example, parked aircraft or airport structures—using a generic model by clicking in the view to detect the result.

You must install Deep Learning Libraries to use object detection.

License:

The interactive object detection tool requires either an ArcGIS Pro Advanced license or the ArcGIS Image Analyst extension.

The Object Detection tool Interactive Detection is on the Exploratory 3D Analysis drop-down menu in the Workflows group on the Analysis tab. After selecting the Object Detection tool, the Exploratory Analysis pane appears.

Use the Exploratory Analysis pane to modify the object detection parameters and set which camera method to use for detection results. The first time the tool is run, the model used is the Esri Windows and Doors. The model is loaded and the detections are calculated. Additional runs do not require reloading the model and will take less time. If you change the model selection, the new model must be loaded again. The Generic Object model does not require a model to be downloaded.

The images below illustrate the object detection result returned with the symbology options available: a box symbology or a location center point X symbol.

Interactive object detection using box symbology

Interactive object detection using location point symbology

Detect objects in a 3D view

The Object Detection tool can work with any supported model that is trained to detect particular objects. It comes with a provided model specific to detecting windows and doors, as well as a generic model for detecting other object interactively.

The Esri Windows and Doors deep learning model detects windows and doors as point features. The object detection parameters for using the Esri Windows and Doors model are described in the following table:

OptionDescription

Model

The deep learning package (.dlpk) to use for detecting objects. The model types supported include FasterRCNN, YOLOv3, Single Shot Detector (SSD), and RetinaNet.

Expand the Model input drop-down arrow and click Download Model to access the pretrained Esri Windows and Doors model. Optionally, click Browse to choose a local deep learning package or download one from ArcGIS Online.

Classes

The list of real-world objects to detect. This list is populated from the .dlpk file. The default is set to All, but you can specifically set it to either only windows or only doors.

Minimum Confidence Level

The minimum detection score a detection must meet. Detections with scores lower than this confidence level are discarded. The default value is 0.5.

Maximum Overlap Threshold

The intersection over union threshold with other detections. If detection results overlap, the one with the highest score is considered a true positive. The default value is 0.

Process using GPU

Use the graphics processing unit (GPU) processing power instead of the computer processing unit (CPU) processing power. This is recommended if you have a graphics card with at least 8 Gb of dedicated GPU memory.

Feature layer

The name of the output feature layer.

  • If the layer does not exist, a feature class is created in the project's default geodatabase and added to the current map or scene.
  • If the layer is already in the map or scene and has the required schema, newly detected objects are appended to the existing feature class.
  • If you rerun the tool when the layer is not in the current map or scene, a new uniquely named feature class is created in the default geodatabase and added to the map or scene.

Description

The description to be included in the attribute table. Multiple detection results can be saved to the same feature layer and a description can be used to differentiate between these multiple detections.

Symbology

Set the returned shape of the output feature layer using the default color of electron gold. The following are the symbology choices:

  • Location Point—An X marking the centerpoint of the feature. This is the default.
  • Vertical Bounding Box (3D only)—A vertical semi-transparent filled bounding box. Use the vertical bounding box symbology in scenes for deep-learning models that detect vertical objects, such as windows and doors.
  • Horizontal Bounding Box (3D only)—A horizontal semi-transparent filled bounding box. Use the horizontal bounding box symbology in scenes for deep-learning models that detect horizontal objects, such as swimming pools.

If the output layer is already in the map or scene and has custom symbology, its symbology is not changed when the tool is run.

Distance

Set the maximum distance from the camera to which results will be retained. Anything beyond the set depth will be ignored.

Width

Set the minimum and maximum width values for the size of the expected returned result.

Height

Set the minimum and maximum height values for the size of the expected returned result.

Note:
Distance, Width, and Height parameters are in the Filter Results section, which you may need to expand to set these values.

The creation methods for object detection are described in the following table:

MethodDescription

Current Camera Current Camera

This is the default creation method. It uses the current camera position to detect objects in the view.

Reposition Camera Reposition Camera (3D only)

Repositions the camera to a horizontal or vertical viewpoint before detecting objects. Set up the area of interest viewpoint and use this to fine-tune the alignment. It is not recommended for positioning the camera on objects in the distance to bring them closer in the view.

Generic object detection

Use the Esri Generic Object deep learning model to interactively detect individual objects such as vehicles, structures, and people in a map or scene. Instead of using the camera you can click directly in the view to detect results. Some detection options such as classes, confidence level, overlap threshold and processing power are not available. Results are stored as point features using the symbology option set for the tool.

The parameters for object detection using the Esri Generic Object model are described in the following table:

OptionDescription

Model

Expand the Model drop-down list and choose Esri Generic Object to define the object detection process.

Feature Layer

The name of the output feature layer.

  • If the layer does not exist, a feature class is created in the project's default geodatabase and added to the current map or scene.
  • If the layer is already in the map or scene and has the required schema, newly detected objects are appended to the existing feature class.
  • If you rerun the tool when the layer is not in the current map or scene, a new uniquely named feature class is created in the default geodatabase and added to the view.

Description

The description to be included in the attribute table as a field. Multiple detection results can be saved to the same feature layer and a description can be used to differentiate between these multiple detections.

Symbology

Set the returned shape of the output feature layer using the default color of electron gold. The following are the symbology choices:

  • Location Point—An X marking the centerpoint of the feature. This is the default.
  • Vertical Bounding Box (3D only)—A vertical semitransparent filled bounding box.
  • Horizontal Bounding Box (3D only)—A horizontal semitransparent filled bounding box.

If the output layer is already in the map or scene and has custom symbology, its symbology is not changed when the tool is run.

Creation Method

Interactive Detection Interactive Detection—Click to detect individual objects at that location.

Update detection results

To change the output results—for example, using a different confidence value or choosing another area of interest—change those properties and run the Object Detection tool again. Newly discovered objects are appended to the same layer. Alternatively, provide a new name and create another output feature layer for comparison. It is not recommended that you manually update the attribute values of object detection results. You can also expand the Filter Results section to specify size and distance values to fine-tune returned results.

Tip:

Before rerunning the tool, turn the layer visibility off for the previous detection results. Otherwise, those results may overlap objects being detected and could affect detection results.

Delete detection results

Detection results are added as point features. You can delete individual detected object features using the standard editing workflows. Alternatively, delete the entire feature class from the project's default geodatabase. Removing the layer from the Contents pane does not automatically delete your results, as they still exist in the geodatabase. If you rerun the tool when the layer is not in the current map or scene, a new uniquely named feature class is created in the default geodatabase and added to the map or scene.