You can use this model in the Detect Objects Using Deep Learning tool available in the Image Analyst toolbox in ArcGIS Pro. Follow the steps below to use the model for parsing text in images.
Supported imagery
This model can be used with high-resolution, 3-band street-level imagery with medium to large size text in it or scanned document.
Detect and recognize text
Complete the following steps to read text from images:
- Download the Optical Character Recognition model and add an image or street-level imagery with text in ArcGIS Pro.
- Zoom to an area of interest.
- Browse to Tools on the Analysis tab.
- Click the Toolboxes tab in the Geoprocessing pane, select Image Analyst Tools, and browse to the Detect Objects Using Deep Learning tool under Deep Learning.
- Set the variables on the Parameters tab as follows:
- Input Raster—Select the image.
- Output Detected Objects—Set the output detected object that will contain the text detection and recognition results.
- Model Definition—Select the pretrained model .dlpk file.
- Arguments—Change the values of the arguments if
required.
- threshold—The detections with a confidence score higher than this threshold are included in the result. The allowed values range from 0 to 1.0.
- test_time_augmentation—Performs test time augmentation while predicting. If true, predictions of flipped and rotated variants of the input image will be merged into the final output.
- batch_size—Number of text detections to be processed by the text recognition model at once.
- Non-Maximum Suppression—Optionally, check or uncheck the check box as needed.
If checked, do the following:
- Set the Confidence Score Field value.
- Set the Class Value Field value.
- Set the Maximum Overlap Ratio value.
- Set the variables on the Environments tab as follows:
- Processing Extent—Select Default or any other option from the drop-down menu.
- Processor Type—Select CPU or GPU as needed.
Note:
If GPU is available, it is recommended to select GPU and set GPU ID to specify the GPU to be used.
- Click Run.
The output layer is added to the map.