You can use this model in the Transform Text Using Deep Learning tool available in the GeoAI toolbox in ArcGIS Pro.
Complete the following steps to extract entities from the text files:
- Download the Address Standardization model from ArcGIS Living Atlas of the World.
- Browse to Tools on the Analysis tab.
- Click the Toolboxes tab in the Geoprocessing pane, select GeoAI Tools, and browse to the Transform Text Using Deep Learning tool under Text Analysis.
- Set the variables on the Parameters tab as follows:
- Input Table—The input point, line, or polygon feature class or table containing the text to be transformed.
- Text Field—The text field within the input feature class or table that contains the text to be transformed.
- Input Model Definition File—Select the pretrained or fine-tuned model .dlpk file.
- Result Field—The name of the field that will contain the transformed text in the output feature class or table. The default field name is Result.
- Model Arguments (optional)—Change the values of the arguments if
required.
- Sequence_length—Maximum sequence length (at subword level after tokenization) of the training data to be considered for training the model. The default value is 512. This is applicable only for models with HuggingFace transformer backbones.
- Use the Advanced options to make the results more accurate:
- Batch Size—The number of rows to be processed at once. Increasing the batch size can improve tool performance; however, as the batch size increases, more memory is used.
- Minimum Sequence Length—The minimum number of characters for the output text string. The recommended value is 10.
- Maximum Sequence Length—The maximum number of characters for the output text string. The default value is 50.
- Set the variables on the Environments tab by selecting CPU or GPU for Processor Type.
It is recommended that you select GPU, if available, and set GPU ID to the GPU to be used.
- Click Run.
The output layer or table is added to the map. Click Attribute Table to see the output.