You can use the HF Entity Recognition model in the Extract Entities Using Deep Learning tool available in the GeoAI toolbox in ArcGIS Pro. Follow the steps below to use the model for entity recognition.
Extract Entities
Complete the following steps to extract entities from text:
- Download the HF Entity Recognition pretrained model from ArcGIS Living Atlas of the World.
- Browse to Tools on the Analysis tab.
- Click the Toolboxes tab in the Geoprocessing pane, select GeoAI Tools, and browse to the Extract Entities Using Deep Learning tool under Text Analysis.
- Set the variables on the Parameters tab as follows:
- Input Folder or Table—The input point, line, or polygon feature class, or table containing the text to be processed. You can also use a .csv file here.
- Text Field—The text field within the input table that contains the text to be processed.
- Output Table—The table in which the output of entity extraction will be saved. This table will be saved in a geodatabase by default.
- Input Model Definition File—Select the model .dlpk file.
- Model Arguments—Change the values of the arguments if
required.
- huggingface_id—The model ID of a pretrained entity recognition model hosted on huggingface.co.
Entity Recognition models can be filtered by selecting the Token Classification tag under the Tasks section within the Natural Language Processing category on the Hugging Face model hub, as shown below:
The model ID follows the format {username}/{repository}, as displayed at the top of the model's page:
Only those models that have config.json are supported. This file can be verified under the Files and versions tab of the model page, as shown below:
- huggingface_id—The model ID of a pretrained entity recognition model hosted on huggingface.co.
- Batch Size—The number of rows to be processed at once. Increasing the batch
size can improve tool performance; however, as the batch size
increases, more memory is used.
- Set the variables on the Environments tab as follows:
- Processor Type—Select CPU or GPU.
It is recommended that you select GPU, if available, and set GPU ID to specify the GPU to be used.
- Processor Type—Select CPU or GPU.
- Click Run.
As soon as processing finishes, the output table will be added to the map. The output can be seen by opening Attribute Table. The input_str column contains the text data processed by the model, while the remaining columns contain the corresponding entities extracted from the text .