Transform Text Using Deep Learning (GeoAI)

Summary

Runs a trained sequence-to-sequence model on a text field in a feature class or table and updates it with a new field containing the converted, transformed, or translated text.

Learn more about how Text Transformation works

Usage

  • This tool requires deep learning frameworks be installed. To set up your machine to use deep learning frameworks in ArcGIS AllSource, see Install deep learning frameworks for ArcGIS.

  • This tool requires a model definition file containing trained model information. The model can be trained using the Train Text Transformation Model tool. The Input Model Definition File parameter value can be an Esri model definition JSON file (.emd) or a deep learning model package (.dlpk). The model files can be stored locally or hosted on ArcGIS Living Atlas of the World.

  • This tool supports models trained using transformer-based backbones and the Mistral backbone. To install the Mistral backbone, see ArcGIS Mistral Backbone.

  • This tool supports the use of third-party language models created using the model extensibility feature. The model extensibility feature enables text transformation tasks using a custom deep learning model file (.dlpk) that is not created using the Train Text Transformation Model tool. To learn more about creating a custom deep learning (.dlpk) model file, see Use third party language models with ArcGIS.

  • This tool can run on CPU or GPU; however, deep learning is computationally intensive and a GPU is recommended. To run this tool using GPU, set the Processor Type environment to GPU. If you have more than one GPU, specify the GPU ID environment instead.

  • For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions.

Parameters

LabelExplanationData Type
Input Table

The input point, line, or polygon feature class, or table containing the text that will be transformed.

Feature Layer; Table View
Text Field

A text field in the input feature class or table that contains the text that will be transformed.

Field
Input Model Definition File

The trained model that will be used for text transformation. The model definition file can be either an Esri model definition JSON file (.emd) or a deep learning model package (.dlpk) that is stored locally or hosted on ArcGIS Living Atlas (.dlpk_remote).

To use a .dlpk file that is trained using the Mistral backbone, it must be installed before using the model. To install the Mistral backbone, see ArcGIS Mistral Backbone.

The .dlpk file can also be a third-party language model.

Caution:

A third-party language model .dlpk file can potentially contain harmful code. Use these models only if you trust their source.

File
Result Field
(Optional)

The name of the field that will contain the transformed text in the output feature class or table. The default field name is Result.

String
Model Arguments
(Optional)

Additional arguments that will be used by the model while performing inference. The supported model argument is sequence_length, which will be used to adjust the model's output.

Note:

When using a third party language model, the model arguments will be updated according to the parameters specified in the .dlpk file. To learn more about defining model arguments, see getParameterInfo section in Use third party language models with ArcGIS.

Value Table
Batch Size
(Optional)

The number of training samples that will be processed at one time. The default value is 4.

Increasing the batch size can improve tool performance; however, as the batch size increases, more memory is used. If an out of memory error occurs, use a smaller batch size.

Double
Minimum Sequence Length
(Optional)

The minimum number of characters for the output text string. The default value is 20.

Double
Maximum Sequence Length
(Optional)

The maximum number of characters for the output text string. The default value is 50.

Double

Derived Output

LabelExplanationData Type
Updated Table

The output point, line, or polygon feature class, or table containing the transformed text derived from the input data.

Table View; Feature Layer

arcpy.geoai.TransformTextUsingDeepLearning(in_table, text_field, in_model_definition_file, {result_field}, {model_arguments}, {batch_size}, {minimum_sequence_length}, {maximum_sequence_length})
NameExplanationData Type
in_table

The input point, line, or polygon feature class, or table containing the text that will be transformed.

Feature Layer; Table View
text_field

A text field in the input feature class or table that contains the text that will be transformed.

Field
in_model_definition_file

The trained model that will be used for text transformation. The model definition file can be either an Esri model definition JSON file (.emd) or a deep learning model package (.dlpk) that is stored locally or hosted on ArcGIS Living Atlas (.dlpk_remote).

To use a .dlpk file that is trained using the Mistral backbone, it must be installed before using the model. To install the Mistral backbone, see ArcGIS Mistral Backbone.

The .dlpk file can also be a third-party language model.

Caution:

A third-party language model .dlpk file can potentially contain harmful code. Use these models only if you trust their source.

File
result_field
(Optional)

The name of the field that will contain the transformed text in the output feature class or table. The default field name is Result.

String
model_arguments
[model_arguments,...]
(Optional)

Additional arguments that will be used by the model while performing inference. The supported model argument is sequence_length, which will be used to adjust the model's output.

Note:

When using a third party language model, the model arguments will be updated according to the parameters specified in the .dlpk file. To learn more about defining model arguments, see getParameterInfo section in Use third party language models with ArcGIS.

Value Table
batch_size
(Optional)

The number of training samples that will be processed at one time. The default value is 4.

Increasing the batch size can improve tool performance; however, as the batch size increases, more memory is used. If an out of memory error occurs, use a smaller batch size.

Double
minimum_sequence_length
(Optional)

The minimum number of characters for the output text string. The default value is 20.

Double
maximum_sequence_length
(Optional)

The maximum number of characters for the output text string. The default value is 50.

Double

Derived Output

NameExplanationData Type
updated_table

The output point, line, or polygon feature class, or table containing the transformed text derived from the input data.

Table View; Feature Layer

Code sample

TransformTextUsingDeepLearning example (Python window)

The following Python window script demonstrates how to use the TransformTextUsingDeepLearning function.

# Name: TransformText.py
# Description: Translate text from English to German
#
# Requirements: ArcGIS Pro Advanced license

# Import system modules
import arcpy
import os

arcpy.env.workspace = "C:/textanalysisexamples/data"

# Set local variables
in_table = os.path.join("translationdata")
pretrained_model_path_emd = "c:\\translatedata\\Seq2Seq.emd"

# Run Transform Text Using Deep Learning
arcpy.geoai.TransformTextUsingDeepLearning(in_table, "EnglishText", pretrained_model_path_emd)

Environments