Smart assistants

Smart assistants augment field data collection workflows, transforming the mobile device camera into a tool that can recognize objects relevant to the workflow at hand. This technology can be used to protect people's privacy by assisting users to redact personally identifiable information. It can also make data collection more efficient and less error prone. With smart assistants, users have the final say on the modifications made to images and the data that gets submitted.

Smart assistants can be configured for image questions in surveys. There are three ways to use smart assistants in the Survey123 field app, which are listed below. Each assistant can be used with photos taken with the camera in the app or with photos selected in the app from the file system.

  • Smart attributes—Perform image classification or object detection and display a real-time preview of attributes during image capture. On capture, the attributes are stored in the image's EXIF metadata and can be extracted and used to populate other questions in the survey.
  • Smart annotation—Use object detection to generate annotation graphics on an image that a user can edit using annotation tools.
  • Smart redaction—Use object detection to generate bounding boxes around target objects; then apply effects to redact those regions.

Smart attributes

Smart attributes allow you to associate an image question with an object detection or image classification model and extract values based on the objects the model detects in the image. Using smart attributes to assist in analyzing the image, you can automate the process of identifying and categorizing the subjects contained in the image, and reduce the risk of errors or inconsistencies in the analysis process.

For example, you take a photo of a road and use smart attributes to identify and analyze different types of manholes in the photo. You can use the pulldata("@json") function to read the detection results in the image's EXIF metadata.

Detection results will vary depending on the type of model. Object detection models show all of the items identified with bounding boxes in the camera preview. Image classification models show the class identified at the bottom of the image preview. Values are written to the image EXIF metadata when the image is captured.

For more information, see Add smart attributes to a survey.

Smart annotation

Smart annotation augments the image annotation tools in Survey123 by automatically annotating objects detected in the image. Detection results are added to the annotation canvas after you take a photo or add an image from device storage. You can edit the bounding boxes and labels in the canvas and add annotation. For more information about the annotation canvas, see Draw and annotate. You can also create custom annotation palettes to apply specific symbology for each class in an object detection model. For more information, see Draw and annotate palettes.

As an example, smart annotation can be used in a street scene where you want to label and annotate the vehicles in the image. This smart annotation would require an object detection model trained to detect different types of vehicles. The annotated image could be useful for a variety of applications, such as traffic analysis, parking management, and urban planning. Using smart annotation to automatically annotate the image, you can save time and effort compared to manual annotation, and reduce the risk of errors or inconsistencies in the labeling process.

For more information, see Add smart annotation to a survey.

Smart redaction

Redaction allows users to obscure sensitive information in images, such as people's faces. Survey123 supports manual redaction, allowing users to manually select regions of an image before the image is saved and submitted with the survey. Alternatively, you can use smart redaction to redact images.

Redaction effects include blur, blockout, pixelate, and symbol.

For more information, see Add smart redaction to a survey.

Machine learning

Smart assistants in the Survey123 field app use machine learning models that are trained to detect patterns in images. Because models are downloaded with surveys or accessed through built-in APIs, smart assistants work when your device is online or offline, and all image processing happens on the device.


Survey123 allows you to use APIs built into the field app or your device's operating system that provide access to third-party object detection models trained using deep learning. You can also train your own models. You are responsible for the use of these models. When using Survey123, it is your responsibility to review outputs and, in the case of image redaction, manually correct any information that may be missed by the automatic redaction.

You can use this technology in Survey123 in the following ways:

  • Supply a TensorFlow Lite model in the survey's media folder. This method is supported on Android, iOS, and Windows for all smart assistants. You can create TensorFlow Lite models to detect object classes for your specific use case. Alternatively, download the Common Object Detection deep learning package to use as a starting point. For more information, see the Models section below.
  • For smart redaction only, you can use built-in APIs to redact faces in images. With this method, you don't need to supply a model file. Survey123 supports two built-in technologies:
    • Google ML Kit is built into the Survey123 field app and supported on Android and iOS. Google ML Kit provides the fastest and most accurate smart redaction experience in the field app. Users must enable enhanced camera features in the field app to use this technology. To enable enhanced camera features, tap Settings > Privacy and Security > Enable enhanced camera features.
    • For iOS, you can enable the Apple built-in Vision API for face detection by specifying the engine=vision property with the redaction parameter. This API is built into the iOS operating system.
  • You can use built-in APIs to increase the accuracy and performance of barcode scanning on Android and iOS. This applies to barcode questions in surveys and the barcode scanner in the survey gallery. For more information, see Barcodes.

Enhanced camera features use Google ML Kit. When you enable enhanced camera features in the field app, usage statistics may be sent to Google to measure performance, debug, maintain and improve products, and detect misuse or abuse. Image processing happens entirely on the device and no images are sent to Google servers. For more information, see Google ML Kit Terms & Privacy on the Google developers website.

On iOS, barcode scanning and surveys that include the engine=vision property for smart redaction automatically use Apple built-in Vision APIs. These APIs may send analytics data to Apple. Analytics data may include details about hardware and operating system specifications, performance statistics, and data about how you use your devices and applications. You can review this information in the privacy and security settings on your iOS device. This information is used to help Apple improve and develop its products and services. None of the collected information identifies you personally. Personal data is not logged, is subject to privacy preserving techniques such as differential privacy, or is removed from any reports before being sent to Apple. For more information, see Device Analytics & Privacy and Data & Privacy on the Apple website.

For more information, see Prepare smart assistants.


The Survey123 field app supports TensorFlow Lite models in .tflite files. Models must be accompanied by an .emd file or .txt file containing information about the model type and the object classes it is trained to detect, including the labels for each class. The Survey123 field app supports two types of machine learning models:

  • Object detection—An object detection model is trained to detect the presence and location of multiple classes of objects in an image, each with an associated label. For more information, see Object detection.
  • Image classification—An image classification model is trained to recognize various classes of images, each with an associated label. The output is a probability of the image representing one of the labels in the model. For more information, see Image classification. These models are best suited to applications in which there is one target object in each image.

The Common Object Detection deep learning package in ArcGIS Living Atlas of the World is a TensorFlow Lite object detection model trained on the Common Objects in Context (COCO) dataset. It can detect 80 common objects, including people, animals, food items, vehicles, and household items. While it is not recommended to use this model in production surveys, it can be useful for demonstration purposes and to get started with smart assistants. For more information, see Introduction to the model.

Model creation

You can create image classification and object detection models to suit your requirements. Models are trained on a collection of images that are labeled with bounding boxes to identify the location of each object in the image. Training a model can be time and resource intensive. The accuracy and performance of a model depends on the number of images used to train it, and the suitability of those images.

You can create image classification models using ArcGIS tools. Follow the steps in the Train a model to identify street signs tutorial to create an image classification model. The tutorial demonstrates how to use Survey123 to capture a representative collection of training images, train a model using ArcGIS Notebooks, and use the model in the Survey123 field app to classify new images.