Skip To Content

Introduction to the model

Banner image for the model

The Vision Language Context-Based Classification pretrained model available on ArcGIS Living Atlas of the World is a deep learning model that is used to classify images.

This deep learning package (DLPK) acts as a bridge between ArcGIS Pro and OpenAI's vision language models. OpenAI's models are renowned for their advanced capabilities in natural language processing and understanding, as well as their ability to interpret and generate human-like text. The integration of these models into a DLPK enhances their utility by enabling them to process images and perform zero-shot classification of objects in imagery.

Use this deep learning package to use the power of OpenAI's large vision language models to perform object classification on images and rasters within ArcGIS Pro. This DLPK allows for flexibility in classifying objects, as it is not restricted to predefined classes; users can specify custom class labels at the time of running the tool. This capability opens up new avenues for analysis and interpretation of spatial data, making it easier for professionals in fields such as environmental science, urban planning, and remote sensing to extract meaningful insights from their visual datasets.

License requirements

To complete this workflow, the following are the license requirements:

  • ArcGIS DesktopArcGIS Image Analyst extension for ArcGIS Pro
  • ArcGIS EnterpriseArcGIS Image Server with raster analytics configured
  • ArcGIS OnlineArcGIS Pro or Professional Plus user type

Model overview

This model has the following characteristics:

  • Input—8-bit RGB imagery.
  • Output—Feature class with information about classification of the image.
  • Compute—This workflow can run on CPU or GPU.
  • Applicable geographies—This model is expected to work well globally.
  • Architecture—The implementation uses OpenAI's vision language models.

Access and download the model

Download the Vision Language Context-Based Classification pretrained model from ArcGIS Living Atlas of the World. Alternatively, access the model directly from ArcGIS Pro, or consume it in ArcGIS Image for ArcGIS Online.

  1. Browse to ArcGIS Living Atlas of the World.
  2. Sign in with your ArcGIS Online credentials.
  3. Search for Vision Language Context-Based Classification and open the item_page from the search results.
  4. Click the Download button to download the model.

    You can use the downloaded .dlpk file directly in ArcGIS Pro.

Release notes

The following are the release notes:

DateDescription

December 2024

First release of Vision Language Context-Based Classification