Real-time analysis

Real-time analytics perform processing on data ingested through a feed type, analyzing each message as it is received. Real-time analytics are used for transforming data, geofencing, and incident detection. Analytics conclude with one or more output types, such as storing data in a feature layer or sending an email alert.

Examples of real-time analysis

The following are example uses of real-time analysis:

  • As an emergency operations manager, you can track and archive the current locations of your field crews in real time, send alerts if crew is inside a restricted zone, and calculate the distance of the field crews from their assigned base of operations.
  • As a supply chain analyst at an oil and gas company, you can connect to an automatic identification system (AIS) data stream to monitor your vessels, calculate expected arrival information, and understand when vessels are either inside or outside areas of interest.
  • As a logistics analyst at a transportation company, you can monitor vehicle movements in real time to visualize and archive motion statistics such as speed, distance, and time spent idling. You can store all of these locations and attribute observations to a feature layer.
  • As an environmental scientist managing a large number of sensors, you can archive observations for later processing.

Components of a real-time analytic

There are four components of a real-time analytic—feeds, sources, tools, and outputs. These components are described below:

  • Feeds:
    • A feed is a real-time stream of data coming into ArcGIS Velocity. Feeds typically connect to external sources of observational data such as Internet of things (IoT) platforms, message brokers, or third-party APIs. Feeds parse incoming tabular, point, polyline, or polygon data and expose it for analysis and visualization.
  • Sources:
  • Tools:
    • Tools process or analyze events coming in from feeds. Include none or multiple tools in a real-time analytic depending on the use case.
    • Tools can be connected to each other where the output of one tool represents the input of another tool.
    • Not all tools available in big data analytics are available in real-time analytics. This is because some tools such as Find Hot Spots analyze an entire set of data at once. Real-time analytics, by contrast, operate on each incoming event as it is received.
  • Outputs:
    • An output type defines what should be done with each event as it is processed by a real-time analytic.
    • Output options include storing features to a new or existing feature layer, sending an email, sending messages to Kafka or RabbitMQ, and more. For more information, refer to the fundamentals of analytic outputs.
    • The events received from a tool or feed can be sent to multiple outputs.

Stateless vs. stateful processing

Most tools in real-time analytics function in a stateless manner, meaning they operate on each observation received and do not maintain in-memory records of any previous observations. However, several of the available tools function in a stateful manner, on tracks rather than on individual observations.

Stateful tools gather multiple consecutive observations per track to compare spatial and or attribute conditions in each track and detect changes. When an observation is received for each track, it is added to a small cache of observations for that track. This is used, for example, to detect whether the track has entered or exited a geofence by comparing the most recent observation to the previous one.

The available stateful tools include the following:

Stateful tools cannot maintain an indefinite number of observations in memory, so to avoid over-consumption of memory resources, the cache for each track is periodically purged of observations that are older than a specified age.

Some of the stateful tools allow you to specify a purging duration using the Target Time Window parameter. When purging happens, observations older than the value specified in the Target Time Window parameter are purged from memory. Purging only affects observations in memory retained for stateful processing purposes. Purging does not affect any observations sent to outputs and does not delete the data.

The Target Time Window parameter should be set to a value equal to or greater than the longest anticipated period of time between observations for any single track. For example, if vehicles report their locations every five minutes and you are using the Filter by Geometry tool to detect when each vehicle enters a certain area, you would set the Target Time Window value on the filter to be slightly more than five minutes to ensure multiple observations are received before being purged. Setting it to less than five minutes results in a cache containing only one observation per track, eliminating the ability to determine that a vehicle's spatial relationship to the geofence has changed from outside to inside. The Calculate Motion Statistics, Detect Incidents, Filter by Geometry, and Join Features tools all have the Target Time Window parameter.

Geofencing

Geofencing is a quintessential form of real-time spatial analysis in which features (often track points) are assessed against areas of interest (often polygon areas). Most commonly, point-based observations are analyzed to determine whether they have entered or exited a virtual perimeter.

In several real-time and big data analytic tools, geofencing can be performed to identify certain spatial relationships that can occur between features in a target feed or data source and a set of spatial join features, or geofences. The features used as geofences must be connected to the join port of the geofencing tool. Geofences can be points, lines, or polygons. The spatial relationships depend on the geometry type of the input target and join data.

Real-time and big data analytic tools that support geofencing include the following:

For additional details and example use cases, refer to geofencing analysis.

Dynamic geofencing

In several real-time analytic tools, dynamic geofencing can be performed to identify spatial relationships between features in a target feed and a set of features in another join feed (the geofences), both of which are updating in real time or near real time. The tool performing the geofencing uses the most recent observation of any given track ID as geofences.

  • If a feed is connected to the join port, the join features (the geofences) are continuously refreshed based on the incoming features in the join feed. In this case, geofencing is performed dynamically based on the changing features in both the target and join feeds.
  • With dynamic geofencing, the Join Time Window parameter is required.
    • If the join feed does not have a field tagged END_TIME, and the last known observation for a join feature is older than the specified join time window, the tool purges the observations from its memory and does not include them in the analysis.
    • If the join feed has a field tagged END_TIME, the feature ages out of the geofence store according to the value in the field tagged as END_TIME or at the close of the join time window, whichever comes first.

Real-time analytic tools that support dynamic geofencing include the following:

Note:

The maximum size of geofences supported in real-time analytics cannot exceed 768 MB.

For additional details and example use cases, refer to geofencing analysis.