Find Hot Spots (GeoAnalytics Desktop)

Summary

Given a set of features, identifies statistically significant hot spots and cold spots using the Getis-Ord Gi* statistic.

Learn more about how Hot Spot Analysis (Getis-Ord Gi*) works

Illustration

Find Hot Spots tool illustration

Usage

  • This tool identifies statistically significant spatial clusters of many features (hot spots) and few features (cold spots). It creates an output feature class with a z-score, p-value, and confidence level bin (Gi_Bin) for each feature in the input..

  • During analysis, the input points (incidents) are aggregated into bins of a specified size, and they are then analyzed to determine hot spots. The aggregated bins must contain a variety of values (counts of points in a bin should be highly variable).

  • The z-scores and p-values are measures of statistical significance that tell you whether to reject the null hypothesis using aggregated bins. That is, they indicate whether the observed spatial clustering of high or low values is more pronounced than one would expect in a random distribution of those values. The z-score and p-value fields do not reflect any kind of False Discovery Rate (FDR) correction.

  • A high z-score and small p-value for a feature indicates an intense presence of point incidents. A low negative z-score and small p-value indicates an absence of point incidents. The higher (or lower) the z-score, the more intense the clustering. A z-score near zero indicates no apparent spatial clustering.

  • The z-score is based on the randomization null hypothesis computation. For more information on z-scores, see What is a z-score? What is a p-value?

  • Analysis with binning requires that the input is projected or that the output coordinate system is set to a projected coordinate system. If the data is not in a projected coordinate system and you do not set one, a projection will be used based on the extent of the data you are analyzing.

  • When input features are analyzed using time steps, each time step is analyzed independent of features outside the time step.

  • The Time Step Reference parameter can be a date and time value or solely a date value; it cannot be solely a time value.

  • This geoprocessing tool is powered by Spark. Analysis is completed on your desktop machine using multiple cores in parallel. See Considerations for GeoAnalytics Desktop tools to learn more about running analysis.

  • When running GeoAnalytics Desktop tools, the analysis is completed on your desktop machine. For optimal performance, data should be available on your desktop. If you are using a hosted feature layer, it is recommended that you use ArcGIS GeoAnalytics Server. If your data isn't local, it will take longer to run a tool. To use your ArcGIS GeoAnalytics Server to perform analysis, see GeoAnalytics Tools.

  • Similar analysis can also be completed using the following:

Parameters

LabelExplanationData Type
Point Layer

The point feature class for which hot spot analysis will be performed.

Feature Layer
Output Feature Class

The output feature class with the z-score and p-value results.

Feature Class
Bin Size

The distance interval that represents the bin size and units into which the Point Layer will be aggregated. The distance interval must be a linear unit.

Linear Unit
Neighborhood Size

The spatial extent of the analysis neighborhood. This value determines which features are analyzed together to assess local clustering.

Linear Unit
Time Step Interval
(Optional)

The interval that will be used for the time step. This parameter is only used if time is enabled for Point Layer.

Time Unit
Time Step Alignment
(Optional)

Specifies how time steps will be aligned. This parameter is only available if the input points are time enabled and represent an instant in time.

  • End timeTime steps will align to the last time event and aggregate back in time.
  • Start timeTime steps will align to the first time event and aggregate forward in time. This is the default.
  • Reference timeTime steps will align to a specified date or time. If all points in the input features have a time stamp larger than the specified reference time (or it falls exactly on the start time of the input features), the time-step interval will begin with that reference time and aggregate forward in time (as occurs with the Start time alignment). If all points in the input features have a time stamp smaller than the specified reference time (or it falls exactly on the end time of the input features), the time-step interval will end with that reference time and aggregate backward in time (as occurs with the End time alignment). If the specified reference time is in the middle of the time extent of the data, a time-step interval will be created ending with the reference time provided (as occurs with the End time alignment); additional intervals will be created both before and after the reference time until the full time extent of the data is covered.
String
Time Step Reference
(Optional)

The time that will be used to align the time steps and time intervals. This parameter is only used if time is enabled for Point Layer.

Date

arcpy.geoanalytics.FindHotSpots(point_layer, out_feature_class, bin_size, neighborhood_size, {time_step_interval}, {time_step_alignment}, {time_step_reference})
NameExplanationData Type
point_layer

The point feature class for which hot spot analysis will be performed.

Feature Layer
out_feature_class

The output feature class with the z-score and p-value results.

Feature Class
bin_size

The distance interval that represents the bin size and units into which the point_layer will be aggregated. The distance interval must be a linear unit.

Linear Unit
neighborhood_size

The spatial extent of the analysis neighborhood. This value determines which features are analyzed together to assess local clustering.

Linear Unit
time_step_interval
(Optional)

The interval that will be used for the time step. This parameter is only used if time is enabled for point_layer.

Time Unit
time_step_alignment
(Optional)

Specifies how time steps will be aligned. This parameter is only available if the input points are time enabled and represent an instant in time.

  • END_TIMETime steps will align to the last time event and aggregate back in time.
  • START_TIMETime steps will align to the first time event and aggregate forward in time. This is the default.
  • REFERENCE_TIMETime steps will align to a specified date or time. If all points in the input features have a time stamp larger than the specified reference time (or it falls exactly on the start time of the input features), the time-step interval will begin with that reference time and aggregate forward in time (as occurs with the Start time alignment). If all points in the input features have a time stamp smaller than the specified reference time (or it falls exactly on the end time of the input features), the time-step interval will end with that reference time and aggregate backward in time (as occurs with the End time alignment). If the specified reference time is in the middle of the time extent of the data, a time-step interval will be created ending with the reference time provided (as occurs with the End time alignment); additional intervals will be created both before and after the reference time until the full time extent of the data is covered.
String
time_step_reference
(Optional)

The time that will be used to align the time steps and time intervals. This parameter is only used if time is enabled for point_layer.

Date

Code sample

FindHotSpots example (stand-alone script)

The following stand-alone script demonstrates how to use the FindHotSpots function.

# Name: FindHotSpots.py
# Description: Find Hots Spots of 311 calls for bins of 500 meters looking at 
# neighbors within 1 kilometer. Complete the analysis for each month. 

# Import system modules
import arcpy

arcpy.env.workspace = "C:/data/Calls311.gdb"

# Enable time on the input features using an .lyrx file.
# To create the .lyrx file, add your layer to a map, open the layer properties 
# and enable time. Then right-click the layer and select Share As Layer File.
input_lyrx = r'C:\data\SanFrancisco_311calls.lyrx'

# MakeFeatureLayer converts the .lyrx to features
SF311CallsInputLayer = arcpy.management.MakeFeatureLayer(input_lyrx, "SF_311Calls_layer")

# ApplySymbologyFromLayer sets the time using the .lyrx file definition
arcpy.management.ApplySymbologyFromLayer(SF311CallsInputLayer, input_lyrx)

# Set local variables
bins = "500 Meters"
neighborhood = "1 Kilometers"
timeStep = "1 Months"
out = "HotSpotsOf311Data"

# Run Find Hot Spots
arcpy.gapro.FindHotSpots(SF311CallsInputLayer, out, bins, neighborhood, timeStep)