Find Hot Spots

Find Hot Spots tool icon The Find Hot Spots tool identifies any statistically significant hot spots and cold spots in the spatial pattern of your data using the Getis-Ord Gi* statistic.

Workflow diagram

Find Hot Spots workflow diagram

Examples

  • A police department is conducting an analysis to determine if there is a relationship between violent crimes and unemployment rates. An expanded summer job program will be implemented for high schools in areas where there is high violent crime and high unemployment. The Find Hot Spots tool can be used to find areas with statistically significant crime and unemployment hot spots.
  • A conservation officer is studying disease in trees to prioritize which areas of the forest should receive treatment and learn more about areas that are showing some resistance. The Find Hot Spots tool can be used to find clusters of diseased (hot spots) and healthy (cold spots) trees.

Usage notes

  • Input features must be points. Points are being analyzed by aggregated features within a square grid (bins).
  • The output layer will have additional fields containing information such as the statistical significance of each feature, the p-value, and the z-score.
  • During analysis, the input points are aggregated into bins of a specified size and they are then analyzed to determine hot spots. It is required that the aggregated bins contain a variety of values (counts of points in a bin should be highly variable).
  • The z-scores and p-values are measures of statistical significance, which tell you whether or not to reject the null hypothesis, using aggregated bins. In effect, they indicate whether the observed spatial clustering of high or low values is more pronounced than one would expect in a random distribution of those same values. The z-score and p-value fields do not reflect any kind of FDR (False Discovery Rate) correction.
  • A high z-score and small p-value for a feature indicates an intense presence of point incidents. A low negative z-score and small p-value indicates an absence of point incidents. The higher (or lower) the z-score, the more intense the clustering. A z-score near zero indicates no apparent spatial clustering.
  • The z-score is based on the randomization null hypothesis computation. For more information on z-scores, see What is a z-score? What is a p-value?.
  • The Find Hot Spots tool allows you to optionally analyze using time steps. Each time step is analyzed independently of features outside of the time step. To use time stepping, your input data must be time enabled and represent an instant in time. When time stepping is applied, output features will be time intervals represented by the fields StartTime and EndTime.
  • The Time Step Reference parameter can be a date and time value or solely a date value; it cannot be solely a time value.

Parameters

ParameterDescriptionData Type

Input Layer

The point features for which hot spots will be calculated.

Features

Bin Type

The bin shape that will be used to create the regular bins. Option includes SQUARE.

String

Bin Size

The distance interval that represents the bin size into which the input points will be analyzed.

String

Neighborhood Size (optional)

The spatial extent of the analysis neighborhood. This value determines which features are analyzed together in order to assess local clustering.

String

Time Step Interval (optional)

The interval for time step. This option is only used if the input points' schema has a field tagged as START_TIME.

String

Time Step Alignment (optional)

How you want your time steps aligned. This option is only available if the input points are time enabled and represent an instant in time.

  • START TIME - Time steps align to the first time event and aggregate forward in time.
  • END TIME - Time steps align to the last time event and aggregate back in time.
  • REFERENCE TIME - Time steps align to a particular date/time you specify. If all points in the input features have a timestamp larger than the Reference time you provide (or it falls exactly on the start time of the input features), the time-step interval will begin with that reference time and aggregate forward in time (as occurs with a Start time alignment). If all points in the input features have a timestamp smaller than the reference time you provide (or it falls exactly on the end time of the input features), the time-step interval will end with that reference time and aggregate backward in time (as occurs with an End time alignment). If the Reference time you provide is in the middle of the time extent of your data, a time-step interval will be created ending with the reference time provided (as occurs with an End time alignment); additional intervals will be created both before and after the reference time until the full time extent of your data is covered.

String

Time Step Reference (optional)

A specified reference time for time steps and time intervals to align with. This parameter only appears if the user selects ReferenceTime for the Time Step Alignment parameter.

Date

Output layer

The output layer will contain the following fields in place of the original fields.

Field NameDescriptionField Type

value

The number of features within that bin.

Float64

GiZScore

The Z-Score of features within that bin.

Float64

GiPValue

The P-Value of features within that bin.

Float64

Gi_Bin

The confidence level used to identify statistically significant hot and cold spots. Features with a Gi_Bin value of +/-3 reflect statistical significance with a 99 percent confidence level; features with a Gi_Bin value of +/-2 reflect a 95 percent confidence level; features with a Gi_Bin value of +/-1 reflect a 90 percent confidence level; and the clustering for features with a Gi_Bin value of 0 are not statistically significant.

Float64

Considerations and limitations

  • Inputs must include a point layer and they will be aggregated into bins of a specified size before analysis.