Find Hot Spots

Tool icon Available in big data analytics.

The Find Hot Spots tool Find Hot Spots tool identifies statistically significant hot spots and cold spots in the spatial pattern of the data using the Getis-Ord Gi* statistic.

Workflow diagram

Find Hot Spots workflow diagram

Examples

  • A police department is conducting an analysis to determine if there is a relationship between violent crimes and unemployment rates. An expanded summer job program will be implemented for high schools in areas where there is high violent crime and high unemployment. The Find Hot Spots tool can be used to find areas with statistically significant crime and unemployment hot spots.
  • A conservation officer is studying disease in trees to prioritize which areas of the forest should receive treatment and to learn more about areas that are showing some resistance. The Find Hot Spots tool can be used to find clusters of diseased (hot spots) and healthy (cold spots) trees.

Usage notes

  • Input features must be points. Points are being analyzed by aggregated features within a square grid (bins).
  • The output layer will have additional fields containing information such as the statistical significance of each feature, the p-value, and the z-score.
  • During analysis, the input points are aggregated into bins of a specified size. They are then analyzed to determine hot spots. The aggregated bins must contain a variety of values (counts of points in a bin should be highly variable).
  • The z-scores and p-values are measures of statistical significance that indicate whether the observed spatial clustering of high or low values is more pronounced than one would expect in a random distribution of those same values. You can then determine whether to accept or reject the null hypothesis using aggregated bins. The z-score and p-value fields do not reflect any kind of False Discovery Rate (FDR) correction.
  • A high z-score and small p-value for a feature indicate an intense presence of point incidents. A low negative z-score and small p-value indicate an absence of point incidents. The higher (or lower) the z-score, the more intense the clustering. A z-score near zero indicates no apparent spatial clustering.
  • The z-score is based on the randomization null hypothesis computation. For more information on z-scores, see What is a z-score? What is a p-value?.
  • The Find Hot Spots tool allows you to analyze using time steps. Each time step is analyzed independently of features outside of the time step. To use time stepping, the input data must be time enabled and represent an instant in time. When time stepping is applied, output features will be time intervals represented by the StartTime and EndTime fields.
  • The Time Step Reference parameter can be a date and time value or solely a date value; it cannot be solely a time value.

Parameters

ParameterDescriptionData type

Input Layer

The point features for which hot spots will be calculated.

Features

Bin Type

The bin shape that will be used to create the regular bins. The default is Square.

String

Bin Size

The distance interval that represents the bin size into which the input points will be analyzed.

String

Neighborhood Size (optional)

The spatial extent of the analysis neighborhood. This value determines which features are analyzed together to assess local clustering.

String

Time Step Interval (optional)

The interval for the time step. This parameter is only used if the input points' schema has a field tagged with the Start Time key field.

String

Time Step Alignment (optional)

Specifies how time steps will be aligned. This parameter is only available if the input points are time enabled and represent an instant in time.

  • Start Time—Time steps align to the first time event and aggregate forward in time.
  • End Time—Time steps align to the last time event and aggregate back in time.
  • Reference Time—Time steps align to a specified date and time. If all points in the input features have a time stamp larger than the reference time you provide (or it falls exactly on the start time of the input features), the time-step interval will begin with that reference time and aggregate forward in time (as occurs with a start time alignment). If all points in the input features have a time stamp smaller than the reference time you provide (or it falls exactly on the end time of the input features), the time-step interval will end with that reference time and aggregate backward in time (as occurs with an end time alignment). If the reference time you provide is in the middle of the time extent of the data, a time-step interval will be created ending with the reference time provided (as occurs with an end time alignment); additional intervals will be created both before and after the reference time until the full time extent of the data is covered.

String

Time Step Reference (optional)

The reference time for time steps and time intervals to align with. This parameter only appears if Reference Time is used for the Time Step Alignment parameter.

Date

Output layer

The output layer will contain the following fields in place of the original fields:

Field nameDescriptionField type

value

The number of features in that bin.

Float64

GiZScore

The z-score of features in that bin.

Float64

GiPValue

The p-value of features in that bin.

Float64

Gi_Bin

The confidence level used to identify statistically significant hot and cold spots. Features with a Gi_Bin value of +/-3 reflect statistical significance with a 99 percent confidence level; features with a Gi_Bin value of +/-2 reflect a 95 percent confidence level; features with a Gi_Bin value of +/-1 reflect a 90 percent confidence level; and the clustering for features with a Gi_Bin value of 0 are not statistically significant.

Float64

Considerations and limitations

Inputs must include a point layer, and they will be aggregated into bins of a specified size before analysis.