The Find Hot Spots tool identifies any statistically significant hot spots and cold spots in the spatial pattern of your data using the Getis-Ord Gi* statistic.
- A police department is conducting an analysis to determine if there is a relationship between violent crimes and unemployment rates. An expanded summer job program will be implemented for high schools in areas where there is high violent crime and high unemployment. The Find Hot Spots tool can be used to find areas with statistically significant crime and unemployment hot spots.
- A conservation officer is studying disease in trees to prioritize which areas of the forest should receive treatment and learn more about areas that are showing some resistance. The Find Hot Spots tool can be used to find clusters of diseased (hot spots) and healthy (cold spots) trees.
- Input features must be points. Points are being analyzed by aggregated features within a square grid (bins).
- The output layer will have additional fields containing information such as the statistical significance of each feature, the p-value, and the z-score.
- During analysis, the input points are aggregated into bins of a specified size and they are then analyzed to determine hot spots. It is required that the aggregated bins contain a variety of values (counts of points in a bin should be highly variable).
- The z-scores and p-values are measures of statistical significance, which tell you whether or not to reject the null hypothesis, using aggregated bins. In effect, they indicate whether the observed spatial clustering of high or low values is more pronounced than one would expect in a random distribution of those same values. The z-score and p-value fields do not reflect any kind of FDR (False Discovery Rate) correction.
- A high z-score and small p-value for a feature indicates an intense presence of point incidents. A low negative z-score and small p-value indicates an absence of point incidents. The higher (or lower) the z-score, the more intense the clustering. A z-score near zero indicates no apparent spatial clustering.
- The z-score is based on the randomization null hypothesis computation. For more information on z-scores, see What is a z-score? What is a p-value?.
- The Find Hot Spots tool allows you to optionally analyze using time steps. Each time step is analyzed independently of features outside of the time step. To use time stepping, your input data must be time enabled and represent an instant in time. When time stepping is applied, output features will be time intervals represented by the fields StartTime and EndTime.
- The Time Step Reference parameter can be a date and time value or solely a date value; it cannot be solely a time value.
The point features for which hot spots will be calculated.
The bin shape that will be used to create the regular bins. Option includes SQUARE.
The distance interval that represents the bin size into which the input points will be analyzed.
Neighborhood Size (optional)
The spatial extent of the analysis neighborhood. This value determines which features are analyzed together in order to assess local clustering.
Time Step Interval (optional)
The interval for time step. This option is only used if the input points' schema has a field tagged as START_TIME.
Time Step Alignment (optional)
How you want your time steps aligned. This option is only available if the input points are time enabled and represent an instant in time.
Time Step Reference (optional)
A specified reference time for time steps and time intervals to align with. This parameter only appears if the user selects ReferenceTime for the Time Step Alignment parameter.
The output layer will contain the following fields in place of the original fields.
|Field Name||Description||Field Type|
The number of features within that bin.
The Z-Score of features within that bin.
The P-Value of features within that bin.
The confidence level used to identify statistically significant hot and cold spots. Features with a Gi_Bin value of +/-3 reflect statistical significance with a 99 percent confidence level; features with a Gi_Bin value of +/-2 reflect a 95 percent confidence level; features with a Gi_Bin value of +/-1 reflect a 90 percent confidence level; and the clustering for features with a Gi_Bin value of 0 are not statistically significant.
Considerations and limitations
- Inputs must include a point layer and they will be aggregated into bins of a specified size before analysis.