80-20 Analysis (Crime Analysis and Safety)

Summary

Conducts an 80/20 analysis of features and creates point clusters, lines, or polygons based on the number of associated incidents. The tool calculates a cumulative percentage field to identify the locations where incidents are disproportionately occurring.

Usage

  • The 80-20 rule is a theoretical concept in which a large majority of incidents occur at a small minority of locations, for example 80 percent of incidents occur at 20 percent of locations.

  • In the discipline of crime analysis, this tool can be used in many ways. Typical analysis workflows include the following:

    • Aggregate incidents into clusters—This type of analysis identifies the properties with the highest number of incidents for a specific period.
    • Aggregate incidents to street segments—This type of analysis is sometimes described as finding hot streets.
    • Aggregate incidents to specific neighborhood boundaries—This type of analysis is sometimes described as finding hot areas.
  • The Aggregation Method parameter determines how the 80-20 analysis is conducted and how the input point features will be aggregated. The two aggregation methods available for conducting analysis are Cluster and Closest Feature, which are described as follows:

    • Cluster—The input point features will be clustered based on the Defined distance (DBSCAN) clustering method used in the Density-based Clustering tool.
    • Closest Feature—The input point features will be associated with the closest input comparison line or polygon feature.

  • The following fields will be added to output when the Aggregation Method parameter is set to Cluster:

    • ICOUNT—The number of points found within the cluster tolerance for that cluster.
    • PERC—The percentage of the total number of points found within the cluster tolerance for that cluster.
    • CUMU_PERC—The cumulative percentage of the current cluster point and all other larger cluster points, calculated using the ICOUNT value.
    • CUMU_LPERC—The cumulative percentage of the current cluster point and all other larger cluster points, calculated using the total number of output point features.

    The CUMU_PERC and CUMU_LPERC values can be used to determine if a disproportionate number of cluster locations represent a larger proportion of crimes, for example, 20 percent of cluster locations contain 80 percent of total points.

    The following fields will be added to the output when the Aggregation Method parameter is set to Closest Feature:

    • ICOUNT—The number of points found closest to the line or polygon features.
    • PERC—The percentage of the total number of points found near or within the line or polygon features.
    • CUMU_PERC—The cumulative percentage of the current feature and all other features with greater counts, calculated using the ICOUNT value.
    • CUMU_LPERC—The cumulative percentage of the current feature and all other features with greater counts, calculated using the total number of output line or polygon features.
    • INC_KM—The number of features per kilometer. This is added to the output when the Input Comparison Features are lines.
    • INC_MI—The number of features per mile. This is added to the output when the Input Comparison Features are lines.
    • INC_SQKM—The number of features per square kilometer. This is added to the output when the Input Comparison Features values are polygons.
    • INC_SQMI—The number of features per square mile. This is added to the output when the Input Comparison Features values are polygons.

    The CUMU_PERC and CUMU_LPERC values can be used to determine if a disproportionate number of line or polygon features represent larger proportion of crimes, for example, 20 percent of lines or polygon features locations contain 80 percent of total points.

    Records in the output are sorted based on generated ICOUNT (incident count), CUMU_PERC (cumulative percentage), PERC (incident percentage), and CUMU_LPERC (cumulative location percentage) field values.

  • The output feature class is symbolized by the ICOUNT field.

  • The output point feature class is symbolized by a graduated symbol layer based on the number of incidents occurring at each location.

Parameters

LabelExplanationData Type
Input Point Features

The input point features that will be used to create clusters, lines, or polygons.

Feature Layer
Output Feature Class

The output feature class.

When the Aggregation Method parameter is set to Cluster, the output will be a point feature class.

When the Aggregation Method parameter is set to Closest Feature, the geometry type of the output will be the same as the Input Comparison Features parameter value.

Feature Class
Cluster Tolerance
(Optional)

The maximum distance separating points at which they will be considered part of the same cluster.

If no cluster tolerance is specified, the tool will create a cluster where point features overlap.

This parameter is active when the Aggregation Method parameter is set to Cluster.

Linear Unit
Output Fields
(Optional)

The fields from the input features that will be transferred to the output.

Field
Aggregation Method
(Optional)

Specifies how the input point features will be aggregated.

  • ClusterThe input point features will be clustered. This is the default.
  • Closest FeatureThe input point features will be aggregated to the closest comparison polygon or line feature.
String
Input Comparison Features
(Optional)

The comparison input polygon or line feature class by which the Input Point Features parameter value is aggregated.

This parameter is active when the Aggregation Method parameter is set to Closest Feature.

Feature Layer

arcpy.ca.EightyTwentyAnalysis(in_features, out_feature_class, {cluster_tolerance}, {out_fields}, {aggregation_method}, {in_comparison_features})
NameExplanationData Type
in_features

The input point features that will be used to create clusters, lines, or polygons.

Feature Layer
out_feature_class

The output feature class.

When the aggregation_method parameter is set to POINT_CLUSTER, the output will be a point feature class.

When the aggregation_method parameter is set to CLOSEST_FEATURE, the geometry type of the output will be the same as the in_comparison_features parameter value.

Feature Class
cluster_tolerance
(Optional)

The maximum distance separating points at which they will be considered part of the same cluster.

If no cluster tolerance is specified, the tool will create a cluster where point features overlap.

This parameter is enabled when the aggregation_method parameter is set to POINT_CLUSTER.

Linear Unit
out_fields
[out_fields,...]
(Optional)

The fields from the input features that will be transferred to the output.

Field
aggregation_method
(Optional)

Specifies how the input point features will be aggregated.

  • POINT_CLUSTERThe input point features will be clustered. This is the default.
  • CLOSEST_FEATUREThe input point features will be aggregated to the closest comparison polygon or line feature.
String
in_comparison_features
(Optional)

The comparison input polygon or line feature class by which the in_features parameter value is aggregated.

This parameter is enabled when the aggregation_method parameter is set to CLOSEST_FEATURE.

Feature Layer

Code sample

EightyTwentyAnalysis example 1 (Python window)

The following Python window script demonstrates how to use the EightyTwentyAnalysis function in immediate mode.

import arcpy
arcpy.env.workspace = r"C:/data/city_pd.gdb"
arcpy.ca.EightyTwentyAnalysis("CallsForService", "80_20_clusters")
EightyTwentyAnalysis example 2 (stand-alone script)

The following Python script demonstrates how to use the EightyTwentyAnalysis function in a stand-alone script.


# Name: EightyTwentyAnalysis.py
# Description: Conducts an 80/20 analysis of 911 calls to determine clusters of calls within 50 meters of each other.

# import system modules 
import arcpy

# Set environment settings
arcpy.env.workspace = r"C:\data\city_pd.gdb"

# Set local variables
in_features = "CallsForService"
out_feature_class = "80_20_clusters"
cluster_tolerance = "50 Meters"
out_fields = ["FULLADDR","RESCITY", "RESSTATE", "RESZIP5"]

# Run EightyTwentyAnalysis
arcpy.ca.EightyTwentyAnalysis(in_features,
                              out_feature_class,
                              cluster_tolerance,
                              out_fields)
EightyTwentyAnalysisStreets example 3 (stand-alone script)

The following Python script demonstrates how to use the EightyTwentyAnalysis function in a stand-alone script to determine street segments with a disproportional amount of crimes nearby. In crime analysis, this type of analysis is sometimes described as finding hot streets.

# Name: EightyTwentyAnalysisStreets.py
# Description: Conduct an 80/20 analysis of calls for service to determine street segments with a disproportional amount of crimes nearby.

# import system modules
import arcpy

# Set environment settings
arcpy.env.workspace = r"C:\data\city_pd.gdb"

# Set local variables
in_features = "CallsForService"
out_feature_class = "80_20_streets"
comp_features = "city_centerlines"
out_fields = ["STREET_NAME", "L_ADDR_NUM", "R_ADDR_NUM"]

# Run Eighty Twenty Analysis
arpcy.ca.EightyTwentyAnalysis(in_features,
                              out_feature_class,
                              in_comparison_features=comp_features,
                              out_fields=out_fields)