Summarize Within (GeoAnalytics Desktop)—ArcGIS AllSource

Summary

Overlays a polygon layer with another layer to summarize the number of points, length of the lines, or area of the polygons within each polygon and calculates attribute field statistics for those features within the polygons.

The following are example scenarios using Summarize Within:

Given watershed boundaries and land-use boundaries by land-use type, calculate total acreage of land-use type for each watershed.
Given county parcels and city boundaries, summarize the average value of vacant parcels within each city boundary.
Given counties and roads, summarize the total mileage of roads by road type within each county.

Illustration

Examples of summarizing points within polygons (first row), lines within polygons (second row), and polygons within polygons (third row) are shown.

Usage

In basic terms, the Summarize Within tool takes two layers, the input polygons and the input summary features, and stacks them on top of each other. You can view down through the stack and count the number of input summary features that fall within the input polygons. You can also calculate simple statistics about the attributes of the input summary features, such as sum, mean, minimum, maximum, and so on.
Use Summarize Within to calculate standard statistics as well as geographically weighted statistics. Standard statistics summarize the statistical values without weighting. Weighted statistics calculate values using the geographically weighted attributes of lines within a polygon, or the attributes of polygons within a polygon. Weighted statistics do not apply to points within polygons.
Standard statistics and geographically weighted statistics can be calculated for attributes that represent either counts or rates. These are defined as follows:
- Counts—Attributes that represent a sum or quantity of an entity at a point location, along a line, or within a polygon. Examples of count-type attributes include the population of a country, the number of taxi pickups in a census block, and the number of dams along a river. For line and polygon features, counts are proportioned before calculating standard or weighted statistics.
- Rates—Attributes that represent a ratio or index at a point location, along a line, or within a polygon. Examples of rate-type attributes include the population density of a country, the speed limit of a road, or the walkability score of a neighborhood. Rates are never proportioned.
For count-type attributes, values are proportioned according to the amount of the line within a polygon or the amount of the polygon within another polygon prior to calculating statistics. Statistics are calculated the same way for count-type and rate-type attributes when the summary features are points.

You can calculate the lengths and areas of the summarized layers within each polygon using the options in the table below. Options are based on the geometry of the summarized layer.


Input feature	Description	Option
Points	The count of summarized points within each polygon	None
Lines	The length of summarized lines within each polygon	Statute Miles International Yards International Feet US Survey Miles US Survey Yards US Survey Feet Kilometers Meters
Areas	The area of summarized polygons within each polygon	Square Statute Miles Square International Yards Square International Feet Square Kilometers Square Meters Hectares International Acres Square US Survey Miles Square US Survey Yards Square US Survey Feet US Survey Acres

For lines and areas, all weighted statistics will be calculated. Both the standard summary field statistics and the weighted summary field statistics will be applied to data for the features in the Summarized Layer parameter value that intersect the Summary Polygons layer. The weighted summary field statistics will be multiplied by a weight based on the proportion of the features in the Summary Polygons parameter value that intersect each feature in the Summarized Layer parameter value.
For standard statistics, there are eight options: count, sum, mean, minimum, maximum, range, standard deviation, and variance. There are two options for string statistics: count and any. There are three weighted statistics that are calculated on numeric fields in the layer to be summarized: mean, standard deviation, and variance.
Weighted statistics are not calculated for string data. Each time Field and Statistic values are specified, a row is added to the tool pane so more than one statistic can be calculated. You can view the summarized results in the result layer's table or pop-ups. By default, the count of features intersecting the Summary Polygons values is always calculated.
Analysis with binning requires that the input is projected or that the output coordinate system is set to a projected coordinate system. If the data is not in a projected coordinate system and you do not set one, a projection will be used based on the extent of the data you are analyzing.
You can provide a Group By Field value so statistics will be calculated separately for each unique attribute value. When a Group By Field value is provided, a summary table listing each feature and statistic by Group By Field value will be created.
The Add Minority and Majority Attributes and Add Group Percentages parameters are available when a Group By Field value is provided. The minority and majority will be the least and most dominant value from the Group By Field parameter, respectively, in which dominance is determined using the count of points, total length, or total area of each value.
When the Add Minority and Majority Attributes parameter is checked, two fields will be added to the result layer. The fields will list the values from the Group By Field parameter that are the minority and majority for each result feature.
The Add Group Percentages parameter is only available when Add Minority and Majority Attributes is checked. When the Add Group Percentages parameter is checked, two fields will be added to the result layer listing the percentage of the count of points, total length, or total area that belong to the minority and majority values for each feature. A percentage field will also be added to the result table listing the percentage of the count of points, total length, or total area that belong to all values from the Group By Field parameter for each feature.
The output feature layer is always a polygon layer. Only polygons that intersect a summarized layer will be returned. Other polygons will be completely removed from the result layer.
The input point and polygon features (the first image) and the resulting area features (the second image) are shown.

The following fields are included in the output polygon features:


Field name	Description
count	The count of summarized features that intersect each polygon layer.
sum_length_<linearunit>, or sum_area_<areaunit>	The total length of lines within the polygon or total area of summarized polygon within each polygon. These values are returned when Add shape summary attributes is checked and are returned in the specified unit.
statistic_<fieldname>	Specified statistics will each create an attribute field named in the following format: <statistic>_<fieldname>. For example, the maximum and standard deviation of the id field is MAX_id and SD_id.
pstatistic_<fieldname>	Specified weighted statistics will each create an attribute field named in the following format: p<statistic>_<fieldname>. For example, the weighted maximum of the id field is pMAX_id.
minority_<fieldname>	This value is returned when you create a group-by table and Add Minority and Majority Attributes is checked. This represents the values for the specified field that is the minority in each polygon. For example, there are five points in a polygon with a field called color and values of red, blue, blue, green, green. If you create a group by the color field, the value for the minority_color field is red.
majority_<fieldname>	This value is returned when you create a group-by table and Add Minority and Majority Attributes is checked. This represents the values for the specified field that is the majority in each polygon. For example, there are five points in a polygon with a field called color and values of red, blue, blue, green, green. If you create a group by the color field, the value for the majority_color field is blue;green.
minority_<fieldname>_percent	This value is returned when you create a group-by table and Add Group Percentages is checked. This represents the percentages of the count for the specified field that is the minority in each polygon. For example, there are five points in a polygon with a field called color and values of red, blue, blue, green, green. If you create a group by the color field, the value for the minority_color_percent field is 20 (calculated as 1/5).
majority_<fieldname>_percent	This value is returned when you create a group-by table and Add Group Percentages is checked. This represents the percentages of the count for the specified field that is the majority in each polygon. For example, there are five points in a polygon with a field called color and values of red, blue, blue, green, green. If you create a group by the color field, the value for the majority_color_percent field is 40 (calculated as 2/5).
join_id	This value is returned when you create a group-by table. This is an ID to link features to the group-by table. Every join_id field corresponds to one or more rows in the group-by table.

The following fields are included in the output group-by table:


Field name	Description
join_id	This is an ID to link features to the polygon layer. Each polygon will have one or more features with the same ID that represent all of the group-by values. For example, there are five points in a polygon with a field called color and values of red, blue, blue, green, green. The group-by table will have three rows representing that polygon (same join ID), one for each color: red, blue, and green.
count	The count of the specified group within the joined polygon. For example, red is 1 for the selected polygon.
<statistic>_<fieldname>	Any specified statistic calculated for each group.
p<statistic>_<fieldname>	Any specified weighted statistic calculated for each group.
percentcount	The percentage each group contributes to the total count in the polygon. Using the example above, red contributes 1/5 = 20, blue contributes 2/5 = 40, and green contributes 2/5 = 20.

You can improve the performance of the Summarize Within tool by doing one or more of the following:
- Set the extent environment so only data of interest will be analyzed.
- Larger bins will perform better than smaller bins. If you are using bins and are unsure which size to use, start with a larger bin to prototype.
- Use data that is local to where the analysis is being run.
This geoprocessing tool is powered by Spark. Analysis is completed on your desktop machine using multiple cores in parallel. See Considerations for GeoAnalytics Desktop tools to learn more about running analysis.
When running GeoAnalytics Desktop tools, the analysis is completed on your desktop machine. For optimal performance, data should be available on your desktop. If you are using a hosted feature layer, it is recommended that you use ArcGIS GeoAnalytics Server. If your data isn't local, it will take longer to run a tool. To use your ArcGIS GeoAnalytics Server to perform analysis, see GeoAnalytics Tools.
Similar analysis can also be completed using the Summarize Within tool in the Standard Feature Analysis toolbox in ArcGIS AllSource.

Parameters

Label	Explanation	Data Type
Summarized Layer	The point, line, or polygon features that will be summarized by either polygons or bins.	Feature Layer
Output Feature Class	The name of the output feature class that will contain the intersecting geometries and attributes.	Feature Class
Polygon or Bin	Specifies whether the Summarized Layer value will be summarized by polygons or bins. Polygon—The summarized layer will be aggregated into a polygon dataset. Bin—The summarized layer will be aggregated into square or hexagonal bins that are generated when the tool is run.	String
Bin Type	Specifies the bin shape that will be generated to summarize features. Square—The Bin Size value represents the height of a square. This is the default. Hexagon—The Bin Size value represents the height between two parallel sides.	String
Bin Size (Optional)	The distance interval that represents the bin size and units by which the input features will be summarized.	Linear Unit
Summary Polygons (Optional)	The polygons that will be used to summarize the features in the input summarized layer.	Feature Layer
Add Shape Summary Attributes	Specifies whether the length of lines or area of polygons within the summary layer (polygon or bin) will be calculated. The count of points, lines, and polygons intersecting the summary shape will always be included. Checked—Summary shape values will be calculated. This is the default. Unchecked—Summary shape values will not be calculated.	Boolean
Shape Units (Optional)	Specifies the unit of measurement that will be used to calculate shape summary attributes. If the input summary features are points, a shape unit is unnecessary, since only the count of points within each input polygon is added. If the input summary features are lines, specify a linear unit. If the input summary features are polygons, specify an areal unit. Meters—The shape units will be US survey meters. Kilometers—The shape units will be US survey kilometers. US Survey Feet—The shape units will be US survey feet. US Survey Yards—The shape units will be US survey yards. US Survey Miles—The shape units will be US survey miles. US Survey Nautical Miles—The shape units will be US survey nautical miles. International Feet—The shape units will be international feet. International Yards—The shape units will be international yards. Statute Miles—The shape units will be statute miles. International Nautical Miles—The shape units will be international nautical miles. International Acres—The shape units will be international acres. Hectares—The shape units will be hectares. Square Meters—The shape units will be square meters. Square Kilometers—The shape units will be square kilometers. Square International Feet—The shape units will be square international feet. Square International Yards—The shape units will be square international yards. Square Statute Miles—The shape units will be square statute miles. Square US Survey Feet—The shape units will be square US survey feet. Square US Survey Yards—The shape units will be square US survey yards. Square US Survey Miles—The shape units will be square US survey miles. US Survey Acres—The shape units will be US survey acres.	String
Standard Summary Fields (Optional)	The statistics that will be calculated on specified fields. Specifies whether a field represents a count or a rate. COUNT—For line and polygon layers, the summarized field values will be proportioned by the percentage of the summarized features that intersect the summary polygons prior to calculating statistics. Values will not be proportioned for point layers. RATE—The summarized field values will not be proportioned. The raw field values will be used to calculate statistics.	Value Table
Weighted Summary Fields (Optional)	Specifies the weighted statistics that will be calculated on specified fields. Mean—The weighted mean of each field will be calculated in which the weight applied is the proportion of the summarized layer within the polygons. Standard deviation—The weighted standard deviation of each field will be calculated in which the weight applied is the proportion of the summarized layer within the polygons. Variance—The weighted variance of each field will be calculated in which the weight applied is the proportion of the summarized layer within the polygons. Specifies whether a field represents a count or a rate. Count—The summarized field values will be proportioned by the percentage of the summarized features that intersect the summary polygons prior to calculating statistics. Rate—The summarized field values will not be proportioned. The raw field values will be used to calculate statistics.	Value Table
Group By Field (Optional)	A field from the input summary features that will be used to calculate statistics for each unique attribute value. For example, the input summary features contain point locations of businesses that store hazardous materials, and one of the fields is HazardClass, which contains codes that describe the type of hazardous material stored. To calculate summaries by each unique value of HazardClass, use it as the group-by field.	Field
Add Minority and Majority Attributes (Optional)	Specifies whether minority (least dominant) and majority (most dominant) attribute values for each group field within each boundary will be added. When this parameter is checked, two new fields will be added to the output layer prefixed with Minority_ and Majority_ . This parameter only applies when a value is provided for the Group By Field parameter. Unchecked—Minority and majority fields will not be added. This is the default. Checked—Minority and majority fields will be added.	Boolean
Add Group Percentages (Optional)	Specifies whether percentage fields will be added. When this parameter is checked, the percentage of each unique group value will be calculated for each input polygon. This parameter only applies when a value is provided for the Group By Field parameter and a value is specified for the Add Minority and Majority Attributes parameter. Unchecked—Percentage fields will not be added. This is the default. Checked—Percentage fields will be added.	Boolean
Group By Summary Table (Optional)	The output table that will contain the group by summaries.	Table

arcpy.geoanalytics.SummarizeWithin(summarized_layer, out_feature_class, polygon_or_bin, bin_type, {bin_size}, {summary_polygons}, sum_shape, {shape_units}, {standard_summary_fields}, {weighted_summary_fields}, {group_by_field}, {add_minority_majority}, {add_percentages}, {group_by_summary})

Name	Explanation	Data Type
summarized_layer	The point, line, or polygon features that will be summarized by either polygons or bins.	Feature Layer
out_feature_class	The name of the output feature class that will contain the intersecting geometries and attributes.	Feature Class
polygon_or_bin	Specifies whether the summarized_layer value will be summarized by polygons or bins. POLYGON—The summarized layer will be aggregated into a polygon dataset. BIN—The summarized layer will be aggregated into square or hexagonal bins.	String
bin_type	Specifies the bin shape that will be generated to summarize features. SQUARE—The bin_size value represents the height of a square. This is the default. HEXAGON—The bin_size value represents the height between two parallel sides.	String
bin_size (Optional)	The distance interval that represents the bin size and units by which the input features will be summarized.	Linear Unit
summary_polygons (Optional)	The polygons that will be used to summarize the features in the input summarized layer.	Feature Layer
sum_shape	Specifies whether the length of lines or area of polygons within the summary layer (polygon or bin) will be calculated. The count of points, lines, and polygons intersecting the summary shape will always be included. ADD_SUMMARY—Summary shape values will be calculated. This is the default. NO_SUMMARY—Summary shape values will not be calculated.	Boolean
shape_units (Optional)	Specifies the unit of measurement that will be used to calculate shape summary attributes. If the input summarized_layer value is points, no shape unit is necessary, since only the count of points within each input polygon is added. If the input summary features are lines, specify a linear unit. If the input summary features are polygons, specify an areal unit. METERS—The shape units will be meters. KILOMETERS—The shape units will be kilometers. FEET—The shape units will be US survey feet. YARDS—The shape units will be US survey yards. MILES—The shape units will be US survey miles. NAUTICAL_MILES—The shape units will be US survey nautical miles. FEET_INT—The shape units will be international feet. YARDS_INT—The shape units will be international yards. MILES_INT—The shape units will be statute miles. NAUTICAL_MILES_INT—The shape units will be international nautical miles. ACRES—The shape units will be international acres. HECTARES—The shape units will be hectares. SQUARE_METERS—The shape units will be square meters. SQUARE_KILOMETERS—The shape units will be square kilometers. SQUARE_FEET—The shape units will be square international feet. SQUARE_YARDS—The shape units will be square international yards. SQUARE_MILES—The shape units will be square statute miles. SQUARE_FEET_US—The shape units will be square US survey feet. SQUARE_YARDS_US—The shape units will be square US survey yards. SQUARE_MILES_US—The shape units will be square US survey miles. ACRES_US—The shape units will be US survey acres.	String
standard_summary_fields [standard_summary_fields,...] (Optional)	The statistics that will be calculated on specified fields. COUNT—The number of nonnull values. It can be used on numeric fields or strings. The count of [null, 0, 2] is 2. SUM—The sum of numeric values in a field. The sum of [null, null, 3] is 3. MEAN—The mean of numeric values. The mean of [0,2, null] is 1. MIN—The minimum value of a numeric field. The minimum of [0, 2, null] is 0. MAX—The maximum value of a numeric field. The maximum value of [0, 2, null] is 2. STDDEV—The standard deviation of a numeric field. The standard deviation of [1] is null. The standard deviation of [null, 1,1,1] is null. VAR—The variance of a numeric field in a track. The variance of [1] is null. The variance of [null, 1,1,1] is null. RANGE—The range of a numeric field. This is calculated as the minimum value subtracted from the maximum value. The range of [0, null, 1] is 1. The range of [null, 4] is 0. ANY—A sample string from a field of type string. Specifies whether a field represents a count or a rate. COUNT—For line and polygon layers, the summarized field values will be proportioned by the percentage of the summarized features that intersect the summary polygons prior to calculating statistics. Values will not be proportioned for point layers. RATE—The summarized field values will not be proportioned. The raw field values will be used to calculate statistics.	Value Table
weighted_summary_fields [weighted_summary_fields,...] (Optional)	Specifies the weighted statistics that will be calculated on specified fields. MEAN—The weighted mean of each field will be calculated in which the weight applied is the proportion of the summarized layer within the polygons. STDDEV—The weighted standard deviation of each field will be calculated in which the weight applied is the proportion of the summarized layer within the polygons. VAR—The weighted variance of each field will be calculated in which the weight applied is the proportion of the summarized layer within the polygons. Specifies whether a field represents a count or a rate. Count—The summarized field values will be proportioned by the percentage of the summarized features that intersect the summary polygons prior to calculating statistics. Rate—The summarized field values will not be proportioned. The raw field values will be used to calculate statistics.	Value Table
group_by_field (Optional)	A field from the input summary features that will be used to calculate statistics for each unique attribute value. For example, the input summary features contain point locations of businesses that store hazardous materials, and one of the fields is HazardClass, which contains codes that describe the type of hazardous material stored. To calculate summaries by each unique value of HazardClass, use it as the group-by field.	Field
add_minority_majority (Optional)	Specifies whether minority (least dominant) and majority (most dominant) attribute values for each group field within each boundary will be added. When this parameter value is ADD_MIN_MAJ, two new fields will be added to the output layer prefixed with Minority_ and Majority_. This parameter only applies when a value is provided for the group_by_field parameter. NO_MIN_MAJ—Minority and majority fields will not be added. This is the default. ADD_MIN_MAJ—Minority and majority fields will be added.	Boolean
add_percentages (Optional)	Specifies whether percentage fields will be added. When this parameter value is ADD_PERCENT, the percentage of each unique group value will be calculated for each input polygon. This parameter only applies when a value is provided for the group_by_field parameter and a value is specified for the add_minority_majority parameter. NO_PERCENT—Percentage fields will not be added. This is the default. ADD_PERCENT—Percentage fields will be added.	Boolean
group_by_summary (Optional)	The output table that will contain the group by summaries.	Table

Code sample

SummarizeWithin example (Python window)

The following Python window script demonstrates how to use the SummarizeWithin function.

# Name: SummarizeWithin.py
# Description: Summarize river polylines by counties.

# Import system modules
import arcpy

arcpy.env.workspace = "C:/data/RedRiver_basin.gdb"

# Set local variables
summarizedLayer = "Rivers"
summaryPolys = "Basins"
summaryStatistics = [["Width", "MEAN", "RATE"]]
weightedSummaryStatistics = [["DOC", "STDDEV", "COUNT"]]
out = 'SummarizedRivers'


# Run SummarizeWithin
arcpy.gapro.SummarizeWithin(summarizedLayer, out, "POLYGON", None, 
                            None, summaryPolys, "ADD_SUMMARY", 
                            "KILOMETERS", summaryStatistics, 
                            weightedSummaryStatistics)

Environments

Output Coordinate System, Extent, Current Workspace

Feedback on this topic?