Available in big data analytics.
The Summarize Within tool calculates statistics in areas where an input layer is within or overlaps a boundary layer. The area being summarized can be an area layer or a hexagonal or square bin.
Workflow diagram
Examples
 A cable provider is starting a pilot program that provides lowcost internet access to lowincome community college students. Using Summarize Within by bins can be used to determine the number of lowincome students within square bins of a defined size so the cable provider can determine an appropriate region for its pilot program.
 To complete routine maintenance projects efficiently, the city uses Summarize Within to count the street lights and to sum the miles of bike lanes within each maintenance assessment district. It can then estimate the material and staff needed to complete the work in each district.
Usage notes
 The input layer to be summarized can be a point, line, or polygon layer.
 The output layer is always a polygon area or bin layer, and only the area or bin features where summarized features occur are returned.
 You can think of Summarize Within as taking two layers, the area features and the input summary features, and stacking them on top of each other. After stacking these layers, you view down through the stack and count the number of input summary features that fall within the areas. In addition to the number of features, you can also calculate simple statistics about the attributes of the input summary features, such as sum, mean, minimum, maximum, and so on.
 You can use Summarize Within to calculate standard statistics and geographically weighted statistics. Standard statistics summarize the statistical values without weighting. Weighted statistics calculate values using the geographically weighted values of the proportion of lines within a polygon, or proportion of polygons within a polygon. Weighted statistics do not apply to points within polygons.
How Summarize Within works
Equations
For summarized line and area features, weighted statistics incorporate Summary Area weights. None of the statistics for point features are weighted. The following table shows the equations used to calculate variance, the weighted mean, and the weighted standard deviation.
Statistic  Equation  Variables  Features 

Variance  Points  
Weighted Mean  Weights are calculated as the percentage of the feature within the summary area.  Lines and Areas  
Weighted Standard Deviation  Weights are calculated as the percentage of the feature within the summary area.  Lines and Areas 
Points
Point layers are summarized using only the point features that fall within the Summary Area. Weighted statistics cannot be applied when summarizing points.
The figure and table below explain the statistical calculations of a point Summarized Layer within hypothetical areas. The Population field was used to calculate the statistics (Count, Sum, Minimum, Maximum, Range, Mean, Standard Deviation, and Variance) for the layer.
Numeric statistic  Results District A 

Count  Count of:

Sum 

Minimum  Minimum of:

Maximum  Maximum of:

Range 

Mean 

Variance 

Standard Deviation 

String statistic  Results District A 

Count 

Any  = Secondary School 
Note:
The count statistic (for strings and numeric fields) counts the number of nonnull values. For example, the count of [0, 1, 10, 5, null, 6] is 5. The count of [Primary, Primary, Secondary, null] is 3.
A reallife scenario in which this analysis could be used is in determining the total number of students in each school district. Each point represents a school. The Type field gives the type of school (elementary, middle school, or secondary) and a student population field gives the number of students enrolled at each school. The calculations and results are given for District A in the table above. From the results, you can see that District A has 2,568 students. When running the Summarize Within tool, the results would also be given for District B.
Lines
For weighted statistics, line layers are summarized using only the proportions of line features that are within the Summary Area. Standard (nonweighted) statistics summarize any line intersecting the Summary Area. When summarizing lines using weighted statistics, use counts and amounts (rather than rates or indices) so proportional calculations make logical sense in your analysis.
The figure and table below explain the statistical calculations of a line Summarized Layer within a hypothetical Summary Area. The Volume field was used to calculate the statistics (Count, Sum, Minimum, Maximum, Range, Mean, Standard Deviation, and Variance) for the layer. The standard statistics are calculated using lines that intersect the boundary and the weighted statistics are calculated using the proportion of the lines that are within the Summary Area.
Numeric statistics  Standard statistics  Weighted statistics 

Calculating Weights  Not applicable  Weight of the brown line (value = 600):
Weight of the blue line (value = 1000):

Count  Count of:
 Count of:

Sum 


Minimum  Minimum of:
 Minimum of:

Maximum  Maximum of:
 Maximum of:

Range 


Mean 


Variance 


Standard Deviation 


A reallife scenario in which this analysis could be used is in determining the total volume of water in rivers within the boundaries of a state park. Each line represents a river that is partially located inside the park. From the results, you can see that there are 5 miles of rivers within the park and the total volume is 900 units.
Areas
Area layers are summarized using only the proportions of the area features that are within the input boundary. When summarizing areas, use fields with absolute numbers so proportional calculations make logical sense in the analysis.
Weighted statistics for summarized area layers are based on the proportions of the Summary Area features that are within the Summarized Layer. When summarizing areas, use counts or amounts (rather than rates or indices) so proportional calculations make logical sense in your analysis.
The figure and table below explain the statistical calculations of an area layer within a hypothetical Summary Area. The population field was used to calculate the statistics (Count, Sum, Minimum, Maximum, Range, Mean, Standard Deviation, and Variance) for the layer. The standard statistics are calculated using areas that intersect the Summary Area, and the weighted statistics are calculated using a proportional weight based on the portion of summary areas contained within each Summarized Layer.
Numeric statistics  Standard statistics: Results Neighborhood 1  Weighted statistics: Results Neighborhood 1 

Calculating Weights  Weight of the yellow area (value = 3200):
Weight of the green area (value = 4700):
Weight of the pink area (value = 1000):
Weight of the blue area (value = 4500):
Weight of the orange area (value = 3600):
 
Count  Count of:
 Count of:

Sum 


Minimum  Minimum of:
 Minimum of:

Maximum  Maximum of:
 Maximum of:

Range 


Mean 


Variance 


Standard Deviation 


Parameters
Parameter  Description  Data type 

Input Layer  The point, line, or polygon features that will be summarized within area features.  Features 
Bin Type  The bin shape that will be used to create the regular bins. Options are Square and Hexagon. If a polygon source is connected to the join port of this tool, this parameter will no longer appear or be required.  String 
Bin Size  The distance interval that represents the bin size into which the input points will be aggregated. For square bins, the bin size represents the height of a square. This is the default. For hexagonal bins, the bin size represents the height between two parallel sides. If a polygon source is connected to the join port of this tool, this parameter will no longer appear or be required.  String 
Summarize Shapes  Specifies whether shape information will be summarized as part of the analysis (length of lines or area of polygons). If the input summary features are points, there is no shape information to summarize. Only the count of points within each area feature is added.  Boolean 
Shape Units  The unit in which to calculate shape summary attributes. If the input summary features are lines, specify a linear unit. If the input summary features are polygons, specify an areal unit.  String 
Summary Fields  The statistics that will be calculated for specified fields. Different statistics are available depending on whether the specified field is a string, numeric, or date field.
 String 
Weighted Statistics  The geographically weighted statistics that will be calculated for specified fields. Weighted statistics calculate values using the geographically weighted values of the proportion of lines within a polygon, or proportion of polygons within a polygon. Weighted statistics do not apply to points within polygons. Different statistics are available depending on whether the specified field is a string, numeric, or date field.
 String 
Output layer
The output layer will contain the following fields in place of the original fields. If you configured summary fields, those fields will also be calculated for the output layer.
Field name  Description  Field type 

COUNT  The number of features from the input layer that were summarized into this polygon bin.  Float64 
sum_length_<units>  If the input layer is a polyline feature, and the Summarize Shapes parameter is set to Yes, the output will generate this field that reports the total length of polyline features within each bin, in the units specified by the Shape Units parameter.  Float64 
sum_area_<units>  If the input layer is a polygon feature, and the Summarize Shapes parameter is set to Yes, the output will generate this field that reports the total area of polygon features within each bin, in the units specified by the Shape Units parameter.  Float64 
Considerations and limitations
Lines and areas are summarized using proportions; therefore, it is best to summarize absolute data (such as population) rather than relative data (such as average income) when lines or areas are being summarized.