Classification methods

When you apply styles using color or size to show numeric data, you can classify that data—that is, divide it into classes, or groups—and define the ranges and breaks for those classes. For example, you can group population by age data into up to 10 groups (population age 0-9, 10-19, 20-29, and so on), and visually identify those classes on a map.

Depending on the amount of data, you can have up to 10 classes. The more data, the more classes you can add. The way in which you define the class ranges and the class breaks—the high and low values that define each class—determines which features fall into each class and how the layer appears. By changing the classes using different classification methods, you change the appearance of the map. Generally, the goal is to ensure that features with similar values are in the same class.

Natural breaks

The natural breaks classification method (also known as Jenks' optimization) is based on natural groupings inherent in the data. Class breaks that best group similar values and maximize the differences between classes—for example, tree height in a national forest—are identified. The features are divided into classes whose boundaries are set where there are relatively large differences in the data values.

Because natural breaks classification places clustered values in the same class, this method is good for mapping data values that are not evenly distributed.

Equal interval

The equal-interval classification method divides the range of attribute values into subranges of equal size. With this classification method, you specify the number of intervals (or subranges), and the data is divided automatically. For example, if you specify three classes for an attribute field with values ranging from 0 to 300, three classes with ranges of 0-100, 101-200, and 201-300 are created.

Equal interval is best applied to familiar data ranges, such as percentages and temperature. This method emphasizes the amount of an attribute value relative to other values. For example, it can show that a store is part of the group of stores that make up the top one-third of all sales.

Standard deviation

The standard deviation classification method shows how much a feature's attribute value varies from the mean. By emphasizing values above and below the mean, the standard deviation classification helps show which features are above or below an average value. Use this classification when it is important to know how values relate to the mean, such as when reviewing population density in a region or comparing foreclosure rates across the country. For greater detail in the map, you can change the class size from 1 standard deviation to .5 standard deviation.

Quantile

With the quantile classification method, each class contains an equal number of features—for example, 10 per class or 20 per class. There are no empty classes or classes with too few or too many values. Quantile classification is well suited to linearly (evenly) distributed data. If you need to have the same number of features or values in each class, use quantile classification.

Because features are grouped in equal numbers in each class, the resulting map can often be misleading. Similar features can be placed in adjacent classes or features with widely different values can be put in the same class. You can minimize this distortion by increasing the number of classes.

Manual interval

To define custom classes, you can manually add class breaks and set class ranges that are appropriate for the data. Alternatively, you can start with one of the standard classification methods and make adjustments as necessary. There may already be certain standards or guidelines for mapping the data—for example, an agency may use standard classes or breaks for all maps, such as the Fujita scale (F-scale) to classify tornado strength.

Additional resources

The article Better Breaks Define Your Thematic Map's Purpose illustrates the differences between each classification method in a thematic map.