Label | Explanation | Data Type |
Input Study Area
| The input study area where sample locations will be created. The study area must be polygons or an integer (categorical) raster. For rasters, cells with null values will not be included in the study area. | Feature Layer; Raster Layer |
Output Features
| The output features representing the sample locations. For simple random and stratified sampling, the output features will be points. For cluster sampling, the output will be polygons. For systematic sampling, the output can be points or polygons. | Feature Class |
Sampling Method
(Optional) | Specifies the sampling method that will be used to create the sample locations.
| String |
Strata ID Field
(Optional) | For stratified sampling by strata ID field, the strata ID field defining the strata. | Field |
Strata Sample Count Allocation Method
(Optional) | For stratified sampling, specifies the method that will be used to determine the number of sample locations that will be created in each stratum.
| String |
Bin Shape
(Optional) | For systematic and cluster sampling, specifies the shape of the polygons that will be generated in the gridded tessellation.
| String |
Bin Size
[count or area] (Optional) | For systematic and cluster sampling, the size of each polygon in the tessellation. The value can be provided as a count (the total number of tessellated polygons created in the study area) or as an area (the area of each tessellated polygon). For count input, the default is 100. For area input, a value must be provided. If a count is provided, the tool will attempt to create the specified number of sample locations. If the exact number cannot be created, a warning will be returned. | Areal Unit; Long |
H3 Resolution (Optional) | For systematic or cluster sampling with H3 hexagon bins, specifies the H3 resolution of the hexagons. With each increasing resolution value, the area of the polygons will be one seventh the size.
| Long |
Number of Samples
(Optional) | The number of sample locations that will be created. This parameter always applies to simple random and cluster sampling. For stratified sampling, this parameter applies when the sample count will be proportional to the stratum area or proportional to a population field. For simple random and stratified sampling, the default is 100. For cluster sampling, the default is 10. | Long |
Number of Samples Per Stratum
(Optional) | For stratified sampling with an equal sample count in each stratum, the number of sample locations created within each stratum. The total number of samples will be this value multiplied by the number of strata. The default is 100. | Long |
Population Field
(Optional) | The population field for stratified sampling when the sample count is equal or proportional to a population field. | Field |
Output Geometry Type
(Optional) | For systematic sampling, specifies whether the sample locations will be tessellated polygons or centroids (points) of the tessellated polygons.
| String |
Minimum Distance Between Sample Points
(Optional) | For simple random and stratified sampling, the smallest allowed distance between sample locations. For simple random sampling, all points will be at least this distance apart. For stratified sampling, points within the same stratum will be at least this distance apart, but points in neighboring strata may be closer than this distance. For large distances, fewer sample locations than were expected may be created to keep the locations sufficiently far apart. In this case, a warning message will be returned. | Linear Unit |
Spatial Relationship
(Optional) | Specifies which polygons from a background tessellation will be included as sampling locations. This parameter applies to cluster sampling and to systematic sampling when the output geometry type is polygon.
| String |
Summary
Creates sample locations within a continuous study area using simple random, stratified, systematic (gridded), or cluster sampling designs.
Sampling is the process of selecting individuals from a population to study them and make inferences about the entire population. Continuous spatial sampling treats the population as a continuous area from which any location or area can be sampled. For example, you can use this tool to create sample locations for trees within a dense forest or to collect soil moisture measurements in a crop field. This tool is not appropriate for sampling discrete populations such as households, animals, or cities.
Illustration
Usage
The input study area must be a polygon feature class or an integer (categorical) raster. You can also draw the study area on a map using interactive feature input. For rasters, cells with null values will not be considered part of the study area.
Sample locations can be created for the following primary sampling designs:
- Simple random sampling—Create sample points randomly within the study area. Each location in the study area is equally likely to be selected as a sample location. The study area will be treated as a single area, and all boundaries between polygons or raster categories will be ignored (for example, a polygon feature class of all counties within a state will define the same study area as a single polygon of the entire state). Simple random sampling is useful when you want to investigate the entire study area, but no location is more important for sampling than any other location. To perform simple random sampling, specify the Simple random option for the Sampling Method parameter.
- Example application: If the study area is a dense forest where every location can be assumed to have a tree, simple random sampling can be used to randomly sample trees within the forest.
- Stratified random sampling—Create sample points by dividing the study area into distinct strata (such as soil class or land use type) and performing simple random sampling separately within each stratum. Stratified random sampling is useful when you want to ensure that all strata are represented in the sample. To perform stratified random sampling, specify one of the three stratification options for the Sampling Method parameter (see the next usage tip for information about each type of stratification).
- Example application: If a national park is divided into elevation classes, stratified random sampling can be used to collect soil samples separately for each elevation class. This ensures that there will be sufficient soil sampling across all elevations in the park.
- Systematic sampling—Create sample locations in a gridded, nonrandom pattern within the study area. The grid is created by a tessellation of regularly shaped polygons (such as hexagons, squares, or triangles). The sample locations can be returned as the tessellated polygons or as points (the centroids of the tessellated polygons). Systematic sampling is useful for ensuring that no sections of the study area are sampled more than others, which is often desirable when the goal is to create a map of the samples rather than to make inferences about the entire study area. To perform systematic sampling, specify the Systematic option for the Sampling Method parameter.
- Example application: To study the ocean floor in a marine area, you can create a hexagonal grid of sample locations to sample marine plant species.
- Cluster sampling—Create sample polygons by creating a systematic sample and randomly selecting some of the polygons from the tessellation. The resulting polygons are called clusters and, typically, the clusters are exhaustively studied, sampling as much as possible within each cluster. Cluster sampling is useful when you are interested in how samples interact with each other at short distances, and it is acceptable for large sections of the study area to have no samples. To perform cluster sampling, specify the Cluster option for the Sampling Method parameter.
- Example application: When sampling insect colonies, cluster sampling can be used to create small areas of a plot, and all insect colonies within the clusters will be sampled.
- Simple random sampling—Create sample points randomly within the study area. Each location in the study area is equally likely to be selected as a sample location. The study area will be treated as a single area, and all boundaries between polygons or raster categories will be ignored (for example, a polygon feature class of all counties within a state will define the same study area as a single polygon of the entire state). Simple random sampling is useful when you want to investigate the entire study area, but no location is more important for sampling than any other location. To perform simple random sampling, specify the Simple random option for the Sampling Method parameter.
For stratified sampling, you can define the strata in the three ways described below. Each is an option for the Sampling Method parameter.
- Stratify by individual polygon—Each record in a polygon feature class is a different stratum. For example, if the study area is a field with subplots stored as separate polygons, sample points will be created separately for each subplot. The input study area must be polygons.
- Stratify by contiguous raster region—Each region of an integer (categorical) raster will be a stratum. A raster region is a contiguous block of cells with the same value (from the Value field) that are connected by shared cell edges. If two regions have the same value but are disconnected from each other, they will be different strata. The input study area must be a raster.
- Stratify by strata ID field—All polygons or raster cells with the same strata ID value will be a stratum. The polygons or raster cells do not need to be contiguous to be in the same stratum. Provide the field containing the strata ID values in the Strata ID Field parameter. The field must be integer or text.
You can specify the number of samples that will be created in each stratum using one of following options for the Strata Sample Count Allocation Method parameter:
- Equal count in each stratum—An equal number of samples will be created in each stratum. Provide the value in the Number of Samples Per Stratum parameter.
- Count proportional to stratum area—The number of samples in the strata will be proportional to the size of the strata. Provide the overall number of samples in the Number of Samples parameter, and the total count will be distributed to each stratum proportionally to its area.
- Count equal to population field—The number of samples in each stratum will be equal to the values of a population field. Provide the field in the Population Field parameter. The field cannot contain negative values and must be an integer type.
- Count proportional to population field—The number of samples in each stratum will be proportional to the values of a population field. Provide the field in the Population Field parameter and the overall number of samples in the Number of Samples parameter.
- Stratify by individual polygon—Each record in a polygon feature class is a different stratum. For example, if the study area is a field with subplots stored as separate polygons, sample points will be created separately for each subplot. The input study area must be polygons.
You can also use the tool to create the following advanced sampling designs that are not available as explicit options for the Sampling Method parameter:
- Two-stage cluster sampling—Create clusters of points throughout the study area by first creating a cluster sample and then creating points (simple random, stratified, or systematic) within each cluster. This sampling design is useful when a cluster sample is needed, but it is not feasible to exhaustively study each cluster polygon. It is also useful when you are primarily interested in how samples interact at short distances. To perform two-stage cluster sampling, first use the tool to create a cluster sample; then, use the cluster polygons as the input study area in a simple random, stratified, or systematic sampling design.
- Mixed (composite) sampling—Create separate sampling locations from different sampling designs; then. merge them into a single dataset. For example, combining a simple random sample and a two-stage cluster sample will produce sampling locations across the entire study area (simple random) and also include small patches with more points (two-stage cluster). This is useful because simple random sampling on its own can miss how samples interact at short distances, but two-stage cluster sampling leaves large areas of the study area with no sample locations. By combining the two, you can ensure that the entire study area is represented and still investigate the interaction between samples at short distances.
A warning will be returned if the specified number of sample locations cannot be created. This can occur in the following situations:
- The value of the Minimum Distance Between Sample Points parameter is so large that the specified number of sample locations cannot be created within the study area (or stratum) without some points being closer to each other than the minimum distance. In this case, fewer locations will be created than were specified.
- If the Bin Size parameter value is provided as a count, it is not always possible to create the specified number of sample locations in the study area. The tool will try various area values and use the area that creates a sample count closest to the specified value. The area (in the unit of the output coordinate system) and the resulting number of sample locations will be returned as geoprocessing messages.
If the specified parameters do not create any sample locations (such as using an output extent that does not intersect the study area), an error will be returned.
For systematic and cluster sampling and any bin shape except H3 hexagons, the centroid of the first polygon of the tessellation is created at the lower left corner of the output extent. For H3 hexagons, the hexagons are at fixed locations. For all bin shapes, you can use the Spatial Relationship parameter to return the polygons that intersect, are completely within, or have centroids that are within the study area.
If you stratify by strata ID field and use a population field (equal or proportional), the population of each stratum will be the sum of the population field values of every polygon or raster category in the stratum.
If you stratify by contiguous raster region, you cannot use a population field. This is because each population field value represents the total population of a raster category even if the category is composed of multiple disjoint regions. To use population fields while stratifying by contiguous raster region, use the Raster To Polygon tool to convert the raster to polygons and assign population values to each polygon (for example, by allocating the population of each category proportionally to the number of cells in each of its regions).
If you use the Extent environment with a polygon study area, any polygon that intersects the extent will be included in the study area, and sample locations will be created throughout the entire polygon even if they are outside the provided extent.
For stratified sampling with strata sample count proportional to area or a population field, the Largest Remainder Method is used to ensure that the overall sample count is not altered due to rounding.
Parameters
arcpy.management.CreateSpatialSamplingLocations(in_study_area, out_features, {sampling_method}, {strata_id_field}, {strata_count_method}, {bin_shape}, {bin_size}, {h3_resolution}, {num_samples}, {num_samples_per_strata}, {population_field}, {geometry_type}, {min_distance}, {spatial_relationship})
Name | Explanation | Data Type |
in_study_area | The input study area where sample locations will be created. The study area must be polygons or an integer (categorical) raster. For rasters, cells with null values will not be included in the study area. | Feature Layer; Raster Layer |
out_features | The output features representing the sample locations. For simple random and stratified sampling, the output features will be points. For cluster sampling, the output will be polygons. For systematic sampling, the output can be points or polygons. | Feature Class |
sampling_method (Optional) | Specifies the sampling method that will be used to create the sample locations.
| String |
strata_id_field (Optional) | For stratified sampling by strata ID field, the strata ID field defining the strata. | Field |
strata_count_method (Optional) | For stratified sampling, specifies the method that will be used to determine the number of sample locations that will be created in each stratum.
| String |
bin_shape (Optional) | For systematic and cluster sampling, specifies the shape of the polygons that will be generated in the gridded tessellation.
| String |
bin_size (Optional) | For systematic and cluster sampling, the size of each polygon in the tessellation. The value can be provided as a count (the total number of tessellated polygons created in the study area) or as an area (the area of each tessellated polygon). For count input, the default is 100. For area input, a value must be provided. If a count is provided, the tool will attempt to create the specified number of sample locations. If the exact number cannot be created, a warning will be returned. | Areal Unit; Long |
h3_resolution (Optional) | For systematic or cluster sampling with H3 hexagon bins, specifies the H3 resolution of the hexagons. With each increasing resolution value, the area of the polygons will be one seventh the size.
| Long |
num_samples (Optional) | The number of sample locations that will be created. This parameter always applies to simple random and cluster sampling. For stratified sampling, this parameter applies when the sample count will be proportional to the stratum area or proportional to a population field. For simple random and stratified sampling, the default is 100. For cluster sampling, the default is 10. | Long |
num_samples_per_strata (Optional) | For stratified sampling with an equal sample count in each stratum, the number of sample locations created within each stratum. The total number of samples will be this value multiplied by the number of strata. The default is 100. | Long |
population_field (Optional) | The population field for stratified sampling when the sample count is equal or proportional to a population field. | Field |
geometry_type (Optional) | For systematic sampling, specifies whether the sample locations will be tessellated polygons or centroids (points) of the tessellated polygons.
| String |
min_distance (Optional) | For simple random and stratified sampling, the smallest allowed distance between sample locations. For simple random sampling, all points will be at least this distance apart. For stratified sampling, points within the same stratum will be at least this distance apart, but points in neighboring strata may be closer than this distance. For large distances, fewer sample locations than were expected may be created to keep the locations sufficiently far apart. In this case, a warning message will be returned. | Linear Unit |
spatial_relationship (Optional) | Specifies which polygons from a background tessellation will be included as sampling locations. This parameter applies to cluster sampling and to systematic sampling when the output geometry type is polygon.
| String |
Code sample
The following Python script demonstrates how to use the CreateSpatialSamplingLocations function.
# Create 50 sampling locations in the dissolved California counties.
import arcpy
arcpy.management.CreateSpatialSamplingLocations(
in_study_area="CA_counties",
out_features="outputSamplingLocations"
sampling_method="RANDOM",
strata_id_field=None,
strata_count_method="EQUAL",
bin_shape="HEXAGON",
bin_size=None,
h3_resolution=7,
num_samples=50,
num_samples_per_strata=100,
population_field=None,
geometry_type="POINT",
min_distance="15 NauticalMilesInt",
spatial_relationship = "HAVE_THEIR_CENTER_IN"
)
The following Python script demonstrates how to use the CreateSpatialSamplingLocations function.
# Simple random sampling
# Create 50 sample points in a polygon study area.
# Import system modules.
import arcpy
# Allow overwriting output.
arcpy.env.overwriteOutput = True
# Define study area and output features.
inputStudyArea = "C:/samplingdata/inputs.gdb/study_area_polygons"
outputFeatures = "C:/samplingdata/outputs.gdb/out_samples_SRS"
# Define the sampling method and number of samples.
samplingMethod = "RANDOM"
numSamples=50
# Define the minimum distance between any two points.
minDistance= "15 NauticalMilesInt"
# Run tool.
try:
arcpy.management.CreateSpatialSamplingLocations(inputStudyArea, outputFeatures,
samplingMethod, "", "", "", "", "", numSamples, "", "", "",
minDistance)
except arcpy.ExecuteError:
# If an error occurred when running the tool, print the error message.
print(arcpy.GetMessages())
The following Python script demonstrates how to use the CreateSpatialSamplingLocations function.
# Stratify by individual polygons
# Create 100 sample points in each polygon.
# Import system modules.
import arcpy
# Allow overwriting output.
arcpy.env.overwriteOutput = True
# Define the study area and output features.
inputStudyArea = "C:/samplingdata/inputs.gdb/study_area_polygons"
outputFeatures = "C:/samplingdata/outputs.gdb/out_samples_SBIP"
# Define the sampling method.
samplingMethod = "STRAT_POLY"
# Create 100 samples in each polygon.
strataCountMethod = "EQUAL"
numSamplesPerStrata=100
# Define the minimum distance between any two points in the same polygon.
minDistance= "15 Meters"
# Run tool.
try:
arcpy.management.CreateSpatialSamplingLocations(inputStudyArea, outputFeatures,
samplingMethod, "", strataCountMethod, "", "", "", "",
numSamplesPerStrata, "", "", minDistance)
except arcpy.ExecuteError:
# If an error occurred when running the tool, print the error message.
print(arcpy.GetMessages())
The following Python script demonstrates how to use the CreateSpatialSamplingLocations function.
# Stratify by contiguous raster region
# Create 100 points in a raster study area with number of samples in
# each region proportional to the area of the region.
# Import system modules.
import arcpy
# Allow overwriting output.
arcpy.env.overwriteOutput = True
# Define the study area and output features.
inputStudyArea = "C:/samplingdata/raster_study_area.tif"
outputFeatures = "C:/samplingdata/outputs.gdb/out_samples_SBCRR"
# Define the sampling method.
samplingMethod = "STRAT_RAST"
# Create 100 points and allocate proportionally to the area of the regions.
strataCountMethod = "PROP_AREA"
numSamples=100
# Run tool.
try:
arcpy.management.CreateSpatialSamplingLocations(inputStudyArea, outputFeatures,
samplingMethod, "", strataCountMethod, "", "", "", numSamples)
except arcpy.ExecuteError:
# If an error occurred when running the tool, print the error message.
print(arcpy.GetMessages())
The following Python script demonstrates how to use the CreateSpatialSamplingLocations function.
# Stratify by strata ID field
# Create sample points in each land use category of a raster.
# Use a population field to define the number of samples in each category.
# Import system modules.
import arcpy
# Allow overwriting output.
arcpy.env.overwriteOutput = True
# Define the study area and output features.
inputStudyArea = "C:/samplingdata/land_use_raster.tif"
outputFeatures = "C:/samplingdata/outputs.gdb/out_samples_SBSIDF"
# Define the sampling method.
samplingMethod = "STRAT_ID"
# All raster cells with the same value are in the same stratum.
strataIDField = "LandUse"
# Define the number of samples using a population field.
strataCountMethod = "FIELD"
populationField="Population"
# Run tool.
try:
arcpy.management.CreateSpatialSamplingLocations(inputStudyArea, outputFeatures,
samplingMethod, strataIDField, strataCountMethod, "", "", "",
"", "", populationField)
except arcpy.ExecuteError:
# If an error occurred when running the tool, print the error message.
print(arcpy.GetMessages())
The following Python script demonstrates how to use the CreateSpatialSamplingLocations function.
# Systematic sampling
# Create sample points in a hexagonal tessellation in a polygon study area.
# Import system modules.
import arcpy
# Allow overwriting output.
arcpy.env.overwriteOutput = True
# Define the study area and output features.
inputStudyArea = "C:/samplingdata/inputs.gdb/study_area_polygons"
outputFeatures = "C:/samplingdata/outputs.gdb/out_samples_SYS"
# Define the sampling method.
samplingMethod = "SYSTEMATIC"
# Create points in a hexagonal tessellation.
binShape = "HEXAGON"
binSize = "10000 SquareFeet"
outputGeometryType = "POINT"
# Run tool.
try:
arcpy.management.CreateSpatialSamplingLocations(inputStudyArea, outputFeatures,
samplingMethod, "", "", binShape, binSize, "", "", "", "",
outputGeometryType)
except arcpy.ExecuteError:
# If an error occurred when running the tool, print the error message.
print(arcpy.GetMessages())
The following Python script demonstrates how to use the CreateSpatialSamplingLocations function.
# Cluster sampling
# Create 100 cluster polygons that are diamond shaped.
# Import system modules.
import arcpy
# Allow overwriting output.
arcpy.env.overwriteOutput = True
# Define the study area and output features.
inputStudyArea = "C:/samplingdata/inputs.gdb/study_area_polygons"
outputFeatures = "C:/samplingdata/outputs.gdb/out_samples_CLUST"
# Define the sampling method.
samplingMethod = "CLUSTER"
# Create a diamond tessellation and randomly choose 100 polygons.
binShape = "DIAMOND"
binSize = "1000000 SquareFeet"
numSamples=100
spatialRelationship = "INTERSECT"
# Run tool.
try:
arcpy.management.CreateSpatialSamplingLocations(inputStudyArea, outputFeatures,
samplingMethod, "", "", binShape, binSize, "", numSamples, "",
"", "", "", spatialRelationship)
except arcpy.ExecuteError:
# If an error occurred when running the tool, print the error message.
print(arcpy.GetMessages())