Principal Components (Spatial Analyst)

Available with Spatial Analyst license.

Summary

Performs Principal Component Analysis (PCA) on a set of raster bands and generates a single multiband raster as output.

Learn more about how Principal Components works

Usage

  • The value specified for the number of principal components determines the number of principal component bands in the output multiband raster. The number must not be larger than the total number of raster bands in the input.

  • When a multiband raster is specified as one of the Input raster bands (in_raster_bands in Python), all the bands will be used.

    To process a selection of bands from a multiband raster, you can first create a new raster dataset composed of those particular bands with the Composite Bands tool, and use the result in the list of the Input raster bands (in_raster_bands in Python).

  • The raster bands must have a common intersection. If there are none, an error occurs and no output is created.

  • The percent variance identifies the amount of the variance each eigenvalue captures. This can be useful to help interpret the results of PCA. If a few eigenvalues (each corresponding to bands in the output raster) capture the majority of the variance, it may be adequate to use this subset of bands in a subsequent analysis, since they may capture the majority of the interactions within the original multiband dataset.

  • When determining the percent variance each eigenvalue captures, the sum of eigenvalues is entered into the following formula: (eigenvalue * 100)/Sum. The first eigenvalue (and its associated band) captures the greatest variance, and the subsequent eigenvalues capture sequentially lesser variance. The accumulative percent of variance is a sequential sum of the variance each eigenvalue captures.

  • See Analysis environments and Spatial Analyst for additional details on the geoprocessing environments that apply to this tool.

Parameters

LabelExplanationData Type
Input raster bands

The input raster bands.

They can be integer or floating point type.

Raster Layer
Number of Principal components
(Optional)

Number of principal components.

The number must be greater than zero and less than or equal to the total number of input raster bands.

The default is the total number of rasters in the input.

Long
Output data file
(Optional)

Output ASCII data file storing principal component parameters.

The output data file records the correlation and covariance matrices, the eigenvalues and eigenvectors, the percent variance each eigenvalue captures, and the accumulative variance described by the eigenvalues.

The extension for the output file can be .txt or .asc.

File

Return Value

LabelExplanationData Type
Output multiband raster

The output multiband raster dataset.

If all of the input bands are integer type, the output raster bands will be integer. If any of the input bands are floating point, the output will be floating point.

If the output is an Esri Grid raster, the name must have less than 10 characters.

Raster

PrincipalComponents(in_raster_bands, {number_components}, {out_data_file})
NameExplanationData Type
in_raster_bands
[in_raster_band,...]

The input raster bands.

They can be integer or floating point type.

Raster Layer
number_components
(Optional)

Number of principal components.

The number must be greater than zero and less than or equal to the total number of input raster bands.

The default is the total number of rasters in the input.

Long
out_data_file
(Optional)

Output ASCII data file storing principal component parameters.

The output data file records the correlation and covariance matrices, the eigenvalues and eigenvectors, the percent variance each eigenvalue captures, and the accumulative variance described by the eigenvalues.

The extension for the output file can be .txt or .asc.

File

Return Value

NameExplanationData Type
out_multiband_raster

The output multiband raster dataset.

If all of the input bands are integer type, the output raster bands will be integer. If any of the input bands are floating point, the output will be floating point.

If the output is an Esri Grid raster, the name must have less than 10 characters.

Raster

Code sample

PrincipalComponents example 1 (Python window)

This example performs Principal Component Analysis (PCA) on an input multiband raster and generates a multiband raster output.

import arcpy
from arcpy import env
from arcpy.sa import *
env.workspace = "C:/sapyexamples/data"
outPrincipalComp = PrincipalComponents(["redlands"], 4,"pcdata.txt")
outPrincipalComp.save("C:/sapyexamples/output/outpc01")
PrincipalComponents example 2 (stand-alone script)

This example performs Principal Component Analysis (PCA) on an input multiband raster and generates a multiband raster output.

# Name: PrincipalComponents_Ex_02.py
# Description: Performs principal components analysis on a set of raster bands.
# Requirements: Spatial Analyst Extension

# Import system modules
import arcpy
from arcpy import env
from arcpy.sa import *

# Set environment settings
env.workspace = "C:/sapyexamples/data"

# Set local variables
inRasterBand1 = "redlands/redlandsc1"
inRasterBand2 = "redlands/redlandsc3"
numberComponents = 2
outDataFile = "C:/sapyexamples/output/pcdatafile.txt"

# Execute PrincipalComponents
outPrincipalComp = PrincipalComponents([inRasterBand1, inRasterBand2], 2,
                                       outDataFile)

# Save the output 
outPrincipalComp.save("C:/sapyexamples/output/outpc01")

Related topics