Multidimensional Principal Components (Image Analyst)

Available with Image Analyst license.

Summary

Transforms multidimensional rasters into their principal components, loadings, and eigenvalues. The tool transforms the data into a reduced number of components that account for the variance of the data, so that spatial and temporal patterns can be readily identified.

Usage

  • Use eigenvalues and cumulative percentages of variances in the Output Eigenvalues table to determine the number of principal components needed to define the data without losing essential information.

    Eigenvalue table

    In the example above, the first component shows 72.51 percent of the variance. To reach a value of 95 percent of the variance, choose the first five components.

  • The Mode parameter's Dimension Reduction option analyzes the data as a set of images. It transforms and reduces the data into a set of images that capture the dominant features and patterns. The principal components are a set of rasters stored as a multiband dataset.

  • The Mode parameter's Spatial Reduction option analyzes the data as a set of pixel time series. It finds dominant temporal patterns and associated spatial locations of these temporal patterns. The principal components are a set of one-dimensional arrays stored in a table.

  • Charts are automatically created on the output layers to analyze and understand the loadings, principal components, and eigenvalues.

  • The Number of Principal Components parameter specifies the number of bands in the output. To avoid the output of an unnecessarily large raster, use an appropriate percentage or number of components. Typically, the first few components will cover the most variance in the data.

Parameters

LabelExplanationData Type
Input Multidimensional Raster

The input multidimensional raster.

The tool processes data along one dimension, such as a time series raster or a data cube defined by a nontime dimension [X, Y, Z]. If an input variable includes multiple dimensions, such as depth and time, the first dimension value will be used by default.

You can use the Make Multidimensional Raster Layer tool or Subset Multidimensional Raster tool to redefine the multidimensional data as needed, such as configuring multidimensional data into a dataset with one dimension.

Raster Dataset; Mosaic Dataset; Raster Layer; Mosaic Layer; Image Service; File
Mode

Specifies the method that will be used to perform principal component analysis.

  • Dimension ReductionThe input time series data will be treated as a set of images. Principal components that extract prevalent pattens over time will be computed. This is the default.
  • Spatial ReductionThe input time series data will be treated as a set of pixels. Principal components that extract prevalent pattens and locations over time will be computed as a set of one-dimensional arrays stored in a table.
String
Dimension

The dimension name used to process the principal components.

String
Output Principal Components

The name of the output raster dataset.

When the Mode parameter is specified as Dimension Reduction, the output will be a multiband raster with the components as bands. The first band is the first principal component with the largest eigenvalue, the second band has the principal component with the second largest eigenvalue, and so on. The output is in CRF file format (.crf), which maintains the multidimensional information.

When the Mode parameter is specified as Spatial Reduction, the output is a table containing a set of time series data representing the principal components.

Raster Dataset; Table
Output Loadings

The output loadings data contributing to the principal components.

When the Mode parameter is specified as Dimension Reduction, the output will be a table containing the weights that each input raster contributed to the principal components. These weights define the correlations of the input data and the output principal components. Use the .csv file extension to output the loadings as a comma-separated values file.

When the Mode parameter is specified as Spatial Reduction, the output is a raster where pixel values are the weights contributing the principal components. Pixels with larger values are more corelated to the principal components. This output may have a larger cell size than the input raster because a random reprojection is applied to reduce the computation complexity.

The output loadings data contributing to the principal components.

When the mode parameter is specified as DIMENSION_REDUCTION, the output will be a table containing the weights that each input raster contributed to the principal components. These weights define the correlations of the input data and the output principal components. Use the .csv file extension to output the loadings as a comma-separated values file.

When the mode parameter is specified as SPATIAL_REDUCTION, the output is a raster where pixel values are the weights contributing the principal components. Pixels with larger values are more corelated to the principal components. This output may have a larger cell size than the input raster because a random reprojection is applied to reduce the computation complexity.

Table; Raster Dataset
Output Eigenvalues
(Optional)

The output Eigenvalues table. Eigenvalues are values indicating the variance percentage of each component. Eigenvalues help you define the number of principal components that are needed to represent the dataset.

Table
Variable
(Optional)

The variable of the input multidimensional raster used in computation. If the input raster is multidimensional and no variable is specified, only the first variable will be analyzed, by default.

For example, to find the years in which temperature values were highest, specify temperature as the variable to be analyzed. If you do not specify any variables and you have both temperature and precipitation variables, both variables will be analyzed, and the output multidimensional raster will include both variables.

String
Number of Principal Components
(Optional)

The number of principal components to compute, usually fewer than the number of input rasters.

This parameter also takes the form of a percentage (%). For example, a value of 90% means the number of components that can explain 90 percent of variance in the data will be computed.

String

MultidimensionalPrincipalComponents(in_multidimensional_raster, mode, dimension, out_pc, out_loadings, {out_eigenvalues}, {variable}, {number_of_pc})
NameExplanationData Type
in_multidimensional_raster

The input multidimensional raster.

The tool processes data along one dimension, such as a time series raster or a data cube defined by a nontime dimension [X, Y, Z]. If an input variable includes multiple dimensions, such as depth and time, the first dimension value will be used by default.

You can use the Make Multidimensional Raster Layer tool or Subset Multidimensional Raster tool to redefine the multidimensional data as needed, such as configuring multidimensional data into a dataset with one dimension.

Raster Dataset; Mosaic Dataset; Raster Layer; Mosaic Layer; Image Service; File
mode

Specifies the method that will be used to perform principal component analysis.

  • DIMENSION_REDUCTIONThe input time series data will be treated as a set of images. Principal components that extract prevalent pattens over time will be computed. This is the default.
  • SPATIAL_REDUCTIONThe input time series data will be treated as a set of pixels. Principal components that extract prevalent pattens and locations over time will be computed as a set of one-dimensional arrays stored in a table.
String
dimension

The dimension name used to process the principal components.

String
out_pc

The name of the output raster dataset.

When the mode parameter is specified as DIMENSION_REDUCTION, the output will be a multiband raster with the components as bands. The first band is the first principal component with the largest eigenvalue, the second band has the principal component with the second largest eigenvalue, and so on. The output is in CRF file format (.crf), which maintains the multidimensional information.

When the mode parameter is specified as SPATIAL_REDUCTION, the output is a table containing a set of time series data representing the principal components.

Raster Dataset; Table
out_loadings

The output loadings data contributing to the principal components.

When the mode parameter is specified as DIMENSION_REDUCTION, the output will be a table containing the weights that each input raster contributed to the principal components. These weights define the correlations of the input data and the output principal components. Use the .csv file extension to output the loadings as a comma-separated values file.

When the mode parameter is specified as SPATIAL_REDUCTION, the output is a raster where pixel values are the weights contributing the principal components. Pixels with larger values are more corelated to the principal components. This output may have a larger cell size than the input raster because a random reprojection is applied to reduce the computation complexity.

Table; Raster Dataset
out_eigenvalues
(Optional)

The output Eigenvalues table. Eigenvalues are values indicating the variance percentage of each component. Eigenvalues help you define the number of principal components that are needed to represent the dataset.

Table
variable
(Optional)

The variable of the input multidimensional raster used in computation. If the input raster is multidimensional and no variable is specified, only the first variable will be analyzed, by default.

For example, to find the years in which temperature values were highest, specify temperature as the variable to be analyzed. If you do not specify any variables and you have both temperature and precipitation variables, both variables will be analyzed, and the output multidimensional raster will include both variables.

String
number_of_pc
(Optional)

The number of principal components to compute, usually fewer than the number of input rasters.

This parameter also takes the form of a percentage (%). For example, a value of 90% means the number of components that can explain 90 percent of variance in the data will be computed.

String

Code sample

MultidimensionalPrincipalComponents example 1 (Python window)

This example computes three principal components from an NDVI time series raster.

# Import system modules 
import arcpy 
from arcpy.ia import *  

# Check out the ArcGIS Image Analyst extension license 
arcpy.CheckOutExtension("ImageAnalyst") 

arcpy.env.workspace = r"c:\data" 
arcpy.ia.MultidimensionalPrincipalComponents('ndviData.crf', 'DIMENSION_REDUCTION', "StdTime", "ndviData_PC.crf", "ndviData_loadings.csv", "ndviData_eiganvalues.csv", None, 3)
MultidimensionalPrincipalComponents example 2 (stand-alone script)

This example computes four principal components from an NDVI time series raster in Dimension Reduction mode.

# Import system modules 
import arcpy 
from arcpy.ia import * 

# Check out the ArcGIS Image Analyst extension license 
arcpy.CheckOutExtension("ImageAnalyst") 

# Define input parameters 
inputFile = r"c:\data\ndviData.crf" 
mode = "DIMENSION_REDUCTION" 
dimension = "StdTime" 
out_pc = r"c:\data\ndviData_pc.tif" 
out_loadings = r"c:\data\ndviData_loadings.csv" 
out_eiganvalues = r"c:\data\ndviData_pc.csv" 
variable = "ndvi" 
pc_number = 4 
  
# Execute  

arcpy.ia.MultidimensionalPrincipalComponents(inputFile, mode, dimension, out_pc, out_loadings, out_eiganvalues, variable, pc_number)
MultidimensionalPrincipalComponents example 3 (Python window)

This example computes three principal components from a time series raster in Spatial Reduction mode.

# Import system modules  

import arcpy  
from arcpy.ia import *   

# Check out the ArcGIS Image Analyst extension license  
arcpy.CheckOutExtension("ImageAnalyst")  

arcpy.env.workspace = r"c:\data"  
arcpy.ia.MultidimensionalPrincipalComponents('sstData.crf', 'SPATIAL_REDUCTION', "StdTime", "sstData_temporal_PC.csv", "sstData_loading_raster.crf", "sstData_eiganvalues.csv", None, 3)

Related topics