Label | Explanation | Data Type |
Input Time Series Data | The netCDF cube containing the variable that will be used to forecast to future time steps. This file must have an .nc file extension and must have been created using the Create Space Time Cube By Aggregating Points, Create Space Time Cube From Defined Locations, or Create Space Time Cube From Multidimensional Raster Layer tool. | File |
Output Model | The output folder location that will store the trained model. The trained model will be saved as a deep learning package file (.dlpk). | Folder |
Analysis Variable | The numeric variable in the dataset that will be forecasted to future time steps. | String |
Sequence Length
| The number of previous time steps that will be used when training the model. If the data contains seasonality (repeating cycles), provide the length corresponding to one season.
| Long |
Explanatory Training Variables
(Optional) | Independent variables from the data that will be used to train the model. Check the Categorical check box for any variables that represent classes or categories | Value Table |
Max Epochs (Optional) | The maximum number of epochs for which the model will be trained. The default is 20. | Long |
Number Of Time Steps to Exclude for Validation
(Optional) | The number of time steps that will be excluded for validation. For example, if a value of 14 is specified, the last 14 rows in the data frame will be used as validation data. The default is 10 percent of total timesteps. Ideally it should not be less than 5 percent of the total time steps in the input time cube.
| Long |
Model Type
(Optional) | Specifies the model architecture that will be used for training the model.
| String |
Batch Size (Optional) | The number of samples that will be processed at one time. The default is 64. Depending on the computer's GPU, this number can be changed to 8, 16, 32, 64, and so on. | Long |
Model Arguments (Optional) | Additional model arguments that will be used specific to each model. These arguments can be used to adjust the model complexity and size. See How Time Series forecasting models work to understand the model architecture, the supported model arguments, and their default values. | Value Table |
Stop training when model no longer improves
(Optional) | Specifies whether the model training will stop when validation loss does not register improvement after five consecutive epochs.
| Boolean |
Output Feature Class
(Optional) | The output feature class of all locations in the space-time cube with forecasted values stored as fields. The feature class will be created using prediction of the trained model on the validation dataset. The output displays the forecast for the final time step and contains pop-up charts showing the time series forecast on the validation set. | Feature Class |
Output Cube
(Optional) | An output space-time cube (.nc file) containing the values of the input space-time cube with the forecasted values for the corresponding validation time steps replaced. | File |
Multi-Step
(Optional) | Specifies whether a one-step or multistep approach will be used for training the multivariate time series forecasting model.
| Boolean |
Summary
Trains a deep learning-based time series forecasting model using time series data from a space-time cube. The trained model can be used for forecasting the values of each location of a space-time cube using the Forecast Using Time Series Model tool.
Time series data can follow various trends and have multiple levels of seasonality. Traditional time series forecasting models based on statistical approaches perform differently depending on the trend and patterns of seasonality in the data. Deep learning-based models have a high capacity to learn and can provide results across different kinds of time series, provided there is enough training data.
This tool trains time series forecasting models using various deep learning-based models, such as Fully Connected Network (FCN), Long Short-Term Memory (LSTM), InceptionTime, ResNet, and ResCNN. These models support multivariate time series, in which the model learns from more than one time dependent variable to forecast future values. The trained model is saved as a deep learning package file (.dlpk) and can be used for forecasting future values using the Forecast Using Time Series Model tool.
Usage
You must install the proper deep learning framework for Python in ArcGIS AllSource.
This tool accepts netCDF data created by the Create Space Time Cube By Aggregating Points, Create Space Time Cube From Defined Locations, Create Space Time Cube from Multidimensional Raster Layer, and Subset Space Time Cube tools.
Compared to other forecasting tools in the Time Series Forecasting toolset, this tool uses deep learning-based time series forecasting models. Deep learning models have a high capacity to learn and are appropriate for time series that follow complex trends and are difficult to model with simple mathematical functions. However, they require a larger volume of training data to learn such complex trends and use more computational resources for training and inference. A GPU is recommended for using this tool.
-
To run this tool using a GPU, set the Processor Type environment to GPU. If you have more than one GPU, specify the GPU ID environment instead.
This tool can be used to model both univariate and multivariate time series. If the space-time cube has other variables that are related to the variable being forecast, they can be included as explanatory variables to improve the forecast.
Univariate time series forecasting is estimated using only the one-step method, which is also the default method.
Multivariate time series forecasting can be used using two different approaches, one-step forecasting and multistep forecasting. The Multi-Step parameter will become active when multiple explanatory training variables are selected.
During the one-step method, the model can be updated with new data at each time step, making it suitable for real-time applications. However, since the model is updated at each time step, errors in predictions can accumulate over time, leading to less accurate long-term forecasts. When using multistep forecasting, the model predicts multiple future data points beyond the current time step. For example, if the goal is to forecast the next 20 time steps, the model will generate 20 consecutive predictions at once. Multistep forecasting allows the model to consider a broader view of the time series, capturing long-term trends and patterns more effectively. Since the model predicts multiple time steps ahead, the potential for error accumulation is reduced, leading to more accurate long-term forecasts. However, as the model predicts multiple steps at once, it may not be as agile to adapt to real-time changes in the data. The choice between these two approaches depends on the specific requirements and characteristics of the time series forecasting task.
The Sequence Length parameter impacts the outcome of a time series forecasting model and can be defined as the number of past time steps to use as input to predict the next time step. If the sequence length is n, the model will take the last n time steps as input to forecast the next time step. The parameter value cannot be larger than the total number of input time steps that remain after excluding validation time steps.
Rather than building an independent forecast model at each location of the space-time cube, this tool trains a single global forecast model that uses training data from each location. This global model will be used to forecast future values at every location using the Forecast Using Time Series Model tool.
The Output Features parameter value will be added to the Contents pane with rendering based on the final forecasted time step.
Example use cases for this tool include training a model to predict demand for retail products based on historical sales data, training a model to predict the spread of diseases, or training a model to predict generation of wind power based on historical production and weather data.
Deciding how many time steps to exclude for validation is important. The more time steps that are excluded, the fewer time steps there will be to estimate the validation RMSE. If too few time steps are excluded, the validation RMSE will be estimated using a small amount of data and may be misleading. Exclude as many time steps as possible while maintaining sufficient time steps to estimate the validation RMSE. Withhold at least as many time steps for validation as the number of time steps you intend to forecast if the space-time cube has enough time steps to support this.
For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions.
Parameters
arcpy.geoai.TrainTimeSeriesForecastingModel(in_cube, out_model, analysis_variable, sequence_length, {explanatory_variables}, {max_epochs}, {validation_timesteps}, {model_type}, {batch_size}, {arguments}, {early_stopping}, {out_features}, {out_cube}, {multistep})
Name | Explanation | Data Type |
in_cube | The netCDF cube containing the variable that will be used to forecast to future time steps. This file must have an .nc file extension and must have been created using the Create Space Time Cube By Aggregating Points, Create Space Time Cube From Defined Locations, or Create Space Time Cube From Multidimensional Raster Layer tool. | File |
out_model | The output folder location that will store the trained model. The trained model will be saved as a deep learning package file (.dlpk). | Folder |
analysis_variable | The numeric variable in the dataset that will be forecasted to future time steps. | String |
sequence_length | The number of previous time steps that will be used when training the model. If the data contains seasonality (repeating cycles), provide the length corresponding to one season.
| Long |
explanatory_variables [explanatory_variables,...] (Optional) | Independent variables from the data that will be used to train the model. Use a True value after any variables that represent classes or categories. | Value Table |
max_epochs (Optional) | The maximum number of epochs for which the model will be trained. The default is 20. | Long |
validation_timesteps (Optional) | The number of time steps that will be excluded for validation. For example, if a value of 14 is specified, the last 14 rows in the data frame will be used as validation data. The default is 10 percent of total timesteps. Ideally it should not be less than 5 percent of the total time steps in the input time cube.
| Long |
model_type (Optional) | Specifies the model architecture that will be used for training the model.
| String |
batch_size (Optional) | The number of samples that will be processed at one time. The default is 64. Depending on the computer's GPU, this number can be changed to 8, 16, 32, 64, and so on. | Long |
arguments [arguments,...] (Optional) | Additional model arguments that will be used specific to each model. These arguments can be used to adjust the model complexity and size. See How Time Series forecasting models work to understand the model architecture, the supported model arguments, and their default values. | Value Table |
early_stopping (Optional) | Specifies whether the model training will stop when validation loss does not register improvement after five consecutive epochs.
| Boolean |
out_features (Optional) | The output feature class of all locations in the space-time cube with forecasted values stored as fields. The feature class will be created using prediction of the trained model on the validation dataset. The output displays the forecast for the final time step and contains pop-up charts showing the time series forecast on the validation set. | Feature Class |
out_cube (Optional) | An output space-time cube (.nc file) containing the values of the input space-time cube with the forecasted values for the corresponding validation time steps replaced. | File |
multistep (Optional) | Specifies whether a one-step or multistep approach will be used for training the multivariate time series forecasting model.
| Boolean |
Code sample
This example shows how to use the TrainTimeSeriesForecastingModel function.
# Name: TrainTimeSeriesForecastingModel.py
# Description: Train a time series model on space-time cube data with
# different AI models.
# Import system modules
import arcpy
import os
# Set local variables
datapath = "path_to_data_for_forecasting"
out_path = "path_to_gdb_for_forecasting"
model_path = os.path.join(out_path, "model")
in_cube = os.path.join(datapath, "test_data")
out_features = os.path.join(out_path, "forecasted_feature.gdb", "forecasted")
# Run TrainTimeSeriesForecastingModel
arcpy.geoai.TrainTimeSeriesForecastingModel(
in_cube,
model_path,
"CONSUMPTION",
12,
None,
20,
2,
"InceptionTime",
64,
None,
True,
out_features
)