Big data analytics are used to process a variety of data sources in order to perform certain procedures or analyses. This processing generates output datasets and informational products that may need to be kept up-to-date to ensure accuracy for those who depend upon the results.
As data from input data sources changes over time and as new observations are made and new features or values are stored, big data analytic processing must be repeated to generate results for the latest set of data. These results can either replace prior outputs or can be appended to existing outputs to establish a representation of this analysis performed over time.
By scheduling a big data analytic to run periodically or at a recurring time, you can ensure the analytic is run at the desired frequency or interval to generate up-to-date outputs and informational products for use in your organization.
Consider the following examples:
- A transportation organization would like to generate a daily or weekly email report that indicates the total mileage driven by each of their vehicles or employees over that period of time
- An environmental group would like to calculate statistics on one or multiple attributes from sensor readings across a region once a week to understand how environmental patterns change over time or change depending on conditions.