Chapter Contents |
Previous |
Next |
The FORECAST Procedure |
This section explains the forecasting methods used by PROC FORECAST.
The STEPAR method fits the autoregressive process to the residuals of the trend model using a backwards-stepping method to select parameters. Since the trend and autoregressive parameters are fit in sequence rather than simultaneously, the parameter estimates are not optimal in a statistical sense; however, the estimates are usually close to optimal, and the method is computationally inexpensive.
Missing values are tolerated in the series; the autocorrelations are estimated from the available data and tapered if necessary.
This method requires at least three passes through the data: two passes to fit the model and a third pass to initialize the autoregressive process and write to the output data set.
If the INTERVAL= option is specified, the default NLAGS= value includes lags for up to three years plus one, subject to the maximum of 13 lags or one third of the number of observations in your data set, whichever is less. If the number of observations in the input data set cannot be determined, the maximum NLAGS= default value is 13. If the INTERVAL= option is not specified, the default is NLAGS=13 or one-third the number of input observations, whichever is less.
If the Toeplitz matrix formed by the autocovariance matrix at a given step is not positive definite, the maximal number of autoregressive lags is reduced.
For example, for INTERVAL=QTR, the default is NLAGS=13 (that is, 4×3+1) provided that there are at least 39 observations. The NLAGS= option default is always at least 3.
The EXPO method fits a trend model such that the most recent data are weighted more heavily than data in the early part of the series. The weight of an observation is a geometric (exponential) function of the number of periods that the observation extends into the past relative to the current period. The weight function is
where is the observation number of the past observation, t is the current observation number, and is the weighting constant specified with the WEIGHT= option.
You specify the model with the TREND= option as follows:
where St is the smoothed value at the current period, t is the time index of the current period, and xt is the current actual value of the series. The smoothed value St is the forecast of xt+1 and is calculated as the smoothing constant times the value of the series, xt, in the current period plus () times the previous smoothed value St-1, which is the forecast of xt computed at time t-1.
Double and triple exponential smoothing are derived by applying exponential smoothing to the smoothed series, obtaining smoothed values as follows:
Missing values after the start of the series are replaced with one-step-ahead predicted values, and the predicted value is then applied to the smoothing equations.
The polynomial time trend parameters CONSTANT, LINEAR, and QUAD in the OUTEST= data set are computed from ST, ST[2], and ST[3], the final smoothed values at observation T, the last observation used to fit the model. In the OUTEST= data set, the values of ST, S[2]T, and S[3]T are identified by _TYPE_=S1, _TYPE_=S2, and _TYPE_=S3, respectively.
More detailed descriptions of the forecast computations can be found in Montgomery and Johnson (1976) and Brown (1962).
However, the standard exponential smoothing model is, in fact, a special case of an ARIMA model (McKenzie 1984). Single exponential smoothing corresponds to an ARIMA(0,1,1) model; double exponential smoothing corresponds to an ARIMA(0,2,2) model; and triple exponential smoothing corresponds to an ARIMA(0,3,3) model.
The traditional exponential smoothing calculations can be viewed as a simple and computationally inexpensive method of forecasting the equivalent ARIMA model. The exponential smoothing technique was developed in the 1960s before computers were widely available and before ARIMA modeling methods were developed.
If you use exponential smoothing as a forecasting method, you might consider using the ARIMA procedure to forecast the equivalent ARIMA model as an alternative to the traditional version of exponential smoothing used by PROC FORECAST. The advantages of the ARIMA form are:
See Chapter 7, "The ARIMA Procedure," for information on forecasting with ARIMA models.
The Time Series Forecasting System provides for exponential smoothing models and allows you to either specify or optimize the smoothing weights. See Chapter 23, "Getting Started with Time Series Forecasting," for details.
where a and b are the trend parameters, and the function s(t) selects the seasonal parameter for the season corresponding to time t.
The WINTERS method assumes that the series values are positive. If negative or zero values are found in the series, a warning is printed and the values are treated as missing.
The preceding standard WINTERS model uses a linear trend. However, PROC FORECAST can also fit a version of the WINTERS method that uses a quadratic trend. When TREND=3 is specified for METHOD=WINTERS, PROC FORECAST fits the following model:
The quadratic trend version of the Winters method is often unstable, and its use is not recommended.
When TREND=1 is specified, the following constant trend version is fit:
The default for the WINTERS method is TREND=2, which produces the standard linear trend model.
When there are multiple seasons specified, s(t) is the product of the parameters for the seasons. For example, if SEASONS=(MONTH DAY), then s(t) is the product of the seasonal parameter for the month corresponding to the period t, and the seasonal parameter for the day of the week corresponding to period t. When the SEASONS= option is not specified, the seasonal factors s(t) are not included in the model. See the section "Specifying Seasonality" later in this chapter for more information on specifying multiple seasonal factors.
The estimates of the constant, linear, and quadratic trend parameters are updated using the following equations:
For TREND=3,
For TREND=2,
For TREND=1,
In this updating system, the trend polynomial is always centered at the current period so that the intercept parameter of the trend polynomial for predicted values at times after t is always the updated intercept parameter at. The predicted value for periods ahead is
The seasonal parameters are updated when the season changes in the data, using the mean of the ratios of the actual to the predicted values for the season. For example, if SEASONS=MONTH and INTERVAL=DAY, then, when the observation for the first of February is encountered, the seasonal parameter for January is updated using the formula
where t is February 1 of the current year and st(t-1) is the seasonal parameter for January updated with the data available at time t.
When multiple seasons are used, st(t) is a product of seasonal factors. For example, if SEASONS=(MONTH DAY) then st(t) is the product of the seasonal factors for the month and for the day of the week: st(t) = smt(t) sdt(t).
The factor smt(t) is updated at the start of each month using the preceding formula, and the factor sdt(t) is updated at the start of each week using the following formula:
Missing values after the start of the series are replaced with one-step-ahead predicted values, and the predicted value is substituted for xi and applied to the updating equations.
If the WEIGHT= option is not used, then 1 defaults to (1- .81/trend), where trend is the value of the TREND= option. This produces defaults of WEIGHT=0.2 for TREND=1, WEIGHT=0.10557 for TREND=2, and WEIGHT=0.07168 for TREND=3.
The Time Series Forecasting System provides for generating forecast models using Winters Method and allows you to specify or optimize the weights. See Chapter 23, "Getting Started with Time Series Forecasting," for details.
where is the mean level and I(t) are the fixed seasonal factors. Assuming that and are small, the forecast equations can be linearized and only first-order terms in and kept. In terms of forecasts for , this linearized system is equivalent to a seasonal ARIMA model. Confidence limits for are based on this ARIMA model and converted into confidence limits for xt using st(t) as estimates of I(t).
The exponential smoothing confidence limits are based on an approximation to a weighted regression model, whereas the preceding Winters confidence limits are based on an approximation to an ARIMA model. You can use METHOD=WINTERS without the SEASONS= option to do exponential smoothing and get confidence limits for the EXPO forecasts based on the ARIMA model approximation. These are generally more pessimistic than the weighted regression confidence limits produced by METHOD=EXPO.
The WINTERS method for updating equation and confidence limits calculations described in the preceding section are modified accordingly for the additive version.
proc forecast method=expo trend=2 weight= ... ;
proc forecast method=winters trend=2
weight=(,) ... ;
Although the forecasts are the same, the confidence limits are computed differently.
The noisier the data, the lower should be the weight given to the most recent observation. Another factor to consider is how quickly the mean of the time series is changing. If the mean of the series is changing rapidly, relatively more weight should be given to the most recent observation. The more stable the series over time, the lower should be the weight given to the most recent observation.
Note that the smoothing weights should be set separately for each series; weights that produce good results for one series may be poor for another series. Since PROC FORECAST does not have a feature to use different weights for different series, when forecasting multiple series with the EXPO, WINTERS, or ADDWINTERS method it may be desirable to use different PROC FORECAST steps with different WEIGHT= options.
For the Winters method, many combinations of weight values may produce unstable noninvertible models, even though all three weights are between 0 and 1. When the model is noninvertible, the forecasts depend strongly on values in the distant past, and predictions are determined largely by the starting values. Unstable models usually produce poor forecasts. The Winters model may be unstable even if the weights are optimally chosen to minimize the in-sample MSE. Refer to Archibald (1990) for a detailed discussion of the unstable region of the parameter space of the Winters model.
Optimal weights and forecasts for exponential smoothing models can be computed using the ARIMA procedure. For more information, see "Exponential Smoothing as an ARIMA Model" earlier in this chapter.
The ARIMA procedure can also be used to compute optimal weights and forecasts for seasonal ARIMA models similar to the Winters type methods. In particular, an ARIMA(0,1,1)×(0,1,1)S model may be a good alternative to the additive version of the Winters method. The ARIMA(0,1,1)×(0,1,1)S model fit to the logarithms of the series may be a good alternative to the multiplicative Winters method. See Chapter 7, "The ARIMA Procedure," for information on forecasting with ARIMA models.
The Time Series Forecasting System can be used to automatically select an appropriate smoothing method as well as to optimize the smoothing weights. See Chapter 23, "Getting Started with Time Series Forecasting," for more information.
By default, starting values for the trend parameters are computed by a time-trend regression over the first few observations for the series. Alternatively, you can specify the starting value for the trend parameters with the ASTART=, BSTART=, and CSTART= options.
The number of observations used in the time-trend regression for starting values depends on the NSTART= option. For METHOD=EXPO, NSTART= beginning values of the series are used, and the coefficients of the time-trend regression are then used to form the initial smoothed values S0, S[2]0, and S[3]0.
For METHOD=WINTERS or METHOD=ADDWINTERS, n complete seasonal cycles are used to compute starting values for the trend parameter, where n is the value of the NSTART= option. For example, for monthly data the seasonal cycle is one year, so NSTART=2 specifies that the first 24 observations at the beginning of each series are used for the time trend regression used to calculate starting values.
The starting values for the seasonal factors for the WINTERS and ADDWINTERS methods are computed from seasonal averages over the first few complete seasonal cycles at the beginning of the series. The number of seasonal cycles averaged to compute starting seasonal factors is controlled by the NSSTART= option. For example, for monthly data with SEASONS=12 or SEASONS=MONTH, the first n January values are averaged to get the starting value for the January seasonal parameter, where n is the value of the NSSTART= option.
The s0(i) seasonal parameters are set to the ratio (for WINTERS) or difference (for ADDWINTERS) of the mean for the season to the overall mean for the observations used to compute seasonal starting values.
For example, if METHOD=WINTERS, INTERVAL=DAY, SEASON=(MONTH DAY), and NSTART=2 (the default), the initial seasonal parameter for January is the ratio of the mean value over days in the first two Januarys after the start of the series (that is, after the first nonmissing value), to the mean value for all days read for initialization of the seasonal factors. Likewise, the initial factor for Sundays is the ratio of the mean value for Sundays to the mean of all days read.
For the ASTART=, BSTART=, and CSTART= options, the values specified are associated with the variables in the VAR statement in the order in which the variables are listed (the first value with the first variable, the second value with the second variable, and so on). If there are fewer values than variables, default starting values are used for the later variables. If there are more values than variables, the extra values are ignored.
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.