IDENTIFY Statement
- IDENTIFY VAR=variable options;
The IDENTIFY statement specifies the time series to be modeled,
differences the series if desired,
and computes statistics to help identify models to fit.
Use an IDENTIFY statement for each time series that you want to model.
If other time series are to be used as inputs in a subsequent
ESTIMATE statement, they must be listed in a CROSSCORR= list in the
IDENTIFY statement.
The following options are used in the IDENTIFY statement.
The VAR= option is required.
- ALPHA= significance-level
-
The ALPHA= option specifies the significance level for tests
in the IDENTIFY statement. The default is 0.05.
- CENTER
-
centers each time series by subtracting its sample mean.
The analysis is done on the centered data.
Later, when forecasts are generated, the mean is added back.
Note that centering is done after differencing.
The CENTER option is normally used in conjunction with the
NOCONSTANT option of the ESTIMATE statement.
- CLEAR
-
deletes all old models.
This option is useful when you want to delete old models so that the
input variables are not prewhitened.
(See the section "Prewhitening" later in this chapter for more information.)
- CROSSCORR= variable (d11, d12, ..., d1k)
-
- CROSSCORR= (variable (d11, d12, ..., d1k) ...
variable (d21, d22, ..., d2k))
- names the variables cross correlated with the response variable given by the
VAR= specification.
Each variable name can be followed by a list of differencing lags
in parentheses, the same as for the VAR= specification.
If differencing is specified for a variable in the CROSSCORR= list,
the differenced series is cross correlated with the VAR= option series,
and the differenced series is used when the ESTIMATE statement
INPUT= option refers to the variable.
- DATA= SAS-data-set
-
specifies the input SAS data set containing the time series.
If the DATA= option is omitted,
the DATA= data set specified in the PROC ARIMA statement is
used; if the DATA= option is omitted from the PROC ARIMA statement as well,
the most recently created data set is used.
- ESACF
-
computes the extended sample autocorrelation
function and uses these estimates to tentatively identify the
autoregressive and moving average orders of mixed models.
The ESACF option generates two tables. The first table displays
extended sample autocorrelation estimates, and the second table
displays probability values that can be used to test
the significance of these estimates.
The P=(pmin: pmax) and
Q=(qmin: qmax)
options determine the size of the table.
The autoregressive and moving average orders are tentatively identified
by finding a triangular pattern in which all values are
insignificant. The ARIMA procedure finds these patterns based on the
IDENTIFY statement ALPHA= option and displays possible
recommendations for the orders.
The following code generates an ESACF table with
dimensions of p=(0:7) and q=(0:8).
proc arima data=test;
identify var=x esacf p=(0:7) q=(0:8);
run;
See the "The ESACF Method" section for more information.
- MINIC
-
uses information criteria or penalty
functions to provide tentative ARMA order identification.
The MINIC option generates a table containing the computed
information criterion associated with various ARMA model orders.
The PERROR= option determines
the range of the autoregressive model orders used to estimate
the error series.
The P=(pmin: pmax) and
Q=(qmin: qmax)
options determine the size of the table.
The ARMA orders are tentatively identified by those orders that
minimize the information criterion.
The following code generates a MINIC table with
default dimensions of p=(0:5) and q=(0:5) and with the error series
estimated by an autoregressive model with an order, ,
that minimizes the AIC in the range from 8 to 11.
proc arima data=test;
identify var=x minic perror=(8:11);
run;
See the "The MINIC Method" section for more information.
- NLAG= number
-
indicates the number of lags to consider in computing the
autocorrelations and cross correlations.
To obtain preliminary estimates of an ARIMA(p,d,q) model,
the NLAG= value must be at least p+q+d.
The number of observations must be greater than or equal to the NLAG= value.
The default value for NLAG= is 24 or one-fourth the number of
observations, whichever is less. Even though the NLAG= value is specified,
the NLAG= value can be changed according to the data set.
- NOMISS
-
uses only the first continuous sequence of data with no missing values.
By default, all observations are used.
- NOPRINT
-
suppresses the normal printout
(including the correlation plots)
generated by the IDENTIFY statement.
- OUTCOV= SAS-data-set
-
writes the autocovariances, autocorrelations,
inverse autocorrelations, partial autocorrelations, and
cross covariances to an output SAS data set.
If the OUTCOV= option is not specified,
no covariance output data set is created.
See the section "OUTCOV= Data Set" later in this chapter for more information.
- P= (pmin: pmax)
-
see the ESCAF, MINIC, and SCAN options for details.
- PERROR= ()
-
see the ESCAF, MINIC, and SCAN options for details.
- Q= (qmin: qmax)
-
see the ESACF, MINIC, and SCAN options for details.
- SCAN
-
computes estimates of the squared canonical
correlations and uses these estimates to tentatively identify the
autoregressive and moving average orders of mixed models.
The SCAN option generates two tables. The first table
displays squared canonical correlation estimates, and the second
table displays probability values that can be used to test
the significance of these estimates.
The P=(pmin: pmax) and Q=(qmin: qmax)
options determine the size of each table.
The autoregressive and moving average orders are tentatively identified
by finding a rectangular pattern in which all values are
insignificant. The ARIMA procedure finds these patterns based on the
IDENTIFY statement ALPHA= option and displays possible recommendations
for the orders.
The following code generates a SCAN table with
default dimensions of p=(0:5) and q=(0:5).
The recommended orders are based on a significance
level of 0.1.
proc arima data=test;
identify var=x scan alpha=0.1;
run;
See the "The SCAN Method" section for more information.
- STATIONARITY=
-
performs stationarity tests. Stationarity tests can be used
to determine whether differencing terms should be included in
the model specification. In each stationarity test, the autoregressive
orders can be specified by a range, test=armax, or
as a list of values, test=(ar1,.., arn), where test
is ADF, PP, or RW.
The default is (0,1,2).
See the "Stationarity Tests" section for more information.
- STATIONARITY=(ADF= AR orders DLAG= s)
- STATIONARITY=(DICKEY= AR orders DLAG= s)
-
performs augmented Dickey-Fuller tests. If the DLAG=s
option specified with s is greater than one,
seasonal Dickey-Fuller tests are performed. The maximum
allowable value of s is 12. The default value of s is one.
The following code performs augmented Dickey-Fuller
tests with autoregressive orders 2 and 5.
proc arima data=test;
identify var=x stationarity=(adf=(2,5));
run;
- STATIONARITY=(PP= AR orders)
- STATIONARITY=(PHILLIPS= AR orders)
-
performs Phillips-Perron tests.
The following code performs Augmented Phillips-Perron
tests with autoregressive orders ranging from 0 to 6.
proc arima data=test;
identify var=x stationarity=(pp=6);
run;
- STATIONARITY=(RW= AR orders)
- STATIONARITY=(RANDOMWALK= AR orders)
-
performs random-walk with drift tests.
The following code performs random-walk with drift
tests with autoregressive orders ranging from 0 to 2.
proc arima data=test;
identify var=x stationarity=(rw);
run;
- VAR= variable
- VAR= variable ( d1, d2, ..., dk )
-
names the variable containing the time series to analyze.
The VAR= option is required.
A list of differencing lags can be placed in parentheses
after the variable name to request that the series be differenced
at these lags.
For example, VAR=X(1) takes the first differences of X.
VAR=X(1,1) requests that X be differenced twice,
both times with lag 1, producing a second difference series, which
is (Xt-Xt-1)-(Xt-1-Xt-2)=Xt-2Xt-1+Xt-2 .
VAR=X(2) differences X once at lag two (Xt-Xt-2) .
If differencing is specified, it is the differenced series
that is processed by any subsequent ESTIMATE statement.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.