PROC DATASOURCE Statement
- PROC DATASOURCE options;
The following options can be used in the PROC DATASOURCE statement:
- ALIGN= option
-
controls the alignment of SAS dates used to identify output observations.
The ALIGN= option allows the following values: BEGINNING|BEG|B,
MIDDLE|MID|M, and ENDING|END|E. BEGINNING is the default.
- ASCII
-
specifies the incoming data is ascii. This option is needed when
the native character set of your host machine is ebcdic.
- DBNAME= 'database name'
-
specifies the FAME database to access. Only use this option with the
filetype=FAME option.
The character string you specify on the DBNAME= option is passed
through to FAME.
Specify the value of this option as you would in accessing the
database from within FAME software.
- EBCDIC
-
specifies the incoming data is ebcdic. This option is needed when
the native character set of your host machine is ascii.
- FAMEPRINT
-
prints the FAME command file generated by PROC DATASOURCE
and the log file produced by the FAME component of the interface system.
Only use this option with the filetype=FAME option.
- FILETYPE= entry
-
- DBTYPE= dbtype
-
specifies the kind of input data file to process.
See the "Supported File Types" section
for a list of supported file types.
The FILETYPE= option is required.
- INDEX
-
creates a set of single indexes from BY variables for
the OUT= data set.
Under some circumstances, creating indexes for a SAS
data set may increase the efficiency in locating observations
when BY or WHERE statements are used in subsequent steps.
Refer to SAS Language: Reference, Version 7, First Edition
for more information on SAS indexes.
The INDEX option is ignored when no OUT= data set is created
or when the data file does not contain any BY variables.
The INDEX= data set option can be used to override the
index variable definitions.
- INFILE= fileref
-
- INFILE= (fileref1 fileref2 ... filerefn)
-
specifies the fileref assigned to the input data file.
The default value is DATAFILE.
The fileref used in INFILE= option (or if no INFILE= option
is specified, the fileref DATAFILE) must be associated
with the physical data file in a FILENAME statement.
(On some operating systems, the fileref assignment can be made
with the system's control language, and a FILENAME
statement may not be needed.
Refer to SAS Language: Reference Version 7, First Edition
for more details on the FILENAME statement).
Physical data files can reside on tapes, disks, diskettes, CD-ROM,
or other media.
For some file types, the data are distributed over several files.
In this case, the INFILE= option is required, and it lists in parentheses
the filerefs for each of the files making up the database. The order in which
these FILEREFS are listed is important and must conform to the specifics of
each file type as explained in the "Supported File Types" section.
- LRECL= lrecl
-
- LRECL= (lrecl1 lrecl2 ... lrecln)
-
The logical record length in bytes of the infile.
Only use this if you need to override the default LRECL of the file.
For some file types, the data are distributed over several files.
In this case, the LRECL= option lists in parentheses
the LRECLS for each of the files making up the database. The order in which
these lrecls are listed is important and must conform to the specifics of
each file type as explained in the "Supported File Types" section.
- RECFM= recfm
-
- RECFM= (recfm1 recfm2 ... recfmn)
-
The record format of the infile. Only use this if you need to
override the default record format of the file.
For some file types, the data are distributed over several files.
In this case, the RECFM= option lists in parentheses
the recfms for each of the files making up the database. The order in which
these RECFMS are listed is important and must conform to the specifics of
each file type as explained in the "Supported File Types" section.
The possible values of RECFM are:
- F or FIXED for fixed length records
- N or BIN for binary records
- D or VAR for varying length records
- U or DEF for host default record format
- DOM_V or DOMAIN_VAR or BIN_V or BIN_VAR for unix binary record format
- INTERVAL= interval
-
- FREQUENCY= interval
-
- TYPE= interval
-
specifies the periodicity of series selected
for output to the OUT= data set.
The OUT= data set created by PROC DATASOURCE can contain only
time series with the same periodicity.
Some data files contain time series with different periodicities;
for example, a file may contain both monthly series and quarterly series.
Use the INTERVAL= option to indicate which periodicity you want.
If you want to extract series with different periodicities,
use different PROC DATASOURCE invocations with the desired INTERVAL= options.
Common values for INTERVAL= are YEAR, QUARTER, MONTH, WEEK,
and DAY. The values allowed, as well as the default value of
the INTERVAL= option, depend on the file type.
See the "Supported File Types" section for the
INTERVAL= values appropriate to the data file type you are reading.
- OUT= SAS-data-set
-
names the output data set for the time series
extracted from the data file.
If none of the output data set options are specified, including
the OUT= data set itself, an OUT= data set is created and named
according to the DATAn convention. However, when you create
any of the other output data sets, such as OUTCONT=, OUTBY=, OUTALL=,
or OUTEVENT=, you must explicitly specify the OUT= data set; otherwise,
it will not be created.
See the "OUT= Data Set" section for further details.
- OUTALL= SAS-data-set
-
writes information on the contents of the input data file
to an output data set.
The OUTALL= data set includes descriptive information,
time ranges, and observation counts for all the time series
within each BY group. By default, no OUTALL= data set is created.
The OUTALL= data set contains the Cartesian product of the
information output by the OUTCONT= and OUTBY= options.
In data files for which there are no cross sections,
the OUTALL= and OUTCONT= data sets are almost equivalent,
except that OUTALL= data set also reports time ranges and observation
counts of series.
See the "OUTALL= Data Set" section for further details.
- OUTBY= SAS-data-set
-
writes information on the BY variables to an output data set.
The OUTBY= data set contains the list of cross sections in the
database delimited by the unique set of values that the
BY variables assume.
Unless the OUTSELECT=OFF option is present, only the selected BY groups get
written to the OUTBY= data set.
If you omit the OUTBY= option, no OUTBY= data set is created.
See the "OUTBY= Data Set" section for further details.
- OUTCONT= SAS-data-set
-
writes information on the contents of the input data file
to an output data set.
By default, the OUTCONT= data set includes descriptive information on
all of the unique series of the selected periodicity in the data file.
When the OUTSELECT=OFF option is omitted,
the OUTCONT= data set includes observations only for
the series selected for output to the OUT= data set.
By default, no OUTCONT= data set is created.
See the "OUTCONT= Data Set" section for further details.
- OUTEVENT= SAS-data-set
-
names the output data set to output event-oriented time series data.
This option can only be used when CRSP stock files are
being processed. For all other file types, it will be ignored.
See the "OUTEVENT= Data Set" section for further details.
- OUTSELECT= ON | OFF
-
determines whether to output all observations (OUTSELECT=OFF) or
only those corresponding to the selected time series and selected BY
groups (OUTSELECT=ON) to OUTCONT=, OUTBY=, and OUTALL= data sets. The
default is OUTSELECT=ON. The OUTSELECT= option is only relevant when
any one of the auxiliary data sets is specified.
The option writes observations to OUTCONT=, OUTBY=, and OUTALL= data sets
for only the selected time series and selected BY groups if it is
set ON.
The OUTSELECT= option is only relevant when any one of the OUTCONT=,
OUTBY= and OUTALL= options are specified. The default is
OUTSELECT=ON.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.