|
Creating a SAS Data File or a SAS Data View |
You can create
either a SAS data file, a data set that holds actual data, or a SAS data view,
a data set that references data that is stored elsewhere. By default, you
create a SAS data file. To create a SAS data view instead, use the VIEW= option
on the DATA statement. With a data view you can, for example, process monthly
sales figures without having to edit your DATA step. Whenever you need to
create output, the output from a data view reflects the current input data
values.
The following DATA statement creates a data view called
MONTHLY_SALES.
data monthly_sales / view=monthly_sales;
The following DATA statement creates a data file called
TEST_RESULTS.
data test_results;
You select data-reading
statements based on the source of your
input data. There are at least six sources of input data:
- raw data in an external file
- raw data in the
jobstream (instream data)
- data in SAS data sets
- data that is created by programming
statements
- data that you can remotely access through an FTP
protocol, TCP/IP socket, a SAS catalog entry, or through a URL
- data that is stored in a Database Management System
(DBMS) or other vendor's data files.
Usually DATA steps read input data records from only
one of the first three sources of input. However, DATA steps can use a combination
of some or all of the sources.
Example 1: Reading External File Data
The components of a DATA step that produce a SAS data set from
raw data stored in an external file are outlined here.
data weight; [1]
infile 'your-input-file'; [2]
input IDnumber $ Week1 Week16; [3]
WeightLoss=Week1-Week16; [4]
run; [5]
proc print data=weight; [6]
run; [7]
| Begin the DATA step and create a SAS data
set called WEIGHT. |
| Specify the external file that contains your
data. |
| Read a record and assign values to three variables. |
| Calculate a value for variable
WeightLoss. |
| Execute the DATA step. |
| Print data set WEIGHT using the PRINT
procedure. |
| Execute the PRINT procedure. |
Example 2: Reading Instream Data Lines
This example reads raw data from instream data lines.
data weight2; [1]
input IDnumber $ Week1 Week16; [2]
WeightLoss2=Week1-Week16; [3]
datalines; [4]
2477 195 163
2431 220 198
2456 173 155
2412 135 116
; [5]
proc print data=weight2; [6]
run; [7]
| Begin the DATA step and create SAS data set
WEIGHT2. |
| Read a data line and assign values to three
variables. |
| Calculate a value for variable WeightLoss2. |
| Begin the data
lines. |
| Signal end of data lines with a semicolon
and execute the DATA step. |
| Print data set WEIGHT2 using the PRINT procedure. |
| Execute the PRINT
procedure. |
Example 3: Reading Instream Data Lines with Missing Values
You can also take advantage of options on the INFILE statement
when you read instream data lines. This example shows the use of the MISSOVER
statement option, which assigns missing values to variables for records that
contain no data for those variables.
data weight2;
infile datalines missover; [1]
input IDnumber $ Week1 Week16;
WeightLoss2=Week1-Week16;
datalines; [2]
2477 195 163
2431
2456 173 155
2412 135 116
; [3]
proc print data=weight2; [4]
run; [5]
| Use the MISSOVER option to assign missing
values to variables that do not contain values. |
| Begin data lines. |
| Signal end of data lines and
execute the DATA
step. |
| Print data set WEIGHT2 using the PRINT procedure. |
| Execute the PRINT
procedure. |
Example 4: Using Multiple Input Files in Instream Data
This example shows how to use multiple input files as instream
data to your program. This example reads the records in each file and creates
the ALL_ERRORS SAS data set. The program then sorts the observations by Station,
and creates a sorted data set called SORTED_ERRORS. The print procedure prints
the results.
options pageno=1 nodate linesize=60 pagesize=80;
data all_errors;
length filelocation $ 60;
input filelocation; /* reads instream data */
infile daily filevar=filelocation
filename=daily end=done;
do while (not done);
input Station $ Shift $ Employee $ NumberOfFlaws;
output;
end;
put 'Finished reading ' daily=;
datalines;
. . .myfile_A. . .
. . .myfile_B. . .
. . .myfile_C. . .
;
proc sort data=all_errors out=sorted_errors;
by Station;
run;
proc print data = sorted_errors;
title 'Flaws Report sorted by Station';
run;
Multiple Input Files in Instream Data
Flaws Report sorted by Station 1
Number
Obs Station Shift Employee OfFlaws
1 Amherst 2 Lynne 0
2 Goshen 2 Seth 4
3 Hadley 2 Jon 3
4 Holyoke 1 Walter 0
5 Holyoke 1 Barb 3
6 Orange 2 Carol 5
7 Otis 1 Kay 0
8 Pelham 2 Mike 4
9 Stanford 1 Sam 1
10 Suffield 2 Lisa 1 |
|
Reading Data from SAS Data Sets |
This
example reads data from one SAS data set, generates a value
for a new variable, and creates a new data set.
data average_loss; [1]
set weight; [2]
Percent=round((AverageLoss * 100) / Week1); [3]
run; [4]
| Begin the DATA step and create a SAS data
set called AVERAGE_LOSS. |
| Read an observation from SAS data set WEIGHT. |
| Calculate a value for
variable Percent. |
| Execute the DATA step. |
|
Generating Data from Programming Statements |
You can create data for a SAS data set by generating observations
with programming statements rather than by reading data. A DATA step that
reads no input goes through only one iteration.
data investment; [1]
begin='01JAN1990'd;
end='31DEC2009'd;
do year=year(begin) to year(end); [2]
Capital+2000 + .07*(Capital+2000);
output; [3]
end;
put 'The number of DATA step iterations is '_n_; [4]
run; [5]
proc print data=investment; [6]
format Capital dollar12.2; [7]
run; [8]
| Begin the DATA step and create a SAS data
set called INVESTMENT. |
| Calculate a value based on a $2,000 capital
investment and 7% interest each year from 1990 to 2009. Calculate variable
values for one observation per iteration of the DO loop. |
| Write each observation to data set
INVESTMENT. |
| Write a note to the SAS log proving that the
DATA step iterates only once. |
| Execute the DATA step. |
| To see your output, print the INVESTMENT
data
set with the PRINT procedure. |
| Use the FORMAT statement to write numeric
values with dollar signs, commas, and decimal points. |
| Execute the PRINT procedure. |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.