Chapter Contents |
Previous |
Next |
Concatenating |
In the simplest case, all input data sets contain the same variables.
If the input data sets contain different variables, observations from one
data set have missing values for variables defined only in other data sets.
In either case, the variables in the new data set are the same as the variables
in the old data sets.
Use this form of the SET statement to concatenate data sets:
SET data-set(s); |
where
For a complete description of the SET statement, see
SAS Language Reference: Dictionary.
ANIMAL PLANT OBS Common Animal Number OBS Common Plant Number 1 a Ant 5 1 g Grape 69 2 b Bird 2 h Hazelnut 55 3 c Cat 17 3 i Indigo 4 d Dog 9 4 j Jicama 14 5 e Eagle 5 k Kale 5 6 f Frog 76 6 l Lentil 77
The following program uses a SET statement to concatenate the data sets and then prints the results:
libname example 'SAS-data-library'; data example.concatenation; set example.animal example.plant; run; proc print data=example.concatenation; var Common Animal Plant Number; title 'Data Set CONCATENATION'; run;
Concatenated Data Sets (DATA Step)
Data Set CONCATENATION 1 Obs Common Animal Plant Number 1 a Ant 5 2 b Bird . 3 c Cat 17 4 d Dog 9 5 e Eagle . 6 f Frog 76 7 g Grape 69 8 h Hazelnut 55 9 i Indigo . 10 j Jicama 14 11 k Kale 5 12 l Lentil 77 |
The resulting data set CONCATENATION has 12 observations, which is the
sum of the observations from the combined data sets. The program data vector
contains all variables from all data sets. The values of variables found
in one data set but not in another are set to missing.
You can also use the SQL language to concatenate tables. In this example, SQL reads each row in both tables and creates a new table named COMBINED. The following shows the YEAR1 and YEAR2 input tables:
YEAR1 YEAR2 Date1 Date2 1996 1997 1997 1998 1998 1999 1999 2000 2001
The following SQL code creates and prints the table COMBINED.
proc sql; title 'SQL Table COMBINED'; create table combined as select * from year1 outer union corr select * from year2; select * from combined; quit;
Concatenated Tables (SQL)
SQL Table COMBINED 1 Year -------- 1996 1997 1998 1999 1997 1998 1999 2000 2001 |
Instead of concatenating data sets or tables, you can append them and produce the same results as concatenation. SAS concatenates data sets (DATA step) and tables (SQL) by reading each row of data to create a new file. To avoid reading all the records, you can append the second file to the first file by using the APPEND procedure:
proc append base=year1 data=year2; run;
The YEAR1 file will contain all rows from both tables.
Note: You cannot use PROC APPEND to add observations to a SAS data set
in a sequential library.
If no additional processing is necessary, using PROC APPEND or the APPEND statement in PROC DATASETS is more efficient than using a DATA step to concatenate data sets.
Interleaving |
Use this form of the SET statement to interleave data sets when you use a BY variable:
SET data-set(s); |
BY variable(s); |
where
Use this form of the SET statement to interleave data sets when you use an index:
SET data-set-1 . . . data-set-n KEY= index; |
where
For a complete description of the SET statement, including SET with
the KEY= option, see
SAS Language Reference: Dictionary.
Before you can interleave data sets, the observations must be sorted
or grouped by the same variable or variables that you use in the BY statement,
or you must have an appropriate index for the data sets.
ANIMAL PLANT OBS Common Animal OBS Common Plant 1 a Ant 1 a Apple 2 b Bird 2 b Banana 3 c Cat 3 c Coconut 4 d Dog 4 d Dewberry 5 e Eagle 5 e Eggplant 6 f Frog 6 f Fig
The following program uses SET and BY statements to interleave the data sets, and prints the results:
data example.interleaving; set example.animal example.plant; by Common; run; proc print data=example.interleaving; title 'Data Set INTERLEAVING'; run;
Interleaved Data Sets
Data Set INTERLEAVING 1 Obs common animal plant 1 a Ant 2 a Apple 3 b Bird 4 b Banana 5 c Cat 6 c Coconut 7 d Dog 8 d Dewberry 9 e Eagle 10 e Eggplant 11 f Frog 12 f Fig |
The resulting data set INTERLEAVING has 12 observations, which is the sum of the observations from the combined data sets. The new data set contains all variables from both data sets. The value of variables found in one data set but not in the other are set to missing, and the observations are arranged by the values of the BY variable.
If the data sets contain duplicate values of the BY variables, the observations are written to the new data set in the order in which they occur in the original data sets. This example contains duplicate values of the BY variable Common. The following shows the ANIMAL1 and PLANT1 input data sets:
ANIMAL1 PLANT1 OBS Common Animal1 OBS Common Plant1 1 a Ant 1 a Apple 2 a Ape 2 b Banana 3 b Bird 3 c Coconut 4 c Cat 4 c Celery 5 d Dog 5 d Dewberry 6 e Eagle 6 e Eggplant
The following program uses SET and BY statements to interleave the data sets, and prints the results:
data example.interleaving2; set example.animal1 example.plant1; by Common; run; proc print data=example.interleaving2; title 'Data Set INTERLEAVING2: Duplicate BY Values'; run;
Interleaved Data Sets with Duplicate Values of the BY Variable
Data Set INTERLEAVING2: Duplicate BY Values 1 Obs Common Animal1 Plant1 1 a Ant 2 a Ape 3 a Apple 4 b Bird 5 b Banana 6 c Cat 7 c Coconut 8 c Celery 9 d Dog 10 d Dewberry 11 e Eagle 12 e Eggplant |
The number of observations in the new data set is the sum of the observations
in all the data sets. The observations are written to the new data set in
the order in which they occur in the original data sets.
The data sets ANIMAL2 and PLANT2 both contain BY values that are present in one data set but not in the other. The following shows the ANIMAL2 and the PLANT2 input data sets:
ANIMAL2 PLANT2 OBS Common Animal2 OBS Common Plant2 1 a Ant 1 a Apple 2 c Cat 2 b Banana 3 d Dog 3 c Coconut 4 e Eagle 4 e Eggplant 5 f Fig
This program uses SET and BY statements to interleave these data sets, and prints the results:
data example.interleaving3; set example.animal2 example.plant2; by Common; run; proc print data=example.interleaving3; title 'Data Set INTERLEAVING3: Different BY Values'; run;
Interleaving Data Sets with Different BY Values
Data Set INTERLEAVING3: Different BY Values 1 Obs Common Animal2 Plant2 1 a Ant 2 a Apple 3 b Banana 4 c Cat 5 c Coconut 6 d Dog 7 e Eagle 8 e Eggplant 9 f Fig |
The resulting data set has nine observations arranged by the values
of the BY variable.
One-to-One Reading |
Use this form of the SET statement for one-to-one reading:
SET data-set-1; |
SET data-set-2; |
where
For a complete description of the SET statement, see
SAS Language Reference: Dictionary.
ANIMAL PLANT OBS Common Animal OBS Common Plant 1 a Ant 1 a Apple 2 b Bird 2 b Banana 3 c Cat 3 c Coconut 4 d Dog 4 d Dewberry 5 e Eagle 5 e Eggplant 6 f Frog 6 g Fig
The following program uses two SET statements to combine observations from ANIMAL and PLANT, and prints the results:
data twosets; set animal; set plant; run; proc print data=twosets; title 'Data Set TWOSETS - Equal Number of Observations'; run;
Data Set Created from Two Data Sets That Have Equal Observations
Data Set TWOSETS - Equal Number of Observations 1 Obs Common Animal Plant 1 a Ant Apple 2 b Bird Banana 3 c Cat Coconut 4 d Dog Dewberry 5 e Eagle Eggplant 6 g Frog Fig |
Each observation in the new data set contains all the variables from
all the data sets. Note, however, that the Common variable value in observation
6 contains a "g." The value of Common in observation 6 of the
ANIMAL data set was overwritten by the value in PLANT, which was the data
set that SAS read last.
One-to-One Merging |
If you use the MERGENOBY= SAS system option, you can control whether
SAS issues a message when MERGE processing occurs without an associated BY
statement.
Use this form of the MERGE statement to merge SAS data sets:
MERGE data-set(s); |
where
For a complete description of the MERGE statement, see
SAS Language Reference: Dictionary.
ANIMAL PLANT OBS Common Animal OBS Common Plant 1 a Ant 1 a Apple 2 b Bird 2 b Banana 3 c Cat 3 c Coconut 4 d Dog 4 d Dewberry 5 e Eagle 5 e Eggplant 6 f Frog 6 g Fig
The following program merges these data sets and prints the results:
data combined; merge animal plant; run; proc print data=combined; title 'Data Set COMBINED'; run;
Merged Data Sets That Have an Equal Number of Observations
Data Set COMBINED 1 Obs Common Animal Plant 1 a Ant Apple 2 b Bird Banana 3 c Cat Coconut 4 d Dog Dewberry 5 e Eagle Eggplant 6 g Frog Fig |
Each observation in the new data set contains all variables from all
data sets. If two data sets contain the same variables, the values from the
second data set replace the values from the first data set, as shown in observation
6.
The SAS data sets ANIMAL1 and PLANT1 both contain the variable Common, and the observations are arranged by the values of Common. The PLANT1 data set has fewer observations than the ANIMAL1 data set. The following shows the ANIMAL1 and the PLANT1 input data sets:
ANIMAL1 PLANT1 OBS Common Animal OBS Common Plant 1 a Ant 1 a Apple 2 b Bird 2 b Banana 3 c Cat 3 c Coconut 4 d Dog 5 e Eagle 6 f Frog
The following program merges these unequal data sets and prints the results:
data combined1; merge animal1 plant1; run; proc print data=combined1; title 'Data Set COMBINED1'; run;
Merged Data Sets That Have an Unequal Number of Observations
Data Set COMBINED1 1 Obs Common Animal Plant 1 a Ant Apple 2 b Bird Banana 3 c Cat Coconut 4 d Dog 5 e Eagle 6 f Frog |
Note that observations 4 through 6 contain missing values for the variable
Plant.
The following example shows the undesirable results that you can obtain by using one-to-one merging with data sets that contain duplicate values of common variables. The value from the last data set that is read is the one that is written to the new data set. The variables are combined exactly as they are read from each data set. In the following example, the data sets ANIMAL1 and PLANT1 contain the variable Common, and each data set contains observations with duplicate values of Common. The following shows the ANIMAL1 and the PLANT1 input data sets:
ANIMAL1 PLANT1 OBS Common Animal OBS Common Plant 1 a Ant 1 a Apple 2 a Ape 2 b Banana 3 b Bird 3 c Coconut 4 c Cat 4 c Celery 5 d Dog 5 d Dewberry 6 e Eagle 6 e Eggplant
The following program produces the data set MERGE1 data set and prints the results:
/* This program illustrates undesirable results. */ data merge1; merge animal1 plant1; run; proc print data=merge1; title 'Data Set MERGE1'; run;
Undesirable Results with Duplicate Values of Common Variables
Data Set MERGE1 1 Obs Common Animal1 Plant1 1 a Ant Apple 2 b Ape Banana 3 c Bird Coconut 4 c Cat Celery 5 d Dog Dewberry 6 e Eagle Eggplant |
The number of observations in the new data set is six. Note that observations
2 and 3 contain undesirable values. SAS reads the second observation from
data set ANIMAL1. It then reads the second observation from data set PLANT1
and replaces the values for the variables Common and Plant1. The third observation
is created in the same way.
The following example shows the undesirable results obtained from using the one-to-one merge to combine data sets with different values of common variables. If a variable exists in more than one data set, the value from the last data set that is read is the one that is written to the new data set even if the value is missing. Once SAS processes all observations in a data set, all subsequent observations in the new data set have missing values for the variables that are unique to that data set. In this example, the data sets ANIMAL2 and PLANT2 have different values of the Common variable. The following shows the ANIMAL2 and the PLANT2 input data sets:
ANIMAL2 PLANT2 OBS Common Animal OBS Common Plant 1 a Ant 1 a Apple 2 c Cat 2 b Banana 3 d Dog 3 c Coconut 4 e Eagle 4 e Eggplant 5 f Fig
The following program produces the data set MERGE2 and prints the results:
/* This program illustrates undesirable results. */ data merge2; merge animal2 plant2; run; proc print data=merge2; title 'Data Set MERGE2'; run;
Undesirable Results with Different Values of Common Variables
Data Set MERGE2 1 Obs Common Animal2 Plant2 1 a Ant Apple 2 b Cat Banana 3 c Dog Coconut 4 e Eagle Eggplant 5 f Fig |
The results from a one-to-one merge are similar to the results obtained from using two or more SET statements to combine observations. However, with the one-to-one merge, SAS continues processing all observations in all data sets that were named in the MERGE statement.
Match-Merging |
Use this form of the MERGE statement to match-merge data sets:
MERGE data-set(s); |
BY variable(s); |
where
For a complete description of the MERGE and the BY statements, see
SAS Language Reference: Dictionary.
ANIMAL PLANT OBS Common Animal OBS Common Plant 1 a Ant 1 a Apple 2 b Bird 2 b Banana 3 c Cat 3 c Coconut 4 d Dog 4 d Dewberry 5 e Eagle 5 e Eggplant 6 f Frog 6 f Fig
The following program merges the data sets according to the values of the BY variable Common, and prints the results:
data combined; merge animal plant; by Common; run; proc print data=combined; title 'Data Set COMBINED'; run;
Data Sets Combined by Match-Merging
Data Set COMBINED 1 Obs Common Animal Plant 1 a Ant Apple 2 b Bird Banana 3 c Cat Coconut 4 d Dog Dewberry 5 e Eagle Eggplant 6 f Frog Fig |
Each observation in the new data set contains all the variables from
all the data sets.
When SAS reads the last observation from a BY group in one data set, SAS retains its values in the program data vector for all variables that are unique to that data set until all observations for that BY group have been read from all data sets. In the following example, the data sets ANIMAL1 and PLANT1 contain duplicate values of the BY variable Common. The following shows the ANIMAL1 and the PLANT1 input data sets:
ANIMAL1 PLANT1 OBS Common Animal1 OBS Common Plant1 1 a Ant 1 a Apple 2 a Ape 2 b Banana 3 b Bird 3 c Coconut 4 c Cat 4 c Celery 5 d Dog 5 d Dewberry 6 e Eagle 6 e Eggplant
The following program produces the merged data set MATCH1, and prints the results:
data match1; merge animal1 plant1; by Common; run; proc print data=match1; title 'Data Set MATCH1'; run;
Match-Merged Data Set with Duplicate BY Values
Data Set MATCH1 1 Obs Common Animal1 Plant1 1 a Ant Apple 2 a Ape Apple 3 b Bird Banana 4 c Cat Coconut 5 c Cat Celery 6 d Dog Dewberry 7 e Eagle Eggplant |
In observation 2 of the output, the value of the variable Plant1 is
retained until all observations in the BY group are written to the new data
set. Match-merging also produced duplicate values in ANIMAL1 for observations
4 and 5.
When SAS performs a match-merge with nonmatched observations in the input data sets, SAS retains the values of all variables in the program data vector even if the value is missing. The data sets ANIMAL2 and PLANT2 do not contain all values of the BY variable Common. The following shows the ANIMAL2 and the PLANT2 input data sets:
ANIMAL2 PLANT2 OBS Common Animal2 OBS Common Plant2 1 a Ant 1 a Apple 2 c Cat 2 b Banana 3 d Dog 3 c Coconut 4 e Eagle 4 e Eggplant 5 f Fig
The following program produces the merged data set MATCH2, and prints the results:
data match2; merge animal2 plant2; by Common; run; proc print data=match2; title 'Data Set MATCH2'; run;
Match-Merged Data Set with Nonmatched Observations
Data Set MATCH2 1 Obs Common Animal2 Plant2 1 a Ant Apple 2 b Banana 3 c Cat Coconut 4 d Dog 5 e Eagle Eggplant 6 f Fig |
As the output shows, all values of the variable Common are represented in the new data set, including missing values for the variables that are in one data set but not in the other.
Updating with the UPDATE and the MODIFY Statements |
You can update data sets by using the UPDATE statement or the MODIFY statement:
UPDATE | uses observations from the transaction data set to change the values of corresponding observations from the master data set. You must use a BY statement with the UPDATE statement because all observations in the transaction data set are keyed to observations in the master data set according to the values of the BY variable. |
MODIFY | can replace, delete, and append observations in an existing data set. Using the MODIFY statement can save disk space because it modifies data in place, without creating a copy of the data set. |
The number of observations in the new data set is the sum of the number of observations in the master data set and the number of unmatched observations in the transaction data set.
For complete information about the UPDATE and the MODIFY statements,
see "Statements" in SAS Language Reference: Dictionary.
Use this form of the UPDATE statement to update a master data set:
UPDATE master-data-set transaction-data-set; |
BY variable-list; |
where
If the transaction data set contains duplicate values of the BY variable, SAS applies both transactions to the observation. The last values that are copied into the program data vector are written to the new data set. If your data is in this form, use the MODIFY statement instead of the UPDATE statement to process your data.
For complete information about the UPDATE statement, see
SAS Language Reference: Dictionary.
This form of the MODIFY statement is used in the examples that follow:
MODIFY master-data-set; |
BY variable-list; |
where
Note: The MODIFY statement does not support
changing the descriptor
portion of a SAS data set, such as adding a variable.
For complete information about the MODIFY statement, see
SAS Language Reference: Dictionary.
In the UPDATE statement, if an observation in the master data set does not have a corresponding observation in the transaction data set, SAS writes the observation to the new data set without modifying it. Any observation from the transaction data set that does not correspond to an observation in the master data set is written to the program data vector and becomes the basis for an observation in the new data set. The data in the program data vector can be modified by other transactions before it is written to the new data set. If a master data set observation does not need updating, the corresponding observation can be omitted from the transaction data set.
SAS does not replace existing values in the master data set with missing values if those values are coded as periods (for numeric variables) or blanks (for character variables) in the transaction data set. To replace existing values with missing values, you must either create a transaction data set in which missing values are coded with the special missing value characters, or use the UPDATEMODE=NOMISSINGCHECK statement option.
With UPDATE, the transaction data set can contain new variables to be added to all observations in the master data set.
To view a sample program, see
Example 3: Using UPDATE for Processing Nonmatched Observations, Missing Values, and New Variables.
If you do not use an index, both the master data set and the transaction data set must be sorted by the same variable or variables that you specify in the BY statement that accompanies the UPDATE statement. The values of the BY variable should be unique for each observation in the master data set. If you use more than one BY variable, the combination of values of all BY variables should be unique for each observation in the master data set. The BY variable or variables should be ones that you never need to update.
Note: The MODIFY statement does not require sorted files. However, sorting
the data improves efficiency.
The MODIFY statement maintains the index. You do not have to rebuild
the index like you do for the UPDATE statement.
Issue | MODIFY with BY | UPDATE |
---|---|---|
Disk space | saves disk space because it updates data in place | requires more disk space because it produces an updated copy of the data set |
Sort and index | sorted input data sets are not required, although for good performance, it is strongly recommended that both data sets be sorted and that the master data set be indexed | requires only that both data sets be sorted |
When to use | use only when you expect to process a SMALL portion of the data set | use if you expect to need to process most of the data set |
Where to specify the modified data set | specify the updated data set in both the DATA and the MODIFY statements | specify the updated data set in the DATA and the UPDATE statements |
Duplicate BY-values | allows duplicate BY-values in both the master and the transaction data sets | allows duplicate BY-values in the transaction data set only (If duplicates exist in the master data set, SAS issues a warning.) |
Scope of changes | cannot change the data set descriptor information, so changes such as adding or deleting variables, variable labels, and so on, are not valid | can make changes that require a change in the descriptor portion of a data set, such as adding new variables, and so on |
Error checking | has error-checking capabilities using the _IORC_ automatic variable and the SYSRC autocall macro | needs no error checking because transactions without a corresponding master record are not applied but are added to the data set |
Data set integrity | data may only be partially updated due to an abnormal task termination | no data loss occurs because UPDATE works on a copy of the data |
For more information about tools for combining SAS data sets, see
Statements or Procedures for Combining SAS Data Sets.
The MODIFY statement has three primary uses:
Several of the examples that follow demonstrate these
uses.
MASTER NEWPLANT OBS Common Animal Plant OBS Common Plant 1 a Ant Apple 1 a Apricot 2 b Bird Banana 2 b Barley 3 c Cat Coconut 3 c Cactus 4 d Dog Dewberry 4 d Date 5 e Eagle Eggplant 5 e Escarole 6 f Frog Fig 6 f Fennel
The following program updates MASTER with the transactions in the data set NEWPLANT, writes the results to UPDATE_FILE, and prints the results:
data update_file; update master newplant; by common; run; proc print data=update_file; title 'Data Set Update_File'; run;
Master Data Set Updated by Transaction Data Set
Data Set Update_File 1 Obs Common Animal Plant 1 a Ant Apricot 2 b Bird Barley 3 c Cat Cactus 4 d Dog Date 5 e Eagle Escarole 6 f Frog Fennel |
Each observation in the new data set contains a new value for the variable
Plant.
If the master data set contains two observations with the same value of the BY variable, the first observation is updated and the second observation is ignored. SAS writes a warning message to the log. If the transaction data set contains duplicate values of the BY variable, SAS applies both transactions to the observation. The last values copied into the program data vector are written to the new data set. The following shows the MASTER1 and the DUPPLANT input data sets.
MASTER1 DUPPLANT OBS Common Animal1 Plant1 OBS Common Plant1 1 a Ant Apple 1 a Apricot 2 b Bird Banana 2 b Barley 3 b Bird Banana 3 c Cactus 4 c Cat Coconut 4 d Date 5 d Dog Dewberry 5 d Dill 6 e Eagle Eggplant 6 e Escarole 7 f Frog Fig 7 f Fennel
The following program applies the transactions in DUPPLANT to MASTER1 and prints the results:
data update1; update master1 dupplant; by Common; run; proc print data=update1; title 'Data Set Update1'; run;
Updating Data Sets with Duplicate BY Values
Data Set Update1 1 Obs Common Animal1 Plant1 1 a Ant Apricot 2 b Bird Barley 3 b Bird Banana 4 c Cat Cactus 5 d Dog Dill 6 e Eagle Escarole 7 f Frog Fennel |
When this DATA step executes, SAS generates a warning message stating that there is more than one observation for a BY group. However, the DATA step continues to process, and the data set UPDATE1 is created.
The resulting data set has seven observations. Observations 2 and 3
have duplicate values of the BY variable Common. However, the value of the
variable PLANT1 was not updated in the second occurrence of the duplicate
BY value.
In this example, the data set MASTER2 is a master data set. It contains a missing value for the variable Plant2 in the first observation, and not all of the values of the BY variable Common are included. The transaction data set NONPLANT contains a new variable Mineral, a new value of the BY variable Common, and missing values for several observations. The following shows the MASTER2 and the NONPLANT input data sets:
MASTER2 NONPLANT OBS Common Animal2 Plant2 OBS Common Plant2 Mineral 1 a Ant 1 a Apricot Amethyst 2 c Cat Coconut 2 b Barley Beryl 3 d Dog Dewberry 3 c Cactus 4 e Eagle Eggplant 4 e 5 f Frog Fig 5 f Fennel 6 g Grape Garnet
The following program updates the data set MASTER2 and prints the results:
data update2_file; update master2 nonplant; by Common; run; proc print data=update2_file; title 'Data Set Update2_File'; run;
Results of Updating with New Variables, Nonmatched Observations, and Missing Values
Data Set Update2_File 1 Obs Common Animal2 Plant2 Mineral 1 a Ant Apricot Amethyst 2 b Barley Beryl 3 c Cat Cactus 4 d Dog Dewberry 5 e Eagle Eggplant 6 f Frog Fennel 7 g Grape Garnet |
As shown, all observations now include values for the variable Mineral. The value of Mineral is set to missing for some observations. Observations 2 and 6 in the transaction data set did not have corresponding observations in MASTER2, and they have become new observations. Observation 3 from the master data set was written to the new data set without change, and the value for Plant2 in observation 4 was not changed to missing. Three observations in the new data set have updated values for the variable Plant2.
The following program uses the UPDATEMODE statement option on the UPDATE statement, and prints the results:
data update2_file; update master2 nonplant updatemode=nomissingcheck; by Common; run; proc print data=update2_file; title 'Data Set Update2_File - UPDATEMODE Option'; run;
Results of Updating with the UPDATEMODE Option
Data Set Update2_File - UPDATEMODE Option 1 Obs Common Animal2 Plant2 Mineral 1 a Ant Apricot Amethyst 2 b Barley Beryl 3 c Cat Cactus 4 d Dog Dewberry 5 e Eagle 6 f Frog Fennel 7 g Grape Garnet |
The value of Plant2 in observation 5 is set to missing because the UPDATEMODE=NOMISSINGCHECK option is in effect.
For detailed examples for updating data sets, see Combining and Modifying SAS Data Sets: Examples.
If the transaction data set contains an observation that does not match an observation in the master data set, you must alter the program. The Year value in observation 5 of TRANSACTION has no match in MASTER. The following shows the MASTER and the TRANSACTION input data sets:
MASTER TRANSACTION OBS Year VarX VarY OBS Year VarX VarY 1 1985 x1 y1 1 1991 x2 2 1986 x1 y1 2 1992 x2 y2 3 1987 x1 y1 3 1993 x2 4 1988 x1 y1 4 1993 y2 5 1989 x1 y1 5 1995 x2 y2 6 1990 x1 y1 7 1991 x1 y1 8 1992 x1 y1 9 1993 x1 y1 10 1994 x1 y1
You must use an explicit OUTPUT statement to write a new observation to a master data set. (The default action for a DATA step using a MODIFY statement is REPLACE, not OUTPUT.) Once you specify an explicit OUTPUT statement, you must also specify a REPLACE statement. The following DATA step updates data set MASTER, based on values in TRANSACTION, and adds a new observation. This program also uses the _IORC_ automatic variable for error checking. (For more information about error checking, see Error Checking When Using Indexes to Randomly Access or Update Data.
data master; modify master transaction; by Year; if _iorc_=%sysrc(_sok) then replace; else if _iorc_=%sysrc(_dsenmr) then do; output; _error_=0; end; else do; put "Unexpected error at Observation: " _n_; _error_=0; stop; end; run; proc print data=master; title 'Updated Master Data Set -- MODIFY'; title2 'One Observation Added'; run;
Modified MASTER Data Set
Updated Master Data Set -- MODIFY 1 One Observation Added Obs Year VarX VarY 1 1985 x1 y1 2 1986 x1 y1 3 1987 x1 y1 4 1988 x1 y1 5 1989 x1 y1 6 1990 x1 y1 7 1991 x2 y1 8 1992 x2 y2 9 1993 x2 y2 10 1994 x1 y1 11 1995 x2 y2 |
SAS added a new observation, observation 11, to the MASTER data set and updated observations 7, 8, and 9.
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.