Chapter Contents |
Previous |
Next |
MODIFY |
Valid: | in a DATA step |
Category: | File-handling |
Type: | Executable |
Restriction: | Cannot modify the descriptor portion of a SAS data set, such as adding a variable |
Syntax |
MODIFY master-data-set
<(data-set-option(s))> transaction-data-set <(data-set-option(s))>
<NOBS=variable> <END=variable> <UPDATEMODE=MISSINGCHECK| NOMISSINGCHECK>; BY by-variable; |
MODIFY master-data-set <(data-set-option(s))> KEY=index </ UNIQUE> <NOBS=variable> <END=variable> ; |
MODIFY master-data-set <(data-set-option(s))> <NOBS=variable> POINT=variable; |
MODIFY master-data-set <(data-set-option(s))> <NOBS=variable> <END=variable>; |
Observations in native SAS data files may have incorrect data values, or the data file may become unreadable. DBMS tables that are referenced by views are not affected.
Note:
If you modify a password-protected data set, specify
the password with the appropriate data set option (ALTER= or PW=) within the
MODIFY statement, and not in the DATA statement.
Restriction: | This data set must also appear in the DATA statement. |
Restriction: | For sequential and
matching access, the master data set can be a SAS data file, a SAS/ACCESS
view, an SQL view, or a DBMS engine for the LIBNAME statement. It cannot
be a DATA step view or a passthrough view.
For random access using POINT=, the master data set must be a SAS data file or an SQL view that references a SAS data file. For direct access using KEY=, the master data set can be a SAS data file or the DBMS engine for the LIBNAME statement. If it is a SAS file, it must be indexed and the index name must be specified on the KEY= option. For a DBMS, the KEY= is set to the keyword DBKEY and the column names to use as an index must be specified on the DBKEY= data set option. These column names are used in constructing a WHERE expression that is passed to the DBMS. |
Restriction: | Specify this data set only when the DATA step contains a BY statement. |
Explanation: | The variable, which
is initialized to zero, is set to 1 when the MODIFY statement reads the last
observation of the data set being modified (for sequential access )
or the last observation of the transaction data set (for matching access ).
It is also set to 1 when MODIFY cannot find a match for a KEY= value (random
access ).
This variable is not added to any data set. |
Restriction: | Do not use this argument in the same MODIFY statement with the POINT= argument. POINT= indicates that MODIFY uses random access. The value of the END= variable is never set to 1 for random access. |
Default: | If the KEY= value is not found, the automatic variable _ERROR_ is set to 1, and the automatic variable _IORC_ receives the value corresponding to the SYSRC autocall macro's mnemonic _DSENOM. See Automatic Variable _IORC_ and the SYSRC Autocall Macro . |
Restriction: | KEY= processing is different for SAS/ACCESS engines. See the SAS/ACCESS documentation for more information. |
Tip: | Examples of sources for index values include a separate SAS data set named in a SET statement and an external file that is read by an INPUT statement. |
Tip: | If duplicates exist in the
master file, only the first occurrence is updated unless you use a DO-LOOP
to execute a SET statement for the data set listed on the KEY=option for all
duplicates in the master data set.
If duplicates exist in the transaction data set, and they are consequtive, use the UNIQUE option to force the search for a match in the master data set to begin at the top of the index. Write an accumulation statement to add each duplicate transaction to the observation in master. Without the UNIQUE option, only the first duplicate transaction observation updates the master. If the duplicates in the transaction data set are not consecutive, the search begins at the beginning of the index each time, so that each duplicate is applied to the master. Write an accumulation statement to add each duplicate to the master. |
See Also: | UNIQUE |
Featured in: | Modifying Observations Located by an Index, Handling Duplicate Index Values, and Controlling I/O |
Explanation: | At compilation time, SAS reads the descriptor portion of the data set and assigns the value of the NOBS= variable automatically. Thus, you can refer to the NOBS= variable before the MODIFY statement. The variable is available in the DATA step but is not added to the new data set. |
Tip: | The NOBS= and POINT= options are independent of each other. |
Featured in: | Modifying Observations Located by Observation Number |
Requirement: | When using the POINT=
argument, include one or both of the following:
|
Restriction: | You cannot use the
POINT= option with any of the following:
|
Restriction: | You can use POINT= with compressed data sets only if the data set was created with the POINTOBS= data set option set to YES, the default value. |
Restriction: | You can use the random access method on compressed files only with SAS version 7 and beyond. |
Tip: | If the POINT= value does not match an observation number, SAS sets the automatic variable _ERROR_ to 1. |
Featured in: | Modifying Observations Located by Observation Number |
Restriction: | UNIQUE can appear only with the KEY= option. |
Tip: | Use UNIQUE when there are consecutive duplicate KEY= values in the transaction data set, so that the search for a match in the master data set begins at the top of the index file for each duplicate transaction. You must include an accumulation statement or the duplicate values overwrite each other causing only the last transaction value to be the result in the master observation. |
Featured in: | Handling Duplicate Index Values |
Default: | MISSINGCHECK |
Details |
The matching access method uses the BY statement to match observations from the transaction data set with observations in the master data set. The BY statement specifies a variable that is in the transaction data set and the master data set.
When the MODIFY statement reads an observation from the transaction data set, it uses dynamic WHERE processing to locate the matching observation in the master data set. The observation in the master data set can be either
Modifying Observations Using a Transaction Data Set
shows the matching access method.
Duplicates in the master and transaction data sets affect processing.
This method requires that you use the KEY= option in the MODIFY statement to name an indexed variable from the data set that is being modified. Use another data source (typically a SAS data set named in a SET statement or an external file read by an INPUT statement) to provide a like-named variable whose values are supplied to the index. MODIFY uses the index to locate observations in the data set that is being modified.
Modifying Observations Located by an Index
shows the direct-access-by-indexed-values method.
Handling Duplicate Index Values shows how to handle duplicate index values.
You can use the POINT= option in the MODIFY statement to name a variable from another data source (not the master data set), whose value is the number of an observation that you want to modify in the master data set. MODIFY uses the values of the POINT= variable to retrieve observations in the data set that you are modifying. (You can use POINT= on a compressed data set only if the data set was created with the POINTOBS= data set option.)
It is good programming practice to validate the value of the POINT= variable and to check the status of the automatic variable _ERROR_.
Modifying Observations Located by Observation Number shows the direct (random) access by observation number method.
The sequential access method is the simplest form of the MODIFY statement,
but it provides less control than the direct access methods. With the sequential
access method, you may use the NOBS= and END= options to modify a data set;
you do not use the POINT= or KEY= options.
The automatic variable _IORC_ contains the return code for each I/O operation that the MODIFY statement attempts to perform. The best way to test for values of _IORC_ is with the mnemonic codes that are provided by the SYSRC autocall macro. Each mnemonic code describes one condition. The mnemonics provide an easy method for testing problems in a DATA step program. These codes are useful:
Note: Beginning in Version 7, the IORCMSG function returns a formatted
error message associated with the current value of _IORC_.
Controlling I/O
shows how to use the automatic variable _IORC_ and the SYSRC autocall macro.
The way SAS writes observations to a SAS data set when the DATA step contains a MODIFY statement depends on whether certain other statements are present. The possibilities are
Keep in mind the following as you work with these statements:
Replacing and Removing Observations and Writing Observations to Different SAS Data Sets
shows how to use the OUTPUT, REMOVE, and REPLACE statements to write observations.
In a SAS/SHARE environment, the MODIFY statement accesses an observation in update mode. That is, the observation is locked from the time MODIFY reads it until a REPLACE or REMOVE statement executes. At that point the observation is unlocked. It cannot be accessed until it is re-read with the MODIFY statement. The MODIFY statement opens the data set in update mode, but the control level is based on the statement used. For example, KEY= and POINT= are member-level locking. Refer to SAS/SHARE User's Guide for more information.
Comparisons |
For information on DBMS replacement rules, see the SAS/ACCESS documentation.
Note: Dynamic
WHERE processing can be costly if the MODIFY statement modifies a SAS data
set that is not in sorted order or has not been indexed. Having the master
data set in sorted order or indexed and having the transaction data set in
sorted order reduces processing overhead, especially for large files.
Input Data Set for Examples |
The examples modify the INVTY.STOCK data set. INVTY.STOCK contains these variables:
libname invty 'SAS-data-library';
options yearcutoff= 1920; data invty.stock(index=(partno)); input PARTNO $ DESC $ INSTOCK @17 RECDATE date7. @25 PRICE; format recdate date7.; datalines; K89R seal 34 27jul95 245.00 M4J7 sander 98 20jun95 45.88 LK43 filter 121 19may96 10.99 MN21 brace 43 10aug96 27.87 BC85 clamp 80 16aug96 9.55 NCF3 valve 198 20mar96 24.50 KJ66 cutter 6 18jun96 19.77 UYN7 rod 211 09sep96 11.55 JD03 switch 383 09jan97 13.99 BV1E timer 26 03jan97 34.50 ;
Examples |
This example replaces the date on all records in the data set INVTY.STOCK with the current date. This code replaces the value of the variable RECDATE with the current date for all observations in INVTY.STOCK:
data invty.stock; modify invty.stock; recdate=today(); run;
A printing of INVTY.STOCK shows that RECDATE has been modified:
INVTY.STOCK 1 PARTNO DESC INSTOCK RECDATE PRICE K89R seal 34 14MAR97 245.00 M4J7 sander 98 14MAR97 45.88 LK43 filter 121 14MAR97 10.99 MN21 brace 43 14MAR97 27.87 BC85 clamp 80 14MAR97 9.55 NCF3 valve 198 14MAR97 24.50 KJ66 cutter 6 14MAR97 19.77 UYN7 rod 211 14MAR97 11.55 JD03 switch 383 14MAR97 13.99 BV1E timer 26 14MAR97 34.50 |
The MODIFY statement opens INVTY.STOCK for update processing.
SAS reads one observation of INVTY.STOCK for each iteration of the DATA step
and performs any operations that the code specifies. In this case, the code
replaces the value of RECDATE with the result of the TODAY function for every
iteration of the DATA step. An implicit REPLACE statement at the end of the
step writes each observation to its previous location in INVTY.STOCK.
This example adds the quantity of newly received stock to its data set INVTY.STOCK as well as updating the date on which stock was received. The transaction data set ADDINV in the WORK library contains the new data.
The ADDINV data set is the data set that contains the updated information. ADDINV contains these variables:
ADDINV is the second data set in the MODIFY statement. SAS uses it as the transaction data set and reads each observation from ADDINV sequentially. Because the BY statement specifies the common variable PARTNO, MODIFY finds the first occurrence of the value of PARTNO in INVTY.STOCK that matches the value of PARTNO in ADDINV. For each observation with a matching value, the DATA step changes the value of RECDATE to today's date and replaces the value of INSTOCK with the sum of INSTOCK and NWSTOCK (from ADDINV). MODIFY does not add NWSTOCK to the INVTY.STOCK data set because that would modify the data set descriptor. Thus, it is not necessary to put NWSTOCK in a DROP statement.
This example specifies ADDINV as the transaction data set that contains information to modify INVTY.STOCK. A BY statement specifies the shared variable whose values locate the observations in INVTY.STOCK.
This DATA step creates ADDINV:
data addinv; input PARTNO $ NWSTOCK; datalines; K89R 55 M4J7 21 LK43 43 MN21 73 BC85 57 NCF3 90 KJ66 2 UYN7 108 JD03 55 BV1E 27 ;
This DATA step uses values from ADDINV to update INVTY.STOCK.
libname invty 'SAS-data-library';
data invty.stock; modify invty.stock addinv; by partno; RECDATE=today(); INSTOCK=instock+nwstock; if _iorc_=0 then replace; run;
A printing of INVTY.STOCK shows that INSTOCK and RECDATE have been modified:
INVTY.STOCK 1 PARTNO DESC INSTOCK RECDATE PRICE K89R seal 89 14MAR97 245.00 M4J7 sander 119 14MAR97 45.88 LK43 filter 164 14MAR97 10.99 MN21 brace 116 14MAR97 27.87 BC85 clamp 137 14MAR97 9.55 NCF3 valve 288 14MAR97 24.50 KJ66 cutter 8 14MAR97 19.77 UYN7 rod 319 14MAR97 11.55 JD03 switch 438 14MAR97 13.99 BV1E timer 53 14MAR97 34.50 |
This example reads the data set NEWP, determines which observation number in INVTY.STOCK to update based on the value of TOOL_OBS, and performs the update. This example explicitly specifies the update activity by using an assignment statement to replace the value of PRICE with the value of NEWP.
The data set NEWP contains two variables:
This DATA step creates NEWP:
data newp; input TOOL_OBS NEWP; datalines; 251.00 2 49.33 3 12.32 4 30.00 5 15.00 6 25.75 7 22.00 8 14.00 9 14.32 0 35.00 ;
This DATA step updates INVTY.STOCK:
libname invty 'SAS-data-library';
data invty.stock; set newp; modify invty.stock point=tool_obs nobs=max_obs; if _error_=1 then do; put 'ERROR occurred for TOOL_OBS=' tool_obs / 'during DATA step iteration' _n_ / 'TOOL_OBS value may be out of range.'; _error_=0; stop; end; PRICE=newp; RECDATE=today(); run;
A printing of INVTY.STOCK shows that RECDATE and PRICE have been modified:
INVTY.STOCK 1 PARTNO DESC INSTOCK RECDATE PRICE K89R seal 34 14MAR97 251.00 M4J7 sander 98 14MAR97 49.33 LK43 filter 121 14MAR97 12.32 MN21 brace 43 14MAR97 30.00 BC85 clamp 80 14MAR97 15.00 NCF3 valve 198 14MAR97 25.75 KJ66 cutter 6 14MAR97 22.00 UYN7 rod 211 14MAR97 14.00 JD03 switch 383 14MAR97 14.32 BV1E timer 26 14MAR97 35.00 |
This example uses the KEY= option to identify observations to retrieve by matching the values of PARTNO from ADDINV with the indexed values of PARTNO in INVTY.STOCK. ADDINV is created in Modifying Observations Using a Transaction Data Set.
KEY= supplies index values that allow MODIFY to access directly the observations to update. No dynamic WHERE processing occurs. In this example, you specify that the value of INSTOCK in the master data set INVTY.STOCK increases by the value of the variable NWSTOCK from the transaction data set ADDINV.
libname invty 'SAS-data-library';
data invty.stock; set addinv; modify invty.stock key=partno; INSTOCK=instock+nwstock; RECDATE=today(); if _iorc_=0 then replace; run;
A printing of INVTY.STOCK shows that INSTOCK and RECDATE have been modified.
INVTY.STOCK 1 PARTNO DESC INSTOCK RECDATE PRICE K89R seal 89 14MAR97 245.00 M4J7 sander 119 14MAR97 45.88 LK43 filter 164 14MAR97 10.99 MN21 brace 116 14MAR97 27.87 BC85 clamp 137 14MAR97 9.55 NCF3 valve 288 14MAR97 24.50 KJ66 cutter 8 14MAR97 19.77 UYN7 rod 319 14MAR97 11.55 JD03 switch 438 14MAR97 13.99 BV1E timer 53 14MAR97 34.50 |
This example shows how MODIFY handles duplicate values of the variable in the SET data set that is supplying values to the index on the master data set.
The NEWINV data set is the data set that contains the updated information. NEWINV contains these variables:
M4J7
appears twice.This DATA step creates NEWINV:
data newinv; input PARTNO $ NWSTOCK; datalines; K89R 55 M4J7 21 M4J7 26 LK43 43 MN21 73 BC85 57 NCF3 90 KJ66 2 UYN7 108 JD03 55 BV1E 27 ;
This DATA step terminates with an error when it tries
to locate an observation in INVTY.STOCK to match with the second occurrence
of M4J7
in NEWINV:
libname invty 'SAS-data-library';
/* This DATA step terminates with an error! */ data invty.stock; set newinv; modify invty.stock key=partno; INSTOCK=instock+nwstock; RECDATE=today(); run;
This message appears in the SAS log:
ERROR: No matching observation was found in MASTER data set. PARTNO=K89R NWSTOCK=55 DESC= INSTOCK=. RECDATE=14MAR97 PRICE=. _ERROR_=1 _IORC_=1230015 _N_=1 NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 689:19 NOTE: The SAS System stopped processing this step because of errors. NOTE: The data set INVTY.STOCK has been updated. There were 0 observations rewritten, 0 observations added and 0 observations deleted. |
Adding the UNIQUE option to the MODIFY statement avoids
the error in the previous DATA step. The UNIQUE option causes the DATA step
to return to the top of the index each time it looks for a match for the value
from the SET data set. Thus, it finds the M4J7
in the MASTER data set for each occurrence of M4J7
in the SET data set. The updated result for M4J7
in the output shows that both values of NWSTOCK from NEWINV for M4J7
are added to the value of INSTOCK for M4J7
in INVTY.STOCK. An
accumulation statement sums the values;
without it, only the value of the last instance of M4J7
would be the result in INVTY.STOCK.
data invty.stock; set newinv; modify invty.stock key=partno / unique; INSTOCK=instock+nwstock; RECDATE=today(); if _iorc_=0 then replace; run;
A printing of INVTY.STOCK shows that INSTOCK and RECDATE have been modified:
Results of Using the UNIQUE Option 1 PARTNO DESC INSTOCK RECDATE PRICE K89R seal 89 14MAR97 245.00 M4J7 sander 145 14MAR97 45.88 LK43 filter 164 14MAR97 10.99 MN21 brace 116 14MAR97 27.87 BC85 clamp 137 14MAR97 9.55 NCF3 valve 288 14MAR97 24.50 KJ66 cutter 8 14MAR97 19.77 UYN7 rod 319 14MAR97 11.55 JD03 switch 438 14MAR97 13.99 BV1E timer 53 14MAR97 34.50 |
This example uses the SYSRC autocall macro and the _IORC_ automatic variable to control I/O condition. This technique helps to prevent unexpected results that could go undetected. This example uses the direct access method with an index to update INVTY.STOCK. The data in the NEWSHIP data set updates INVTY.STOCK.
This DATA step creates NEWSHIP:
options yearcutoff= 1920; data newship; input PARTNO $ DESC $ NWSTOCK @17 SHPDATE date7. @25 NWPRICE; datalines; K89R seal 14 14nov96 245.00 M4J7 sander 24 23aug96 47.98 LK43 filter 11 29jan97 14.99 MN21 brace 9 09jan97 27.87 BC85 clamp 12 09dec96 10.00 ME34 cutter 8 14nov96 14.50 ;
Each WHEN clause in the SELECT statement specifies actions for each input/output return code that is returned by the SYSRC autocall macro:
libname invty 'SAS-data-library';
data invty.stock; set newship; modify invty.stock key=partno; select (_iorc_); when (%sysrc(_sok)) do; INSTOCK=instock+nwstock; RECDATE=shpdate; PRICE=nwprice; replace; end; when (%sysrc(_dsenom)) do; INSTOCK=nwstock; RECDATE=shpdate; PRICE=nwprice; output; _error_=0; end; otherwise do; put 'An unexpected I/O error has occurred.'/ 'Check your data and your program'; _error_=0; stop; end; end; run;
INVTY.STOCK Data Set 1 PARTNO DESC INSTOCK RECDATE PRICE K89R seal 48 14NOV96 245.00 M4J7 sander 122 23AUG96 47.98 LK43 filter 132 29JAN97 14.99 MN21 brace 52 09JAN97 27.87 BC85 clamp 92 09DEC96 10.00 NCF3 valve 198 20MAR96 24.50 KJ66 cutter 6 18JUN96 19.77 UYN7 rod 211 09SEP96 11.55 JD03 switch 383 09JAN97 13.99 BV1E timer 26 03JAN97 34.50 ME34 cutter 8 14NOV96 14.50 |
This example shows that you can replace and remove (delete) observations and write observations to different data sets. Further, this example shows that if an OUTPUT, REPLACE, or REMOVE statement is present, you must specify explicitly what action to take because no default statement is generated.
The parts that were received in 1997 are output to INVTY.STOCK97 and are removed from INVTY.STOCK. Likewise, the parts that were received in 1995 are output to INVTY.STOCK95 and are removed from INVTY.STOCK. Only the parts that were received in 1996 remain in INVTY.STOCK, and the PRICE is updated only in INVTY.STOCK.
libname invty 'SAS-data-library';
data invty.stock invty.stock95 invty.stock97; modify invty.stock; if recdate>'01jan97'd then do; output invty.stock97; remove invty.stock; end; else if recdate<'01jan96'd then do; output invty.stock95; remove invty.stock; end; else do; price=price*1.1; replace invty.stock; end; run;
New Prices for Stock Rcvd in '96 1 PARTNO DESC INSTOCK RECDATE PRICE LK43 filter 121 19MAY96 12.089 MN21 brace 43 10AUG96 30.657 BC85 clamp 80 16AUG96 10.505 NCF3 valve 198 20MAR96 26.950 KJ66 cutter 6 18JUN96 21.747 UYN7 rod 211 09SEP96 12.705 |
See Also |
Statements:
| |||||||||
"Reading, Combining, and Modifying SAS Data Sets" in SAS Language Reference: Concepts | |||||||||
"SQL Procedure" in the SAS Procedures Guide |
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.