The DATASETS Procedure

APPEND Statement

Adds the observations from one SAS data set to the end of another SAS data set.

Reminder: You can use data set options with the BASE= and DATA= options. See Statements with the Same Function in Multiple Procedures for details. You can also use any global statements as well. See Global Statements for a list.

Restriction: The BASE= data set must be a member of a SAS library that supports UPDATE access. The TAPE engine and the XPORT engine are examples of engines that do not support UPDATE access.

Restriction: If the BASE= data set is accessed through a SAS server and if no other user has the data set open at the time the APPEND statement begins processing, the BASE= data set defaults to CNTLLEV=MEMBER. When this happens, no other user can update the file while the data set is processed.

Featured in: Concatenating Two SAS Data Sets

APPEND BASE=<libref.>SAS-data-set <options> ;

Required Arguments

BASE=<libref.> SAS-data-set

names the data set to which you want to add observations.

libref: specifies the library that contains the SAS data set. If you omit libref, the default is the libref for the procedure input library. If you are using PROC APPEND, the default for libref is either WORK or USER.
SAS-data-set: names a SAS data set. If the APPEND statement cannot find an existing data set with this name, it creates a new data set in the library. In other words, you can use the APPEND statement to create a data set by specifying a new data set name in the BASE= argument.

The BASE= data set is the current SAS data set after all append operations regardless of whether you are creating a new data set or appending to an existing data set.

Alias: OUT=
Featured in: Concatenating Two SAS Data Sets

Options

APPENDVER=V6

uses the Version 6 behavior for appending observations to the BASE= data set. You must specify V6.

See also:

Appending to an Indexed Data Set

DATA=<libref.> SAS-data-set

names the SAS data set containing observations that you want to append to the end of the SAS data set specified in the BASE= argument.

libref: specifies the library that contains the SAS data set. If you omit libref, the default is the libref for the procedure input library. The DATA= data set can be from any SAS data library, but you must use the two-level name if the data set resides in a library other than the procedure input library.
SAS-data-set: names a SAS data set. If the APPEND statement cannot find an existing data set with this name, it stops processing.

Alias:	NEW=
Default:	the most recently created SAS data set, from any SAS data library
Featured in:	Concatenating Two SAS Data Sets

FORCE

forces the APPEND statement to concatenate data sets when the DATA= data set contains variables that either

are not in the BASE= data set
do not have the same type as the variables in the BASE= data set
are longer than the variables in the BASE= data set.

See also:	Appending to Data Sets with Different Variables and Appending to Data Sets That Contain Variables with Different Attributes
Featured in:	Concatenating Two SAS Data Sets

Restricting the Observations That Are Appended
You can use the WHERE= data set option with the DATA= data set to restrict the observations that are appended. Likewise, you can use the WHERE statement to restrict the observations from the DATA= data set. The WHERE statement has no affect on the BASE= data set. If you use the WHERE= data set option with the BASE= data set, WHERE= has no affect.

Choosing between the SET Statement and the APPEND Statement
If you use the SET statement in a DATA step to concatenate two data sets, the SAS System must process all the observations in both data sets to create a new one. The APPEND statement bypasses the processing of data in the original data set and adds new observations directly to the end of the original data set. Using the APPEND statement can be more efficient than using a SET statement if

the BASE= data set is large
all variables in the BASE= data set have the same length and type as the variables in the DATA= data set and if all variables exist in both data sets.
Note: You can use the CONTENTS statement to see the variable lengths and types.

The APPEND statement is especially useful if you frequently add observations to a SAS data set (for example, in production programs that are constantly appending data to a journal-type data set).

Appending Password-Protected SAS Data Sets
In order to use the APPEND statement, you need read access to the DATA= data set and write access to the BASE= data set. To gain access, use the READ= and WRITE= data set options in the APPEND statement the way you would use them in any other SAS statement, in parentheses immediately after the data set name. When you are appending password-protected data sets, remember the following guidelines:

If you do not give the read password for the DATA= data set in the APPEND statement, by default the procedure looks for the read password for the DATA= data set in the PROC DATASETS statement. However, the procedure does not look for the write password for the BASE= data set in the PROC DATASETS statement. Therefore, you must specify the write password for the BASE= data set in the APPEND statement.
If the BASE= data set is read-protected only, you must specify its read password in the APPEND statement.

Appending to a Compressed Data Set
You can concatenate compressed SAS data sets. Either or both of the BASE= and DATA= data sets can be compressed. If the BASE= data set allows the reuse of space from deleted observations, the APPEND statement may insert the observations into the middle of the BASE= data set to make use of available space.

For information on the COMPRESS= and REUSE= data set and system options, see SAS Language: Reference.

Appending to an Indexed Data Set
Beginning with Version 7, the behavior of appending to an indexed data set has changed to improve performance.

In Version 6, when you appended to an indexed data set, the index was updated for each added observation. Index updates tend to be random; therefore, disk I/O could have been high.
Currently, SAS does not update the index until all observations are added to the data set. After the append, SAS internally sorts the observations and inserts the data into the index in sequential order, which reduces most of the disk I/O and results in a faster append method.

The current method is used by default when the following requirements are met; otherwise, the Version 6 method is used:

The BASE= data set is open for member-level locking.
The BASE= data set does not contain referential integrity constraints.
The BASE= data set is not accessed using the Cross Environment Data Access (CEDA) facility.
The BASE= data set is not using a WHERE= data set option.

To display information in the SAS log about the append method that is being used, you can specify the MSGLEVEL= system option as follows:

options msglevel=i;

Either a message displays if the fast-append method is in use or a message or messages display as to why the fast-append method is not in use.

The current append method initially adds observations to the BASE= data set regardless of the restrictions that are determined by the index. For example, a variable that has an index that was created with the UNIQUE option does not have its values validated for uniqueness until the index is updated. Then, if a nonunique value is detected, the offending observation is deleted from the data set. This means that after observations are appended, some of them may subsequently be deleted.

For a simple example, consider that the BASE= data set has ten observations numbered from 1 to 10 with a UNIQUE index for the variable ID. You append a data set that contains five observations numbered from 1 to 5, and observations 3 and 4 both contain the same value for ID. The following occurs

After the observations are appended, the BASE= data set contains 15 observations numbered from 1 to 15.
SAS updates the index for ID, validates the values, and determines that observations 13 and 14 contain the same value for ID.
SAS deletes one of the observations from the BASE= data set, resulting in 14 observations that are numbered from 1 to 15. For example, observation 13 is deleted. Note that you cannot predict which observation will be deleted, because the internal sort may place either observation first. (In Version 6, you could predict that observation 13 would be added and observation 14 would be rejected.)

If you do not want the current behavior (which could result in deleted observations) or if you want to be able to predict which observations are appended, request the Version 6 append method by specifying the APPENDVER=V6 option:

proc data sets;
     append base=a data=b appendver=v6;
run;

Note: In Version 6, deleting the index and then recreating it after the append could improve performance. The current method may eliminate the need to do that. However, the performance depends on the nature of your data. [cautionend]

Appending to Data Sets with Different Variables
If the DATA= data set contains variables that are not in the BASE= data set, use the FORCE option in the APPEND statement to force the concatenation of the two data sets. The APPEND statement drops the extra variables and issues a warning message.

If the BASE= data set contains a variable that is not in the DATA= data set, the APPEND statement concatenates the data sets, but the observations from the DATA= data set have a missing value for the variable that was not present in the DATA= data set. The FORCE option is not necessary in this case.

Appending to Data Sets That Contain Variables with Different Attributes
If a variable has different attributes in the BASE= data set than it does in the DATA= data set, the attributes in the BASE= data set prevail.

If the length of a variable is longer in the DATA= data set than in the BASE= data set, or if the same variable is a character variable in one data set and a numeric variable in the other, use the FORCE option. Using FORCE has these consequences:

The length of the variables in the BASE= data set takes precedence. The SAS System truncates values from the DATA= data set to fit them into the length that is specified in the BASE= data set.
The type of the variables in the BASE= data set takes precedence. The APPEND statement replaces values of the wrong type (all values for the variable in the DATA= data set) with missing values.

Appending Data Sets That Contain Integrity Constraints
If the DATA= data set contains integrity constraints and the BASE= data set does not exist, the APPEND statement copies both general and referential integrity constraints. If the BASE= data set exists, the APPEND action copies only observations.

Appending with Generation Groups
You can use the GENNUM= data set option to append to a specific generation file. Here are examples:

SAS Statements Result

proc datasets; append base=a data=b(gennum=2);
appends B (GENNUM=2) to A

proc datasets; append base=a(gennum=2) data=b(gennum=2);
appends B (GENNUM=2) to A(GENNUM=2)

Using the APPEND Procedure Instead of the APPEND Statement
The only difference between the APPEND procedure and the APPEND statement in PROC DATASETS, is the default for libref in the BASE= and DATA= arguments. For PROC APPEND, the default is either WORK or USER. For the APPEND statement, the default is the libref of the procedure input library.

System Failures
If a system failure or some other type of interruption occurs while the procedure is executing, the append operation may not be successful; it is possible that not all, perhaps none, of the observations will be added to the BASE= data set. In addition, the BASE= data set may suffer damage. The APPEND operation performs an update in place, which means that it does not make a copy of the original data set before it begins to append observations.

Chapter Contents
Previous
Next
Top of Page

Reminder:	You can use data set options with the BASE= and DATA= options. See Statements with the Same Function in Multiple Procedures for details. You can also use any global statements as well. See Global Statements for a list.
Restriction:	The BASE= data set must be a member of a SAS library that supports UPDATE access. The TAPE engine and the XPORT engine are examples of engines that do not support UPDATE access.
Restriction:	If the BASE= data set is accessed through a SAS server and if no other user has the data set open at the time the APPEND statement begins processing, the BASE= data set defaults to CNTLLEV=MEMBER. When this happens, no other user can update the file while the data set is processed.
Featured in:	Concatenating Two SAS Data Sets

Alias:	OUT=
Featured in:	Concatenating Two SAS Data Sets

SAS Statements	Result
proc datasets; append base=a data=b(gennum=2);	appends B (GENNUM=2) to A
proc datasets; append base=a(gennum=2) data=b(gennum=2);	appends B (GENNUM=2) to A(GENNUM=2)