Chapter Contents |
Previous |
Next |
WHERE operand <operator> <operand> |
operand | something to be operated on. An operand can be a variable, a SAS function, or a constant. See Specifying an Operand. |
operator | a symbol that requests a comparison, logical operation, or arithmetic calculation. All SAS expression operators are valid for a WHERE expression, which include arithmetic, comparison, logical, minimum and maximum, concatenation, parentheses to control order of evaluation, and prefix operators. In addition, you can use special WHERE expression operators, which include BETWEEN-AND, CONTAINS, IS NULL or IS MISSING, LIKE, sounds-like, and SAME-AND. See Specifying an Operator. |
For more information on SAS expressions, see Expressions.
Specifying an Operand |
where score > 50; where date >= '01jan1998'd and time >= '9:00't; where state = 'Texas';
In a WHERE expression, you cannot use automatic variables created by the DATA step (for example, FIRST.variable, LAST.variable, _N_, or variables created in assignment statements).
As in other SAS expressions, the names of numeric variables can stand alone. SAS treats numeric values of 0 or missing as false; other values are true. For example, the following WHERE expression returns all values for EMPNUM and SSN that are not missing or that have a value of 0:
where empnum and ssn;
The names of character variables can also stand alone. SAS selects observations where the value of the character variable is not blank. For example, the following WHERE expression returns all values not equal to blank:
where lastname;
The following DATA step produces a SAS data set
that contains only observations
from data set CUSTOMER in which the value of NAME begins with Mac
and the value of variable CITY is Charleston
or
Atlanta
:
data testmacs; set customer; where substr (name,1,3) = 'Mac' and (city='Charleston' or city='Atlanta'); run;
Note: SAS functions used in a WHERE expression that can be
optimized by an index are the SUBSTR function and the TRIM function.
For more information on SAS functions, see
Functions and CALL Routines
The value will be either numeric or character. Note the following rules regarding whether to use quotation marks:
where price > 200;
where lastname eq 'Martin';
where item = '6" decorative pot'; where name ? "D'Amico";
where birthday = '24sep1975'd; where birthday = "24sep1975"d;
Specifying an Operator |
Symbol | Definition | Example |
---|---|---|
* | multiplication | where bonus = salary * .10; |
/ | division | where f = g/h; |
+ | addition | where c = a+b; |
- | subtraction | where f = g-h; |
** | exponentiation | where y = a**2; |
where zipcode eq 78753;
The following table lists the comparison operators:
Symbol | Mnemonic Equivalent | Definition | Example |
---|---|---|---|
= | EQ | equal to | where empnum eq 3374; |
^= or ~= or ¬= | NE | not equal to | where status ne fulltime; |
> | GT | greater than | where hiredate gt '01jun1982'd; |
< | LT | less than | where empnum < 2000; |
>= | GE | greater than or equal to | where empnum >= 3374; |
<= | LE | less than or equal to | where empnum <= 3374; |
IN | equal to one from a list of values | where state in ('NC','TX'); |
where lastname=: 'S';Note that in the SQL procedure, the colon modifier used in conjunction with an operator is not supported; you can use the LIKE operator instead.
For example, suppose you want all sites that are in North Carolina or Texas. You could specify:
where state = 'NC' or state = 'TX';
However, the easier way would be to use the IN operator, which says you want any state in the list:
where state in ('NC','TX');
In addition, you can use the NOT logical operator to exclude a list. For example,
where state not in ('CA', 'TN', 'MA');
where 500 <= empnum <= 1000;
Note that the previous range condition expression is equivalent to the following:
where empnum >= 500 and empnum <= 1000;
You can combine the NOT logical operator with a fully-bounded range condition to select observations that fall outside the range. Note that parentheses are required:
where not (500 <= empnum <= 1000);
You can specify the limits of the range as constants or expressions. Any range you specify is an inclusive range, so that a value equal to one of the limits of the range is within the range. The general syntax for using BETWEEN-AND is:
WHERE variable BETWEEN value AND value;
For example:
where empnum between 500 and 1000; where taxes between salary*0.30 and salary*0.50;
You can combine the NOT logical operator with the BETWEEN-AND operator to select observations that fall outside the range:
where empnum not between 500 and 1000;
Note: The BETWEEN-AND operator and a fully-bounded range condition produce the same results. That is, the following WHERE expressions are equivalent:
where 500 <= empnum <= 1000; where empnum between 500 and 1000;
The following examples select observations having the values Mobay
and Brisbayne
for the
variable COMPANY, but they do not select observations containing Bayview
:
where company contains 'bay'; where company ? 'bay';
You can combine the NOT logical operator with the CONTAINS operator to select observations that are not included in a specified string:
where company not contains 'bay';
proc sql; select * from table1 as a, table 2 as b where a.fullname contains trim(b.lastname) and a.fullname contains trim(b.firstname);
In addition, the TRIM function is helpful when you search on a macro variable. For example:
proc print; where fullname contains trim("&lname"); run;
where idnum is missing where name is null;
Using the above examples, the following is equivalent for character data:
where name = ' ';And the following is equivalent for numeric data for which missing values can be differentiated with special missing value characters:
where idnum <= .Z;
You can combine the NOT logical operator with IS NULL or IS MISSING to select nonmissing values, as follows:
where salary is not missing;
percent sign (%) | specifies that any number of characters can occupy
that position. The following WHERE expression selects all employees with a
name that starts with the letter N . The names
can be of any length.
where lastname like 'N%'; |
underscore (_) | matches just one character in the value for each
underscore character. You can specify more than one consecutive underscore
character in a pattern, and you can specify a percent sign and an underscore
in the same pattern. For example, you can use different forms of the LIKE
operator to select character values from this list of first
names:
|
The following table shows which of these names is selected by various forms using the LIKE operators:
Pattern | Name Selected |
---|---|
like 'D_an' | Dyan |
like 'D_an_' | Diana, Diane |
like 'D_an__' | Dianna |
like 'D_an%' | all names from list |
You can use a SAS character expression to specify a pattern, but you cannot use a SAS character expression that uses a SAS function.
You can combine the NOT logical operator with LIKE to select values that do not have the specified pattern, such as:
where frstname not like 'D_an%';
Although the sounds-like operator is useful, it does not always select all possible values. For example, consider that you want to select observations from the following list of names that sound like Smith:
Schmitt
Smith
Smithson
Smitt
Smythe
The
following WHERE expression selects all the names from this list
except Schmitt
and Smithson
:
where lastname=* 'Smith';
You can combine the NOT logical operator with the sounds-like operator to select values that do not contain a spelling variation of a specified word or words, such as:
where lastname not =* 'Smith';
Note: The sounds-like operator cannot be optimized with an index.
Use the SAME-AND operator when you already have a WHERE expression defined and you want to insert additional conditions. The SAME-AND operator has the following form:
where-expression-1;
. . . SAS statements. . .
WHERE SAME AND where-expression-2;
. . . SAS statements. . .
WHERE SAME AND where-expression-n;
SAS selects observations that satisfy the conditions after the SAME-AND operator in addition to any previously defined conditions. SAS treats all of the existing conditions as though they were conditions separated by AND operators in a single WHERE expression.
The following example shows how to use the SAME-AND operator within RUN groups in the GPLOT procedure. The SAS data set YEARS has three variables and contains quarterly data for the 1990-1997 period:
proc gplot data=years; plot unit*quar=year; run; where year > '01jan1991'd; run; where same and year < '01jan1996'd; run;
The following WHERE expression is equivalent to the preceding code:
where year > '01jan1991'd and year < '01jan1996'd;
For example, if A is less than B, then the following would return the value of A:
where x = (a min b);
Note:
The
symbol representation >< is not supported, and <> is interpreted as
not equals.
For example,
where name = 'John'||'Smith';
where z = -(x + y);
Note: The NOT operator is also considered a prefix operator.
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.