Chapter Contents |
Previous |
Next |
SAS Companion for the Microsoft Windows Environment |
Using PEEK Functions to Access Character String Arguments |
For example, suppose you have a routine named GetPath in a library named SERVICES.DLL. It has two arguments, an integer function code and a pointer to a pointer. The function code determines what action GetPath will take, and the second argument points to a pointer that will be updated by GetPath to refer to a system character string. The calling code in C might be
GetPath(1,&stgptr); printf("GetPath indicates string is '%s'.\n",stgptr);
Using MODULE, the corresponding attribute table entry would be
ROUTINE GetPath MINARG=2 MAXARG=2 MODULE=SERVICES; ARG 1 NUM INPUT BYVALUE FORMAT=PIB4.; ARG 2 NUM OUTPUT BYADDR FORMAT=PIB4.;and could be invoked as follows:
call module('SERVICES,GetPath',1,stgptr); put stgptr= stgptr=hex8.;If the pointer value in STGPTR is 0035F780, STGPTR would actually be set to the decimal value 3536768, which is the decimal equivalent of 0035F780. So the PUT statement above would produce:
STGPTR=3536768 STGPTR=0035F780However, you want the data at address 0035F780, not the value of the pointer itself. To access that data, you need to use the PEEKC function.
The PEEKC function is given two arguments, a pointer via a numeric variable (such as STGPTR above) and a length in bytes (characters). PEEKC returns a character string of the specified length containing the characters at the pointer location.
In the example, suppose that GetPath sets the second argument's pointer value to the address of the null-terminated character string C:\XYZ. You can access the character data with:
call module('SERVICES,GetPath',1,stgptr); length path $64; path = peekc(stgptr,64); i = index(path,'00'x); if i then substr(path,i)=' '; /* path now contains the string */
The PEEKC function copies 64 bytes starting at the location referred to by the pointer in STGPTR. Because you need only the data up to the null terminator (but not including it), you search for the null terminator with the INDEX function, then blank out all characters including and after that point.
You can also use the $CSTR format in this scenario to simplify your code slightly:
call module('SERVICES,GetPath',1,stgptr); length path $64; path = put(peekc(stgptr,64),$cstr64.);The $CSTR format accepts as input a character string of a specified width. It looks for a null terminator and pads the output string with blanks from that point. For more information, see $CSTRw. Format.
Accessing External DLLs Efficiently |
* routines XYZ and BBB in FIRST.DLL; ROUTINE XYZ MINARG=1 MAXARG=1 MODULE=FIRST; ARG 1 NUM INPUT; ROUTINE BBB MINARG=1 MAXARG=1 MODULE=FIRST; ARG 1 NUM INPUT; * routines ABC and DDD in SECOND.DLL; ROUTINE ABC MINARG=1 MAXARG=1 MODULE=SECOND; ARG 1 NUM INPUT; ROUTINE DDD MINARG=1 MAXARG=1 MODULE=SECOND; ARG 1 NUM INPUT;and the DATA step looked like:
filename sascbtbl 'myattr.tbl'; data _null_; do i=1 to 50; /* FIRST.DLL is loaded only once */ value = modulen('XYZ',i); /* SECOND.DLL is loaded only once */ value2 = modulen('ABC',value); put i= value= value2=; end; run;In this example, MODULEN parses the attribute table during DATA step compilation. In the first loop iteration (i=1), FIRST.DLL is loaded and the XYZ routine is accessed when MODULEN calls for it. Next, SECOND.DLL is loaded and the ABC routine is accessed. For subsequent loop iterations (starting when i=2), FIRST.DLL and SECOND.DLL remain loaded, so the MODULEN function simply accesses the XYZ and ABC routines. The SAS System unloads both DLLs at the end of the DATA step.
Note that the attribute table can contain any number of descriptions for routines that are not accessed for a given step. This does not cause any additional overhead (apart from a few bytes of internal memory to hold the attribute descriptions). In the above example, BBB and DDD are in the attribute table but are not accessed by the DATA step.
Grouping SAS Variables as Structure Arguments |
For example, consider the GetClientRect routine, which is part of the Win32 API in USER32.DLL. This routine retrieves the coordinates of a window's client area. This also requires the use of another routine, GetActiveWindow, to get the window handle for the window you want the coordinates from.
The C prototypes for these routines are
HWND GetActiveWindow(VOID); BOOL GetClientRect(HWND hWnd, LPRECT lprc);In C, the code to invoke them is:
typedef struct tagRECT { int left; int top; int right; int bottom; } RECT; /* RECT is a structure variable */ .... /* other code */ /* Need the window handle first */ hWnd=GetActiveWindow(); /* Function call, passing the address */ /* of RECT */ GetClientRect(hWnd, &RECT);
To call these routines using MODULE, you would use the following attribute table entries:
routine GetActiveWindow minarg=0 maxarg=0 stackpop=called module=USER32 returns=ushort; routine GetClientRect minarg=5 maxarg=5 stackpop=called module=USER32; arg 1 num input byvalue format=pib4.; arg 2 num update fdstart format=ib4.; arg 3 num update format=ib4.; arg 4 num update format=ib4.; arg 5 num update format=ib4.;with the following DATA step:
filename sascbtbl 'sascbtbl.dat'; data _null_; hwnd=modulen('GetActiveWindow'); call module('GetClientRect',hwnd,left, top,right,bottom); put left= top= right= bottom=; run;
The use of the FDSTART option in the ARG statement for argument 2 indicates that argument 2 and all subsequent arguments are to be gathered together into a single parameter block.
The output in the log from the PUT statement would look like:
LEFT=2 TOP=2 RIGHT=400 BOTTOM=587
Using Constants and Expressions as Arguments to MODULE |
You can specify input arguments as constants and arithmetic expressions. However, because output and update arguments must be able to be modified and returned, you can pass only a variable for these parameters. If you specify a constant or expression where a value that can be updated is expected, the SAS System issues a warning message pointing out the error. Processing continues, but the MODULExy routine cannot update a constant or expression argument (meaning that the value of the argument you wanted to update will be lost).
Consider these examples. Here is the attribute table:
* attribute table entry for ABC; routine abc minarg=2 maxarg=2; arg 1 input format=ib4.; arg 2 output format=ib4.;Here is the DATA step with the MODULE calls:
data _null_; x=5; /* passing a variable as the */ /* second argument - OK */ call module('abc',1,x); /* passing a constant as the */ /* second argument - INVALID */ call module('abc',1,2); /* passing an expression as the */ /* second argument - INVALID */ call module('abc',1,x+1); run;
In the above example, the first call to MODULE is correct
because the variable
x
is updated with what the
abc
routine
returns for the second argument. The second call to MODULE is not correct
because a constant is passed. MODULE issues a warning indicating you have
passed a constant, and MODULE passes a temporary area instead. The third call
to MODULE is not correct as an arithmetic expression is passed, causing a
temporary location from the DATA step to be used, and the returned value is
lost.
Specifying Formats and Informats to Use with MODULE Arguments |
Usually, the format you use corresponds to a variable
type for a given programming language. The following sections describe the
proper formats that correspond to different variable types in various programming
languages.
C Type | SAS Format/Informat |
---|---|
double | RB8. |
float | FLOAT4. |
signed int | IB4. |
signed short | IB2. |
signed long | IB4. |
char * | IB4. |
unsigned int | PIB4. |
unsigned short | PIB2. |
unsigned long | PIB4. |
char[w] | $CHARw. or $CSTRw. (see $CSTRw. Format) |
Note: For information about
passing character data other
than as pointers to character strings, see $BYVALw. Format.
FORTRAN Type | SAS Format/Informat |
---|---|
integer*2 | IB2. |
integer*4 | IB4. |
real*4 | RB4. |
real*8 | RB8. |
character*w | $CHARw. |
The
MODULE routines can support FORTRAN character arguments
only if they are not expected to be passed by descriptor.
PL/I Type | SAS Format/Informat |
---|---|
FIXED BIN(15) | IB2. |
FIXED BIN(31) | IB4. |
FLOAT BIN(21) | RB4. |
FLOAT BIN(31) | RB8. |
CHARACTER(w) | $CHARw. |
The
PL/I descriptions are added here for completeness;
this does not guarantee that you will be able to invoke PL/I routines.
COBOL Format | SAS Format/Informat |
Description |
---|---|---|
PIC Sxxxx BINARY | IBw. | integer binary |
COMP-2 | RB8. | double-precision floating point |
COMP-1 | RB4. | single-precision floating point |
PIC xxxx or Sxxxx | Fw. | printable numeric |
PIC yyyy | $CHARw. | character |
The following COBOL specifications might not properly match with the Institute-supplied formats because zoned and packed decimal are not truly defined for systems based on Intel architecture.
COBOL Format | SAS Format/Informat |
Description |
---|---|---|
PIC Sxxxx DISPLAY | ZDw. | zoned decimal |
PIC Sxxxx PACKED-DECIMAL | PDw. | packed decimal |
The following COBOL specifications do not have true native equivalents and are only usable in conjunction with the corresponding S370Fxxx informat and format, which allows for IBM mainframe-style representations to be read and written in the PC environment.
COBOL Format | SAS Format/Informat | Description |
---|---|---|
PIC xxxx DISPLAY | S370FZDUw. | zoned decimal unsigned |
PIC Sxxxx DISPLAY SIGN LEADING | S370FZDLw. | zoned decimal leading sign |
PIC Sxxxx DISPLAY SIGN LEADING SEPARATE | S370FZDSw. | zoned decimal leading sign separate |
PIC Sxxxx DISPLAY SIGN TRAILING SEPARATE | S370FZDTw. | zoned decimal trailing sign separate |
PIC xxxx BINARY | S370FIBUw. | integer binary unsigned |
PIC xxxx PACKED-DECIMAL | S370FPDUw. | packed decimal unsigned |
* attribute table entry; routine abc minarg=1 maxarg=1; arg 1 input char format=$cstr10.;
you can use the following DATA step:
data _null_; rc = module('abc','my string'); run;
The $CSTR format adds a null terminator to the character
string
my string
before passing it to the
abc
routine. This is equivalent to
the following attribute entry:
* attribute table entry; routine abc minarg=1 maxarg=1; arg 1 input char format=$char10.;
with the following DATA step:
data _null_; rc = module('abc','my string'||'00'x); run;
The first example is easier to understand and easier to use when using variable or expression arguments.
The $CSTR informat converts a null-terminated string
into a blank-padded string of the specified length. If the DLL routine is
supposed to update a character argument, use the $CSTR informat in the argument
attribute.
long xyz(a,b) long a; double b; { static char c = 'Y'; if (a == 'X') return(1); else if (b == c) return(2); else return(3); }
In this example, the
xyz
routine expects two arguments,
a long and a double. If the long is an
X
, the actual value of the long is
88 in decimal. This is because an ASCII
X
is stored as hex 58, and this is
promoted to a long, represented as 0x00000058 (or 88 decimal). If the value
of
a
is
X
, or 88, a 1 is returned. If the second argument, a double, is
Y
(which is interpreted as 89), then 2 is returned.
Now suppose that you want to pass characters as the
arguments to
xyz
. In C, you would invoke them as follows:
x = xyz('X',(double)'Z'); y = xyz('Q',(double)'Y');This is because the
X
and
Q
values are automatically promoted to ints (which are the same as longs for
the sake of this example), and the integer values corresponding to
Z
and
Y
are cast to doubles.
To call
xyz
using the MODULEN function, your
attribute table must reflect the fact that you want to pass characters:
routine xyz minarg=2 maxarg=2 returns=long; arg 1 input char byvalue format=$byval4.; arg 2 input char byvalue format=$byval8.;Note that it is important that the BYVALUE option appear in the ARG statement as well. Otherwise, MODULEN assumes that you want to pass a pointer to the routine, instead of a value.
Here is the DATA step that invokes MODULEN and passes it characters:
data _null_; x = modulen('xyz','X','Z'); put x= ' (should be 1)'; y = modulen('xyz','Q','Y'); put y= ' (should be 2)'; run;
Understanding MODULE Log Messages |
If you specify
i
in the control string parameter to MODULE, the
SAS System prints several informational messages to the log. You can use these
messages to determine whether you have passed incorrect arguments or coded
the attribute table incorrectly.
Consider this example that uses MODULEIN from within
the IML procedure. It uses the MODULEIN function to invoke the
changi
routine
(stored in theoretical TRYMOD.DLL). In the example, MODULEIN passes the constant
6 and the matrix x2, which is a 4x5 matrix to be converted to an integer matrix.
The attribute table for
changi
is as follows:
routine changi module=trymod returns=long; arg 1 input num format=ib4. byvalue; arg 2 update num format=ib4.;The following IML step invokes MODULEIN:
proc iml; x1 = J(4,5,0); do i=1 to 4; do j=1 to 5; x1[i,j] = i*10+j+3; end; end; y1= x1; x2 = x1; y2 = y1; rc = modulein('*i','changi',6,x2); ....The
'*i'
control string causes the lines shown in MODULEIN Output to
be printed in the log.
---PARM LIST FOR MODULEIN ROUTINE--- CHR PARM 1 885E0AA8 2A69 (*i) CHR PARM 2 885E0AD0 6368616E6769 (changi) NUM PARM 3 885E0AE0 0000000000001840 NUM PARM 4 885E07F0 0000000000002C400000000000002E40000000000000304000000000000031400000000000003240 000000000000384000000000000039400000000000003A400000000000003B400000000000003C40 0000000000004140000000000080414000000000 ---ROUTINE changi LOADED AT ADDRESS 886119B8 (PARMLIST AT 886033A0)--- PARM 1 06000000 <CALL-BY-VALUE> PARM 2 88604720 0E0000000F00000010000000110000001200000018000000190000001A0000001B0000001C000000 22000000230000002400000025000000260000002C0000002D0000002E0000002F00000030000000 ---VALUES UPON RETURN FROM changi ROUTINE--- PARM 1 06000000 <CALL-BY-VALUE> PARM 2 88604720 140000001F0000002A0000003500000040000000820000008D00000098000000A3000000AE000000 F0000000FB00000006010000110100001C0100005E01000069010000740100007F0100008A010000 ---VALUES UPON RETURN FROM MODULEIN ROUTINE--- NUM PARM 3 885E0AE0 0000000000001840 NUM PARM 4 885E07F0 00000000000034400000000000003F4000000000000045400000000000804A400000000000005040 00000000004060400000000000A06140000000000000634000000000006064400000000000C06540 0000000000006E400000000000606F4000000000 |
The output is divided into four sections.
The 'CHR PARM n' portion indicates that character parameter n was passed. In the example, 885E0AA8 is the actual address of the first character parameter to MODULEIN. The value at the address is hex 2A69, and the ASCII representation of that value ('*i') is in parentheses after the hex value. The second parameter is likewise printed. Only these first two arguments have their ASCII equivalents printed; this is because other arguments might contain unreadable binary data.
The remaining parameters appear with only hex representations of their values (NUM PARM 3 and NUM PARM 4 in the example).
The third parameter to MODULEIN is numeric, and it is
at address 885E0AE0. The hex representation of the floating point number 6
is shown. The fourth parameter is at address 885E07F0, which points to an
area containing all the values for the 4x5 matrix. The
*i
option prints
the entire argument; be careful if you use this option with large matrices
because the log might become quite large.
The log contains the status of each argument as it is passed. For example, the first parameter in the example is call-by-value (as indicated in the log). The second parameter is the address of the matrix. The log shows the address, along with the data to which it points.
Note that all the values in the first parameter and in the matrix are long integers because the attribute table states that the format is IB4.
changi
. The call-by-value argument is unchanged, but the other argument
(the matrix) contains different values.
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.