SAS. Are variables enumerated at each iteration of the data step?

I always thought that the variables are set on each iteration of the data step. However, in the following code, this is similar to the value that the variable receives at the very beginning. I do not understand why this is happening?

data one; input x $ y; datalines; a 10 a 13 a 14 b 9 ; run; data two; input z; datalines; 45 ; run; data test; if _n_ = 1 then set two; /* when _n_=2 the PDV assigns missing values, right ? */ set one; run; proc print; run; 

Result

  zxy 45 a 10 45 a 13 45 a 14 45 b 9 

I expected to get this

  zxy 45 a 10 . a 13 . a 14 . b 9 
+5
source share
2 answers

SAS does not reset the values ​​in the PDV for the SET, MERGE, MODIFY, or UPDATE statements. Since you are using a SET statement, therefore, SAS does not reset it.

 if _n_ = 1 then set two; 

http://support.sas.com/documentation/cdl/en/lrcon/65287/HTML/default/viewer.htm#p08a4x7h9mkwqvn16jg3xqwfxful.htm

Read - Execution Phase - Index 5

http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001290590.htm

http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000961108.htm

+3
source

SAS sets a flag for each variable in the PDV, which indicates what will happen to it when the data step returns to the beginning of the loop. This flag indicates that either the variable will be absent or not will be absent (and the current value will be saved).

By default, this flag indicates that the variable should be reset. This flag is usually set as "save value" in one of two ways.

  • Firstly, if a variable is present in the RETAIN operator or the SUM operator is used on the left side ( x+1; ), the flag is set for this variable.
  • Secondly, if a variable is present in the set , merge , 'modify' or update statement, the flag is set for this variable.

In this case, your z variable is present in the set statement, so it is automatically saved.

Here is another good example of this work.

 data test1; do x=1 to 5; y=2; output; end; run; data test2; do x=6 to 10; output; end; run; data test3; set test1 test2; if x=7 then y=4; run; 

Here y will be absent after reading the last record test1 ; that since at the end of a group or dataset it sets all the variables at once. However, y saved automatically; this flag is not something that can change. Therefore, when I set y=4; in the record x=7 , this 4 is fully preserved. So x=6 has missing y , but x=7 through x=10 has y=4 .

But wait, you say. My variables x and y also present in the set statement, and they are not saved automatically. They are reset every time a data step is read from the data set.

Nope. They get a new meaning, yes: but they will never be absent. This is of particular importance in several cases: from several to one merge, which basically work, as mentioned above, but with by : groups the record one combined with all many , not because it is read by several but because it is read by one times, and then is not reset to absence (i.e., saved). This is why a many-to-one merger is a little dangerous if you don't know about it:

 data test1; do x=1 to 5; z=0; output; end; run; data test2; do x=1 to 5; do y=1 to 3; output; end; end; run; data testMerge; merge test1 test2; by x; if y=2 then z=1; run; 

Note that z=1 is true for y=2 and for records y=3 , although I did not ask for it! Unfortunately! This is because z was read with test1 once for the first of each x by the group record, and then it was not re-read after that - it was just saved.

+1
source

Source: https://habr.com/ru/post/1209047/


All Articles