Yes, any variable defined in the SET , MERGE or UPDATE is automatically saved (not indicated at the top of the data step loop). You can effectively ignore this with
output; call missing(of <list of variables to clear out>); run;
at the end of the data step.
This is how MERGE works for one-to-one mergers, by the way, and the reasons why many-to-many merges don't usually work the way you want.
The difference between โtogetherโ and โseparateโ cases is that in a separate case you have two data sets with different variables. If you run it interactively, i.e. SAS Program Editor or Enhanced Editor (not EG or batch mode), you can use the data step debugger to see it a little more clearly. You will see the following:
At the end of the last row of the ones dataset:
i AB 3 1 .
Notice B exists but is missing. It then returns to the top of the data step loop. All three variables remain valid because they are all from data sets. Then it tries to read from ones again, which generates:
i AB . . .
Then he realizes that he cannot read from ones and begins to read from numbers . At the end of the first row of the numbers dataset:
i AB . . 1
Then it will move to the top, change nothing again; then he reads in 2 for B.
i AB . . 2
Then it sets A to 2 for your program:
i AB . 2 2
Then it returns to the beginning of the data step cycle.
i AB . 2 2
Then it reads in B = 3:
i AB . 2 3
Then he continues the cycle, for B = 4, 5.
Now compare this to a single dataset. It will be almost the same (with a slight difference when switching between data sets that do not give another result). Now go to the step where A = 2 B = 2:
i AB . 2 2
Now, when the data step is read in the next line, it has all three variables on it. Thus, it gives:
i AB . . 3
Since it reads in =. from the line, he sets its absence. In the version with one data type, it had no value for reading A, so it did not replace 2 with the missing one.