Proc transpose using SPDE takes ~ 60 times longer than v9 library

I moved all my data to the SPDE libraries because I had a remarkable performance improvement in everything. Everything before running proc transpose. It will take ~ 60 times longer to execute in the SPDE dataset than the same dataset that is stored in the regular v9 library. Datasets are sorted by item_id. It is read / written to the same library.

Does anyone have an idea why this is so? Am I missing something important in that SPDE and Proc Transpose do not work well together?

SPDE Libary

MPRINT(XMLIMPORT_VANTAGE): proc transpose data = smplus.links_response_mechanism out = smplus.response_mechanism (drop = _NAME_) prefix = rm_; MPRINT(XMLIMPORT_VANTAGE): by item_id; MPRINT(XMLIMPORT_VANTAGE): id lookup_code; MPRINT(XMLIMPORT_VANTAGE): var x; MPRINT(XMLIMPORT_VANTAGE): run; NOTE: There were 5866747 observations read from the data set SMPLUS.LINKS_RESPONSE_MECHANISM. NOTE: The data set SMPLUS.RESPONSE_MECHANISM has 3209353 observations and 14 variables. NOTE: Compressing data set SMPLUS.RESPONSE_MECHANISM decreased size by 37.98 percent. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 28:27.63 cpu time 28:34.64 

V9 library

 MPRINT(XMLIMPORT_VANTAGE): proc transpose data = mplus.links_response_mechanism out = mplus.response_mechanism (drop = _NAME_) prefix = rm_; MPRINT(XMLIMPORT_VANTAGE): by item_id; 68 The SAS System 02:00 Thursday, August 8, 2013 MPRINT(XMLIMPORT_VANTAGE): id lookup_code; MPRINT(XMLIMPORT_VANTAGE): var x; MPRINT(XMLIMPORT_VANTAGE): run; NOTE: There were 5866747 observations read from the data set MPLUS.LINKS_RESPONSE_MECHANISM. NOTE: The data set MPLUS.RESPONSE_MECHANISM has 3209353 observations and 14 variables. NOTE: Compressing data set MPLUS.RESPONSE_MECHANISM decreased size by 27.60 percent. Compressed is 32271 pages; un-compressed would require 44572 pages. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 28.76 seconds cpu time 28.79 seconds 
+4
source share
3 answers

It seems to me that there is a problem with PROC TRANSPOSE and SPDE. Here is a simple SSCCE that has significant differences; not as significant as yours, but to some extent this may be a factor in this on the desktop with not particularly significant performance tuning in the first place. Sounds like a call to SAS technical support in order.

 libname spdelib spde 'c:\temp\SPDE Main' datapath=('c:\temp\SPDE Data' 'd:\temp\SPDE Data') indexpath=('d:\temp\SPDE Index') partsize=512; libname mainlib 'c:\temp\'; data mainlib.bigdata; do ID = 1 to 1500000; do _varn=1 to 10; varname=cats("Var_",_varn); vardata=ranuni(7); output; end; end; run; data spdelib.bigdata; do ID = 1 to 1500000; do _varn=1 to 10; varname=cats("Var_",_varn); vardata=ranuni(7); output; end; end; run; *These data steps take roughly the same amount of time, around 30 seconds each; proc transpose data=spdelib.bigdata out=spdelib.transdata; by id; id varname; var vardata; run; *Run a few times, this takes around 3 to 4 minutes, with 1.5 minutes CPU time; proc transpose data=mainlib.bigdata out=mainlib.transdata; by id; id varname; var vardata; run; *Run a few times, this takes around 30 to 45 seconds, with 20 seconds CPU time; 
+3
source

There are known problems with SPDE and proc in the past (not multithreaded), at least until version 4.1. Which version are you using? (can be seen in the "! install / logs" folder).

This is definitely something to enhance with SAS support to β€œspeed up” everything that I would recommend sending a log with the following parameters:

 proc setinit noalias; run; proc options; run; %put _ALL_; options fullstimer msglevel=i; 

also:

 options spdedebug='DA_TRACEIO_OCR CJNL=Trace.txt'; 

(The CJNL option simply redirects trace output to a text file)

In the meantime, you can take advantage of some of the following special SPD options:

http://support.sas.com/kb/11/349.html

+1
source

This problem usually occurs when PROC TRANSPOSE is used with BY processing of compressed datasets. SAS is forced to read the same block of lines, repeatedly unpacking them each time until all records are completely sorted.

Set Compress = No option, and it will work. See the log below, one program has Compress = yes and the other Compress = no, the first was 56 minutes versus .5 seconds.

 OPTIONS COMPRESS=YES; 50 **tranpose from spde to spde; 51 proc transpose data=spdelib.balancewalkoutput out=spdelib.spdelib_to_spdelib; 52 var metric ; 53 by balancewalk facility_id isretained isexisting isicaapnpl monthofmaturity vintage; 54 run; NOTE: There were 10000000 observations read from the data set SPDELIB.BALANCEWALKOUTPUT. NOTE: The data set SPDELIB.SPDELIB_TO_SPDELIB has 160981 observations and 74 variables. NOTE: Compressing data set SPDELIB.SPDELIB_TO_SPDELIB decreased size by 69.96 percent. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 56:58.54 user cpu time 52:03.65 system cpu time 4:03.00 memory 19028.75k OS Memory 34208.00k Timestamp 09/16/2019 06:19:55 PM Step Count 9 Switch Count 22476 Page Faults 0 Page Reclaims 4056 Page Swaps 0 Voluntary Context Switches 142316 Involuntary Context Switches 5726 Block Input Operations 88 Block Output Operations 569200 OPTIONS COMPRESS=NO; 50 **tranpose from spde to spde; 51 proc transpose data=spdelib.balancewalkoutput out=spdelib.spdelib_to_spdelib; 52 var metric ; 53 by balancewalk facility_id isretained isexisting isicaapnpl monthofmaturity vintage; 18 The SAS System 16:04 Monday, September 16, 2019 54 run; NOTE: There were 10000000 observations read from the data set SPDELIB.BALANCEWALKOUTPUT. NOTE: The data set SPDELIB.SPDELIB_TO_SPDELIB has 160981 observations and 74 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 26.73 seconds user cpu time 14.52 seconds system cpu time 11.99 seconds memory 13016.71k OS Memory 27556.00k Timestamp 09/16/2019 04:13:06 PM Step Count 9 Switch Count 24827 Page Faults 0 Page Reclaims 2662 Page Swaps 0 Voluntary Context Switches 162653 Involuntary Context Switches 1678 Block Input Operations 96 Block Output Operations 1510040 
0
source

Source: https://habr.com/ru/post/1495998/


All Articles