So there is something to think about.
The first point, and perhaps the biggest in terms of performance, is getting data from DB2 to SAS. (I assume this is not an instance of the SAS database - correct me if that is the case). What a large table and moving it through the wire takes time. Because of this, if you can compute all these statistics inside DB2 using an SQL statement, this is likely to be your fastest option.
Suppose you upload a table to a SAS server:
A table sorted by CLASS variables will be much faster to process than an unsorted table. If the SAS knows that the table is sorted, it does not need to scan the table for writing to go to the group, it can execute a read block instead of random I / O.
If the table is not sorted, then the larger the number of groups, the more scanning of the table should occur.
The fact is that the speed of receiving data from HD to the CPU will be of paramount importance in an unsorted process.
From there, you get into the memory and the problem with the processor. PROC SUMMARY is multi-threaded, and SAS will read N groups at a time. If the group size can fit into the memory allocated for this thread, you will not have a problem. If the group size is too large, then SAS will have to be on the page.
I reduced the problem to a 15M line example:
%let grps=3000; %let pergrp=5000;
Unsorted:
NOTE: There were 15000000 observations read from the data set WORK.TEST. NOTE: The data set WORK.SUMMARY has 3001 observations and 9 variables. NOTE: PROCEDURE SUMMARY used (Total process time): real time 20.88 seconds cpu time 31.71 seconds
SORTED:
NOTE: There were 15000000 observations read from the data set WORK.TEST. NOTE: The data set WORK.SUMMARY has 3001 observations and 9 variables. NOTE: PROCEDURE SUMMARY used (Total process time): real time 5.44 seconds cpu time 11.26 seconds
===============================
%let grps=300000; %let pergrp=50;
Unsorted:
NOTE: There were 15000000 observations read from the data set WORK.TEST. NOTE: The data set WORK.SUMMARY has 300001 observations and 9 variables. NOTE: PROCEDURE SUMMARY used (Total process time): real time 19.26 seconds cpu time 41.35 seconds
SORTED:
NOTE: There were 15000000 observations read from the data set WORK.TEST. NOTE: The data set WORK.SUMMARY has 300001 observations and 9 variables. NOTE: PROCEDURE SUMMARY used (Total process time): real time 5.43 seconds cpu time 10.09 seconds
I ran them several times and the runtime was similar. Sorted times are roughly equal and faster.
The more groups / less per group, the faster it was unsorted, but look at the overall CPU usage, it is higher. My laptop has a very fast SSD, so IO was probably not a limiting factor - HD was able to keep up with the requirements for multi-core processors. On a system with slower HD, the total runtime may vary.
In the end, it depends too much on the data structure and the features of your server and database.