The most important thing to understand when using SAS to access data in Teradata (or any other external database, for that matter) is that the SAS software prepares SQL and sends it to the database. The idea is to try to free you (the user) from all the details of the database. SAS does this using a concept called "impict pass-through", which means that SAS translates the SAS code into the DBMS code. Among the many things that happen is this data type conversion: SAS has only two (and only two) data types, numeric and character.
SAS translates things for you, but it can be confusing. For example, I saw lazy database tables defined using VARCHAR (400) columns that have values that never exceed some shorter length (for example, a column for a person’s name). This is not a big problem in the database, but since SAS does not have a VARCHAR data type, it creates a 400-character variable for each row. Even with data compression, this can result in the resulting SAS dataset being unnecessarily large.
An alternative way is to use an “explicit pass” where you write your own queries using the actual syntax of the DBMS in question. These queries are completely executed in the DBMS and return the results back to SAS (which still performs data type conversion for you. For example, there is an “end-to-end” query that joins two tables and creates a SAS dataset as the result:
proc sql; connect to teradata (user=userid password=password mode=teradata); create table mydata as select * from connection to teradata ( select a.customer_id , a.customer_name , b.last_payment_date , b.last_payment_amt from base.customers a join base.invoices b on a.customer_id=b.customer_id where b.bill_month = date '2013-07-01' and b.paid_flag = 'N' ); quit;
Note that everything inside a pair of parentheses is native Teradata SQL and that the join operation itself is performed inside the database.
The sample code that you provided in your question is NOT a complete SAS / Teradata program example. To help you better, you need to show the real program, including any library links. For example, suppose your real program looks like this:
proc sql; CREATE TABLE subset_data AS SELECT bigTable.id, SUM(bigTable.value) AS total FROM TDATA.bigTable bigTable JOIN TDATA.subset subset ON subset.id = bigTable.id WHERE bigTable.date BETWEEN a AND b GROUP BY bigTable.id ;
This will mean the previously assigned LIBNAME statement through which the SAS connects to Teradata. The syntax of this WHERE clause will be very important if the SAS can even pass the full Teradata request. (You, for example, do not show what “a” and “b" mean. It is very possible that the only way SAS can perform the join is to drag both tables back into the local working session and make the connection on your SAS server.
I can simply say that you are trying to convince your Teradata administrators to allow the creation of driver tables in some database. The idea is that you create a relatively small table inside Teradata containing the identifier you want to extract, and then use this table to do explicit joins. I am sure that for this you will need a little more formal preparation of the database (for example, how to determine the correct index and how to "collect statistics"), but with this knowledge and skills in mind, your work will simply fly.
I could go on and on, but I will stay here. I use SAS with Teradata every day against what I am told is one of the largest Teradata environments on the planet. I like programming in both.