I have three tables in SQL Server 2005 (corresponding rows are included):
Patient (560K rows, has many Appts)
ID (PK, Clustered)
Appt (5.8M rows, has many ApptCPTs)
ID (PK, Unique, Non-clustered)
PatientID (Indexed, Non-unique, Non-clustered)
ApptCPT (13.4M rows)
ID (PK, Clustered)
ApptID (Indexed, Non-unique, Non-clustered)
CPTID (Indexed, Non-unique, Non-clustered)
I want to get the latest ApptCPT for each CPT code for each patient, but the request that I still have takes about a minute to return ~ 7M lines.
My current request:
SELECT
p.ID AS PatientID,
MAX(ac.ID) AS ApptCPTID,
ac.CPTID
FROM
Patient p
INNER JOIN Appt a ON a.PatientID = p.ID
INNER JOIN ApptCPT ac ON ac.ApptID = a.ID
GROUP BY
p.ID, ac.CPTID
ORDER BY
p.ID, ac.CPTID
Output Example:
PatientID ApptCPTID CPTID
123456789 18627724 3088
123456789 7647238 3388
123456789 18627723 3419
123456789 9989220 3419
123456789 12190141 3448
123456789 18627721 3551
123456789 17264224 71020
123456789 15933265 77052
123456789 10095897 77057
123456789 5258166 77080
123456789 18627813 80053
123456789 18627814 80061
If I select MAX and put ac.ID in the GROUP BY clause, I see that I am getting the correct data with my original request because it receives the last ApptCPT containing this CPTID for this patient, but it is terribly slow. For reference, here the same output does not use the MAX clause:
PatientID ApptCPTID CPTID
123456789 18126508 3088
123456789 4596004 3088
123456789 18627724 3088
123456789 7647238 3388
123456789 18627723 3419
123456789 9989220 3419
123456789 12190141 3448
123456789 4595928 3551
123456789 9989218 3551
123456789 18627721 3551
123456789 17264224 71020
123456789 15933265 77052
123456789 10095897 77057
123456789 5258050 77057
123456789 5258166 77080
123456789 4595932 80053
123456789 18126505 80053
123456789 9989223 80053
123456789 18627813 80053
123456789 18627814 80061
Anyone have any ideas to do this (hopefully) much, much faster?