I need to query a table with several million rows, and I want to make it the most optimized.
Let's assume that we want to control access to a movie theater with animated screens and save it as follows:
AccessRecord (TicketId, TicketCreationTimestamp, TheaterId, ShowId, MovieId, SeatId, CheckInTimestamp)
To simplify, the "Id" columns of the "bigint" and "Timestamp" data types are "datetime". Tickets are sold at any time, and people get access to the theater at random. And the primary key (also unique) is TicketId.
I want to get for each film and theater and show (time) AccessRecord information of the first and last person who turned to the theater to see mov. If two sessions happen simultaneously, I just need 1, either of them.
My solution would be to combine PK and the grouped column in the subquery to get the row:
select AccessRecord.* from AccessRecord inner join( select MAX(CONVERT(nvarchar(25),CheckInTimestamp, 121) + CONVERT(varchar(25), TicketId)) as MaxKey, MIN(CONVERT(nvarchar(25),CheckInTimestamp, 121) + CONVERT(varchar(25), TicketId)) as MinKey from AccessRecord group by MovieId, TheaterId, ShowId ) as MaxAccess on CONVERT(nvarchar(25),CheckInTimestamp, 121) + CONVERT(varchar(25), TicketId) = MaxKey or CONVERT(nvarchar(25),CheckInTimestamp, 121) + CONVERT(varchar(25), TicketId) = MinKey
Transformation 121 refers to the canonical expression resatate data, for example: aaaa-mm-dd hh: mi: ss.mmm (24h), therefore ordered as a string data type, it will give the same result as it is ordered as date and time.
As you can see, this association is not very optimized, any ideas?
Update with how I tested various solutions :
I checked all your answers in a real database with SQL Server 2008 R2 with a 3M row table to select the correct one.
If I get only the first or last person I contacted:
- Joe Taras solution lasts 10 seconds.
- GarethD's solution lasts 21 seconds.
If I get the same access, but with an ordered result by grouping columns:
- Joe Taras solution lasts 10 seconds.
- GarethD's solution lasts 46 seconds.
If I get both (first and last) people who handle an ordered result:
- The decision of Joe Taras (makes the union) lasts 19 seconds.
- GarethD's solution lasts 49 seconds.
The rest of the solutions (even mine) last more than 60 seconds in the first test, so I canceled it.