Comparison Group from VS Over Partition By

Assuming one CAR table with two columns CAR_ID (int) and VERSION (int) .

I want to get the maximum version of each machine.

So there are two solutions (at least):

 select car_id, max(version) as max_version from car group by car_id; 

Or:

 select car_id, max_version from ( select car_id, version , max(version) over (partition by car_id) as max_version from car ) max_ver where max_ver.version = max_ver.max_version 

Are these two queries equally efficient?

+4
source share
3 answers

Yes, it may affect

The second request is an Inline View example. This is a very useful method for running reports with various types of meters or using any aggregate functions with it.

Oracle executes the subquery and then uses the resulting rows as a representation in the FROM clause.

As we talk about performance, we always recommend inline browsing instead of choosing a different type of subquery.

And another second query will give all the maximum records, while the first will give you only one maximum record.

see here

+2
source

This will depend on your indexing scheme and the amount of data in the table. The optimizer is likely to make various decisions based on the data that is actually inside the table.

I found, at least in SQL Server (I know you asked about Oracle), that the optimizer is more likely to perform a full scan using the PARTITION BY query and the GROUP BY query. But this is only in cases where you have an index containing CAR_ID and VERSION (DESC) in it.

The moral of this story is that I carefully test to choose the right one. For small tables, this does not matter. For really large datasets, none of them can be fast ...

+2
source

I know this is very old, but thought it should be indicated.

 select car_id, max_version from (select car_id , version , max(version) over (partition by car_id) as max_version from car ) max_ver where max_ver.version = max_ver.max_version 

Not sure why you made option 2 like this ... in this case sub select should be theoretically slower because you select 2x from the same table and then attach the results to yourself.

Just remove the version from your inline view, and it's the same.

 select car_id, max(version) over (partition by car_id) as max_version from car 

Performance really depends on the optimizer in this situation, but yes, as the original answer offers inline views, as they reduce the results. Although this is not a very good example, this is its same table without filters in the selected parameters.

Separation is also useful when you select a large number of columns, but need different aggregates that match the result set. Otherwise, you have to group all the other columns.

+2
source

Source: https://habr.com/ru/post/1397025/


All Articles