How to speed up min / max aggregation in Postgres without an index that is not needed otherwise

Let's say I have a table with an int column, and all I will ever read is the value of MAX () int.

If I create an index in this column, Postgres can reverse scan this index to get the MAX() value. But since all but one row in the index are just overhead, we can get the same performance without creating a full index.

Yes, you can create a trigger to update a single-row table that tracks the MAX value and query this table instead of listing MAX() in the main table. But I'm looking for something elegant, because I know that Postgres has partial indexes, and I cannot find a way to use them for this purpose.

Update. This definition of a partial index is ideal that I would like, but Postgres does not allow subqueries in the WHERE clause of a partial index.

create index on test(a) where a = (select max(a) from test);

+4
source share
2 answers

You cannot use aggregate functions or subquery expressions in a partial index predicate . In any case, this would hardly make sense, given the nature of the IMMUTABLE indexes.

If you have a number of integers and you can guarantee that the maximum will always be greater than x , you can use this meta-information.

 CREATE INDEX text_max_idx ON test (a) WHERE a > x; 

This index will only be used by the query planner if you include a WHERE that matches the index predicate. For instance:

 SELECT max(a) FROM test WHERE a > x; 

There may be more conditions, but this one must be included to use the index.
I take warranty seriously. Your query will not return anything if the predicate is false.

You can create fault tolerance:

 SELECT COALESCE( (SELECT max(a) FROM test WHERE a > x) (SELECT max(a) FROM test)); 

You can generalize this approach with more than one partial index. Like this technique , much simpler.

I would consider a trigger approach, though, with the exception of very large write loads in the table.

+7
source

Other rows in the index are not unnecessary, as they allow you to maintain maximum precision even in the event of deletion or in the case of updates that reduce the current maximum.

If you do not have such operations (IOW max only increases), you can save the maximum value yourself. Do this in the application code or in the trigger.

Postgres cannot know that max will only increase. It must support the ability to remove and update.

+4
source

Source: https://habr.com/ru/post/1493262/


All Articles