I noticed some issues with simple aggregate performance in Postgres (8.3). The problem is that if I have a table (say, 200M rows) that is unique (customer_id, order_id), then the select customer_id,max(order_id) from larger_table group by customer_id request select customer_id,max(order_id) from larger_table group by customer_id more than an order of magnitude slower than a simple Java / program JDBC, which performs the following actions:
1) Initialize an empty HashMap client card (this will display id → maximum order size) 2) execute "select customer_id, order_id from large_table" and get a set of streaming results 3) iterate over the result set in each row, doing the following:
long id = resultSet.getLong("customer_id"); long order = resultSet.getLong("order_id"); if (!customerMap.containsKey(id)) customerMap.put(id,order); else customerMap.put(id,Math.max(order,customerMap.get(id)));
Is such a difference in performance expected? I should not think that, as I understand it, this is very close to what is happening inside the country. Is this proof that something is incorrectly / incorrectly configured with db?
source share