I have a DELETE query that needs to be run on PostgreSQL 9.0.4. I find that it works until it reaches 524,289 rows in the subquery request.
For example, at step 524,288 there is no materialized view, and the cost looks pretty good:
explain DELETE FROM table1 WHERE pointLevel = 0 AND userID NOT IN (SELECT userID FROM table2 fetch first 524288 rows only); QUERY PLAN ------------------------------------------------ -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ----------- Delete (cost = 13549.49..17840.67 rows = 21 width = 6) -> Index Scan using jslps_userid_nopt on table1 (cost = 13549.49..17840.67 rows = 21 width = 6) Filter: ((NOT (hashed SubPlan 1)) AND (pointlevel = 0)) SubPlan 1 -> Limit (cost = 0.00..12238.77 rows = 524288 width = 8) -> Seq Scan on table2 (cost = 0.00..17677.92 rows = 757292 width = 8) (6 rows)
However, as soon as I hit 524,289, the materialized view will come into play and the DELETE query will become much more expensive:
explain DELETE FROM table1 WHERE pointLevel = 0 AND userID NOT IN
(SELECT userID FROM table2 fetch first 524289 rows only);
QUERY PLAN
-------------------------------------------------- -------------------------------------------------- -------
Delete (cost = 0.00..386910.33 rows = 21 width = 6)
-> Index Scan using jslps_userid_nopt on table1 (cost = 0.00..386910.33 rows = 21 width = 6)
Filter: ((pointlevel = 0) AND (NOT (SubPlan 1)))
SubPlan 1
-> Materialize (cost = 0.00..16909.24 rows = 524289 width = 8)
-> Limit (cost = 0.00..12238.79 rows = 524289 width = 8)
-> Seq Scan on table2 (cost = 0.00..17677.92 rows = 757292 width = 8) (7 rows)
I worked on the problem using JOIN instead in the subselect request:
SELECT s.userid FROM table1 s LEFT JOIN table2 p ON s.userid=p.userid WHERE p.userid IS NULL AND s.pointlevel=0
However, I am still interested in understanding why materialization significantly reduces productivity.
source share