I came across a very strange problem with my SQL functions. It seems they have different execution plans between the language SQL and language plpgsql , but I canβt say which execution plan is installed for the SQL version, since it requires this: Function final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING. and will not allow me to use EXPLAIN .
As for why I know that they have different plans, this is because the SQL version is not running, complaining that it cannot connect to one of the foreign servers that are currently filmed. The connection is performed using external tables, and this table is divided by date ( date_col column), with some of its partitions physically located on one server, and some on a foreign server. The Date parameter used in the function ensures that it should only scan one section and that section is on the same server. This is also shown in EXPLAIN below, used in plain SQL (not in function):
Append (cost=2.77..39.52 rows=2 width=36) CTE ct -> Result (cost=0.00..0.51 rows=100 width=4) InitPlan 2 (returns $1) -> Aggregate (cost=2.25..2.26 rows=1 width=32) -> CTE Scan on ct (cost=0.00..2.00 rows=100 width=4) -> Seq Scan on table1 (cost=0.00..0.00 rows=1 width=36) Filter: ((date_col = '2017-07-30'::date) AND (some_col = ANY ($1))) -> Seq Scan on "part$_table1_201707" (cost=0.00..36.75 rows=1 width=36) Filter: ((date_col = '2017-07-30'::date) AND (some_col = ANY ($1)))
External sections until 2017 show that the scheduler selects the correct section and is not looking for others to scan. This is true for plain SQL and plpgsql function , but not for sql function . Why is this possible and can I avoid it without rewriting my functions?
From what I understand, there should be some difference between how the parameters are passed to the sql function , since the hard coding date in it does not allow the request to scan unnecessary sections. Perhaps something like this is happening:
WITH ct AS (SELECT unnest(array[1,2]) AS arr) SELECT col1, col2 FROM table1 WHERE date_col = (SELECT '2017-07-30'::date) AND some_col = ANY((SELECT array_agg(arr) FROM ct)::int[])
Creating such an EXPLAIN :
Append (cost=2.78..183.67 rows=3 width=36) CTE ct -> Result (cost=0.00..0.51 rows=100 width=4) InitPlan 2 (returns $1) -> Result (cost=0.00..0.01 rows=1 width=4) InitPlan 3 (returns $2) -> Aggregate (cost=2.25..2.26 rows=1 width=32) -> CTE Scan on ct (cost=0.00..2.00 rows=100 width=4) -> Seq Scan on table1 (cost=0.00..0.00 rows=1 width=36) Filter: ((date_col = $1) AND (some_col = ANY ($2))) -> Seq Scan on "part$_table1_201707" (cost=0.00..36.75 rows=1 width=36) Filter: ((date_col = $1) AND (some_col = ANY ($2))) -> Foreign Scan on "part$_table1_201603" (cost=100.00..144.14 rows=1 width=36)
For reference, you can reproduce the problem on PostgreSQL 9.6.4 using the following code:
CREATE SERVER broken_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'broken_server', dbname 'postgres', port '5432'); CREATE USER MAPPING FOR postgres SERVER broken_server OPTIONS (user 'foreign_username', password 'foreign_password'); CREATE TABLE table1 (id serial PRIMARY KEY, date_col date, some_col int, col1 int, col2 text); CREATE TABLE part$_table1_201707 () INHERITS (table1); ALTER TABLE part$_table1_201707 ADD CONSTRAINT part$_table1_201707_date_chk CHECK (date_col BETWEEN '2017-07-01'::date AND '2017-07-31'::date); CREATE FOREIGN TABLE part$_table1_201603 () INHERITS (table1) SERVER broken_server OPTIONS (schema_name 'public', table_name 'part$_table1_201603'); ALTER TABLE part$_table1_201603 ADD CONSTRAINT part$_table1_201603_date_chk CHECK (date_col BETWEEN '2016-03-01'::date AND '2016-03-31'::date); CREATE OR REPLACE FUNCTION function_plpgsql(param1 date, param2 int[]) RETURNS TABLE(col1 int, col2 text) LANGUAGE plpgsql SECURITY DEFINER AS $function$ BEGIN