Why are Postgres queries on jsonb columns so slow?

I have a table targetingthat has a marital_statustype column text[]and another datatype column jsonb. The content of these two columns is the same, only in a different format (this is just for demonstration purposes). Sample data:

 id |      marital_status      |                        data                       
----+--------------------------+---------------------------------------------------
  1 | null                     | {}
  2 | {widowed}                | {"marital_status": ["widowed"]}
  3 | {never_married,divorced} | {"marital_status": ["never_married", "divorced"]}
...

The table in a random combination contains more than 690K records.

Text Search [] column

EXPLAIN ANALYZE SELECT marital_status
FROM targeting
WHERE marital_status @> '{widowed}'::text[]

No index

Usually takes <900 ms without creating any indexes:

Seq Scan on targeting  (cost=0.00..172981.38 rows=159061 width=28) (actual time=0.017..840.084 rows=158877 loops=1)
  Filter: (marital_status @> '{widowed}'::text[])
  Rows Removed by Filter: 452033
Planning time: 0.150 ms
Execution time: 845.731 ms

With index

With an index, it usually takes <200 ms (75% improvement):

CREATE INDEX targeting_marital_status_idx ON targeting ("marital_status");

Result:

Index Only Scan using targeting_marital_status_idx on targeting  (cost=0.42..23931.35 rows=159061 width=28) (actual time=3.528..143.848 rows=158877 loops=1)"
  Filter: (marital_status @> '{widowed}'::text[])
  Rows Removed by Filter: 452033
  Heap Fetches: 0
Planning time: 0.217 ms
Execution time: 148.506 ms

Search in jsonb column

EXPLAIN ANALYZE SELECT data
FROM targeting
WHERE (data -> 'marital_status') @> '["widowed"]'::jsonb

No index

Usually takes <5700 ms without creating any indexes (more than 6 times slower!):

Seq Scan on targeting  (cost=0.00..174508.65 rows=611 width=403) (actual time=0.095..5399.112 rows=158877 loops=1)
  Filter: ((data -> 'marital_status'::text) @> '["widowed"]'::jsonb)
  Rows Removed by Filter: 452033
Planning time: 0.172 ms
Execution time: 5408.326 ms

With index

The index is usually taken <3700 ms (improvement by 35%):

CREATE INDEX targeting_data_marital_status_idx ON targeting USING GIN ((data->'marital_status'));

Result:

Bitmap Heap Scan on targeting  (cost=144.73..2482.75 rows=611 width=403) (actual time=85.966..3694.834 rows=158877 loops=1)
  Recheck Cond: ((data -> 'marital_status'::text) @> '["widowed"]'::jsonb)
  Rows Removed by Index Recheck: 201080
  Heap Blocks: exact=33723 lossy=53028
  ->  Bitmap Index Scan on targeting_data_marital_status_idx  (cost=0.00..144.58 rows=611 width=0) (actual time=78.851..78.851 rows=158877 loops=1)"
        Index Cond: ((data -> 'marital_status'::text) @> '["widowed"]'::jsonb)
Planning time: 0.257 ms
Execution time: 3703.492 ms

Questions

  • text[] , ?
  • jsonb 35%?
  • jsonb?
+4
2

jsonb_ops ( GIN ) jsonb_path_ops.

: https://www.postgresql.org/docs/9.6/static/datatype-json.html

, jsonb_path_ops @>, jsonb_ops. jsonb_path_ops jsonb_ops , , , . , .

jsonb_ops jsonb_path_ops GIN , , . [1] , jsonb_path_ops (), ; , {"foo": {"bar": "baz"}}, , foo, bar baz -. , , , ; , foo . , jsonb_ops , foo, bar baz ; , . GIN AND , - , jsonb_path_ops, , .

0

. , , ,

CREATE TABLE foo ( id int, key1 text );

,

CREATE TABLE bar ( id int, jsonb foo );

@Craig

GIN , b-, .

SELECT jsonb_build_object('marital_status',ARRAY[null]);
     jsonb_build_object     
----------------------------
 {"marital_status": [null]}
(1 row)

{}. PostgreSQL , jsonb .

, -.

CREATE TABLE foo ( id int, x text, y text, z text )
CREATE INDEX ON foo(x);
CREATE INDEX ON foo(y);
CREATE INDEX ON foo(z);

btree . .

CREATE TABLE bar ( id int, junk jsonb );
CREATE INDEX ON bar USING gin (junk);
INSERT INTO bar (id,junk) VALUES (1,$${"x": 10, "y": 42}$$);

bar , , foo, btrees, , GIN, .

INSERT INTO bar (id,junk) VALUES (1,$${"x": 10, "y": 42, "z":3}$$);

btree z, . , . jsonb , . jsonb, CREATE INDEX .

0

Source: https://habr.com/ru/post/1663141/


All Articles