PostgreSQL - very slow index sampling

Question

PostgreSQL - very slow index sampling

I am running postgresql 9.4 on Centos 6.7. One of the tables contains many millions of records, this is DDL:

CREATE TABLE domain.examples (
  id SERIAL,
  sentence VARCHAR,
  product_id BIGINT,
  site_id INTEGER,
  time_stamp BIGINT,
  category_id INTEGER,
  CONSTRAINT examples_pkey PRIMARY KEY(id)
) 
WITH (oids = false);

CREATE INDEX examples_categories ON domain.examples
  USING btree (category_id);

CREATE INDEX examples_site_idx ON domain.examples
  USING btree (site_id);

An application that consumes data does this using pagination, so we collect thousands of 1000 records. However, even when retrieving an indexed column, the fetch time is very slow:

explain analyze
select *
from domain.examples e
where e.category_id = 105154
order by id asc 
limit 1000;

Limit  (cost=0.57..331453.23 rows=1000 width=280) (actual time=2248261.276..2248296.600 rows=1000 loops=1)
  ->  Index Scan using examples_pkey on examples e  (cost=0.57..486638470.34 rows=1468199 width=280) (actual time=2248261.269..2248293.705 rows=1000 loops=1)
        Filter: (category_id = 105154)
        Rows Removed by Filter: 173306740
Planning time: 70.821 ms
Execution time: 2248328.457 ms

What causes a slow request? And how can this be improved?

Thank!

+4

sql postgresql

Seffy Feb 08 '17 at 20:36

source share

2 answers

, , postgresql examples_pkey category_id = 105154, ANALYZE GUC ( ), .

, category_id = 105154 , CTE, examples_categories;

with favorite_category as (
    select *
    from domain.examples e
    where e.category_id = 105154)
select *
from favorite_category
order by id asc
limit 1000;

category_id = 105154 id ( , show work_mem;, , . - 4 ).

+1

Kevin Johnson 08 . '17 21:59

Roman tkachuk · Accepted Answer · 2017-02-09T08:06:17+0000

You can create an index for both category_id and id fields:

CREATE INDEX examples_site_idx2 ON domain.examples
  USING btree (category_id, id);

I will try to explain the analysis with your query with 3,000,000 lines.

With the old index:

                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..9234.56 rows=1000 width=60) (actual time=0.655..597.193 rows=322 loops=1)
   ->  Index Scan using examples_pkey on examples e  (cost=0.43..138512.43 rows=15000 width=60) (actual time=0.654..597.142 rows=322 loops=1)
         Filter: (category_id = 105154)
         Rows Removed by Filter: 2999678
 Planning time: 2.295 ms
 Execution time: 597.257 ms
(6 rows)

With a new index:

                                                                   QUERY PLAN                                                                    
-------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..2585.13 rows=1000 width=60) (actual time=0.027..28.814 rows=322 loops=1)
   ->  Index Scan using examples_site_idx2 on examples e  (cost=0.43..38770.93 rows=15000 width=60) (actual time=0.026..28.777 rows=322 loops=1)
         Index Cond: (category_id = 105154)
 Planning time: 1.471 ms
 Execution time: 28.860 ms
(5 rows)

PostgreSQL - very slow index sampling

More articles: