How to optimize my PostgreSQL DB for prefix?

I have a table called "nodes" with approximately 1.7 million rows in my PostgreSQL db

=#\d nodes
            Table "public.nodes"
 Column |          Type          | Modifiers 
--------+------------------------+-----------
 id     | integer                | not null
 title  | character varying(256) | 
 score  | double precision       | 
Indexes:
    "nodes_pkey" PRIMARY KEY, btree (id)

I want to use the information from this table to autocomplete the search field, showing the user a list of ten names that have the highest score corresponding to their input. So I used this query (here is a search for all the headers starting with "s")

=# explain analyze select title,score from nodes where title ilike 's%' order by score desc; 
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Sort  (cost=64177.92..64581.38 rows=161385 width=25) (actual time=4930.334..5047.321 rows=161264 loops=1)
   Sort Key: score
   Sort Method:  external merge  Disk: 5712kB
   ->  Seq Scan on nodes  (cost=0.00..46630.50 rows=161385 width=25) (actual time=0.611..4464.413 rows=161264 loops=1)
         Filter: ((title)::text ~~* 's%'::text)
 Total runtime: 5260.791 ms
(6 rows)

This was very little to use with autocomplete. With some information from Using PostgreSQL in Web 2.0 Applications, I was able to improve this with a custom index

=# create index title_idx on nodes using btree(lower(title) text_pattern_ops);
=# explain analyze select title,score from nodes where lower(title) like lower('s%') order by score desc limit 10;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=18122.41..18122.43 rows=10 width=25) (actual time=1324.703..1324.708 rows=10 loops=1)
   ->  Sort  (cost=18122.41..18144.60 rows=8876 width=25) (actual time=1324.700..1324.702 rows=10 loops=1)
         Sort Key: score
         Sort Method:  top-N heapsort  Memory: 17kB
         ->  Bitmap Heap Scan on nodes  (cost=243.53..17930.60 rows=8876 width=25) (actual time=96.124..1227.203 rows=161264 loops=1)
               Filter: (lower((title)::text) ~~ 's%'::text)
               ->  Bitmap Index Scan on title_idx  (cost=0.00..241.31 rows=8876 width=0) (actual time=90.059..90.059 rows=161264 loops=1)
                     Index Cond: ((lower((title)::text) ~>=~ 's'::text) AND (lower((title)::text) ~<~ 't'::text))
 Total runtime: 1325.085 ms
(9 rows)

, 4. ? , '%s%' 's%'? PostgreSQL ? (Lucene?, Sphinx?) ?

+3
3

text_pattern_ops, C.

: .

+3

:

  • . , postgres.

  • postgresql , a > 98%. 0.5G, , 2G . , pg_stats.

  • , . 12 , . , .

  • , . , . 20000 1,2,3 .

  • , % abc% , , , lucene .

+2

150000+, :

select title,score
  from nodes
  where title ilike 's%'
  order by score desc
  limit 10;

, " > =" "<":

create index nodes_title_lower_idx on nodes (lower(title));
select title,score
  from nodes
  where lower(title)>='s' and lower(title)<'t'
  order by score desc
  limit 10;

You should also create an index on the result, which will help in the case ilike %s%.

0
source

Source: https://habr.com/ru/post/1750010/


All Articles