Combining multiple indices when WHERE has OR conditions

Question

Combining multiple indices when WHERE has OR conditions

If I execute this request:

SELECT eid
FROM   entidades e
WHERE  distrito IN ( SELECT id FROM distritos WHERE distrito_t LIKE '%lisboa%' )

or

SELECT eid
FROM   entidades e
WHERE  concelho IN ( SELECT id FROM concelho WHERE concelho_t LIKE '%lisboa%' )

my indices on distritoor are used concelho.

For any of the queries above, the output of the analysis of the explanation will be something like this:

----------------------------------------------------------------------
Nested Loop  (cost=239.36..23453.18 rows=12605 width=4) (actual time=29.995..790.191 rows=100602 loops=1)
  ->  HashAggregate  (cost=1.38..1.39 rows=1 width=12) (actual time=0.081..0.085 rows=1 loops=1)
        ->  Seq Scan on distritos  (cost=0.00..1.38 rows=1 width=12) (actual time=0.058..0.068 rows=1 loops=1)
              Filter: ((distrito_t)::text ~~ '%lisboa%'::text)
  ->  Bitmap Heap Scan on entidades e  (cost=237.98..23294.23 rows=12605 width=7) (actual time=29.892..389.767 rows=100602 loops=1)
        Recheck Cond: (e.distrito = distritos.id)
        ->  Bitmap Index Scan on idx_t_ent_dis  (cost=0.00..234.83 rows=12605 width=0) (actual time=26.787..26.787 rows=100602 loops=1)
              Index Cond: (e.distrito = distritos.id)

However, for the next query, indexes are not used at all ...

SELECT eid
FROM   entidades e
WHERE  concelho IN ( SELECT id FROM concelho WHERE concelho_t LIKE '%lisboa%' )
OR     distrito IN ( SELECT id FROM distritos WHERE distrito_t LIKE '%lisboa%' )

----------------------------------------------------------------------
Seq Scan on entidades e  (cost=10.25..34862.71 rows=283623 width=4) (actual time=0.600..761.876 rows=100604 loops=1)
  Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
  SubPlan 1
    ->  Seq Scan on distritos  (cost=0.00..1.38 rows=1 width=12) (actual time=0.083..0.093 rows=1 loops=1)
          Filter: ((distrito_t)::text ~~ '%lisboa%'::text)
  SubPlan 2
    ->  Seq Scan on concelhos  (cost=0.00..8.86 rows=3 width=5) (actual time=0.173..0.258 rows=1 loops=1)
          Filter: ((concelho_t)::text ~~ '%lisboa%'::text)

How to create an index that will be used by the previous query?
From this documentation you can ...
But I probably did not look for the right things, since I can not find any example at all ...

update: added clarification for both types of queries ...

+3

sql postgresql

acm Mar 04 '11 at 14:52

source share

3 answers

, , ?

SELECT eid
FROM   entidades e
LEFT JOIN concelho c ON e.concelho = c.id
LEFT JOIN distritos d ON e.distrito = d.id
WHERE  
    concelho_t LIKE '%lisboa%' OR 
    distrito_t LIKE '%lisboa%';

+1

krtek 04 . '11 14:56

Try:

SELECT eid
FROM   entidades e
WHERE  concelho IN ( SELECT id FROM concelho WHERE concelho_t LIKE '%lisboa%' )
UNION
SELECT eid
FROM   entidades e
OR     distrito IN ( SELECT id FROM distritos WHERE distrito_t LIKE '%lisboa%' )

But the real problem is the lack of normalization in the database, the lack of hierarchy for the country, territory, district (condominium, parroquia), city, suburb. If you had this, then the organization could have belonged to the structure in the right place, and you would not have the “Lisboa” going on at both councils at the district levels.

+1

Performancedba Mar 07 '11 at 9:36

source share

user533832 · Accepted Answer · 2011-03-07T09:01:41+0000

, , : " , , ".

postgres (8.4), , , where - , .

, , , "" , "" union:

create table distritos(id serial primary key, distrito_t text);
insert into distritos(distrito_t) select 'distrito'||generate_series(1, 10000);

create table concelho(id serial primary key, concelho_t text);
insert into concelho(concelho_t) select 'concelho'||generate_series(1, 10000);

create table entidades( eid serial primary key, 
                        distrito integer not null references distritos, 
                        concelho integer not null references concelho );
insert into entidades(distrito, concelho)
select generate_series(1, 10000), generate_series(1, 10000);

:

explain analyze 
select eid from entidades
where concelho in (select id from concelho where concelho_t like '%lisboa%')
   or distrito in (select id from distritos where distrito_t like '%lisboa%');

                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Seq Scan on entidades  (cost=299.44..494.94 rows=7275 width=4) (actual time=8.978..8.978 rows=0 loops=1)
   Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
   SubPlan 1
     ->  Seq Scan on concelho  (cost=0.00..149.71 rows=2 width=4) (actual time=3.922..3.922 rows=0 loops=1)
           Filter: (concelho_t ~~ '%lisboa%'::text)
   SubPlan 2
     ->  Seq Scan on distritos  (cost=0.00..149.71 rows=2 width=4) (actual time=3.363..3.363 rows=0 loops=1)
           Filter: (distrito_t ~~ '%lisboa%'::text)

:

explain analyze
  select eid from entidades 
  where concelho in (select id from concelho where concelho_t like '%lisboa%')
  union
  select eid from entidades
  where distrito in (select id from distritos where distrito_t like '%lisboa%');

                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=648.98..650.92 rows=194 width=4) (actual time=5.409..5.409 rows=0 loops=1)
   ->  Append  (cost=149.74..648.50 rows=194 width=4) (actual time=5.399..5.399 rows=0 loops=1)
         ->  Hash Semi Join  (cost=149.74..323.28 rows=97 width=4) (actual time=2.743..2.743 rows=0 loops=1)
               Hash Cond: (stack.entidades.concelho = concelho.id)
               ->  Seq Scan on entidades  (cost=0.00..147.00 rows=9700 width=8) (actual time=0.013..0.013 rows=1 loops=1)
               ->  Hash  (cost=149.71..149.71 rows=2 width=4) (actual time=2.723..2.723 rows=0 loops=1)
                     ->  Seq Scan on concelho  (cost=0.00..149.71 rows=2 width=4) (actual time=2.716..2.716 rows=0 loops=1)
                           Filter: (concelho_t ~~ '%lisboa%'::text)
         ->  Hash Semi Join  (cost=149.74..323.28 rows=97 width=4) (actual time=2.655..2.655 rows=0 loops=1)
               Hash Cond: (stack.entidades.distrito = distritos.id)
               ->  Seq Scan on entidades  (cost=0.00..147.00 rows=9700 width=8) (actual time=0.006..0.006 rows=1 loops=1)
               ->  Hash  (cost=149.71..149.71 rows=2 width=4) (actual time=2.642..2.642 rows=0 loops=1)
                     ->  Seq Scan on distritos  (cost=0.00..149.71 rows=2 width=4) (actual time=2.642..2.642 rows=0 loops=1)
                           Filter: (distrito_t ~~ '%lisboa%'::text)

Combining multiple indices when WHERE has OR conditions

More articles: