Oracle SQL optimizer behavior when working with ORS and row-independent predicates (such as functions that return the same value regardless of a row)

I heard that OR bad, and having multiple OR can significantly affect performance. But what about string-independent OR s? Take a look at an example:

 SELECT * FROM some_table t WHERE ( some_function('CONTEXT') = 'context of selecting by id' AND t.id = TO_NUMBER(another_function('ID')) ) OR ( some_function('CONTEXT') = 'context of filtering by name' AND t.name LIKE '%' || another_function('NAME') || '%' ) OR ( some_function('CONTEXT') = 'context of taking actual rows' AND TO_DATE(another_function('ACTUAL_DATE'), '...') BETWEEN t.start_date AND t.end_date ) ... 

Here, some_function('CONTEXT') returns the same value regardless of the row (it does not use any row-dependent data, such as column values, as its arguments and does not change its internal state that affects the result when the query is executed). It can also be just a package variable, for example some_package.context .
As I think the optimizer should first calculate some_function('CONTEXT') and then decide which one to choose OR .
But what will really be? How can I be sure that there will be no performance leak with such a request?

PS: 11.2

+4
source share
2 answers

You need to use the undocumented use_concat(or_predicates(1)) hint use_concat(or_predicates(1)) or rewrite the query using UNION ALL . The optimizer has problems with these types of predicates, regardless of function.

Expected Plan

You need a plan that looks something like this:

 ------------------------------------------------------ | Id | Operation | Name | ------------------------------------------------------ | 0 | SELECT STATEMENT | | | 1 | CONCATENATION | | |* 2 | FILTER | | |* 3 | TABLE ACCESS FULL | SOME_TABLE | |* 4 | FILTER | | |* 5 | TABLE ACCESS FULL | SOME_TABLE | |* 6 | FILTER | | |* 7 | TABLE ACCESS BY INDEX ROWID| SOME_TABLE | |* 8 | INDEX UNIQUE SCAN | SYS_C0010268 | ------------------------------------------------------ 

FILTER in Operation very different from the typical FILTER in the Predicate Information section of the explanation plan. These FILTER will evaluate the status and decide which part of the execution plan to use at runtime. Depending on the values ​​passed to the function, the plan will either use a full table scan (for a non-selective attribute of names or dates) or use a unique index scan (for a very selective predicate on an identifier).

This is exactly what you want with a query similar to yours. And if the request had only a small amount of AND and OR s, it would probably be FILTER .

Actual plan

But in reality, with a complex predicate, the plan is as follows:

 ---------------------------------------- | Id | Operation | Name | ---------------------------------------- | 0 | SELECT STATEMENT | | |* 1 | TABLE ACCESS FULL| SOME_TABLE | ---------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("SOME_FUNCTION"('CONTEXT')='context of filtering by name' AND "T"."NAME" LIKE '%'||"ANOTHER_FUNCTION"('NAME')||'%' OR "SOME_FUNCTION"('CONTEXT')='context of taking actual rows' AND "T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') AND "T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') OR "SOME_FUNCTION"('CONTEXT')='context of selecting by id' AND "T"."ID"=TO_NUMBER("ANOTHER_FUNCTION"('ID'))) 

A full table scan is not always bad. But they are pretty terrible for choosing one primary key value.

Circuit example

Create a table and 1 million sample rows. Some columns are very selective, and some are very non-selective. They all have histograms, so the optimizer has a lot of good information to work with.

 drop table some_table purge; create table some_table ( id number primary key, name varchar2(100), start_date date, end_date date ); begin for i in 1 .. 10 loop insert into some_table select level+(i*100000), 'Name '||mod(level, 5), date '2000-01-01' + mod(level, 10000), date '2010-01-01' + mod(level, 10000) from dual connect by level <= 100000; end loop; end; / begin dbms_stats.gather_table_stats(user, 'SOME_TABLE' ,method_opt => 'for all columns size 254'); end; / 

Function Examples

These functions are very static, and the optimizer should know this. This example uses some_function in a way that will never match anything. This is kind of a better scenario; Oracle should very easily understand that this query will return nothing.

 --Static functions. create or replace function some_function(p_context in varchar2) return varchar2 is begin return p_context; end; / --Btw, returning stringly-typed data is almost always a horrible idea. --(Althogh if you're dealing with sys_context you may not have a choice.) create or replace function another_function(p_type in varchar2) return varchar2 is begin if p_type = 'ID' then return '1'; elsif p_type = 'NAME' then return 'Name 1'; elsif p_type = 'ACTUAL_DATE' then return '2000-01-01'; end if; end; / 

Deafault - A Bad Plan Without FILTER Operations

The default plan is very low. The query should run in almost 0 seconds, but instead you need to perform a full table scan.

 explain plan for SELECT * FROM some_table t WHERE ( some_function('CONTEXT') = 'context of selecting by id' AND t.id = TO_NUMBER(another_function('ID')) ) OR ( some_function('CONTEXT') = 'context of filtering by name' AND t.name LIKE '%' || another_function('NAME') || '%' ) OR ( some_function('CONTEXT') = 'context of taking actual rows' AND TO_DATE(another_function('ACTUAL_DATE'), '...') BETWEEN t.start_date AND t.end_date ); select * from table(dbms_xplan.display); Plan hash value: 3038250352 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 525 | 14700 | 1504 (17)| 00:00:01 | |* 1 | TABLE ACCESS FULL| SOME_TABLE | 525 | 14700 | 1504 (17)| 00:00:01 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("SOME_FUNCTION"('CONTEXT')='context of filtering by name' AND "T"."NAME" LIKE '%'||"ANOTHER_FUNCTION"('NAME')||'%' OR "SOME_FUNCTION"('CONTEXT')='context of taking actual rows' AND "T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') AND "T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') OR "SOME_FUNCTION"('CONTEXT')='context of selecting by id' AND "T"."ID"=TO_NUMBER("ANOTHER_FUNCTION"('ID'))) 

use_concat (or_predicates (1)) - Good plan with FILTERS

The USE_CONCAT hint converts the query into separate UNION ALL steps. And then each predicate is simple and has a FILTER operation. Unfortunately, USE_CONCAT has some weird limitations. Sometimes this only works if indexes (see Microsoft Office Support Document 259741.1). And sometimes this just doesn't work, the workaround doesn't work, and it is still not fixed in 12c (document 14545269.8).

Adding or_predicates(1) makes it work, but it is completely undocumented.

 explain plan for SELECT --+ use_concat(or_predicates(1)) * FROM some_table t WHERE ( some_function('CONTEXT') = 'context of selecting by id' AND t.id = TO_NUMBER(another_function('ID')) ) OR ( some_function('CONTEXT') = 'context of filtering by name' AND t.name LIKE '%' || another_function('NAME') || '%' ) OR ( some_function('CONTEXT') = 'context of taking actual rows' AND TO_DATE(another_function('ACTUAL_DATE'), '...') BETWEEN t.start_date AND t.end_date ); select * from table(dbms_xplan.display); Plan hash value: 1618041905 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 52500 | 1435K| 2721 (8)| 00:00:01 | | 1 | CONCATENATION | | | | | | |* 2 | FILTER | | | | | | |* 3 | TABLE ACCESS FULL | SOME_TABLE | 2500 | 70000 | 1362 (8)| 00:00:01 | |* 4 | FILTER | | | | | | |* 5 | TABLE ACCESS FULL | SOME_TABLE | 49999 | 1367K| 1356 (7)| 00:00:01 | |* 6 | FILTER | | | | | | |* 7 | TABLE ACCESS BY INDEX ROWID| SOME_TABLE | 1 | 28 | 3 (0)| 00:00:01 | |* 8 | INDEX UNIQUE SCAN | SYS_C0010269 | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("SOME_FUNCTION"('CONTEXT')='context of taking actual rows') 3 - filter("T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') AND "T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...')) 4 - filter("SOME_FUNCTION"('CONTEXT')='context of filtering by name') 5 - filter("T"."NAME" LIKE '%'||"ANOTHER_FUNCTION"('NAME')||'%' AND (LNNVL("SOME_FUNCTION"('CONTEXT')='context of taking actual rows') OR LNNVL("T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...')) OR LNNVL("T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...')))) 6 - filter("SOME_FUNCTION"('CONTEXT')='context of selecting by id') 7 - filter((LNNVL("SOME_FUNCTION"('CONTEXT')='context of filtering by name') OR LNNVL("T"."NAME" LIKE '%'||"ANOTHER_FUNCTION"('NAME')||'%')) AND (LNNVL("SOME_FUNCTION"('CONTEXT')='context of taking actual rows') OR LNNVL("T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...')) OR LNNVL("T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...')))) 8 - access("T"."ID"=TO_NUMBER("ANOTHER_FUNCTION"('ID'))) 

UNION EVERYTHING - GOOD PLAN WITH FILTERS

Manually deploying a query is probably a safer approach. But it can get very ugly depending on the complexity of your request.

 explain plan for SELECT * FROM some_table t WHERE some_function('CONTEXT') = 'context of selecting by id' AND t.id = TO_NUMBER(another_function('ID')) union all SELECT * FROM some_table t WHERE some_function('CONTEXT') = 'context of filtering by name' AND t.name LIKE '%' || another_function('NAME') || '%' union all SELECT * FROM some_table t WHERE some_function('CONTEXT') = 'context of taking actual rows' AND TO_DATE(another_function('ACTUAL_DATE'), '...') BETWEEN t.start_date AND t.end_date select * from table(dbms_xplan.display); (Plan not shown - it basically the same as the `USE_CONCAT` version.) 

CASE - Bad plan without FILTERs

Recreating predicates into one CASE was a good idea, but it doesn't seem to work here. Although this can only be a problem in my specific example.

 explain plan for SELECT * FROM some_table t WHERE case when some_function('CONTEXT') = 'context of selecting by id' AND t.id = TO_NUMBER(another_function('ID')) then 1 when some_function('CONTEXT') = 'context of filtering by name' AND t.name LIKE '%' || another_function('NAME') || '%' then 1 when some_function('CONTEXT') = 'context of taking actual rows' AND TO_DATE(another_function('ACTUAL_DATE'), '...') BETWEEN t.start_date AND t.end_date then 1 else 0 end = 1; select * from table(dbms_xplan.display); (Plan not shown - it basically the same as the default version with the full table scan.) 
+2
source

You are right - this is what the optimizer should do. In my experience, this is not what he does.

Strange, you can still get the behavior you want for this case - if you convert the predicates to a case statement, for example:

 case when some_function('CONTEXT') = 'context of selecting by id' AND t.id = TO_NUMBER(another_function('ID') then 1 -- satisfied when some_function('CONTEXT') = 'context of filtering by name' AND t.name LIKE '%' || another_function('NAME') || '%' then 1 -- satisfied when some_function('CONTEXT') = 'context of taking actual rows' AND TO_DATE(another_function('ACTUAL_DATE'), '...') BETWEEN t.start_date AND t.end_date then 1 -- satisfied ... else 0 -- unsatisfied end = 1 -- rows from candidate set are only in the result set when -- they are "satisfied" 

Then, Oracle typically resolves this as a filter operation instead of combining, which will prevent the β€œnormal” performance problems that people often encounter using logical ORs.

As a bonus, this method often works with a non-line static context for "some_function (...)"!

+1
source

Source: https://habr.com/ru/post/1496756/


All Articles