Always true conditional aggregate versus count

I came across coding that uses conditional aggregate 1 = 1 instead of count

select sum(case when 1=1 then 1 else 0 end) 

In my eyes, this is equivalent to:

 select count(1) 
At first, I realized that this is the place where the developer intended to return with additional logic in this conditional aggregate, but then I found it in another script, and then another. This prompted me to ask, and I found that the previous developer emphasized that this sum condition is in all cases more efficient and faster than the count function (when it will be well enough emphasized that other developers followed the standard later). It seems quite intuitive for me, forcing the system to evaluate the condition 1 = 1 may be insignificant, but this is still additional work on the count function. I realized that I will consult here before I return with a firm expression.

a) is there any truth to what this developer said, the conditional population will be faster than count

a completely secondary question: b) has there ever been a db system that would evaluate state aggregation faster than counting?

This is an oracle 11g database, although I suspect scripts were written for oracle 8i

And like bonus points ... I was asked to optimize this code. Will a deletion replacing this with a count function improve speed in general? The number of entries exceeds 100 million at times.

+5
source share
2 answers

Summary: it doesn't make any difference, and it has never had it in Oracle, at least since version 6 (1989), when I first started hearing about smart ways of counting faster by selecting primary key columns, etc., like if Oracle were unaware that people sometimes think of something.

You can see what the parser / optimizer does with the expression, using it in the filter and checking the Predicates section of the execution plan.

 create table demo ( demo_id integer generated always as identity constraint demo_pk primary key , othercolumn integer ); insert into demo (othercolumn) select dbms_random.value(0,1000) from dual connect by rownum <= 10000; commit; call dbms_stats.gather_table_stats(user, 'demo'); 

Normal count(*) (Oracle 12.1):

 select count(*) from demo having count(*) > 1 Plan hash value: 1044424301 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 7 (0)| 00:00:01 | |* 1 | FILTER | | | | | | 2 | SORT AGGREGATE | | 1 | | | | 3 | INDEX FAST FULL SCAN| DEMO_PK | 10000 | 7 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(COUNT(*)>1) 

Clever superfast expression:

 select sum(case when 1=1 then 1 else 0 end) from demo having sum(case when 1=1 then 1 else 0 end) > 0 Plan hash value: 1044424301 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 7 (0)| 00:00:01 | |* 1 | FILTER | | | | | | 2 | SORT AGGREGATE | | 1 | | | | 3 | INDEX FAST FULL SCAN| DEMO_PK | 10000 | 7 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(SUM(1)>0) 

Notice the Predicates section, which shows that the sum expression has been evaluated and replaced with sum(1) . (I don’t have time to dig into tracefiles right now, but I’m sure they will show that the rewriting happened before CBO optimization.)

Here, what he does with count(1) , another expression is sometimes considered more efficient than the standard one:

 select count(1) from demo having count(1) > 1 Plan hash value: 1044424301 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 7 (0)| 00:00:01 | |* 1 | FILTER | | | | | | 2 | SORT AGGREGATE | | 1 | | | | 3 | INDEX FAST FULL SCAN| DEMO_PK | 10000 | 7 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(COUNT(*)>1) 

And here is a plan without filters:

 select sum(case when 1=1 then 1 else 0 end) as rowcount from demo Plan hash value: 2242940774 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 7 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| DEMO_PK | 10000 | 7 (0)| 00:00:01 | ------------------------------------------------------------------------- 

As you can see, they are all the same (except for differences in the state of the artificial filter).

In addition, sum(1) does not give the same results as count(*) when there are no lines:

  select sum(case when 1=1 then 1 else 0 end) as sum1 , count(*) from demo where 1=2 SUM1 COUNT(*) ---------- ---------- 0 
+3
source

The easiest way to find the answer is, I think, an explanation of both options and see what Oracle says.

First, the usual COUNT parameter:

 SQL> set autotrace on explain SQL> select /*+ choose */ count(*) from tob_stavke_rac; COUNT(*) ---------- 53195373 Execution Plan ---------------------------------------------------------- Plan hash value: 3099656827 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 30846 (2)| 00:00:02 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| SRC_S_STA_FK_I | 53M| 30846 (2)| 00:00:02 | -------------------------------------------------------------------------------- 

Then, an unusual SUM with CASE:

 SQL> select /*+ choose */ sum(case when 1 = 1 then 1 else 0 end) from tob_stavke_rac; SUM(CASEWHEN1=1THEN1ELSE0END) ----------------------------- 53195373 Execution Plan ---------------------------------------------------------- Plan hash value: 3099656827 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 30846 (2)| 00:00:02 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| SRC_S_STA_FK_I | 53M| 30846 (2)| 00:00:02 | -------------------------------------------------------------------------------- SQL> 

There are no advantages to this database. Perhaps it was in Oracle 8i (today it is a 20-year-old piece of software), but today, in version 12c, I would not say that. In addition, Oracle, as a rule, rewrites the request if the Optimizer concludes that if it is corresponded, it works citius, altius, fortius (tribute to the Olympic Games in South Korea).

[EDIT, showing what the plan plan looks like in RBO]

Some information, as you can see, is missing ...

 SQL> select count(*) from tob_stavke_rac; COUNT(*) ---------- 53195373 Execution Plan ---------------------------------------------------------- Plan hash value: 3371741006 --------------------------------------------- | Id | Operation | Name | --------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | SORT AGGREGATE | | | 2 | TABLE ACCESS FULL| TOB_STAVKE_RAC | --------------------------------------------- Note ----- - rule based optimizer used (consider using cbo) SQL> 
+1
source

Source: https://habr.com/ru/post/1275277/


All Articles