It will be long, so here is a short summary to attract you: my top-N query with COUNT STOPKEY and ORDER BY STOPKEY in its plan is still slow, with no good reason.
Now, the details. It starts with a slow function. In real life, this involves string manipulations with regular expressions. For demonstration purposes, here's the intentionally dumb recursive Fibonacci algorithm. I found it to be pretty fast for inputs up to 25, slower around 30 and funny at 35.
-- I repeat: Please no advice on how to do Fibonacci correctly. -- This is slow on purpose! CREATE OR REPLACE FUNCTION tmp_fib ( n INTEGER ) RETURN INTEGER AS BEGIN IF n = 0 OR n = 1 THEN RETURN 1; END IF; RETURN tmp_fib(n-2) + tmp_fib(n-1); END; /
Now some input: a list of names and numbers.
CREATE TABLE tmp_table ( name VARCHAR2(20) UNIQUE NOT NULL, num NUMBER(2,0) ); INSERT INTO tmp_table (name,num) SELECT 'Alpha', 10 FROM dual UNION ALL SELECT 'Bravo', 11 FROM dual UNION ALL SELECT 'Charlie', 33 FROM dual;
Here's an example of a slow query: use the slow Fibonacci function to select strings whose number generates a double-digit Fibonacci number.
SELECT p.name, p.num FROM tmp_table p WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1') ORDER BY p.name;
This is true for 11 and 33, so Bravo and Charlie are on the way out. It takes about 5 seconds to start, almost all of which are slow to calculate tmp_fib(33) . So I want to make a faster version of a slow query, converting it to a top-N request. With N = 1, it looks like this: this:
SELECT * FROM ( SELECT p.name, p.num FROM tmp_table p WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1') ORDER BY p.name ) WHERE ROWNUM <= 1;
And now it returns the top result, Bravo . But it still takes 5 seconds to run! The only explanation is that it still computes tmp_fib(33) , although the result of this calculation does not matter to the result. He had to decide what Bravo going to output, so there was no need to check the WHERE clause for the rest of the table.
I thought maybe the optimizer just needs to say that tmp_fib expensive. So I tried to say like this:
ASSOCIATE STATISTICS WITH FUNCTIONS tmp_fib DEFAULT COST (999999999,0,0);
This changes some of the cost numbers in the plan, but it does not make the query run faster.
The output of SELECT * FROM v$version if it is version dependent:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production PL/SQL Release 11.2.0.2.0 - Production CORE 11.2.0.2.0 Production TNS for 64-bit Windows: Version 11.2.0.2.0 - Production NLSRTL Version 11.2.0.2.0 - Production
And here is the top-1 request autotracer. It seems like the request took 1 second, but it is not. This went on for about 5 seconds.
NAME NUM -------------------- ---------- Bravo 11 Execution Plan ---------------------------------------------------------- Plan hash value: 548796432 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 55 | 4 (25)| 00:00:01 | |* 1 | COUNT STOPKEY | | | | | | | 2 | VIEW | | 1 | 55 | 4 (25)| 00:00:01 | |* 3 | SORT ORDER BY STOPKEY| | 1 | 55 | 4 (25)| 00:00:01 | |* 4 | TABLE ACCESS FULL | TMP_TABLE | 1 | 55 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(ROWNUM<=1) 3 - filter(ROWNUM<=1) 4 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("P"."NUM")),'(.)\1')) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 27 recursive calls 0 db block gets 25 consistent gets 0 physical reads 0 redo size 593 bytes sent via SQL*Net to client 524 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processed
UPdATE . As I mentioned in the comments, the INDEX hint helps in this matter. It would be enough for you to be accepted as the correct answer, although this did not reflect very well on my real scenario. And in an ironic twist, Oracle seems to have learned from experience, and now chooses the default INDEX plan; I have to tell him NO_INDEX reproduce the initial slow behavior.
In a real scenario, I applied a more complex solution, rewriting the query as a PL / SQL function. Here, what my technique looks like, applies to the fib problem:
CREATE OR REPLACE PACKAGE tmp_package IS TYPE t_namenum IS TABLE OF tmp_table%ROWTYPE; FUNCTION get_interesting_names (howmany INTEGER) RETURN t_namenum PIPELINED; END; / CREATE OR REPLACE PACKAGE BODY tmp_package IS FUNCTION get_interesting_names (howmany INTEGER) RETURN t_namenum PIPELINED IS CURSOR c IS SELECT name, num FROM tmp_table ORDER BY name; rec c%ROWTYPE; outcount INTEGER; BEGIN OPEN c; outcount := 0; WHILE outcount < howmany LOOP FETCH c INTO rec; EXIT WHEN c%NOTFOUND; IF REGEXP_LIKE(tmp_fib(rec.num), '(.)\1') THEN PIPE ROW(rec); outcount := outcount + 1; END IF; END LOOP; END; END; / SELECT * FROM TABLE(tmp_package.get_interesting_names(1));
Thanks to the respondents who read the question and conducted the tests, and helped me understand the implementation plans, and I will get rid of this question, but they offer.