Refactoring large cursor queries by splitting into multiple cursors

Another PL / SQL refactoring issue!

I have several cursors that have a common simplified form:

cursor_1 is with X as (select col1, col2 from TAB where col1 = '1'), Y as (select col1, col2 from TAB where col2 = '3'), /*main select*/ select count(X.col1), ... from X inner join Y on... group by rollup (X.col1, ... cursor_2 is with X as (select col1, col2 from TAB where col1 = '7' and col2 = '9' and col3 = 'TEST'), Y as (select col1, col2 from TAB where col3 = '6'), /*main select*/ select count(X.col1), ... from X inner join Y on... group by rollup (X.col1, ... cursor_2 is with X as (select col1, col2 from TAB where col1 IS NULL ), Y as (select col1, col2 from TAB where col2 IS NOT NULL ), /*main select*/ select count(X.col1), ... from X inner join Y on... group by rollup (X.col1, ... ... begin for r in cursor_1 loop print_report_results(r); end loop; for r in cursor_2 loop print_report_results(r); end loop; ... end; 

In principle, all of these cursors (there are more than 3) are the same summaries / reports. The difference is in factorized subqueries. There are always two subqueries that are factorizable, β€œX” and β€œY”, and they always select the same columns to submit the main query to the report.

The problem is that the main reporting request is VERY large, about 70 lines. This in itself is not so bad, but it has been copied for ALL reporting requests (I think there are more than a dozen).

Since the only difference is the factorized subqueries (and they all return the same columns, this is really just the difference in the tables they select and their conditions). I was hoping to find a way to reorganize all this so that it is ONE query for a giant report and smaller for various factorized subqueries, therefore, when there are changes in how the report is made, I need to do this in only one place, not in a dozen. Not to mention a much easier (and readable) file to navigate!

I just don’t know how to properly reorganize something like this. I was thinking about pipelined functions? I'm not sure if they are suitable for this, or if there is an easier way ...

On the other hand, I also wonder if the performance will be significantly worse by splitting the reporting query. Performance (speed) is a problem for this system. I would prefer not to introduce changes for the convenience of developers if it adds significant runtime.


I assume that I end up like something that looks something like this (I'm just not sure how to do this so that it really compiles):

 cursor main_report_cursor (in_X, in_Y) is with X as (select * from in_X), Y as (select * from in_Y) /*main select*/ select count(X.col1), ... from X inner join Y on... group by rollup (X.col1, ... cursor x_1 is select col1, col2 from TAB where col1 = '1'; cursor y_1 is select col1, col2 from TAB where col2 = '3' ... begin for r in main_report_cursor(x_1,y_1) loop print_report_results(r); end loop; for r in main_report_cursor(x_2,y_2) loop print_report_results(r); end loop; ... 

(using Oracle 10g)

+6
source share
4 answers

Use the pipelined function. For instance:

 drop table my_tab; create table my_tab ( col1 number, col2 varchar2(10), col3 char(1) ); insert into my_tab values (1, 'One', 'X'); insert into my_tab values (1, 'One', 'Y'); insert into my_tab values (2, 'Two', 'X'); insert into my_tab values (2, 'Two', 'Y'); insert into my_tab values (3, 'Three', 'X'); insert into my_tab values (4, 'Four', 'Y'); commit; -- define types create or replace package refcur_pkg is --type people_tab is table of people%rowtype; type my_subquery_tab is table of my_tab%rowtype; end refcur_pkg; 

Create a pipelined function

 -- create pipelined function create or replace function get_tab_data(p_cur_num in number, p_cur_type in char) return REFCUR_PKG.my_subquery_tab pipelined IS v_ret REFCUR_PKG.my_subquery_tab; begin if (p_cur_num = 1) then if (upper(p_cur_type) = 'X') then for rec in (select * from my_tab where col1=1 and col3='X') loop pipe row(rec); end loop; elsif (upper(p_cur_type) = 'Y') then for rec in (select * from my_tab where col1=1 and col3='Y') loop pipe row(rec); end loop; else return; end if; elsif (p_cur_num = 2) then if (upper(p_cur_type) = 'X') then for rec in (select * from my_tab where col1=2 and col3='X') loop pipe row(rec); end loop; elsif (upper(p_cur_type) = 'Y') then for rec in (select * from my_tab where col1=2 and col3='Y') loop pipe row(rec); end loop; else return; end if; end if; return; end; 

BASIC example procedure

 -- main procedure/usage declare cursor sel_cur1 is with X as (select * from table(get_tab_data(1, 'x'))), Y as (select * from table(get_tab_data(1, 'y'))) select X.col1, Y.col2 from X,Y where X.col1 = Y.col1; begin for rec in sel_cur1 loop dbms_output.put_line(rec.col1 || ',' || rec.col2); end loop; end; 

All your subqueries come down to calling a single pipelined function that defines the returned rows.

EDIT:

To combine all the necessary types and functions into 1 procedure, and also use variables for the parameters of the subquery function, I add the following example:

 create or replace procedure my_pipe IS -- define types type my_subquery_tab is table of my_tab%rowtype; type ref_cur_t is ref cursor; v_ref_cur ref_cur_t; -- define vars v_with_sql varchar2(4000); v_main_sql varchar2(32767); v_x1 number; v_x2 char; v_y1 number; v_y2 char; v_col1 my_tab.col1%type; v_col2 my_tab.col2%type; -- define local functions/procs function get_tab_data(p_cur_num in number, p_cur_type in char) return my_subquery_tab pipelined IS v_ret my_subquery_tab; begin if (p_cur_num = 1) then if (upper(p_cur_type) = 'X') then for rec in (select * from my_tab where col1=1 and col3='X') loop pipe row(rec); end loop; elsif (upper(p_cur_type) = 'Y') then for rec in (select * from my_tab where col1=1 and col3='Y') loop pipe row(rec); end loop; else return; end if; elsif (p_cur_num = 2) then if (upper(p_cur_type) = 'X') then for rec in (select * from my_tab where col1=2 and col3='X') loop pipe row(rec); end loop; elsif (upper(p_cur_type) = 'Y') then for rec in (select * from my_tab where col1=2 and col3='Y') loop pipe row(rec); end loop; else return; end if; end if; return; end; BEGIN --------------------------------- -- Setup SQL for cursors --------------------------------- -- this will have different parameter values for subqueries v_with_sql := q'{ with X as (select * from table(get_tab_data(:x1, :x2))), Y as (select * from table(get_tab_data(:y1, :y2))) }'; -- this will stay the same for all cursors v_main_sql := q'{ select X.col1, Y.col2 from X,Y where X.col1 = Y.col1 }'; --------------------------------- -- set initial subquery parameters --------------------------------- v_x1 := 1; v_x2 := 'x'; v_y1 := 1; v_y2 := 'y'; open v_ref_cur for v_with_sql || v_main_sql using v_x1, v_x2, v_y1, v_y2; loop fetch v_ref_cur into v_col1, v_col2; exit when v_ref_cur%notfound; dbms_output.put_line(v_col1 || ',' || v_col2); end loop; close v_ref_cur; --------------------------------- -- change subquery parameters --------------------------------- v_x1 := 2; v_x2 := 'x'; v_y1 := 2; v_y2 := 'y'; open v_ref_cur for v_with_sql || v_main_sql using v_x1, v_x2, v_y1, v_y2; loop fetch v_ref_cur into v_col1, v_col2; exit when v_ref_cur%notfound; dbms_output.put_line(v_col1 || ',' || v_col2); end loop; close v_ref_cur; end; 

Note that the advantage now is that even if you have many different cursors, you only need to define the main query and SQL subquery only once. After that, you simply change the variables.

Greetings

+4
source

One possibility that you might consider is to use 2 global temporary tables (GTTs) for X and Y. Then you only need one cursor, but you need to clear and refill 2 GTTs several times - and if the data volumes are large you also You can get optimizer statistics on GTT every time.

Here is what I mean:

 cursor_gtt is select count(X.col1), ... from GTT_X inner join GTT_Y on... group by rollup (X.col1, ... begin insert into gtt_x select col1, col2 from TAB where col1 = '1'; insert into gtt_y select col1, col2 from TAB where col2 = '3'; -- maybe get stats for gtt_x and gtt_y here for r in cursor_gtt loop print_report_results(r); end loop; delete gtt_x; delete gtt_y; insert into gtt_x select col1, col2 from TAB where col1 = '7' and col2 = '9' and col3 = 'TEST'; insert into gtt_y select col1, col2 from TAB where col3 = '6' -- maybe get stats for gtt_x and gtt_y here for r in cursor_gtt loop print_report_results(r); end loop; ... end; 

So, the same 2 GTTs are refilled, and the same cursor is used each time.

+2
source
 --Create views that will be replaced by common table expressions later. --The column names have to be the same, the actual content doesn't matter. create or replace view x as select 'wrong' col1, 'wrong' col2 from dual; create or replace view y as select 'wrong' col1, 'wrong' col2 from dual; --Put the repetitive logic in one view create or replace view main_select as select count(x.col1) total, x.col2 from X inner join Y on x.col1 = y.col1 group by rollup (x.col1); --Just querying the view produces the wrong results select * from main_select; --But when you add the common table expressions X and Y they override --the dummy views and produce the real results. declare cursor cursor_1 is with X as (select 'right' col1, 'right' col2 from dual), Y as (select 'right' col1, 'right' col2 from dual) select total, col2 from main_select; --... repeat for each cursor, just replace X and Y as necessary begin for r in cursor_1 loop dbms_output.put_line(r.col2); end loop; null; end; / 

This solution is a little weirder than the pipelined approach, and 3 new objects are required to view it, but it will probably work faster since the context switches less between SQL and PL / SQL.

+2
source

How to create a view for the main request? This ascribes your code and centralizes the main download request.

+1
source

Source: https://habr.com/ru/post/888336/


All Articles