Optimizing Oracle Stored Procedures

Recently, I was tasked with optimizing some existing Oracle stored procedures. Each of the stored procedures queries the database and generates the output of the XML file. One, in particular, took about 20 minutes to complete the performance. Looking at this, there were several nested loops and unnecessary queries. For example, instead of executing

SELECT * from Employee e, Department d WHERE e.DEPT_ID = d.ID --write data from query to XML 

it was more like

 FOR emp_rec in ( SELECT * from employee ) LOOP SELECT * from Department WHERE id = emp_rec.DEPT_ID; --write data from query to XML END LOOP; 

Changing all these cases to look more like the first option, greatly accelerated the procedure. Why is my question? Why is joining in a select query faster than manually combining tables? What are the main processes?

+4
source share
2 answers

See how the original version will be handled.

 FOR emp_rec in ( SELECT * from employee ) LOOP SELECT * from Department WHERE id = emp_rec.DEPT_ID; --write data from query to XML END LOOP; 

The loop request is likely to perform a full table scan on employee . Then, for each row returned, it will execute an internal query. Assuming id is the department primary key, each query execution is likely to perform a unique search using the primary key index.

Sounds great, right? A unique index search is usually the fastest way to get a single row (with the exception of explicit ROWID searches). But think about how this is done over several iterations of the loop. Presumably, each employee belongs to a department; each department has employees; and most or all departments have several employees.

So, in several iterations of the loop, you repeat the same work for the internal query several times. Yes, data blocks can be cached, so you should not do repeated physical readings, but accessing data in the cache has some processor overhead, which can become very significant when the same blocks are accessed again and again.

Also, in the end, you probably want every line in the department at least once, and possibly more than once. Since each individual block of the table must be read, you do not save the work by performing an index search - you add the work.

When you rewrite a loop as a single query, the optimizer can take this into account. One option is to make a nested loop connection based on employee , which will be essentially the same as an explicit loop in PL / SQL (minus context switching as Mark pointed out). However, given the relationship between the two tables and the absence of any filter predicate, the optimizer will be able to say that it is more efficient to simply completely scan both tables and combine the join or hash join. This actually results in fewer physical IOs (assuming a clean cache at the beginning of each run) and a significantly smaller number of logical IOs.

+5
source

"Core processes" require a great response. I will leave Tom Kyte to answer this question;)

+4
source

Source: https://habr.com/ru/post/1394612/


All Articles