Numerical optimization of PL / SQL or alternatives

Question

Numerical optimization of PL / SQL or alternatives

We have a requirement to do some computational heavy lifting to connect to an Oracle database. So far, we have performed our numerical calculations in PL / SQL, and to a large extent suffered a lack of performance.

Now we are implementing a multi-tier inventory model ( https://www.researchgate.net/publication/222409130_Evaluation_of_time-varying_availability_in_multi-echelon_spare_parts_systems_with_passivation ). I'm more concerned about performance due to the size of the problem.

I implemented part of the algorithm in three languages: Fortran (90-2008 corresponded to gfortran), VBA in Excel and PL / SQL, and wrapped a test cycle for one million calls around it. Even using the binary_double data binary_double and native compilation using PLSQL_CODE_TYPE=NATIVE (both of which lead to improvement), the test code below still takes 37 seconds to run (Oracle XE 11.2). For comparison, VBA ranks 16 and Fortran 1.6 on the same hardware.

Although it would be too difficult to ask about performance approaching a Fortran figure (although that would obviously be very desirable), I am surprised that even a modest VBA output performs PL / SQL.

So my question has two parts:

Is there anything else I can try on the Oracle side to improve performance?
If we are considering moving from PL / SQL, what alternatives should we consider when interacting with an Oracle database? Heavy lifting will be called relatively rarely in a mode similar to batch, although a performance improvement approaching Fortran speed will allow us to consider introducing some heavy lifting in the interactive parts of the application. My preferred language for numerical work remains Fortran for ease of implementation of masked multidimensional arrays.

Also, although I didn’t immediately after criticizing my source code as such, I would appreciate it if anyone could notice any obvious optimizations that I can enable.

The timeebo function is a test function that I call with a simple select timeebo from dual; in SQL Developer.

 create or replace FUNCTION gammln( x IN binary_double) RETURN binary_double IS --Lanczos' approximation to the Log Gamma function gammln binary_double; ser binary_double; tmp binary_double; BEGIN tmp := x + 5.5; tmp :=(x + 0.5) * Ln(tmp) - tmp; ser := 1.000000000190015; ser := ser + 76.18009172947146 /(x + 1.0) ; ser := ser - 86.50532032941677 /(x + 2.0) ; ser := ser + 24.01409824083091 /(x + 3.0) ; ser := ser - 1.231739572450155 /(x + 4.0) ; ser := ser + 1.208650973866179E-03 /(x + 5.0) ; ser := ser - 5.395239384953E-06 /(x + 6.0) ; RETURN tmp + Ln(2.5066282746310005 * ser / x) ; END; / CREATE OR REPLACE FUNCTION PoissonDist( k IN INTEGER, lambda IN binary_double) RETURN binary_double IS BEGIN RETURN Exp((k * Ln(lambda)) - lambda - gammln(k + 1)) ; END; / CREATE OR REPLACE FUNCTION EBO( stock IN pls_integer, pipeline IN binary_double, accuracy IN binary_double DEFAULT 0.0000000001) RETURN binary_double IS i pls_integer; EBO binary_double; term binary_double; temp binary_double; PoissonVal binary_double; peaked BOOLEAN; --Flag the Poisson curve as having peaked BEGIN EBO := 0.0; IF(pipeline = 0.0) THEN RETURN EBO; END IF; --Initialise i := 1; peaked := false; PoissonVal := PoissonDist(stock + 1, pipeline) ; --Get p() value IF(PoissonVal < accuracy AND floor(pipeline) > stock) THEN --If p() is very -- small... i := floor(pipeline) - stock; --Revise i to just below peak of Poisson curve PoissonVal := PoissonDist(stock + i, pipeline) ; --Get p() value close to -- peak temp := PoissonVal *(pipeline / CAST(stock + i + 1 AS binary_double)) ; -- -- Store poisson value just above peak LOOP term := CAST(i AS binary_double) * PoissonVal; EBO := EBO + term; i := i - 1; --Work backwards PoissonVal := PoissonVal *(CAST(stock + i + 1 AS DOUBLE PRECISION) / pipeline) ; --Revise Poisson -- value for next time EXIT WHEN(term < accuracy OR i = 0) ; END LOOP; i := 1 + floor(pipeline) - stock; PoissonVal := temp; peaked := true; END IF; LOOP term := CAST(i AS binary_double) * PoissonVal; EBO := EBO + term; i := i + 1; PoissonVal := PoissonVal *(pipeline / CAST(stock + i AS binary_double)) ; --Revise Poisson value for next time IF(CAST(stock + i AS binary_double) > pipeline) THEN peaked := true; END IF; EXIT WHEN(term < accuracy AND peaked) ; END LOOP; IF(EBO < accuracy) THEN EBO := 0.0; END IF; RETURN EBO; END; / CREATE OR REPLACE FUNCTION timeebo RETURN binary_double IS i pls_integer; EBOVal binary_double; acc binary_double; BEGIN acc := 0.0; FOR i IN 1..1000000 LOOP EBOVal := EBO(500, CAST(i AS binary_double) / 1000.0) ; acc := acc + EBOVal; END LOOP; RETURN acc; END;

+6

performance optimization oracle plsql numerical

Matt wenham Oct 22 '16 at 15:46

source share

2 answers

mathguy · Answer 1 · 2016-10-22T23:20:27+0000

This is not an answer to the OP question (but it is a “proof of concept” to show how something like what it does in PL / SQL can be done in plain SQL). I use only the response format, because what I will do below does not fit into the comment.

OP asked to see how summing an infinite series to the right degree of accuracy can be done in pure SQL. I will give two examples.

First example :

Infinite series with positive terms: e = sum [ j = 0 to infinity ] ( 1 / factorial strong> (j) ) . The trivial upper bound of the error for the sum from 0 to n is the last term added to the row. The recursive query below (which requires Oracle 11.1 or higher - actually 11.2, as I wrote it, with the column names in the declaration, but it is easy to change it for 11.1) calculates the e value accurate to 38 decimal places (the maximum precision available in Oracle). The inverse factorial series for e converges very quickly; it takes only 35 steps and works in less than 0.001 seconds on my old home computer (which is just a big Dell tablet with a keyboard).

Edit : Doo! Only I can write something where e = 3.71828! Despite the fact that in the recursive query I add ALL terms (including 1/0!), I started the sum with 1 instead of 0. (Fixed now, but had this error before the correction.)

 with rec ( j, s, next_term, err ) as ( select 0, 0, 1, 2 from dual union all select j+1, s + next_term, next_term/(j+1), next_term from rec where err > power(10, -38) and j < 1000000000 ) select max(j) as steps, round(max(s), 38) as e from rec ; STEPS E ----- ---------------------------------------- 35 2.71828182845904523536028747135266249776

Second example :

OK, so now we take an alternating series (where the absolute value of the last term is always the upper limit of the error), and let it take a very slow approach:

ln ( 2 ) = sum [ j = 1 to infinity ] ( strong> (-1) ^ (j - 1) / j )

The query below computes ln (2), accurate to five decimal places; here we know in advance that we need 100,000 steps, and the calculations on my machine took about 1.1 seconds. (Remember, this is a very slowly converging series.)

 with rec ( j, s, sgn ) as ( select 0, 0, 1 from dual union all select j+1, s + sgn / (j+1), -sgn from rec where j <= 100000 ) select 100000 as steps, round(s, 5) as ln_2 from rec where j = 100000 ; STEPS LN_2 ------ ------- 100000 0.69314

Matt wenham · Answer 2 · 2016-10-24T12:03:29+0000

To summarize my conclusions based on the suggestions in the comments above:

The elimination of the casting had the greatest impact on the proposed refinement. Executing all binary_double causes the code to run about 3.5 times faster and more than ten seconds for hair. From here, using simple_double and pragma_inline , reduce this to about 9.8 s.

The current version of my code with the above changes is as follows. The PoissonDist function can be optimized additionally, but the original question was about how to make existing code faster, rather than optimizing the code itself.

EDIT: The code has been revised with the proper use of pragma_inline , which reduced execution time to 9.5 s.

 create or replace FUNCTION gammln( x IN simple_double) RETURN simple_double IS --Lanczos' approximation to the Log Gamma function ser simple_double := 1.000000000190015d; tmp simple_double := x + 5.5d; BEGIN tmp :=(x + 0.5d) * Ln(tmp) - tmp; ser := ser + 76.18009172947146d /(x + 1.0d) ; ser := ser - 86.50532032941677d /(x + 2.0d) ; ser := ser + 24.01409824083091d /(x + 3.0d) ; ser := ser - 1.231739572450155d /(x + 4.0d) ; ser := ser + 1.208650973866179E-03d /(x + 5.0d) ; ser := ser - 5.395239384953E-06d /(x + 6.0d) ; RETURN tmp + Ln(2.5066282746310005d * ser / x) ; END; / create or replace FUNCTION PoissonDist( k IN simple_double, lambda IN simple_double) RETURN simple_double IS BEGIN PRAGMA INLINE (gammln, 'YES'); RETURN Exp((k * Ln(lambda)) - lambda - gammln(k + 1d)) ; END; / CREATE OR REPLACE FUNCTION EBO( stock IN simple_double, pipeline IN simple_double, accuracy IN simple_double DEFAULT 0.0000000001) RETURN simple_double IS i simple_double := 1d; EBO simple_double := 0d; term simple_double := 0d; temp simple_double := 0d; PRAGMA INLINE(PoissonDist, 'YES') ; PoissonVal simple_double := PoissonDist(stock + 1d, pipeline) ; peaked BOOLEAN := false; --Flag the Poisson curve as having peaked BEGIN IF(pipeline = 0.0d) THEN RETURN EBO; END IF; IF(PoissonVal < accuracy AND floor(pipeline) > stock) THEN --If p() is very -- small... i := floor(pipeline) - stock; --Revise i to just below peak of Poisson curve PRAGMA INLINE(PoissonDist, 'YES') ; PoissonVal := PoissonDist(stock + i, pipeline) ; --Get p() value close to -- peak temp := PoissonVal *(pipeline /(stock + i + 1d)) ; -- -- Store poisson value just above peak LOOP term := i * PoissonVal; EBO := EBO + term; i := i - 1d; --Work backwards PoissonVal := PoissonVal *((stock + i + 1) / pipeline) ; --Revise Poisson -- value for next time EXIT WHEN(term < accuracy OR i = 0d) ; END LOOP; i := 1d + floor(pipeline) - stock; PoissonVal := temp; peaked := true; END IF; LOOP term := i * PoissonVal; EBO := EBO + term; i := i + 1d; PoissonVal := PoissonVal *(pipeline /(stock + i)) ; --Revise Poisson value -- for next time IF((stock + i) > pipeline) THEN peaked := true; END IF; EXIT WHEN(term < accuracy AND peaked) ; END LOOP; IF(EBO < accuracy) THEN EBO := 0.0d; END IF; RETURN EBO; END; / create or replace FUNCTION timeebo RETURN binary_double IS i binary_double; EBOVal binary_double; acc binary_double; BEGIN acc := 0.0d; FOR i IN 1d..1000000d LOOP PRAGMA INLINE (EBO, 'YES'); EBOVal := EBO(500d, i / 1000d) ; acc := acc + EBOVal; END LOOP; RETURN acc; END;

Numerical optimization of PL / SQL or alternatives

More articles: