Using lambda functions in the RK4 algorithm

There are two ways to implement the classic Runge-Kutta scheme in Python, presented here. The first uses lambda functions, the second - without them.

Which one will be faster and why exactly?

+6
source share
3 answers

If you pre-process the code using the Coconut transpiler, which implements tail call optimization, then they are completely equivalent (as fast as the faster version is not processed), so you can use any style that suits you best.

# Save berna1111 code as rk4.coco; no modifications necessary. $ coconut --target 3 rk4.coco & python3 rk4.py 50007 function calls in 0.055 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.097 0.097 <string>:1(<module>) 40000 0.038 0.000 0.038 0.000 rk4.py:243(f) 1 0.000 0.000 0.000 0.000 rk4.py:246(RK4) 10000 0.007 0.000 0.088 0.000 rk4.py:247(<lambda>) 1 0.010 0.010 0.097 0.097 rk4.py:250(test_RK4) 1 0.000 0.000 0.097 0.097 {built-in method builtins.exec} 2 0.000 0.000 0.000 0.000 {built-in method numpy.core.multiarray.empty} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 50006 function calls in 0.057 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.057 0.057 <string>:1(<module>) 40000 0.030 0.000 0.030 0.000 rk4.py:243(f) 10000 0.019 0.000 0.049 0.000 rk4.py:265(rk4_step) 1 0.007 0.007 0.057 0.057 rk4.py:273(test_rk4) 1 0.000 0.000 0.057 0.057 {built-in method builtins.exec} 2 0.000 0.000 0.000 0.000 {built-in method numpy.core.multiarray.empty} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 
+3
source

I adapted the code in this link and used cProfile to compare two methods:

 import numpy as np import cProfile as cP def theory(t): return (t**2 + 4.)**2 / 16. def f(x, y): return x * np.sqrt(y) def RK4(f): return lambda t, y, dt: ( lambda dy1: ( lambda dy2: ( lambda dy3: ( lambda dy4: (dy1 + 2*dy2 + 2*dy3 + dy4)/6 )( dt * f( t + dt , y + dy3 ) ) )( dt * f( t + dt/2, y + dy2/2 ) ) )( dt * f( t + dt/2, y + dy1/2 ) ) )( dt * f( t , y ) ) def test_RK4(dy=f, x0=0., y0=1., x1=10, n=10): vx = np.empty(n+1) vy = np.empty(n+1) dy = RK4(f=dy) dx = (x1 - x0) / float(n) vx[0] = x = x0 vy[0] = y = y0 i = 1 while i <= n: vx[i], vy[i] = x + dx, y + dy(x, y, dx) x, y = vx[i], vy[i] i += 1 return vx, vy def rk4_step(dy, x, y, dx): k1 = dx * dy(x, y) k2 = dx * dy(x + 0.5 * dx, y + 0.5 * k1) k3 = dx * dy(x + 0.5 * dx, y + 0.5 * k2) k4 = dx * dy(x + dx, y + k3) return x + dx, y + (k1 + k2 + k2 + k3 + k3 + k4) / 6. def test_rk4(dy=f, x0=0., y0=1., x1=10, n=10): vx = np.empty(n+1) vy = np.empty(n+1) dx = (x1 - x0) / float(n) vx[0] = x = x0 vy[0] = y = y0 i = 1 while i <= n: vx[i], vy[i] = rk4_step(dy=dy, x=x, y=y, dx=dx) x, y = vx[i], vy[i] i += 1 return vx, vy cP.run("test_RK4(n=10000)") cP.run("test_rk4(n=10000)") 

And received:

  90006 function calls in 0.095 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.095 0.095 <string>:1(<module>) 40000 0.036 0.000 0.036 0.000 untitled1.py:13(f) 1 0.000 0.000 0.000 0.000 untitled1.py:16(RK4) 10000 0.008 0.000 0.086 0.000 untitled1.py:17(<lambda>) 10000 0.012 0.000 0.069 0.000 untitled1.py:18(<lambda>) 10000 0.012 0.000 0.048 0.000 untitled1.py:19(<lambda>) 10000 0.009 0.000 0.027 0.000 untitled1.py:20(<lambda>) 10000 0.009 0.000 0.009 0.000 untitled1.py:21(<lambda>) 1 0.009 0.009 0.095 0.095 untitled1.py:28(test_RK4) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 2 0.000 0.000 0.000 0.000 {numpy.core.multiarray.empty} 50005 function calls in 0.064 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.064 0.064 <string>:1(<module>) 40000 0.032 0.000 0.032 0.000 untitled1.py:13(f) 10000 0.026 0.000 0.058 0.000 untitled1.py:43(rk4_step) 1 0.006 0.006 0.064 0.064 untitled1.py:51(test_rk4) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 2 0.000 0.000 0.000 0.000 {numpy.core.multiarray.empty} 

So, I would say that calling the overhead function in the implementation of " lambda " makes it slower.

However, beware that somehow I seem to have lost some accuracy, as the results, despite agreement with each other, are larger than those given in the example:

 >>> vx, vy = test_rk4() >>> vy array([ 1. , 1.56110667, 3.99324757, ..., 288.78174798, 451.27952013, 675.64427775]) >>> vx, vy = test_RK4() >>> vy array([ 1. , 1.56110667, 3.99324757, ..., 288.78174798, 451.27952013, 675.64427775]) 
+4
source

Both @ berna1111 and @ matt2000 are correct. Due to function calls, the lambda version undergoes an additional add-on. Optimization of the tail call converts tail calls into a while loop (that is, it automatically converts the lambda version to the while version), eliminating the overhead of the function call.

See fooobar.com/questions/48026 / ... why Python does not do this optimization automatically, and you should use a tool like Coconut to do the preliminary process.

+2
source

Source: https://habr.com/ru/post/1013761/


All Articles