A quick way to find the smallest positive true root of a quaternary polynomial of 4 degrees in python

[What I want] is to find the only one least positive real root of the quadratic function a x ^ 4 + b x ^ 3 + c x ^ 2 + d x + e

[Existing method] My equation for collision prediction, the maximum degree is the quartic function as f (x) = a x ^ 4 + b x ^ 3 + c x ^ 2 + d x + e and a, b, c, d, e can be positive / negative / zero ( real float value ). So my function f (x) can be quartic, cubic or quadratic depending on the input coefficient a, b, c, d, e.

I am currently using numpy to search for root as shown below.

import numpy

root_output = numpy.roots ([a, b, c, d, e])

The " root_output " from the numpy module can be all possible real / complex roots depending on the input factor. So I have to look one by one at “ root_output ” and check which root is the smallest real positive value (root> 0?)

[Problem] My program needs to execute numpy.roots ([a, b, c, d, e]) many times, so many times the execution of numpy.roots is too slow for my project. and (a, b, c, d, e) always changes every time numpy.roots is executed

My attempt is to run the code on the Raspberry Pi2. The following is an example of processing time.

  • Running many times numpy.roots on PC: 1.2 seconds
  • Runs many times numpy.roots on raspberry Pi2: 17 seconds

Could you advise me to find the smallest positive real root in the quickest solution ? Using scipy.optimize or implementing some algorithm to speed up the search for the root or any advice from you will be great.

Thanks.

[Decision]

  • A quadratic function needs only real positive roots (please remember division by zero)

    def SolvQuadratic(a, b ,c): d = (b**2) - (4*a*c) if d < 0: return [] if d > 0: square_root_d = math.sqrt(d) t1 = (-b + square_root_d) / (2 * a) t2 = (-b - square_root_d) / (2 * a) if t1 > 0: if t2 > 0: if t1 < t2: return [t1, t2] return [t2, t1] return [t1] elif t2 > 0: return [t2] else: return [] else: t = -b / (2*a) if t > 0: return [t] return [] 
  • Quartic Function for apartment function, you can use pure python / numba version as below answer from @BM . I also add another version of cython from BB code. You can use the code below as a .pyx file and then compile it to get about 2 times faster than pure python (note the rounding problems).

     import cmath cdef extern from "complex.h": double complex cexp(double complex) cdef double complex J=cexp(2j*cmath.pi/3) cdef double complex Jc=1/J cdef Cardano(double a, double b, double c, double d): cdef double z0 cdef double a2, b2 cdef double p ,q, D cdef double complex r cdef double complex u, v, w cdef double w0, w1, w2 cdef double complex r1, r2, r3 z0=b/3/a a2,b2 = a*a,b*b p=-b2/3/a2 +c/a q=(b/27*(2*b2/a2-9*c/a)+d)/a D=-4*p*p*p-27*q*q r=cmath.sqrt(-D/27+0j) u=((-qr)/2)**0.33333333333333333333333 v=((-q+r)/2)**0.33333333333333333333333 w=u*v w0=abs(w+p/3) w1=abs(w*J+p/3) w2=abs(w*Jc+p/3) if w0<w1: if w2<w0 : v = v*Jc elif w2<w1 : v = v*Jc else: v = v*J r1 = u+v-z0 r2 = u*J+v*Jc-z0 r3 = u*Jc+v*J-z0 return r1, r2, r3 cdef Roots_2(double a, double complex b, double complex c): cdef double complex bp cdef double complex delta cdef double complex r1, r2 bp=b/2 delta=bp*bp-a*c r1=(-bp-delta**.5)/a r2=-r1-b/a return r1, r2 def SolveQuartic(double a, double b, double c, double d, double e): "Ferrarai Method" "resolution of P=ax^4+bx^3+cx^2+dx+e=0, coeffs reals" "First shift : x= zb/4/a => P=z^4+pz^2+qz+r" cdef double z0 cdef double a2, b2, c2, d2 cdef double p, q, r cdef double A, B, C, D cdef double complex y0, y1, y2 cdef double complex a0, b0 cdef double complex r0, r1, r2, r3 z0=b/4.0/a a2,b2,c2,d2 = a*a,b*b,c*c,d*d p = -3.0*b2/(8*a2)+c/a q = b*b2/8.0/a/a2 - 1.0/2*b*c/a2 + d/a r = -3.0/256*b2*b2/a2/a2 + c*b2/a2/a/16 - b*d/a2/4+e/a "Second find y so P2=Ay^3+By^2+Cy+D=0" A=8.0 B=-4*p C=-8*r D=4*r*pq*q y0,y1,y2=Cardano(A,B,C,D) if abs(y1.imag)<abs(y0.imag): y0=y1 if abs(y2.imag)<abs(y0.imag): y0=y2 a0=(-p+2*y0)**.5 if a0==0 : b0=y0**2-r else : b0=-q/2/a0 r0,r1=Roots_2(1,a0,y0+b0) r2,r3=Roots_2(1,-a0,y0-b0) return (r0-z0,r1-z0,r2-z0,r3-z0) 

[The problem of the Ferrari method] We encountered a problem when the coefficients of the apartment equation are [0.00614656, -0.0933333333333, 0.527664995846, -1.31617928376, 1.21906444869] the output from numpy.roots and ferrari methods are completely different (numpy.roots is the correct conclusion).

  import numpy as np import cmath J=cmath.exp(2j*cmath.pi/3) Jc=1/J def ferrari(a,b,c,d,e): "Ferrarai Method" "resolution of P=ax^4+bx^3+cx^2+dx+e=0, coeffs reals" "First shift : x= zb/4/a => P=z^4+pz^2+qz+r" z0=b/4/a a2,b2,c2,d2 = a*a,b*b,c*c,d*d p = -3*b2/(8*a2)+c/a q = b*b2/8/a/a2 - 1/2*b*c/a2 + d/a r = -3/256*b2*b2/a2/a2 +c*b2/a2/a/16-b*d/a2/4+e/a "Second find y so P2=Ay^3+By^2+Cy+D=0" A=8 B=-4*p C=-8*r D=4*r*pq*q y0,y1,y2=Cardano(A,B,C,D) if abs(y1.imag)<abs(y0.imag): y0=y1 if abs(y2.imag)<abs(y0.imag): y0=y2 a0=(-p+2*y0)**.5 if a0==0 : b0=y0**2-r else : b0=-q/2/a0 r0,r1=Roots_2(1,a0,y0+b0) r2,r3=Roots_2(1,-a0,y0-b0) return (r0-z0,r1-z0,r2-z0,r3-z0) #~ @jit(nopython=True) def Cardano(a,b,c,d): z0=b/3/a a2,b2 = a*a,b*b p=-b2/3/a2 +c/a q=(b/27*(2*b2/a2-9*c/a)+d)/a D=-4*p*p*p-27*q*q r=cmath.sqrt(-D/27+0j) u=((-qr)/2)**0.33333333333333333333333 v=((-q+r)/2)**0.33333333333333333333333 w=u*v w0=abs(w+p/3) w1=abs(w*J+p/3) w2=abs(w*Jc+p/3) if w0<w1: if w2<w0 : v*=Jc elif w2<w1 : v*=Jc else: v*=J return u+v-z0, u*J+v*Jc-z0, u*Jc+v*J-z0 #~ @jit(nopython=True) def Roots_2(a,b,c): bp=b/2 delta=bp*bp-a*c r1=(-bp-delta**.5)/a r2=-r1-b/a return r1,r2 coef = [0.00614656, -0.0933333333333, 0.527664995846, -1.31617928376, 1.21906444869] print("Coefficient A, B, C, D, E", coef) print("") print("numpy roots: ", np.roots(coef)) print("") print("ferrari python ", ferrari(*coef)) 
+5
source share
3 answers

Another answer:

do this with analytic methods ( Ferrari , Cardan ), and code speed using Just in Time ( Numba ) compilation:

First, look at the improvement:

 In [2]: P=poly1d([1,2,3,4],True) In [3]: roots(P) Out[3]: array([ 4., 3., 2., 1.]) In [4]: %timeit roots(P) 1000 loops, best of 3: 465 µs per loop In [5]: ferrari(*P.coeffs) Out[5]: ((1+0j), (2-0j), (3+0j), (4-0j)) In [5]: %timeit ferrari(*P.coeffs) #pure python without jit 10000 loops, best of 3: 116 µs per loop In [6]: %timeit ferrari(*P.coeffs) # with numba.jit 100000 loops, best of 3: 13 µs per loop 

Then the ugly code:

for order 4:

 @jit(nopython=True) def ferrari(a,b,c,d,e): "resolution of P=ax^4+bx^3+cx^2+dx+e=0" "CN all coeffs real." "First shift : x= zb/4/a => P=z^4+pz^2+qz+r" z0=b/4/a a2,b2,c2,d2 = a*a,b*b,c*c,d*dp = -3*b2/(8*a2)+c/a q = b*b2/8/a/a2 - 1/2*b*c/a2 + d/a r = -3/256*b2*b2/a2/a2 +c*b2/a2/a/16-b*d/a2/4+e/a "Second find X so P2=AX^3+BX^2+C^X+D=0" A=8 B=-4*p C=-8*r D=4*r*pq*q y0,y1,y2=cardan(A,B,C,D) if abs(y1.imag)<abs(y0.imag): y0=y1 if abs(y2.imag)<abs(y0.imag): y0=y2 a0=(-p+2*y0.real)**.5 if a0==0 : b0=y0**2-r else : b0=-q/2/a0 r0,r1=roots2(1,a0,y0+b0) r2,r3=roots2(1,-a0,y0-b0) return (r0-z0,r1-z0,r2-z0,r3-z0) 

for order 3:

 J=exp(2j*pi/3) Jc=1/J @jit(nopython=True) def cardan(a,b,c,d): u=empty(2,complex128) z0=b/3/a a2,b2 = a*a,b*bp=-b2/3/a2 +c/a q=(b/27*(2*b2/a2-9*c/a)+d)/a D=-4*p*p*p-27*q*q r=sqrt(-D/27+0j) u=((-qr)/2)**0.33333333333333333333333 v=((-q+r)/2)**0.33333333333333333333333 w=u*v w0=abs(w+p/3) w1=abs(w*J+p/3) w2=abs(w*Jc+p/3) if w0<w1: if w2<w0 : v*=Jc elif w2<w1 : v*=Jc else: v*=J return u+v-z0, u*J+v*Jc-z0,u*Jc+v*J-z0 

for order 2:

 @jit(nopython=True) def roots2(a,b,c): bp=b/2 delta=bp*bp-a*c u1=(-bp-delta**.5)/a u2=-u1-b/a return u1,u2 

There should probably be a test, but effective.

+3
source

numpy solution for this without a loop:

 p=array([a,b,c,d,e]) r=roots(p) r[(r.imag==0) & (r.real>=0) ].real.min() 

scipy.optimize methods will be slower if you don't need precision:

 In [586]: %timeit r=roots(p);r[(r.imag==0) & (r.real>=0) ].real.min() 1000 loops, best of 3: 334 µs per loop In [587]: %timeit newton(poly1d(p),10,tol=1e-8) 1000 loops, best of 3: 555 µs per loop In [588]: %timeit newton(poly1d(p),10,tol=1) 10000 loops, best of 3: 177 µs per loop 

And you have to find the min ...

EDIT

for a 2x coefficient, do it yourself, which roots do:

 In [638]: b=zeros((4,4),float);b[1:,:-1]=eye(3) In [639]: c=b.copy();c[0]=-(p/p[0])[1:];eig(c)[0] Out[639]: array([-7.40849430+0.j , 5.77969794+0.j , -0.18560182+3.48995646j, -0.18560182-3.48995646j]) In [640]: roots(p) Out[640]: array([-7.40849430+0.j , 5.77969794+0.j , -0.18560182+3.48995646j, -0.18560182-3.48995646j]) In [641]: %timeit roots(p) 1000 loops, best of 3: 365 µs per loop In [642]: %timeit c=b.copy();c[0]=-(p/p[0])[1:];eig(c) 10000 loops, best of 3: 181 µs per loop 
+2
source

If the polynomial coefficients are known in advance, you can speed up, vectorize the calculation in the roots (given that Numpy> = 1.10 or so):

 import numpy as np def roots_vec(p): p = np.atleast_1d(p) n = p.shape[-1] A = np.zeros(p.shape[:1] + (n-1, n-1), float) A[...,1:,:-1] = np.eye(n-2) A[...,0,:] = -p[...,1:]/p[...,None,0] return np.linalg.eigvals(A) def roots_loop(p): r = [] for pp in p: r.append(np.roots(pp)) return r p = np.random.rand(2000, 4) # 2000 polynomials of 4th order assert np.allclose(roots_vec(p), roots_loop(p)) In [35]: %timeit roots_vec(p) 100 loops, best of 3: 4.49 ms per loop In [36]: %timeit roots_loop(p) 10 loops, best of 3: 81.9 ms per loop 

This may not beat the analytic solution + Numba, but allows higher orders.

+1
source

Source: https://habr.com/ru/post/1244425/


All Articles