Recurrence ratio in Pandas

I have a DataFrame, df , in pandas with the df.A and df.B , and I'm trying to create a third series, df.C , which depends on A and B, as well as on the previous result, That is:

C[0]=A[0]

C[n]=A[n] + B[n]*C[n-1]

What is the most efficient way to do this? Ideally, I would not have to go back to the for loop.


Edit

This is the desired result for C given by A and B. Now you just need to figure out how ...

 import pandas as pd a = [ 2, 3,-8,-2, 1] b = [ 1, 1, 4, 2, 1] c = [ 2, 5,12,22,23] df = pd.DataFrame({'A': a, 'B': b, 'C': c}) df 
+5
source share
2 answers

You can vectorize this with unpleasant aggregate products and smoothing other vectors. But that will not save you. In fact, it is likely to be numerically unstable.

Instead, you can use numba to speed up the loop.

 from numba import njit import numpy as np import pandas as pd @njit def dynamic_alpha(a, b): c = a.copy() for i in range(1, len(a)): c[i] = a[i] + b[i] * c[i - 1] return c df.assign(C=dynamic_alpha(df.A.values, df.B.values)) ABC 0 2 1 2 1 3 1 5 2 -8 4 12 3 -2 2 22 4 1 1 23 

For this simple calculation, it will be about as fast as simple

 df.assign(C=np.arange(len(df)) ** 2 + 2) 

 df = pd.concat([df] * 10000) %timeit df.assign(C=dynamic_alpha(df.A.values, df.B.values)) %timeit df.assign(C=np.arange(len(df)) ** 2 + 2) 

 337 µs ± 5.87 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 333 µs ± 20.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 
+4
source

try the following:

C[0]=A[0] C=[A[i]+B[i]*C[i-1] for i in range(1,len(A))]

much faster than a cycle.

-1
source

Source: https://habr.com/ru/post/1275940/


All Articles