How to use `apply ()` or another vector approach when the previous value matters

Suppose I have a DataFrame of the following form, where the first column is a random number and the rest of the columns will be based on the value in the previous column.

enter image description here

For ease of use, let's say I want each number to be in the previous square. Thus, it will look as follows.

enter image description here

I know that I can write a fairly simple loop for this, but I also know that the loop is not the most efficient in python / pandas. How can this be done with apply()or rolling_apply()? Or if it is done more efficiently?

My unsuccessful attempts:

In [12]: a = pandas.DataFrame({0:[1,2,3,4,5],1:0,2:0,3:0})

In [13]: a
Out[13]: 
   0  1  2  3
0  1  0  0  0
1  2  0  0  0
2  3  0  0  0
3  4  0  0  0
4  5  0  0  0

In [14]: a = a.apply(lambda x: x**2)

In [15]: a
Out[15]: 
    0  1  2  3
0   1  0  0  0
1   4  0  0  0
2   9  0  0  0
3  16  0  0  0
4  25  0  0  0


In [16]: a = pandas.DataFrame({0:[1,2,3,4,5],1:0,2:0,3:0})

In [17]: pandas.rolling_apply(a,1,lambda x: x**2)
C:\WinPython64bit\python-3.5.2.amd64\lib\site-packages\spyderlib\widgets\externalshell\start_ipython_kernel.py:1: FutureWarning: pd.rolling_apply is deprecated for DataFrame and will be removed in a future version, replace with 
        DataFrame.rolling(center=False,window=1).apply(args=<tuple>,kwargs=<dict>,func=<function>)
  # -*- coding: utf-8 -*-
Out[17]: 
      0    1    2    3
0   1.0  0.0  0.0  0.0
1   4.0  0.0  0.0  0.0
2   9.0  0.0  0.0  0.0
3  16.0  0.0  0.0  0.0
4  25.0  0.0  0.0  0.0

In [18]: a = pandas.DataFrame({0:[1,2,3,4,5],1:0,2:0,3:0})

In [19]: a = a[:-1]**2

In [20]: a
Out[20]: 
    0  1  2  3
0   1  0  0  0
1   4  0  0  0
2   9  0  0  0
3  16  0  0  0

In [21]: 

So, my problem is mainly how to refer to the previous column value in my DataFrame calculations.

+4
4

, , . , . :

a = pd.DataFrame({0:[1,2,3,4,5],1:0,2:0,3:0})

for i in range(3):
    a[i+1] = a[i].apply(lambda x: x**2)
+3

, , , , - . , apply rolling_apply, . , . . , pandas .

, , ( ), , .

+4
a[1] = a[0].apply(lambda x: x**2)
a[2] = a[1].apply(lambda x: x**2)
a[3] = a[2].apply(lambda x: x**2)

    0   1   2   3
0   1   1   1   1
1   2   4   16  256
2   3   9   81  6561
3   4   16  256 65536
4   5   25  625 390625
+3

  • 0 will be what ever exists in power 1
  • 1will be in the column 0to level2
  • 2will be what is ever in the column 1, to the extent 2...
    • or there will be something that is ever in a column 0, to the extent4
  • 3will be what is ever in the column 2, to the extent 2...
    • or there will be something that is ever in a column 1, to the extent 4...
    • or there will be something that is ever in a column 0, to the extent8

So we can really vectorize your example with

np.power(df.values[:, [0]], np.power(2, np.arange(4)))

array([[     1,      1,      1,      1],
       [     2,      4,     16,    256],
       [     3,      9,     81,   6561],
       [     4,     16,    256,  65536],
       [     5,     25,    625, 390625]])

Wrap it with a beautiful data framework

pd.DataFrame(
    np.power(df.values[:, [0]], np.power(2, np.arange(4))),
    df.index, df.columns)

   0   1    2       3
0  1   1    1       1
1  2   4   16     256
2  3   9   81    6561
3  4  16  256   65536
4  5  25  625  390625
+3
source

Source: https://habr.com/ru/post/1673041/


All Articles