Subtract the first row from all rows in the Pandas DataFrame

I have a pandas dataframe:

a = pd.DataFrame(rand(5,6)*10, index=pd.DatetimeIndex(start='2005', periods=5, freq='A'))
a.columns = pd.MultiIndex.from_product([('A','B'),('a','b','c')])

I want to subtract a string a['2005']from a. For this, I tried this:

In [22]:

a - a.ix['2005']

Out[22]:
    A   B
    a   b   c   a   b   c
2005-12-31  0   0   0   0   0   0
2006-12-31  NaN     NaN     NaN     NaN     NaN     NaN
2007-12-31  NaN     NaN     NaN     NaN     NaN     NaN
2008-12-31  NaN     NaN     NaN     NaN     NaN     NaN
2009-12-31  NaN     NaN     NaN     NaN     NaN     NaN

Which obviously does not work, because pandas aligns the index when performing the operation. It works:

In [24]:

pd.DataFrame(a.values - a['2005'].values, index=a.index, columns=a.columns)

Out[24]:
    A   B
    a   b   c   a   b   c
2005-12-31  0.000000    0.000000    0.000000    0.000000    0.000000    0.000000
2006-12-31  -3.326761   -7.164628   8.188518    -0.863177   0.519587    -3.281982
2007-12-31  3.529531    -4.719756   8.444488    1.355366    7.468361    -4.023797
2008-12-31  3.139185    -8.420257   1.465101    -2.942519   1.219060    -5.146019
2009-12-31  -3.459710   0.519435    -1.049617   -2.779370   4.792227    -1.922461

DataFrame , ​​. apply() : a.apply(lambda x: x-a['2005'].values) ValueError: cannot copy sequence with size 6 to array axis with dimension 5 . , . , ? , , . sub(), , .

+4
2

Pandas . , , Pandas , . DataFrame a.loc['2005'] 1- NumPy:

In [56]: a - a.loc['2005'].values.squeeze()
Out[56]: 
                   A                             B                    
                   a         b         c         a         b         c
2005-12-31  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000
2006-12-31  0.325968  1.314776 -0.789328 -0.344669 -2.518857  7.361711
2007-12-31  0.084203  2.234445 -2.838454 -6.176795 -3.645513  8.955443
2008-12-31  3.798700  0.299529  1.303325 -2.770126 -1.284188  3.093806
2009-12-31  1.520930  2.660040  0.846996 -9.437851 -2.886603  6.705391

squeeze NumPy, a.loc['2005'], (1, 6) (6,), ( ) .

+5

, .

DataFrame, .

import numpy as np
import pandas as pd
#make a simple DataFrame
df = pd.DataFrame(np.fromfunction(lambda i, j: i+1 , (3, 3), dtype=int))

:

# 1 1 1
# 2 2 2
# 3 3 3

first_row = df.iloc[[0]].values[0]

apply(), .

df.apply(lambda row: row - first_row, axis=1)

. ., 1

#  0 0 0
#  1 1 1
#  2 2 2
+2

Source: https://habr.com/ru/post/1545719/


All Articles