Python / Pandas: Unexpected indices when performing a group application

I use Pandas and Numpy in Python3 with the following versions:

  • Python 3.5.1 (via Anaconda 2.5.0) 64 bit
  • Pandas 0.19.1
  • Numpy 1.11.2 (perhaps not relevant here)

Here is the minimal code creating the problem:

import pandas as pd
import numpy as np

a = pd.DataFrame({'i' : [1,1,1,1,1], 'a': [1,2,5,6,100], 'b': [2, 4,10, np.nan, np.nan]})
a.set_index(keys='a', inplace=True)
v = a.groupby(level=0).apply(lambda x: x.sort_values(by='i')['b'].rolling(2, min_periods=0).mean())
v.index.names

This code is a simple group approach, but I do not understand the result:

FrozenList(['a', 'a'])

For some reason, the index of the result is ['a', 'a'], which seems like a pretty dubious choice from pandas. I would expect a simple ['a'].

Does anyone have an idea why Pandas prefers to duplicate a column in an index?

Thanks in advance.

+4
source share
1 answer

, sort_values DataFrame Series, groupby, , shift "b":

In [99]:
v = a.groupby(level=0).apply(lambda x: x['b'].shift())
v

Out[99]:
a    a  
1    1     NaN
2    2     NaN
5    5     NaN
6    6     NaN
100  100   NaN
Name: b, dtype: float64

as_index=False :

In [102]:
v = a.groupby(level=0, as_index=False).apply(lambda x: x['b'].shift())
v

Out[102]:
   a  
0  1     NaN
1  2     NaN
2  5     NaN
3  6     NaN
4  100   NaN
Name: b, dtype: float64

, :

In [104]:
v = a.groupby(level=0).apply(lambda x: x['b'].max())
v

Out[104]:
a
1       2.0
2       4.0
5      10.0
6       NaN
100     NaN
dtype: float64

, , , , , .

+1

Source: https://habr.com/ru/post/1669042/


All Articles