One option is to use the group twice. Once for an index:
In [11]: df.groupby(lambda x: x/2).mean() Out[11]: 0 1 2 3 0 1.5 3.0 3 3.5 1 2.5 1.5 2 2.5
and once for columns:
In [12]: df.groupby(lambda x: x/2).mean().groupby(lambda y: y/2, axis=1).mean() Out[12]: 0 1 0 2.25 3.25 1 2.00 2.25
Note. A solution that calculates only one value may be preferable ... one of the options is stacking, grouping, average, and peeling, but atm is a bit difficult.
This seems significantly faster than Vicktor's solution :
In [21]: df = pd.DataFrame(np.random.randn(100, 100)) In [22]: %timeit df.groupby(lambda x: x/2).mean().groupby(lambda y: y/2, axis=1).mean() 1000 loops, best of 3: 1.64 ms per loop In [23]: %timeit viktor() 1 loops, best of 3: 822 ms per loop
In fact, Viktor's solution comes from my (not powerful enough) laptop for large DataFrames:
In [31]: df = pd.DataFrame(np.random.randn(1000, 1000)) In [32]: %timeit df.groupby(lambda x: x/2).mean().groupby(lambda y: y/2, axis=1).mean() 10 loops, best of 3: 42.9 ms per loop In [33]: %timeit viktor()
As Victor points out, this does not work with an integer index, if necessary, you can just save them as temporary variables and return them back after:
df_index, df_cols, df.index, df.columns = df.index, df.columns, np.arange(len(df.index)), np.arange(len(df.columns)) res = df.groupby(... res.index, res.columns = df_index[::2], df_cols[::2]