With Pandas, this can be done in a more flexible way.
First, let's prepare the data:
import pandas as pd import numpy as np from sklearn.datasets import load_iris iris_data = load_iris() iris = pd.DataFrame(iris_data.data, columns = [c[0:3] + c[6] for c in iris_data.feature_names]) iris['Species'] = iris_data.target_names[iris_data.target]
Now we can simulate the mutate_each pipeline:
# calculate the aggregates pivot = iris.groupby("Species")[iris.columns[iris.columns.str.startswith('sepal')] ].aggregate(['min', 'max', np.mean])
The pivot table is a small pivot table:
seplmin seplmax seplmean sepwmin sepwmax sepwmean Species setosa 4.3 5.8 5.006 2.3 4.4 3.418 versicolor 4.9 7.0 5.936 2.0 3.4 2.770 virginica 4.9 7.9 6.588 2.2 3.8 2.974
And new_iris is a 150x11 table with all the columns from iris and pivot combined, identical to what dplyr produces.
source share