I have a pandas framework that looks like this and contains data groups through a column id:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(10, 4), columns=list('ABCD'))
df['id'] = ['W', 'W', 'W', 'Z', 'Z', 'Y', 'Y', 'Y', 'Z', 'Z']
print(df)
A B C D id
0 0.347501 -1.152416 1.441144 -0.144545 w
1 0.775828 -1.176764 0.203049 -0.305332 w
2 1.036246 -0.467927 0.088138 -0.438207 w
3 -0.737092 -0.231706 0.268403 0.464026 x
4 -1.857346 -1.420284 -0.515517 -0.231774 x
5 -0.970731 0.217890 0.193814 -0.078838 y
6 -0.318314 -0.244348 0.162103 1.204386 y
7 0.340199 1.074977 1.201068 -0.431473 y
8 0.202050 0.790434 0.643458 -0.068620 z
9 -0.882865 0.687325 -0.008771 -0.066912 z
Now I want to create new dataframes (called df_w, df_x, df_y, df_z) that store only their data from the original data frame and are optimally combined into some iterative ones, for example. list:
df_w
A B C D id
0 0.347501 -1.152416 1.441144 -0.144545 w
1 0.775828 -1.176764 0.203049 -0.305332 w
2 1.036246 -0.467927 0.088138 -0.438207 w
df_x
A B C D id
0 -0.737092 -0.231706 0.268403 0.464026 x
1 -1.857346 -1.420284 -0.515517 -0.231774 x
df_y
A B C D id
0 -0.970731 0.217890 0.193814 -0.078838 y
1 -0.318314 -0.244348 0.162103 1.204386 y
2 0.340199 1.074977 1.201068 -0.431473 y
df_z
A B C D id
0 0.202050 0.790434 0.643458 -0.068620 z
1 -0.882865 0.687325 -0.008771 -0.066912 z
Is there any smart way (vectorized pandas) to achieve this with groupby, apply and / or applymap and function?
I thought about iterating over the file border, but that doesn't seem very elegant.
Thanks in advance for any tips!