Move column for data in pandas dataframe

Question

Move column for data in pandas dataframe

I am trying to remove an existing pandas framework.

I have a csv file that I import and look something like this (this is a simplified version):

trial_num  trial_name  unit_1_act  unit_2_act  unit_3_act  unit_4_act
0          face           0.0      0.000000      0.272244      0.006428   
1          face           0.0      0.000000      0.898450      0.000000   
2          face           0.0      0.893845      0.000000      0.000000   
3          scene          0.0      0.879367      0.000000      0.006312   
4          scene          0.0      0.000000      0.000000      0.000000

In this form, I have several observations in each row (each "unit_X_act" is a separate observation.) I want to separate them so that there is one observation per row.

In other words, I would like to change this so that instead of columns called "unit_1_act", "unit_2_act", etc. I would have the following: one column called "unit number" for which the value can be "unit_1", "unit_2", etc., and then one column called "activity", which has a value that was previously under each column is unit_X_act.

+4

python pandas dataframe

Victoria R Jul 22 '17 at 20:58

source share

4 answers

Maxu · Answer 1 · 2017-07-22T21:35:44+0000

We can also use the method pd.lreshape():

In [74]: x = np.repeat(df.columns[df.columns.str.contains(r'^unit_')].str.replace('_act','').values,
    ...:               len(df))
    ...:
    ...: pd.lreshape(df, {'activity': df.columns[df.columns.str.contains(r'^unit_')]}) \
    ...:   .assign(unit_number=x)
    ...:
Out[74]:
   trial_name  trial_num  activity unit_number
0        face          0  0.000000      unit_1
1        face          1  0.000000      unit_1
2        face          2  0.000000      unit_1
3       scene          3  0.000000      unit_1
4       scene          4  0.000000      unit_1
5        face          0  0.000000      unit_2
6        face          1  0.000000      unit_2
7        face          2  0.893845      unit_2
8       scene          3  0.879367      unit_2
9       scene          4  0.000000      unit_2
10       face          0  0.272244      unit_3
11       face          1  0.898450      unit_3
12       face          2  0.000000      unit_3
13      scene          3  0.000000      unit_3
14      scene          4  0.000000      unit_3
15       face          0  0.006428      unit_4
16       face          1  0.000000      unit_4
17       face          2  0.000000      unit_4
18      scene          3  0.006312      unit_4
19      scene          4  0.000000      unit_4

cmaher · Answer 2 · 2017-07-22T21:07:30+0000

You can do this by first renaming the unit_ columns and then using melt:

# remove "_act" suffix from "unit_" columns
df.columns = [x.replace("_act", "") for x in df.columns]

# melt "unit_" columns into key-value columns "unit_number" and "value_name"
df.melt(id_vars=["trial_num", "trial_name"], value_vars=[x for x in df.columns if "unit_" in x], var_name="unit_number", value_name="activity")

#     trial_num trial_name unit_number  activity
# 0           0       face      unit_1  0.000000
# 1           1       face      unit_1  0.000000
# 2           2       face      unit_1  0.000000
# 3           3      scene      unit_1  0.000000
# 4           4      scene      unit_1  0.000000
# 5           0       face      unit_2  0.000000
# 6           1       face      unit_2  0.000000
# ...         ...    ...        ...     ...

value_vars melt , "unit _".

Sebastian · Answer 3 · 2017-07-22T21:09:02+0000

You can also use the stack after renaming the columns.

df.set_index(['trial_num','trial_name'],inplace=True)
df.stack(inplace=True)
df.stack().reset_index()

piRSquared · Answer 4 · 2017-07-23T07:51:28+0000

You can use set_index, stackand reset_indexwhen renaming columns like this

d1 = df.set_index(['trial_num', 'trial_name'])
d1.columns = d1.columns.str.rsplit('_', 1, expand=True)
d1.columns.names = ['unit_number', None]

d1.stack(0).act.reset_index(name='activity')

    trial_num trial_name unit_number  activity
0           0       face      unit_1  0.000000
1           0       face      unit_2  0.000000
2           0       face      unit_3  0.272244
3           0       face      unit_4  0.006428
4           1       face      unit_1  0.000000
5           1       face      unit_2  0.000000
6           1       face      unit_3  0.898450
7           1       face      unit_4  0.000000
8           2       face      unit_1  0.000000
9           2       face      unit_2  0.893845
10          2       face      unit_3  0.000000
11          2       face      unit_4  0.000000
12          3      scene      unit_1  0.000000
13          3      scene      unit_2  0.879367
14          3      scene      unit_3  0.000000
15          3      scene      unit_4  0.006312
16          4      scene      unit_1  0.000000
17          4      scene      unit_2  0.000000
18          4      scene      unit_3  0.000000
19          4      scene      unit_4  0.000000

Move column for data in pandas dataframe

More articles: