I am trying to translate a dataframe from key strings, values to a table with keys as columns and values as cells. For instance:
Key data input frame, value:
>>>df = pd.DataFrame([['TIME', 'VAL1', 'VAL2', 'VAL3',
'TIME', 'VAL1', 'VAL2', 'VAL3'],
["00:00:01",1,2,3,"00:00:02", 1,2,3]]).T
0 1
0 TIME 00:00:01
1 VAL1 1
2 VAL2 2
3 VAL3 3
4 TIME 00:00:02
5 VAL1 1
6 VAL2 2
7 VAL3 3
I want it to look like this:
TIME VAL1 VAL2 VAL3
00:00:01 1 2 3
00:00:02 1 2 3
I can almost get what I want with the axis:
>>>df.pivot(columns=0, values=1)
TIME VAL1 VAL2 VAL3
0 00:00:01 None None None
1 None 1 None None
2 None None 2 None
3 None None None 3
4 00:00:02 None None None
5 None 1 None None
6 None None 2 None
7 None None None 3
And I can concatenate the lines to get what I want:
>>> df.pivot(columns=0, values=1).ffill().drop_duplicates(subset='TIME',
keep='last').set_index('TIME')
TIME VAL1 VAL2 VAL3
00:00:01 1 2 3
00:00:02 1 2 3
But this seems like a rather inconvenient way to do this, which will throw out a lot of memory for a large data set. Is there a simpler method?
I was tired of looking at pd.DataFrame.from_items()and pd.DataFrame.from_records(), but was not successful.