The joris answers in this thread and punchagan in the duplicated stream are very elegant, however they will not give the correct results if the column used for keys contains any duplicate value.
For example:
>>> ptest = p.DataFrame([['a',1],['a',2],['b',3]], columns=['id', 'value']) >>> ptest id value 0 a 1 1 a 2 2 b 3
If you have duplicate entries and donβt want to lose them, you can use this ugly but working code:
>>> mydict = {} >>> for x in range(len(ptest)): ... currentid = ptest.iloc[x,0] ... currentvalue = ptest.iloc[x,1] ... mydict.setdefault(currentid, []) ... mydict[currentid].append(currentvalue) >>> mydict {'a': [1, 2], 'b': [3]}
dalloliogm Jun 23 '14 at 2:35 a.m. 2014-06-23 14:35
source share