Split, display data in two columns in pandas data frame

I want to split the data in two columns from the data frame and build new columns using this data.

My data frame,

dfc = pd.DataFrame( {"A": ["GT:DP:RO:QR:AO:QA:GL", "GT:DP:RO:QR:AO:QA:GL", "GT:DP:RO:QR:AO:QA:GL", "GT:DP:GL", "GT:DP:GL"], "B": ["0/1:71:43:1363:28:806:-71.1191,0,-121.278", "0/1:71:43:1363:28:806:-71.1191,0,-121.278", "0/1:71:43:1363:28:806:-71.1191,0,-121.278", "1/1:49:-103.754,0,-3.51307", "1/1:49:-103.754,0,-3.51307"]} )

I need separate columns with a name GT, DP, RO, QR, AO, QA, GLwith values ​​from a columnB

I want to create an output like, enter image description here

We can separate the two columns with a = df.A.str.split(":", expand = True)and b = df.B.str.split(":", expand = True)to get two separate data frames. They can be combined with c = pd.merge(a, b, left_index = True, right_index = True)to obtain all the desired data. But not in the format as expected. enter image description here

? , split A B, dict A B . .

+4
2

OrderedDict, dict , <<22 > , list.

dataframe.

from collections import OrderedDict

L = dfc.apply(
    lambda x: OrderedDict(zip(x['A'].split(':'), x['B'].split(':'))), 1).tolist()
pd.DataFrame(L)

enter image description here

+3
  • ':'. 2 . stack, , str.split
  • , level=0, .
  • zip dict, A B .
  • unstack .

gb = dfc.stack().str.split(':').groupby(level=0)
gb.apply(lambda x: dict(zip(*x))).unstack()

enter image description here

+2

Source: https://habr.com/ru/post/1665195/


All Articles