Pandas row manipulation in two columns

Here is my dataframe:

           A                B
0   asdf|afsdf|fasd    sdsd|wer|wer
1   sdfsdf             sdfsdff
2   sdf|s              sdfsde|sdf

I would like to form a column Cthat will contain the combined values ​​from the column Aand the column Bto the first |, and if |not, then it simply joins the two columns. Also, when concatenating, I would like to insert -- here is what the column looks like C:

         C
0   asdf--sdsd
1   sdfsdf--sdfsdff
2   sdf--sdfsde

I can go through each line with df.locand get what I need, but it's slow, and I wonder if there is a faster way to do this.

+4
source share
2 answers

Short answer using strandsplit

df['C'] = df.A.str.split('|').str.get(0).add('--') \
        + df.B.str.split('|').str.get(0)
df

enter image description here


You can expand it further

df['C'] = df.A.str.split('|', expand=True).stack() \
    .add('--').add(df.B.str.split('|', expand=True).stack()) \
    .groupby(level=0).apply('|'.join)
df

enter image description here

+3

, :

In [1]: import pandas as pd

In [2]: d = {'A': ('asdf|afsdf|fasd', 'sdfsdf', 'sdf|s'),
             'B': ('sdsd|wer|wer', 'sdfsdff', 'sdfsde|sdf')}

In [3]: data = pd.DataFrame(d)

In [4]: data['C'] = data['A'].str.split('|').str.get(0) + "--" + data['B'].str.split('|').str.get(0)

In [5]: data
Out[5]: 
                 A             B                 C
0  asdf|afsdf|fasd  sdsd|wer|wer       asdf--sdsd
1           sdfsdf       sdfsdff  sdfsdf--sdfsdff
2            sdf|s    sdfsde|sdf      sdf--sdfsde

pandas .

+5

Source: https://habr.com/ru/post/1657109/


All Articles