Doing Pandas correctly ... instead of using a loop

I'm just starting out with Pandas, and it's hard for me to process data like dataframes. From time to time, I just can't figure out how to do something without iterating through the lines.

For example, I have a dataframe with budget information. I want to extract the “provider” from the “short description”, which is a string from one of three possible forms:

  • blah blah blah for supplier name
  • blah blah blah on behalf of the supplier
  • supplier name

I can do this using the following code, but I cannot help but feel that it is not using Pandas correctly. Any thoughts on improving it?

for i, row in dataframe.iterrows():
    current = dataframe['short description'][i]
    if 'to' in current:
        point_of_break = current.index('to') + 3
        dataframe['vendor'][i] = current[point_of_break:]
    elif 'at' in current:
        point_of_break = current.index('at') + 3
        dataframe['vendor'][i] = current[point_of_break:]
    else:
        dataframe['vendor'][i] = current
+4
source share
1 answer

, str.split to at, str[-1]

.

df = pd.DataFrame({'A':['blah blah blah to "vendor name"',
                        'blah blah blah at "vendor name"',
                        '"vendor name"']})
print (df)

                                 A
0  blah blah blah to "vendor name"
1  blah blah blah at "vendor name"
2                    "vendor name"

print (df.A.str.split('[at|to]\s+'))
0    [blah blah blah t, "vendor name"]
1    [blah blah blah a, "vendor name"]
2                      ["vendor name"]
Name: A, dtype: object

df['vendor'] = df.A.str.split('(at|to) *').str[-1]
print (df)
                                 A          vendor
0  blah blah blah to "vendor name"   "vendor name"
1  blah blah blah at "vendor name"   "vendor name"
2                    "vendor name"   "vendor name"

:

df['vendor'] = df.A.str.split('[at|to]\s+').str[-1]
print (df)
                                 A         vendor
0  blah blah blah to "vendor name"  "vendor name"
1  blah blah blah at "vendor name"  "vendor name"
2                    "vendor name"  "vendor name"
+3

Source: https://habr.com/ru/post/1669129/


All Articles