I'm just starting out with Pandas, and it's hard for me to process data like dataframes. From time to time, I just can't figure out how to do something without iterating through the lines.
For example, I have a dataframe with budget information. I want to extract the “provider” from the “short description”, which is a string from one of three possible forms:
- blah blah blah for supplier name
- blah blah blah on behalf of the supplier
- supplier name
I can do this using the following code, but I cannot help but feel that it is not using Pandas correctly. Any thoughts on improving it?
for i, row in dataframe.iterrows():
current = dataframe['short description'][i]
if 'to' in current:
point_of_break = current.index('to') + 3
dataframe['vendor'][i] = current[point_of_break:]
elif 'at' in current:
point_of_break = current.index('at') + 3
dataframe['vendor'][i] = current[point_of_break:]
else:
dataframe['vendor'][i] = current
source
share