I have a pandas dataframe:
order_id buyer_id phone_no
611 261 9920570003
681 261 9321613595
707 261 9768270700
707 261 9768270700
707 261 9768270700
708 261 9820895896
710 261 7208615775
710 261 7208615775
710 261 7208615775
711 261 9920986486
800 234 Null
801 256 Null
803 289 Null
I need to replace the buyer_id column as follows:
order_id buyer_id phone_no
611 261_01 9920570003
681 261_02 9321613595
707 261_03 9768270700
707 261_03 9768270700
707 261_03 9768270700
708 261_04 9820895896
710 261_05 7208615775
710 261_05 7208615775
710 261_05 7208615775
711 261_06 9920986486
800 234 Null
801 256 Null
803 289 Null
So, if there is no phone, he should treat him as one and the same customer, he should add a new series to 261. I want only to 261 buyer_idbe renamed, the other lines should be the same. Since I process orders coming from the phone as261
I can add a series to 261buyer_id with the following code:
for i in range((len(phone_orders):
print '261_%d' %i
segments_data['buyer_id']
phone_orders contains all phone orders.
But I did not understand how to replace the column buyer_idwith the desired output
df['buyer_id'] = '261_' + (df['phone_no'] !=
df['phone_no'].shift()).cumsum().map("{:02}".format)
buyer_id phone_no
261_01 9920570003
261_02 9321613595
261_03 9768270700
261_03 9768270700
261_03 9768270700
261_04 9820895896
261_05 7208615775
261_05 7208615775
261_05 7208615775
261_06 9920986486
261_07 9768270700
261_07 9768270700
261_07 9768270700
261_08 9820895896
261_09 7208615775
261_09 7208615775
261_09 7208615775
So 7208615775phone_no should be 261_05, but it gives 261_09.