The name is a bit confusing, but I will do my best to explain my problem here. I have 2 pandas dataframes, a and b:
>> print a id | value 1 | 250 2 | 150 3 | 350 4 | 550 5 | 450 >> print b low | high | class 100 | 200 | 'A' 200 | 300 | 'B' 300 | 500 | 'A' 500 | 600 | 'C'
I want to create a new column named class in table a that contains the value class according to table b. Here is the result I want:
>> print a id | value | class 1 | 250 | 'B' 2 | 150 | 'A' 3 | 350 | 'A' 4 | 550 | 'C' 5 | 450 | 'A'
I have the following code written that does what I want:
a['class'] = pd.Series() for i in range(len(a)): val = a['value'][i] cl = (b['class'][ (b['low'] <= val) \ (b['high'] >= val) ].iat[0]) a['class'].set_value(i,cl)
The problem is that this is fast for tables 10 or so long, but I am trying to do this with a table size of 100,000+ for a and b. Is there a faster way to do this using some function / attribute in pandas?
source share