Pandas - find the closest dates between two DataFrames without a loop

I am trying to find the closest previous date using two separate DataFrames. I actually got the code for this, but it uses a loop forthat I would prefer not to use, especially since my actual DataFrames will be significantly larger than the following snippet:

date_x = pd.to_datetime(['1/15/2015','2/14/2015','3/16/2015','4/15/2015','5/15/2015','6/14/2015','7/14/2015'])
date_y = pd.to_datetime(['1/1/2015','3/1/2015','6/14/2015','8/1/2015'])

dfx = pd.DataFrame({'date_x':date_x})
dfy = pd.DataFrame({'date_y':date_y})

z_list = []
for x in range(dfx['date_x'].count()):
    z_list.append(dfy['date_y'][dfy['date_y'] <= dfx['date_x'][x]].max())

dfx['date_z'] = z_list

gives ...

      date_x     date_z
0 2015-01-15 2015-01-01
1 2015-02-14 2015-01-01
2 2015-03-16 2015-03-01
3 2015-04-15 2015-03-01
4 2015-05-15 2015-03-01
5 2015-06-14 2015-06-14
6 2015-07-14 2015-06-14

which is what I want, but again, I think there is a more pandonic way.

+2
source share
1 answer

Try using the merge_asof () method :

NOTE. This method was added in Pandas v.0.19.0

In [17]: pd.merge_asof(dfx, dfy, left_on='date_x', right_on='date_y')
Out[17]:
      date_x     date_y
0 2015-01-15 2015-01-01
1 2015-02-14 2015-01-01
2 2015-03-16 2015-03-01
3 2015-04-15 2015-03-01
4 2015-05-15 2015-03-01
5 2015-06-14 2015-06-14
6 2015-07-14 2015-06-14
+4
source

Source: https://habr.com/ru/post/1670012/


All Articles