I want to take a column of datetime objects and return a column of integers that are "days from this day to today." I can do it ugly by looking for a more beautiful (and faster) way.
So, suppose I have a dataframe with a datetime column, for example:
11 2014-03-04 17:16:26+00:00
12 2014-03-10 01:35:56+00:00
13 2014-03-15 02:35:51+00:00
14 2014-03-20 05:55:47+00:00
15 2014-03-26 04:56:33+00:00
Name: datetime, dtype: object
And each element looks like this:
datetime.datetime(2014, 3, 4, 17, 16, 26, tzinfo=<UTC>)
Suppose I want to calculate how many days each observation happened, and return it as a prime integer. I know that I can just use it applytwice, but is there a way to vectorize / clean?
today = datetime.datetime.today().date()
df_dates = df['datetime'].apply(lambda x: x.date())
days_ago = today - df_dates
Which gives a series of timedelta64 [ns].
11 56 days, 00:00:00
12 50 days, 00:00:00
13 45 days, 00:00:00
14 40 days, 00:00:00
15 34 days, 00:00:00
Name: datetime, dtype: timedelta64[ns]
And finally, if I want this integer:
days_ago_as_int = days_ago.apply(lambda x: x.item().days)
days_ago_as_int
11 56
12 50
13 45
14 40
15 34
Name: datetime, dtype: int64
Any thoughts?
Related questions that did not quite understand what I was asking:
Pandas Python-
Pandas
Karl D, , - ( , , , ?):
converted_dates = df['date'].values.astype('datetime64[D]')
today_date = np.datetime64(dt.date.today())
print converted_dates
print today_date
print today_date - converted_dates
[2014-01-16 00:00:00
2014-01-19 00:00:00
2014-01-22 00:00:00
2014-01-26 00:00:00
2014-01-29 00:00:00]
2014-04-30 00:00:00
[16189 days, 0:08:20.637994
16189 days, 0:08:20.637991
16189 days, 0:08:20.637988
16189 days, 0:08:20.637984
16189 days, 0:08:20.637981]