How to cut a framework with a date field as an index?

In mine, dataframeI change my index to a date field as

df.index = df.TRX_DATE   # transaction date and type is class pandas.core.series.Series'

Now I want to cut mine dataframebased on two dates or any date difference.

But I get errors.

# currentdate is today date
startdate = currentdate - timedelta(days=30)

dflast30 = df.loc[startdate:currentdate]  # error

Tried to do, creating a mask

mask = (df['TRX_DATE'] >= startdate) & (df['TRX_DATE'] <= currentdate )
dflast30 = df.loc[mask]

dflast30 = df.loc[mask]

TypeError: unorderable types: str ()> datetime.datetime ()

Then I tried to truncate as:

dflast30 = df.truncate(before = currentdate, after = startdate)

And I get the same error.

I am embarrassed. And I need to give advice on these issues:

  • Is it possible to change the index (field TRX_DATE) to type datetime?

  • Or should I do this type of row field.

  • Or should I indicate the unassigned index as it is, and search the date field for my current requirement.

  • Or give an example of how I can make a date field as an index and slice for a date range, and please also indicate the output.

+4
1

, .

TRX_DATE :

df.index = pd.to_datetime(df['TRX_DATE'])

, TRX_DATE :

df = df.set_index(['TRX_DATE'])

:

import pandas as pd
import numpy as np
import io
import datetime as dt

temp=u"""TRX_DATE;A
2013-07-05;1
2013-08-06;1
2015-09-05;2
2015-10-08;2
2015-11-05;2
2015-11-25;2
2015-12-06;3"""

df = pd.read_csv(io.StringIO(temp), sep=";", parse_dates=[0])
print df
#    TRX_DATE  A
#0 2013-07-05  1
#1 2013-08-06  1
#2 2015-09-05  2
#3 2015-10-08  2
#4 2015-11-05  2
#5 2015-11-25  2
#6 2015-12-06  3

print df.dtypes
#TRX_DATE    datetime64[ns]
#A                    int64
#dtype: object

#copy column TRX_DATE to index
#df.index = pd.to_datetime(df['TRX_DATE'])
#no copy, only set column TRX_DATE to index
df = df.set_index(['TRX_DATE'])
print df
#            A
#TRX_DATE
#2013-07-05  1
#2013-08-06  1
#2015-09-05  2
#2015-10-08  2
#2015-11-05  2
#2015-11-25  2
#2015-12-06  3

currentdate = dt.date.today()
print currentdate
#2015-11-06

startdate = currentdate - pd.Timedelta(days=30)
print startdate
#2015-10-07

dflast30 = df.loc[startdate:currentdate]
print dflast30
#            A
#TRX_DATE
#2015-10-08  2
#2015-11-05  2

dflast30 = dflast30.reset_index()
print dflast30
#    TRX_DATE  A
#0 2015-10-08  2
#1 2015-11-05  2

, df. datetimeindex.

import pandas as pd
import numpy as np
import io
import datetime as dt

temp=u"""TRX_DATE;A
2013-07-05;1
2013-08-06;1
2015-09-05;2
2015-10-08;2
2015-11-05;2
2015-11-25;2
2015-12-06;3"""

df = pd.read_csv(io.StringIO(temp), sep=";", parse_dates=[0])
print df
#    TRX_DATE  A
#0 2013-07-05  1
#1 2013-08-06  1
#2 2015-09-05  2
#3 2015-10-08  2
#4 2015-11-05  2
#5 2015-11-25  2
#6 2015-12-06  3

print df.dtypes
#TRX_DATE    datetime64[ns]
#A                    int64
#dtype: object

currentdate = dt.date.today()
print currentdate
#2015-11-06

startdate = currentdate - pd.Timedelta(days=30)
print startdate
#2015-10-07

dflast30 = df[(df.TRX_DATE >= startdate) & (df.TRX_DATE <= currentdate)]
print dflast30
#    TRX_DATE  A
#3 2015-10-08  2
#4 2015-11-05  2
+2

Source: https://habr.com/ru/post/1614722/


All Articles