Pandas daily daily check value for half an hour data index

I have a pandas framework with a half hour timeseries index and a series of daily data that I need to match based on the date for the equation. The following code works using .get () in a loop, but is slow and seems pretty "non-pythonic."

I tried turning the series into a dummy column framework to try to combine or find, but for various reasons, I can't get this to work. Data is missing, so key errors are possible with some potential methods.

Previously answered questions do not seem to apply. Someone good with lambda functions or the .asfreq method might come up with something.

import pandas as pd
import numpy as np

# Make a 2 day series
days = 2
dates = pd.date_range('20130102',periods=days)
ts_d = pd.Series(np.random.randn(days),index=dates)
ts_d

# Output

2013-01-02   -1.044139
2013-01-03   -1.061720
Freq: D, dtype: float64

# Make an overlapping 4 day dataframe with 60min index
datetimes = pd.date_range('20130101 00:00',periods=4*24, freq = '60min')
df_t = pd.DataFrame(np.random.randn(4*24,4),index=datetimes,columns=list('ABCD'))

# Begin clunkiness
df_t['date'] = df_t.index.date
for t in df_t.index:
    d = df_t.loc[t, 'date']
    df_t.loc[t, 'E'] = ts_d.get(d)
df_t

Some output:

                         A          B           C           D          date          E
2013-01-01 20:00:00 -0.173764   -1.440833   -0.163796    0.479593    2013-01-01  None
2013-01-01 21:00:00  1.915522    2.308827   -0.849182   -1.478981    2013-01-01  None
2013-01-01 22:00:00 -0.013391   -1.534994   -2.365495    0.747692    2013-01-01  None
2013-01-01 23:00:00  0.739665   -0.566568    0.413195    0.665017    2013-01-01  None
2013-01-02 00:00:00 -0.358202   -1.625681    0.120250   -1.122430    2013-01-02 -1.044139
2013-01-02 01:00:00  1.048837   -0.328021    0.933473   -0.234328    2013-01-02 -1.044139
2013-01-02 02:00:00  1.178195   -1.389543   -0.144850   -2.430063    2013-01-02 -1.044139
2013-01-02 03:00:00 -0.420962    0.244130    1.819005   -0.982521    2013-01-02 -1.044139
.
.
.
2013-01-02 15:00:00  1.809403   -2.505042   -0.509833   -1.238630    2013-01-02 -1.044139
2013-01-02 16:00:00  0.740123   -0.205582    0.795701    0.459017    2013-01-02 -1.044139
2013-01-02 17:00:00  1.252692    1.025432   -0.235781   -0.506460    2013-01-02 -1.044139
2013-01-02 18:00:00 -1.456726   -1.983843   -1.623061    0.629214    2013-01-02 -1.044139
2013-01-02 19:00:00  1.126687   -0.253415    0.163900    0.059876    2013-01-02 -1.044139
2013-01-02 20:00:00  0.156657    0.066207    0.103946   -0.762910    2013-01-02 -1.044139
2013-01-02 21:00:00 -1.123818    0.314226   -0.281381    0.947381    2013-01-02 -1.044139
2013-01-02 22:00:00 -0.945620    0.538180    1.403452   -0.065406    2013-01-02 -1.044139
2013-01-02 23:00:00  0.059012    2.599817   -0.623826    0.796559    2013-01-02 -1.044139
2013-01-03 00:00:00  0.859748    1.476591    0.607554   -1.575007    2013-01-03  -1.06172
2013-01-03 01:00:00  0.678326    0.084930    0.762786   -1.139595    2013-01-03  -1.06172
2013-01-03 02:00:00 -0.034952   -1.224600    0.317359   -1.620755    2013-01-03  -1.06172
2013-01-03 03:00:00 -1.208597   -1.864493   -0.883250   -0.814249    2013-01-03  -1.06172
2013-01-03 04:00:00 -0.061918    0.461941    0.163563    0.532755    2013-01-03  -1.06172
.
.
.
+4
2

pandathonically:

, :

df_t['Date'] = pd.to_datetime(df_t.index.date)

:

df_t = df_t.reset_index().set_index('Date')

:

df_t['E'] = ts_d

reset :

df_t = df_t.reset_index().set_index('index')

:

df_t.ix[pd.to_datetime('20130102')]

* edit: , jeff

+3

, , , df_t:

df_t.loc[:, 'E'] = None
for k, group in pd.groupby(df_t, df_t.index.date):
    df_t.E[group.index] = ts_d.get(k)

ts_d , , , , , , , , .

0

Source: https://habr.com/ru/post/1548677/


All Articles