Zipline backtesting using non-US (European) intraday data

Question

Zipline backtesting using non-US (European) intraday data

I am trying to get zipline to work with non-US, intraday data that I loaded into a pandas DataFrame:

BARC HSBA LLOY STAN Date 2014-07-01 08:30:00 321.250 894.55 112.105 1777.25 2014-07-01 08:32:00 321.150 894.70 112.095 1777.00 2014-07-01 08:34:00 321.075 894.80 112.140 1776.50 2014-07-01 08:36:00 321.725 894.80 112.255 1777.00 2014-07-01 08:38:00 321.675 894.70 112.290 1777.00

I followed the tutorial here , replacing “AAPL” with my own character code, and historical calls with “1m” data instead of “1d”.

Then I make the final call with algo_obj.run(DataFrameSource(mydf)) , where mydf is the above framework.

However, there are all kinds of problems associated with TradingEnvironment . According to the source code:

 # This module maintains a global variable, environment, which is # subsequently referenced directly by zipline financial # components. To set the environment, you can set the property on # the module directly: # from zipline.finance import trading # trading.environment = TradingEnvironment() # # or if you want to switch the environment for a limited context # you can use a TradingEnvironment in a with clause: # lse = TradingEnvironment(bm_index="^FTSE", exchange_tz="Europe/London") # with lse: # the code here will have lse as the global trading.environment # algo.run(start, end)

However, using context does not seem to work fully. I still get errors, for example, stating that my timestamps are before the market opens (and, indeed, looking at trading.environment.open_and_close , time for the US market.

My question is: has anyone managed to use zipline with non-US, intraday data? Could you point me to a resource and ideally a code example on how to do this?

nb I saw that there are tests on github that relate to trading calendars (tradincalendar_lse.py, tradingcalendar_tse.py, etc.) - but this seems to only process data at the daily level. I would need to fix:

opening / closing time
reference data for the standard
and maybe more ...

+6

zipline hft

Luciano Aug 6 '14 at 16:29

source share

2 answers

Luciano · Answer 1 · 2014-08-08T08:00:32+0000

It works for me after messing with the textbook. Sample code below. It uses DF mid as described in the original question. A few points are noted:

Trade calendar . I create it manually and assign trading.environment using non_working_days in tradecalendar_lse.py. Alternatively, you can create one that exactly matches your data (however, this may be a problem for data outside the sample). There are two fields that need to be defined: trading_days and open_and_closes .
sim_params There is a problem with the start / end defaults as they are not timezone aware. Thus, you must create a sim_params object and pass the start / end parameters using the time zone.
In addition, run() should be called with the argument overwrite_sim_params = False as calculate_first_open / close . Increase time stamps.

It should be noted that it is also possible to transfer pandas Panel data, with the fields open, high, low, close, price and volume to minor_axis. But in this case, the previous fields are required - otherwise errors occur.

Please note that this code only gives a daily performance summary. I am sure there should be a way to get the result with a minute resolution (I thought it was set to emission_rate , but apparently it is not). If anyone knows, please comment and I will update the code. Also, I’m not sure if the api call causes an “analysis” (ie When using the %%zipline magic in IPython, as in the tutorial, the analyze() method is automatically called. How to do it manually?)

 import pytz from datetime import datetime from zipline.algorithm import TradingAlgorithm from zipline.utils import tradingcalendar from zipline.utils import tradingcalendar_lse from zipline.finance.trading import TradingEnvironment from zipline.api import order_target, record, symbol, history, add_history from zipline.finance import trading def initialize(context): # Register 2 histories that track daily prices, # one with a 100 window and one with a 300 day window add_history(10, '1m', 'price') add_history(30, '1m', 'price') context.i = 0 def handle_data(context, data): # Skip first 30 mins to get full windows context.i += 1 if context.i < 30: return # Compute averages # history() has to be called with the same params # from above and returns a pandas dataframe. short_mavg = history(10, '1m', 'price').mean() long_mavg = history(30, '1m', 'price').mean() sym = symbol('BARC') # Trading logic if short_mavg[sym] > long_mavg[sym]: # order_target orders as many shares as needed to # achieve the desired number of shares. order_target(sym, 100) elif short_mavg[sym] < long_mavg[sym]: order_target(sym, 0) # Save values for later inspection record(BARC=data[sym].price, short_mavg=short_mavg[sym], long_mavg=long_mavg[sym]) def analyze(context,perf) : perf["pnl"].plot(title="Strategy P&L") # Create algorithm object passing in initialize and # handle_data functions # This is needed to handle the correct calendar. Assume that market data has the right index for tradeable days. # Passing in env_trading_calendar=tradingcalendar_lse doesn't appear to work, as it doesn't implement open_and_closes from zipline.utils import tradingcalendar_lse trading.environment = TradingEnvironment(bm_symbol='^FTSE', exchange_tz='Europe/London') #trading.environment.trading_days = mid.index.normalize().unique() trading.environment.trading_days = pd.date_range(start=mid.index.normalize()[0], end=mid.index.normalize()[-1], freq=pd.tseries.offsets.CDay(holidays=tradingcalendar_lse.non_trading_days)) trading.environment.open_and_closes = pd.DataFrame(index=trading.environment.trading_days,columns=["market_open","market_close"]) trading.environment.open_and_closes.market_open = (trading.environment.open_and_closes.index + pd.to_timedelta(60*7,unit="T")).to_pydatetime() trading.environment.open_and_closes.market_close = (trading.environment.open_and_closes.index + pd.to_timedelta(60*15+30,unit="T")).to_pydatetime() from zipline.utils.factory import create_simulation_parameters sim_params = create_simulation_parameters( start = pd.to_datetime("2014-07-01 08:30:00").tz_localize("Europe/London").tz_convert("UTC"), #Bug in code doesn't set tz if these are not specified (finance/trading.py:SimulationParameters.calculate_first_open[close]) end = pd.to_datetime("2014-07-24 16:30:00").tz_localize("Europe/London").tz_convert("UTC"), data_frequency = "minute", emission_rate = "minute", sids = ["BARC"]) algo_obj = TradingAlgorithm(initialize=initialize, handle_data=handle_data, sim_params=sim_params) # Run algorithm perf_manual = algo_obj.run(mid,overwrite_sim_params=False) # overwrite == True calls calculate_first_open[close] (see above)

user4634850 · Answer 2 · 2015-03-05T02:48:38+0000

@Luciano

You can add analyze(None, perf_manual) to the end of your code to automatically start the analysis process.

Zipline backtesting using non-US (European) intraday data

More articles: