Efficient way to handle pandas DataFrame timeframes with Numba

Question

Efficient way to handle pandas DataFrame timeframes with Numba

I have a DataFrame with 1,500,000 rows. This is the one-minute stock market data that I bought from QuantQuote.com. (Open, High, Low, Close, Volume). I am trying to launch some homemade backtests of stock trading strategies. The direct python for transaction processing is too slow, and I wanted to try using numba to speed things up. The problem is that numba does not work with pandas functions .

Google searches find an amazing lack of information on using numba with pandas. Which makes me wonder if I am mistaken when considering this.

My setup is Numba 0.13.0-1, pandas 0.13.1-1. Windows 7, MS VS2013 with PTVS, Python 2.7, Enthought Canopy

My existing Python + Pandas innerloop has the following general structure

Compute the indicator columns (using pd.ewma, pd.rolling_max, pd.rolling_min, etc.).
Compute “event” columns for predefined events, such as moving averages of crosses, new highs, etc.

Then I use DataFrame.iterrows to handle the DataFrame.

I tried various optimizations, but still not as fast as I would like. And optimization causes errors.

I want to use numba to handle strings. Are there any preferred methods to approximate this?

DataFrame , - DataFrame.values , , numba . , , . , , DataFrame.values, .

.

+4

python python-2.7 pandas numba

JasonEdinburgh 13 '14 11:35

1

Peque · Answer 1 · 2017-06-27T12:37:36+0000

Numba - , NumPy. NumPy , Numba, Pandas.

, 2017-06-27, Pandas, NumPy.

, , " ". , , :

import pandas


df = pandas.DataFrame([0, 1, 2, 3])
df.values[2] = 8
print(df)  # Should show you the value `8`

, Numba - ( ) , Python. , @numba.jit(nopython=True) ( , Python JIT- , ).

, , , Pandas, Pandas, ( NumPy) Backbackest Numba.

Efficient way to handle pandas DataFrame timeframes with Numba

More articles: