How can I avoid for loops and nested sentences and be more Pythonic?
At first glance, this may seem like “ask me all my work for me.” I can assure you that this is not so. I am trying to learn some real Python and would like to find ways to speed up the code based on a reproducible example and a predefined function.
I calculate the return from the following specific signals in the financial markets, using loads for cycles and nested if offers. I have made several attempts, but I just do not get anywhere with vectorization or understanding or other more pythonic trading tools. I have been fine with this so far, but finally I am starting to feel the pain of using functions that are too slow in scale.
I have a dataframe with two indexes and one specific event. The first two code snippets are included to show the procedure step by step. I included the whole thing with some predefined settings and a function at the very end.
IN 1]
import numpy as np
import pandas as pd
import datetime
np.random.seed(12345678)
Observations = 10
df = pd.DataFrame(np.random.randint(0,10,size=(Observations, 2)),
columns=['IndicatorA', 'IndicatorB'] )
df['Event'] = np.random.randint(0,2,size=(Observations, 1))
datelist = pd.date_range(pd.datetime.today().strftime('%Y-%m-%d'),
periods=Observations).tolist()
df['Dates'] = datelist
df = df.set_index(['Dates'])
df['Signal'] = 0
print(df)
Out [1]

A data frame is indexed by date. The signal I'm looking for is determined by the interaction of these indicators and events. The signal is calculated as follows (extension by fragment above):
IN 2]
i = 0
for signals in df['Signal']:
if i == 0:
df.ix[i,'Signal'] = 0
else:
if df.ix[i,'IndicatorA'] > 5:
df.ix[i,'Signal'] = 1
else:
if df.ix[i - 1,'IndicatorB'] > 5 & df.ix[i,'Event'] > 1:
df.ix[i,'Signal'] = 1
else:
df.ix[i,'Signal'] = 0
i = i + 1
print(df['Signal'])
Out [2]

, . , Signal. , , , % time ipython.
import numpy as np
import pandas as pd
import datetime
def fxSlow(Observations):
np.random.seed(12345678)
df = pd.DataFrame(np.random.randint(0,10,size=(Observations, 2)),
columns=['IndicatorA', 'IndicatorB'] )
df['Event'] = np.random.randint(0,2,size=(Observations, 1))
datelist = pd.date_range(pd.datetime.today().strftime('%Y-%m-%d'),
periods=Observations).tolist()
df['Signal'] = 0
df['Dates'] = datelist
df = df.set_index(['Dates'])
i = 0
for signals in df['Signal']:
if i == 0:
df.ix[i,'Signal'] = 0
else:
if df.ix[i,'IndicatorA'] > 5:
df.ix[i,'Signal'] = 1
else:
if df.ix[i - 1,'IndicatorB'] > 5 & df.ix[i,'Event'] > 1:
df.ix[i,'Signal'] = 1
else:
df.ix[i,'Signal'] = 0
i = i + 1
return np.mean(df['Signal'])
/ :

, , Pythonic?
, , 100000?
