Does Pandas calculate ewm incorrectly?

Question

Does Pandas calculate ewm incorrectly?

When trying to calculate the exponential moving average (EMA) from the financial data in the data frame, it seems that the Pandas ewm approach is wrong.

The basics are well explained in the following link: http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:moving_averages

When moving on to explaining Pandas, the following approach is used (using the "Adjust" parameter as False):

   weighted_average[0] = arg[0];
   weighted_average[i] = (1-alpha) * weighted_average[i-1] + alpha * arg[i]

This is in my opinion incorrect. "Arg" should be (for example) closing values, however arg [0] is the first average value (ie, the Simple average value of the first series of data of the length of the selected period), but NOT the first closing value, Therefore arg [0] and arg [ i] can never be from the same data. Using the min_periods parameter does not seem to solve this problem.

Can someone explain to me how (or if) Pandas can be used to correctly calculate EMA data?

+8

pandas exponential moving-average

jeronimo Jun 20 '16 at 13:55

source share

4 answers

chrisb · Answer 1 · 2016-06-20T16:00:35+0000

There are several ways to initialize an exponential moving average, so I would not say that pandas is doing it wrong, just different.

, :

In [20]: s.head()
Out[20]: 
0    22.27
1    22.19
2    22.08
3    22.17
4    22.18
Name: Price, dtype: float64

In [21]: span = 10

In [22]: sma = s.rolling(window=span, min_periods=span).mean()[:span]

In [24]: rest = s[span:]

In [25]: pd.concat([sma, rest]).ewm(span=span, adjust=False).mean()
Out[25]: 
0           NaN
1           NaN
2           NaN
3           NaN
4           NaN
5           NaN
6           NaN
7           NaN
8           NaN
9     22.221000
10    22.208091
11    22.241165
12    22.266408
13    22.328879
14    22.516356
15    22.795200
16    22.968800
17    23.125382
18    23.275312
19    23.339801
20    23.427110
21    23.507635
22    23.533520
23    23.471062
24    23.403596
25    23.390215
26    23.261085
27    23.231797
28    23.080561
29    22.915004
Name: Price, dtype: float64

arkochhar · Answer 2 · 2017-10-01T04:31:55+0000

EWMA, (span) Pandas ewm.

alpha: (1 - alpha) * previous_val + alpha * current_val alpha = 1 / period

coeff: ((current_val - previous_val) * coeff) + previous_val coeff = 2 / (period + 1)

Pandas :

con = pd.concat([df[:period][base].rolling(window=period).mean(), df[period:][base]])

if (alpha == True):
    df[target] = con.ewm(alpha=1 / period, adjust=False).mean()
else:
    df[target] = con.ewm(span=period, adjust=False).mean()

Ben · Answer 3 · 2018-04-26T20:24:09+0000

, Pandas , ewm:

name = 'closing'
series = pd.Series([1, 2, 3, 5, 8, 13, 21, 34], name=name).to_frame()
period = 4
alpha = 2/(1+period)

series[name+'_ewma'] = np.nan
series.loc[0, name+'_ewma'] = series[name].iloc[0]

series[name+'_ewma_adjust'] = np.nan
series.loc[0, name+'_ewma_adjust'] = series[name].iloc[0]

for i in range(1, len(series)):
    series.loc[i, name+'_ewma'] = (1-alpha) * series.loc[i-1, name+'_ewma'] + alpha * series.loc[i, name]

    ajusted_weights = np.array([(1-alpha)**(i-t) for t in range(i+1)])
    series.loc[i, name+'_ewma_adjust'] = np.sum(series.iloc[0:i+1][name].values * ajusted_weights) / ajusted_weights.sum()

print(series)
print("diff adjusted=False -> ", np.sum(series[name+'_ewma'] - series[name].ewm(span=period, adjust=False).mean()))
print("diff adjusted=True -> ", np.sum(series[name+'_ewma_adjust'] - series[name].ewm(span=period, adjust=True).mean()))

https://github.com/pandas-dev/pandas/issues/8861.

tentativafc · Answer 4 · 2019-07-01T02:44:29+0000

If you calculate ewm for ewm (like the MACD formula), you will have poor results because the second and subsequent ewm will use an index starting at 0 and ending with a dot. I am using the following solution.

sma = df['Close'].rolling(period, min_periods=period).mean()
#this variable is used to shift index by non null start minus period
idx_start = sma.isna().sum() + 1 - period
idx_end = idx_start + period
sma = sma[idx_start: idx_end]
rest = df[item][idx_end:]
ema = pd.concat([sma, rest]).ewm(span=period, adjust=False).mean()

Does Pandas calculate ewm incorrectly?

More articles: