How to calculate moving rank correlation using Pandas

I would like to compute a moving rank correlation between two columns in a data frame. However, the current rolling_corr in pandas does not support rank correlation. I tried to perform a rolling rank correlation with rolling_apply , but had no success. It seems that rolling_apply only takes one array as an input argument, but two arrays are needed for correlation. Is there a reasonable way to do moving rank correlation using rolling_apply or some other methods? Rank correlation will be a good complement to rolling_corr , if possible.

+3
source share
1 answer

I don’t think rolling_apply can be used to perform moving correlation, since it seems to split DataFrames into 1-dimensional arrays. There may be more efficient ways to do this, but one solution is to get the generator to give a slice for each window on its own:

 def window(length, size=2, start=0): while start + size <= length: yield slice(start, start + size) start += 1 

and then skip it.

 In [144]: from pandas import DataFrame ...: import numpy as np ...: ...: df = DataFrame(np.arange(10).reshape(2,5).T, columns=['a','b']) ...: ...: df.iloc[0,1] = -1 #still perfect with ranked correlation, but not with pearson r ...: ...: for w in window(len(df), size=3): ...: df_win = df.iloc[w,:] ...: spearman = df_win['a'].rank().corr(df_win['b'].rank()) ...: pearson = df_win['a'].corr(df_win['b']) ...: print w.start, spearman, pearson ...: 0 1.0 0.917662935482 1 1.0 1.0 2 1.0 1.0 
+4
source

Source: https://habr.com/ru/post/974856/


All Articles