Median pandas dataframe

Question

Median pandas dataframe

I have a DataFrame df :

 name count aaaa 2000 bbbb 1900 cccc 900 dddd 500 eeee 100

I would like to see rows that are within 10 times of the median of the count column.

I tried df['count'].median() and got the median. But I don’t know how to move on. Can you suggest how I could use pandas / numpy for this.

Expected Result:

 name count distance from median aaaa 2000 *****

I can use any measure as the distance from the median (absolute deviation from the median, quantile, etc.).

+6

python numpy pandas r

Ssank Apr 21 '15 at 16:58

source share

2 answers

Median absolute deviation

for a column, one could also calculate using statsmodels.robust.scale.mad , which can also be assigned the normalization constant c , which in this case is only 1.

 >>> from statsmodels.robust.scale import mad >>> mad(df['count'], c=1) 800.0

+1

miradulo Feb 26 '17 at 1:52

source share

Computerfellow · Accepted Answer · 2015-04-21T17:07:38+0000

If you are looking for how to calculate the median absolute deviation -

 In [1]: df['dist'] = abs(df['count'] - df['count'].median()) In [2]: df Out[2]: name count dist 0 aaaa 2000 1100 1 bbbb 1900 1000 2 cccc 900 0 3 dddd 500 400 4 eeee 100 800 In [3]: df['dist'].median() Out[3]: 800.0

Median pandas dataframe

More articles: