Get the minimum and maximum elements for 2 matching rows in pandas

Suppose I have 2 series in pandas:

from datetime import datetime, timedelta import pandas as pd d = datetime.now() index = [d + timedelta(seconds = i) for i in range(5)] a = pd.Series([1,4,5,7,8], index = index) b = pd.Series([2,3,6,7,8], index = index) 

What is the best way to get the min / max values ​​for the corresponding index elements. How:

 min_func(a, b): [1,3,5,7,8] (for given index) max_func(a, b): [2,4,6,7,8] 

The only functions that I could find in the documentation are the min / max functions that return min / max inside the series, while the .apply function does not accept an index argument. Is there a better way to implement this without manual sequential iteration or some arithmetic magic (e.g. min_func: a * (a <b) + b * (b <= a), max_func: a * (a> b) + b * (b > = a))

thanks

+6
source share
1 answer

Merge the series into a frame that automatically aligns with the index

 In [51]: index Out[51]: [datetime.datetime(2013, 8, 26, 18, 33, 48, 990974), datetime.datetime(2013, 8, 26, 18, 33, 49, 990974), datetime.datetime(2013, 8, 26, 18, 33, 50, 990974), datetime.datetime(2013, 8, 26, 18, 33, 51, 990974), datetime.datetime(2013, 8, 26, 18, 33, 52, 990974)] In [52]: a = pd.Series([1,4,5,7,8], index = index) In [53]: b = pd.Series([2,3,6,7,8], index = index) In [54]: a Out[54]: 2013-08-26 18:33:48.990974 1 2013-08-26 18:33:49.990974 4 2013-08-26 18:33:50.990974 5 2013-08-26 18:33:51.990974 7 2013-08-26 18:33:52.990974 8 dtype: int64 In [55]: b Out[55]: 2013-08-26 18:33:48.990974 2 2013-08-26 18:33:49.990974 3 2013-08-26 18:33:50.990974 6 2013-08-26 18:33:51.990974 7 2013-08-26 18:33:52.990974 8 dtype: int64 In [56]: df = DataFrame({ 'a' : a, 'b' : b }) In [57]: df Out[57]: ab 2013-08-26 18:33:48.990974 1 2 2013-08-26 18:33:49.990974 4 3 2013-08-26 18:33:50.990974 5 6 2013-08-26 18:33:51.990974 7 7 2013-08-26 18:33:52.990974 8 8 

Min / max

 In [9]: df.max(1) Out[9]: 2013-08-26 18:33:48.990974 2 2013-08-26 18:33:49.990974 4 2013-08-26 18:33:50.990974 6 2013-08-26 18:33:51.990974 7 2013-08-26 18:33:52.990974 8 Freq: S, dtype: int64 In [10]: df.min(1) Out[10]: 2013-08-26 18:33:48.990974 1 2013-08-26 18:33:49.990974 3 2013-08-26 18:33:50.990974 5 2013-08-26 18:33:51.990974 7 2013-08-26 18:33:52.990974 8 Freq: S, dtype: int64 

Min / max index

 In [11]: df.idxmax(1) Out[11]: 2013-08-26 18:33:48.990974 b 2013-08-26 18:33:49.990974 a 2013-08-26 18:33:50.990974 b 2013-08-26 18:33:51.990974 a 2013-08-26 18:33:52.990974 a Freq: S, dtype: object In [12]: df.idxmin(1) Out[12]: 2013-08-26 18:33:48.990974 a 2013-08-26 18:33:49.990974 b 2013-08-26 18:33:50.990974 a 2013-08-26 18:33:51.990974 a 2013-08-26 18:33:52.990974 a Freq: S, dtype: object 
+7
source

Source: https://habr.com/ru/post/952550/


All Articles