I have a data frame that I need to split based on two conditions, and these conditions apply to 3 different columns, and then the use of these new splits is performed according to some calculations on other columns in pairs.
So I have df A
A = pd.DataFrame([[1, 5, 2, 3, 4, 1], [2, 4, 4, 5, 5, 1], [3, 3, 1, 2, 5, 4], [4, 2, 2, 3, 5, 5], [5, 1, 4, 2, 3, 1]], columns=['A', 'B', 'C', 'D', 'E', 'F'], index=[1, 2, 3, 4, 5])
I need to split the data frame based on two conditions and apply to the first three columns, for example, this
condition1_colA = A[A['A'] > 1] condition2_colA = A[A['A'] < 1] condition1_colB = A[A['B'] > 1] condition2_colB = A[A['B'] < 1] condition1_colC = A[A['C'] > 1] condition2_colC = A[A['C'] < 1]
Now I need to do 3 calculations on each of 6 new cut data frames, however these calculations are soared as follows:
- condition1_colA and condition2_colA with column ['D']
- condition1_colB and condition2_colB with column ['E']
- condition1_colC and condition2_colC with column ['F']
Here is an example for the first pair:
x = condition1_colA['D'].mean() y = condition1_colA['D'].std() z = condition1_colA['D'][condition1_colA['D'] > 0].count() x1 = condition2_colA['D'].mean() y1 = condition2_colA['D'].std() z1 = condition2_colA['D'][condition1_colA['D'] < 0].count()
Then we would do the same for the other two couples. And we will organize 18 calculations on a datframe. I know how to do this manually, but I hope for some help with this using a group or perhaps a pivot table.
thanks