Search for the last occurrence in multiple columns in a data frame

Suppose I have a large data block similar to the structure below

 home| away|  home_score| away_score
    A|    B|           1|          0
    B|    C|           1|          1
    C|    A|           1|          0

I want to find the latest result, regardless of home / away. For example, the last score of teams A, B, and C is 0, 1, and 1, respectively, and is filled back to the original data framework:

 home| away|  home_score| away_score| last_score_home| last_score_away|
    A|    B|           1|          0|                |                |
    B|    C|           1|          1|               0|                |
    C|    A|           1|          0|               1|               1|
 ...

I tried group and shift, but I'm not sure how to combine the results at home / at home.

+4
source share
1 answer

- . 1) , ; 2) ; 3) stack, :

df.columns = df.columns.str.replace("^([^_]+)$", "\\1_team").str.split("_", expand=True)
df.stack(level=0).groupby("team").tail(1)

#         score   team
#1  home      1      B
#2  away      0      A
#   home      1      C

Update:

, join:

df.columns = df.columns.str.replace("^([^_]+)$", "\\1_team").str.split("_", expand=True)
df1 = df.stack(level=0).groupby("team").tail(1)   

# join the result back to the original transformed data frame 
df2 = df.stack(level=0).join(df1.score, rsuffix = "_last").unstack(level=1)
df2.columns = [x + "_" + y for x, y in df2.columns]
df2

enter image description here

+4

Source: https://habr.com/ru/post/1670941/


All Articles