Subtract specific lines

I have data that looks like this:

Participant Round Total 1 100 5 1 101 8 1 102 12 1 200 42 2 100 14 2 101 71 40 100 32 40 101 27 40 200 18 

I want to get a table with Total last Round (200) minus Total first Round (100);

For example, for participant 1, this is 42 - 5 = 37 .

The end result should look like this:

 Participant Total 1 37 2 40 -14 
+6
source share
4 answers

With base R

 aggregate(Total ~ Participant, df[df$Round %in% c(100, 200), ], diff) # Participant Total # 1 1 37 # 2 2 # 3 40 -14 

Or similarly in combination with subset

 aggregate(Total ~ Participant, df, subset = Round %in% c(100, 200), diff) 

Or using data.table

 library(data.table) ; setDT(df)[Round %in% c(100, 200), diff(Total), by = Participant] # Participant V1 # 1: 1 37 # 2: 40 -14 

Or using a binary connection

 setkey(setDT(df), Round) df[.(c(100, 200)), diff(Total), by = Participant] # Participant V1 # 1: 1 37 # 2: 40 -14 

Or using dplyr

 library(dplyr) df %>% group_by(Participant) %>% filter(Round %in% c(100, 200)) %>% summarise(Total = diff(Total)) # Source: local data table [2 x 2] # # Participant Total # 1 1 37 # 2 40 -14 
+12
source

try the following:

 df <- read.table(header = TRUE, text = " Participant Round Total 1 100 5 1 101 8 1 102 12 1 200 42 2 100 14 2 101 71 2 200 80 40 100 32 40 101 27 40 200 18") library(data.table) setDT(df)[ , .(Total = Total[Round == 200] - Total[Round == 100]), by = Participant] 
+2
source

you can try this

 library(dplyr) group_by(df, Participant) %>% filter(row_number()==1 | row_number()==max(row_number())) %>% mutate(df = diff(Total)) %>% select(Participant, df) %>% unique() Source: local data frame [3 x 2] Groups: Participant Participant df 1 1 37 2 2 57 3 40 -14 
+1
source

Everyone loves a little sqldf, so if your requirement is not to use the application, try the following:

Firstly, some test data:

 df <- read.table(header = TRUE, text = " Participant Round Total 1 100 5 1 101 8 1 102 12 1 200 42 2 100 14 2 101 71 2 200 80 40 100 32 40 101 27 40 200 18") 

Then use SQL to create 2 columns - one for 100 rounds and one for round 200 and subtract them

 rolled <- sqldf(" SELECT tab_a.Participant AS Participant ,tab_b.Total_200 - tab_a.Total_100 AS Difference FROM ( SELECT Participant ,Total AS Total_100 FROM df WHERE Round = 100 ) tab_a INNER JOIN ( SELECT Participant ,Total AS Total_200 FROM df WHERE Round = 200 ) tab_b ON (tab_a.Participant = tab_b.Participant) ") 
+1
source

Source: https://habr.com/ru/post/987411/


All Articles