How to get frequencies, then add them as a variable in an array?

Say I have an array of this format

XYZ A 1 0 A 2 1 B 1 1 B 2 1 B 1 0 

I want to find the frequency X and the frequency Y given by X, and then add them to the array

 XYZF(x) F(Y|X) A 1 0 2 1 A 2 1 2 1 B 1 1 3 2 B 2 1 3 1 B 1 0 3 2 
+6
source share
3 answers

Here is a data.table method:

 require(data.table) DT <- data.table(dat) DT[,nx:=.N,by=X][,nxy:=.N,by=list(X,Y)] 

In the last step, two columns were created:

 DT # XYZ nx nxy # 1: A 1 0 2 1 # 2: A 2 1 2 1 # 3: B 1 1 3 2 # 4: B 2 1 3 1 # 5: B 1 0 3 2 

And it could be written in two lines instead of one:

 DT[,nx:=.N,by=X] DT[,nxy:=.N,by=list(X,Y)] 
+6
source
 # Assuming your data frame is called df: df$Fx <- ave(as.numeric(as.factor(df$X)), df$X, FUN = length) df2 <- as.data.frame(with(df, table(X, Y)), responseName = "Fyx") df3 <- merge(df, df2) # please see @thelatemail clean `ave`-only calculation of 'Fyx' df3 # XYZ Fx Fyx # 1 A 1 0 2 1 # 2 A 2 1 2 1 # 3 B 1 1 3 2 # 4 B 1 0 3 2 # 5 B 2 1 3 1 # And a ddply alternative library(plyr) df2 <- ddply(.data = df, .variables = .(X), mutate, Fx = length(X)) ddply(.data = df2, .variables = .(X, Y), mutate, Fxy = length(Y)) 
+3
source

Using ave and suppose your dat data

 dat$Fx <- with(dat,ave(Y,list(X),FUN=length)) dat$Fyx <- with(dat,ave(Y,list(X,Y),FUN=length)) 

Result:

  XYZ Fx Fyx 1 A 1 0 2 1 2 A 2 1 2 1 3 B 1 1 3 2 4 B 2 1 3 1 5 B 1 0 3 2 

If the data does not have a numeric column for ave to work, then:

 dat$Fx <- with(dat,ave(seq_len(nrow(dat)),list(X),FUN=length)) dat$Fyx <- with(dat,ave(seq_len(nrow(dat)),list(X,Y),FUN=length)) 
+3
source

Source: https://habr.com/ru/post/954025/


All Articles