Frequency table with ddply function

ID<-c("R1","R2","R2","R3","R3","R4","R4","R4","R4","R3","R3","R3","R3","R2","R2","R2","R5","R6") event<-c("a","b","b","M","s","f","y","b","a","a","a","a","s","c","c","b","m","a") df<-data.frame(ID,event) 

How can I modify the code below to get this table. 2-How can I get the average frequency for each frequency element? For example: the average frequency for a will be 1 + 3 + 1 + 1/4.

 ddply(df,.(ID),summarise,N=sum(!is.na(ID)),frequency=length(event)) ID N Number-event-level levels frequency R1 1 1 aa=1 R2 5 2 b,cb=3,c=2 R3 6 3 M,a,s M=1,a=3,s=2 R4 4 4 f,y,b,af=1,y=1,b=1,a=1 R5 1 1 mm=1 R6 1 1 aa=1 
+5
source share
1 answer

Here is the answer to the first question:

 ddply(df,.(ID),summarise, N=length(event), Number.event.level=length(unique(event)), levels=paste(sort(unique(event)),collapse=","), frequency=paste(paste(sort(unique(event)),table(event)[table(event)>0],sep="="),collapse=",")) # ID N Number.event.level levels frequency # 1 R1 1 1 aa=1 # 2 R2 5 2 b,cb=3,c=2 # 3 R3 6 3 a,M,sa=3,M=1,s=2 # 4 R4 4 4 a,b,f,ya=1,b=1,f=1,y=1 # 5 R5 1 1 mm=1 # 6 R6 1 1 aa=1 

For your second question, it looks like you want to get the middle frequency when the frequency is greater than 0. If this happens, you can do this:

 apply(table(df),2,function(x) mean(x[x>0])) # abcfm M sy # 1.5 2.0 2.0 1.0 1.0 1.0 2.0 1.0 

Update

If you want to do this last part for each level of the third variable, and you still want to use ddply() , you can do the following:

 df1 <- rbind(df,df) df1$cat <- rep(c("a","b"),each=nrow(df)) ddply(df1,.(cat),function(y) apply(table(y),2,function(x) mean(x[x>0]))) # cat abcfm M sy # 1 a 1.5 2 2 1 1 1 2 1 # 2 b 1.5 2 2 1 1 1 2 1 
+3
source

Source: https://habr.com/ru/post/1240201/


All Articles