I watched online for a long time and did not see the answer to this specific question (I think).
The best way to explain myself would be with some code that replicates my problem. I made some temporary data:
x <- runif(100,1,2)
y <- runif(100,2,3)
z <- c(rep(1,100))
temp <- cbind(x,y,z)
temp[1:25,3] = temp[1:25,3] +2
temp <- as.data.frame(temp)
And this is what temp looks like
x y z
1 1.512620 2.552271 3
2 1.133614 2.455296 3
3 1.543242 2.490120 3
4 1.047618 2.069474 3
. . . .
. . . .
27 1.859012 2.687665 1
28 1.231450 2.196395 1
and it continues until the end of the data frame (100 lines).
What I want to do is apply the function to the data frame, but to subsets of the data. So, for example, I want to apply the value of the function to the columns x and y for z = 3 and apply the average function to the columns x and y for, when z = 1. Thus, I get 4 values: the average value of x at z = 1 and for z = 3 and the average value of y for z = 1 and z = 3. For my actual data set, the number of rows when z = some value varies greatly.
, ; , , , for.
x <- c(unique(temp$z))
^^ z ( z = 3 z = 1).
for(i in x){
assign(paste("newdata",i,sep=""),subset(temp[which(temp$z==i),],select=c("x","y")))
}
, newdata1 newdata3, . newdata1 , z = 1, newdata3 z = 3.
library(gdata)
blah <-cbindX(newdata1,newdata3)
cbindX . , ( ). , , , for . , z, . z 1 50, newdata1, newdata2, newdata3.... ..
... :
summ.test <- apply(blah,2,function(x) {
c(min(x,na.rm=TRUE),median(x,na.rm=TRUE),max(x,na.rm=TRUE),sum(!is.na(x)))})
x y x y
[1,] 1.028332 2.018162 1.012379 2.009595
[2,] 1.509049 2.504000 1.427981 2.455296
[3,] 1.992704 2.998483 1.978359 2.970695
[4,] 75.000000 75.000000 25.000000 25.000000
, , , . , : x, z = 1, y z = 1, x z = 3, y z = 3.
, : for loop , . , ?
, , - , ! .