I am trying to simulate three small datasets that contain x1 , x2 , x3 , x4 , trt and IND .
However, when I try to split simulated data using IND using "split" in R, I get warning messages and exits. Can someone please give me a hint what I did wrong in my R code?
# Step 2: simulate data Alpha = 0.05 S = 3 # number of replicates x = 8 # number of covariates G = 3 # number of treatment groups N = 50 # number of subjects per dataset tot = S*N # total subjects for a simulation run # True parameters alpha = c(0.5, 0.8) # intercepts b1 = c(0.1,0.2,0.3,0.4) # for pi_1 of trt A b2 = c(0.15,0.25,0.35,0.45) # for pi_2 of trt B b = c(1.1,1.2,1.3,1.4); ############################################################################## # Scenario 1: all covariates are independent standard normally distributed # ############################################################################## set.seed(12) x1 = rnorm(n=tot, mean=0, sd=1);x2 = rnorm(n=tot, mean=0, sd=1); x3 = rnorm(n=tot, mean=0, sd=1);x4 = rnorm(n=tot, mean=0, sd=1); ############################################################################### p1 = exp(alpha[1]+b1[1]*x1+b1[2]*x2+b1[3]*x3+b1[4]*x4)/ (1+exp(alpha[1]+b1[1]*x1+b1[2]*x2+b1[3]*x3+b1[4]*x4) + exp(alpha[2]+b2[1]*x1+b2[2]*x2+b2[3]*x3+b2[4]*x4)) p2 = exp(alpha[2]+b2[1]*x1+b2[2]*x2+b2[3]*x3+b2[4]*x4)/ (1+exp(alpha[1]+b1[1]*x1+b1[2]*x2+b1[3]*x3+b1[4]*x4) + exp(alpha[2]+b2[1]*x1+b2[2]*x2+b2[3]*x3+b2[4]*x4)) p3 = 1/(1+exp(alpha[1]+b1[1]*x1+b1[2]*x2+b1[3]*x3+b1[4]*x4) + exp(alpha[2]+b2[1]*x1+b2[2]*x2+b2[3]*x3+b2[4]*x4)) # To assign subjects to one of treatment groups based on response probabilities tmp = function(x){sample(c("A","B","C"), 1, prob=x, replace=TRUE)} trt = apply(cbind(p1,p2,p3),1,tmp) IND=rep(1:S,each=N) #create an indicator for split simulated data sim=data.frame(x1,x2,x3,x4,trt, IND) Aset = subset(sim, trt=="A") Bset = subset(sim, trt=="B") Cset = subset(sim, trt=="C") Anew = split(Aset, f = IND) Bnew = split(Bset, f = IND) Cnew = split(Cset, f = IND)
Warning message:
> Anew = split(Aset, f = IND) Warning message: In split.default(x = seq_len(nrow(x)), f = f, drop = drop, ...) : data length is not a multiple of split variable
and the output will be
$`2` x1 x2 x3 x4 trt IND 141 1.0894068 0.09765185 -0.46702047 0.4049424 A 3 145 -1.2953113 -1.94291045 0.09926239 -0.5338715 A 3 148 0.0274979 0.72971804 0.47194731 -0.1963896 A 3 $`3` [1] x1 x2 x3 x4 trt IND <0 rows> (or 0-length row.names)
I checked my R-code several times, but I canβt understand what I did wrong. Thank you very much in advance