Hey, I have a question on how to simulate data from
“A novel approach for biomarker selection and the integration of repeated measures experiments from two assays”.
I want to compare using Xw to X for PLSDA filter methods. I want to simulate data similar to the block correlation used for the subject random effect and error term in the paper. However I am using only two treatment groups.
This is my code:
library(MASS)
library(rlist)
library(Matrix)
mujk=list(c(rep(2,40),rep(0,160)),
c(rep(1,40),rep(0,160))) #means separating certain genes for both treatments
rho1=0.8 #correlation for subject random effect
rho2=0.7 #correlation for error
varR=3 #variance of subject random effect
varE=0.5 #variance of error
sigmarandom=list((diag(20) + (rho1varR)(1-diag(20)))-diag(20)+(diag(20)varR),(diag(20) + (rho1varR)*(1-diag(20)))-diag(20)+(diag(20)*varR)) #homogenous covariance matrix rho =0.8 for subject random effect block diagonal
sigmaerror=list((diag(20) + (rho2varE)(1-diag(20)))-diag(20)+(diag(20)varE),(diag(20) + (rho2varE)*(1-diag(20)))-diag(20)+(diag(20)*varE)) #homogenous covariance matrix rho =0.7 for error random effect block diagonal
Rsig=as.matrix(bdiag(rep(sigmarandom,5)))
Esig=as.matrix(bdiag(rep(sigmaerror,5)))
perswitch=c(1,-1) #gives labels to treatments
lmmper=list(2) #treatments
lmmS=list(12) #total subjects to simulate
for(s in 1:12){ #12 is number of subjects
###random effect is changing
Randeff=mvrnorm(n=1,mu=rep(0,200),Sigma=Rsig)# Random effect for subject same for each #observation on subject
for(p in 1:2){ #for each treatment
lmmper[[p]]=c(s,perswitch[p],mujk[[p]]+Randeff+mvrnorm(n=1,mu=rep(0,200),Sigma=Esig))#(sample ID, Y, X)
}
lmmS[[s]]=lmmper
}
lmmX=lapply(1:length(lmmS),function(i){list.rbind(lmmS[[i]])})
lmmX=data.frame(list.rbind(lmmX))
colnames(lmmX)[1:2]<-c(“sample”,“response”)
lmmY=data.frame(lmmX[,1:2])
X=lmmX[,-c(1,2)]
Y=lmmY$response
I am curious if I am following the logic correctly from the paper. I can simulate the data using a block diagonal right? If so does my approach look correct or am I misunderstanding how to conduct the simulation?I am comparing filter methods with PLSDA on this design and am not seeing any difference in feature selection. Makes me think I am doing something wrong.
I appreciate the help!