Hey, I have a question on how to simulate data from

“A novel approach for biomarker selection and the integration of repeated measures experiments from two assays”.

I want to compare using Xw to X for PLSDA filter methods. I want to simulate data similar to the block correlation used for the subject random effect and error term in the paper. However I am using only two treatment groups.

This is my code:

library(MASS)

library(rlist)

library(Matrix)

mujk=list(c(rep(2,40),rep(0,160)),

c(rep(1,40),rep(0,160))) #means separating certain genes for both treatments

rho1=0.8 #correlation for subject random effect

rho2=0.7 #correlation for error

varR=3 #variance of subject random effect

varE=0.5 #variance of error

sigmarandom=list((diag(20) + (rho1*varR)*(1-diag(20)))-diag(20)+(diag(20)*varR),(diag(20) + (rho1*varR)*(1-diag(20)))-diag(20)+(diag(20)*varR)) #homogenous covariance matrix rho =0.8 for subject random effect block diagonal

sigmaerror=list((diag(20) + (rho2*varE)*(1-diag(20)))-diag(20)+(diag(20)*varE),(diag(20) + (rho2*varE)*(1-diag(20)))-diag(20)+(diag(20)*varE)) #homogenous covariance matrix rho =0.7 for error random effect block diagonal

Rsig=as.matrix(bdiag(rep(sigmarandom,5)))

Esig=as.matrix(bdiag(rep(sigmaerror,5)))

perswitch=c(1,-1) #gives labels to treatments

lmmper=list(2) #treatments

lmmS=list(12) #total subjects to simulate

for(s in 1:12){ #12 is number of subjects

###random effect is changing

Randeff=mvrnorm(n=1,mu=rep(0,200),Sigma=Rsig)# Random effect for subject same for each #observation on subject

for(p in 1:2){ #for each treatment

lmmper[[p]]=c(s,perswitch[p],mujk[[p]]+Randeff+mvrnorm(n=1,mu=rep(0,200),Sigma=Esig))#(sample ID, Y, X)

}

lmmS[[s]]=lmmper

}

lmmX=lapply(1:length(lmmS),function(i){list.rbind(lmmS[[i]])})

lmmX=data.frame(list.rbind(lmmX))

colnames(lmmX)[1:2]<-c(“sample”,“response”)

lmmY=data.frame(lmmX[,1:2])

X=lmmX[,-c(1,2)]

Y=lmmY$response

I am curious if I am following the logic correctly from the paper. I can simulate the data using a block diagonal right? If so does my approach look correct or am I misunderstanding how to conduct the simulation?I am comparing filter methods with PLSDA on this design and am not seeing any difference in feature selection. Makes me think I am doing something wrong.

I appreciate the help!