withinVariation() on part of dataset

Kang · December 18, 2021, 7:22pm

Merry Christmas and Happy New Year!

I am trying to perform withinVariation() function before fitting data into sPLS-DA and DIABLO. However, only one-third of subjects in my dataset have repeat measurements (pre- vs. post-drug treatment).

I compared the PCA with vs. without “multilevel = subjects” on these repeat measurements and the difference was quite obvious. I am wondering if I can only apply withinVariation() on the subjects with repeat measurements and then combine the output with the remaining dataset to fit sPLS-DA?

Thank you!

MaxBladen · February 3, 2022, 1:48am

Hi @Kang,

Yes this is possible. The below code uses the SRBCT dataset as an example, such that we pretend the first 20 samples are repeated measurements.

Note that with data used below, withinVariation() introduced negative values into the dataframe. It didn’t have negative values prior to the decomposition. Hence, I would advise you to be careful doing this as it may drastically change the distributions of the repeated samples while the non-repeated samples retain their original distributions.

This may have detrimental effects to your model. I would suggest running PCA on the original data (equivalent of X below) and the partially decomposed data (X.final below) and assessing the differences.

Hope this helped.

Cheers,
Max.

library(mixOmics)
data(srbct) 
X <- srbct$gene
Y <- srbct$class

# pretend that the first 20 samples from this dataset are of repeated design
# indices of "repeated samples"
repeated.samples <- 1:20 
sample <- c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10) 
# set multilevel design
design <- data.frame(sample = sample) 

# subset X dataframe for only repeated samples
X.r <- X[repeated.samples, ] 

# decompose
X.r.w <- withinVariation(X = X.r, design = design) 

# combine decomposed repeated samples with the remaining, non-repeated samples
X.final <- rbind(X.r.w, X[-repeated.samples, ]) 

# form sPLS-DA model
model <- splsda(X.final, Y, keepX = c(15,15))

Topic		Replies	Views
DIABLO adjusting for repeat measures using withinVariation() Analysis	1	34	January 23, 2025
Biological replicates and Diablo Analysis	6	791	August 11, 2022
Bug with PLS-DA multilevel: example code not working Support	6	280	December 19, 2024
sPLSDA multilevel and timepoints Support	1	220	December 8, 2022
Multilevel design issues	2	231	May 25, 2023

withinVariation() on part of dataset

Related topics