withinVariation() on part of dataset

MaxBladen · February 3, 2022, 1:48am

Yes this is possible. The below code uses the SRBCT dataset as an example, such that we pretend the first 20 samples are repeated measurements.

Note that with data used below, withinVariation() introduced negative values into the dataframe. It didn’t have negative values prior to the decomposition. Hence, I would advise you to be careful doing this as it may drastically change the distributions of the repeated samples while the non-repeated samples retain their original distributions.

This may have detrimental effects to your model. I would suggest running PCA on the original data (equivalent of X below) and the partially decomposed data (X.final below) and assessing the differences.

Hope this helped.

Cheers,
Max.

library(mixOmics)
data(srbct) 
X <- srbct$gene
Y <- srbct$class

# pretend that the first 20 samples from this dataset are of repeated design
# indices of "repeated samples"
repeated.samples <- 1:20 
sample <- c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10) 
# set multilevel design
design <- data.frame(sample = sample) 

# subset X dataframe for only repeated samples
X.r <- X[repeated.samples, ] 

# decompose
X.r.w <- withinVariation(X = X.r, design = design) 

# combine decomposed repeated samples with the remaining, non-repeated samples
X.final <- rbind(X.r.w, X[-repeated.samples, ]) 

# form sPLS-DA model
model <- splsda(X.final, Y, keepX = c(15,15))

Topic		Replies	Views
DIABLO adjusting for repeat measures using withinVariation() Analysis	1	34	January 23, 2025
Biological replicates and Diablo Analysis	6	791	August 11, 2022
Bug with PLS-DA multilevel: example code not working Support	6	280	December 19, 2024
sPLSDA multilevel and timepoints Support	1	220	December 8, 2022
Multilevel design issues	2	231	May 25, 2023

withinVariation() on part of dataset

Related topics