Multilevel sPLS1 to find treatment response biomarkers

Luke · May 15, 2025, 8:26am

Hi,
I have just started using mixomics and I’m really enjoying exploring the package so far. I’m a PhD student working on a multi omics study with longitudinal samples collected at 12-week intervals from 15 patients during a clinical trial.

Currently, I’m concerned with proteomics data (250 proteins). I want to answer ‘Alterations in which proteins are associated with treatment response?’ After studying your website and forum, I have decided to use sPLS1 with multilevel adjustment for repeated samples from individual patients.

My samples
Patients = 15
Timepoints = 4

My sPLS variables
X = Proteomics data (counts of 250 proteins)
y = Total Improvement Score (a continuous compound clinical measurement used to assess response of patients to drug - 0 is minimum, 100 is maximum)
Multilevel = Patient ID

Optimal ncomp = 1
Optimal KeepX = 250
Optimal KeepY = 1

My question for the forum is, will looking at proteins with the greatest loading on sPLS component 1 tell me eg. ‘an increase/decrease in protein 1 is most important for an increase in y’? Or is there another way you’d suggest I answer this question?

evahamrud · May 16, 2025, 2:50am

Hi @Luke,

Yes, proteins with the greatest loadings on the sPLS components are the ones which are most informative in distinguishing your samples based on improvement score. You can also try tuning with a range of smaller test.keepX variables (I see you settled on 250, but perhaps even less will work?) and these remaining variables after tuning will also be ones that are best at distinguishing the total improvement score outcome.

Cheers,
Eva

Luke · May 16, 2025, 9:32am

Hi @evahamrud,

Thanks very much for the advice.

I’ve tried tuning with 100 proteins and found that optimal KeepX = 14 proteins with this range.
However, MESP for this model was 0.60 - compared to 0.50 with KeepX = 250 proteins.
Would you agree that this suggests i should stick with KeepX = 250?

Thanks,
Luke

evahamrud · May 30, 2025, 5:55am

Hi @Luke,

I think perhaps something to think about here is what you would like your end goal to be.

If you would like a short list of proteins (say 10 or so) that you can dig deeper into, you can take the proteins that have the highest loadings in your sPLS1 model. You can visualise and extract these using plotLoadings() function.

If instead you would like to build a model that can accurately predict improvement score from your proteomics data, I would recommend running tuning between 1 and 250 proteins until you find the optimal number of proteins. You can run a broad range first (like every 20) and then a more fine-grained grid where you saw the lowest error rates next.

Cheers,
Eva

Topic		Replies	Views
Transcriptomic signature with sPLS-DA Analysis	7	1474	October 3, 2019
Difference between PLS-DA and sPLS-DA Analysis	3	4106	December 21, 2020
Issues with hierarchical clustering after running tune.spls() Support	3	694	April 26, 2022
Is it possible to do a multi level sPLS analysis? Analysis	9	835	October 6, 2022
PLS or sPLS? What does Q2 mean? Analysis	1	2222	March 30, 2020

Multilevel sPLS1 to find treatment response biomarkers

Related topics