sPLS tuning always selecting minimum in range of test values

bolomics · July 10, 2023, 8:39am

I have a dataset of around 3000 preprocessed features and 164 observations (Y) to relate to a dataset of 17 exposures for the same 164 people (X), and as with the PLS2 documentation I am trying to run a sPLS model to answer the question “if I consider the features as response data, can I model the features given the predictor variables of the exposure data?”. However, whenever I range of test values for X and Y the smallest value is always specified as the choice.keep no matter what I put in. Has anyone had similar issues? Is there anything else that I can try?

kimanh.lecao · July 13, 2023, 10:30pm

hi @bolomics,

First, I would probably consider:
Y = exposure
X = your ‘pre-processed features’
The question being: what are the features that in combination can explain specific (or all) exposures. (you can then try Y = a specific exposure type, or selected them in combination with sPLS2).

First, I’d start with PCA on the exposure and look at plotVar correlation circle plot to understand the correlations and associations between exposure type.

Then
The tuning of PLS in that context is very tricky, so I would adopt a more lenient approach, and select, say, 50 features per components and about 5 exposure type (depending on what you find above). Again I’d look at the plotVar plot to see if that makes sense. Then I’d use the perf function to evaluate how well this model is doing.

Kim-Anh

Topic		Replies	Views
Refitting a sPLS model and negative loadings Support	2	357	July 14, 2021
How to determine number of variables tio be used when we say 𝑛≪𝑝;	1	308	May 10, 2021
Perf() and tune() producing different optimal component counts Analysis	7	1207	May 26, 2022
Proportion explained variance in PLS vs sPLS model Analysis	4	69	March 28, 2025
PLS-DA classification Analysis	1	308	July 27, 2022

sPLS tuning always selecting minimum in range of test values

Related topics