How to determine number of variables tio be used when we say 𝑛≪𝑝;

amnah · May 3, 2021, 6:48pm

we know that PLSr is good modeling method for 𝑛≪𝑝; but how to determine the best/right number of variables; how do we know the number of variables that we are using is right or is affecting the model if very large say 20k; Is there a reference for knowing the statistical power or knowing what should be the number of p that we could say focus on with feature selection method to derive subset of features to be used from big data say transcriptomiocs or metabolomics with variables in thousands (upto20k) ;I will appreciate if somebody could point me to the direction where I can use properly and cite such reference for my analysis.

aljabadi · May 10, 2021, 12:55am

Hi @amnah,

The number of variables to select can be tuned using the tune.spls function. It used cross-validation to find the best set of features to keep in the model. Please refer to ?tune.spls for more details.

Hope it helps,

Al

Topic		Replies	Views
Number of variables in final sPLS-DA Analysis	1	88	May 2, 2024
Number of variables per component in tuning vs checking stability Support	2	254	September 6, 2023
Perf() and tune() producing different optimal component counts Analysis	7	1207	May 26, 2022
Proportion explained variance in PLS vs sPLS model Analysis	4	69	March 28, 2025
sPLS tuning always selecting minimum in range of test values Analysis	1	148	July 13, 2023

How to determine number of variables tio be used when we say 𝑛≪𝑝;

Related topics