Find DE genes downstream of PLS-DA

tkapello · October 14, 2019, 8:03am

Hi all,

  I have a human dataset and I investigate the effect of disease on my favourite cell type. The

samples looked intertwined on a PCA so I decided to run a PLS-DA and discriminate for the presence
(or not) of the disease. My ultimate goal is to identify DE genes between the two conditions and
validate them experimentally. When I tune the PLS-DA, I end up with 8 features on Component 1 and 7
features on Component 2 which feels very few to me. Is there a way to get the maximum number of
features that can still discriminate the two conditions sufficiently? Or am I missingg the principle of
PLS-DA?

Thanks in advance,
Theo

kimanh.lecao · October 17, 2019, 2:40am

Hi Theo,

If your aim is primarily to identify DE genes then PLS-DA is not appropriate, as it considers a signature as a whole, rather than individual and independently identified genes (which is what a classical univariate analysis does).

But assuming you are still interested in PLS-DA, then yes you can change the keepX (number of variables to select per component) as you wish. The tuning gives you an indication, but ultimately you can vary this parameter. What is worthwhile once you fit your final sPLS-DA model is then to run a perf() function to estimate the performance of the model. This is going to really tell you how performant the method is (evaluated based on cross-validation).

Kim-Anh

Topic		Replies	Views
Help deciding the number of components in PLS-DA Analysis	3	409	June 27, 2024
Number of variables in final sPLS-DA Analysis	1	93	May 2, 2024
Difference between PLS-DA and sPLS-DA Analysis	3	4069	December 21, 2020
Transcriptomic signature with sPLS-DA Analysis	7	1464	October 3, 2019
Best criteria for sPLS-DA feature selection: VIP, weight coeff, stability? Analysis	8	1742	December 9, 2022

Find DE genes downstream of PLS-DA

Related topics