Include components from sPLS-DA in regression?

I am working on a human gut microbiome data and its association with few outcomes. Firstly, I started with PCA, and then selected 3 principal components (using this code: modeling <- data.frame(MyResult.spca$x[,1:3], df3)) and the variables present in each component. Later, I ran regression analyses with the outcomes and 3 principal components individually, and for all variables selected in each component, adjusting for potential confounders.

However, because PCA is not discriminatory, the microbes selected at this stage were not really related to any of our outcomes.

Therefore, we went forward to perform sPLS-DA, and selected 3 components and the variables present in each component. I repeated the regression analyses for all variables selected in the components, but I am not able to select the data from components as I could do from PCA. Is there a method by which I could extract the components from sPLS-DA and run regression on it?

Thank you

hi @dr.vinay.muc

you wan tot extract the components, also called variates from a sPLS-DA object, those are called $variates (see ?splsda). You can use the selectVar() function to extract the variables selected (not sure in your case that applies).
Just be mindful that whatever Y you have set up in sPLS-DA, if your outcomes are closely related to Y in your downstream regression analyses, then you will be overfitting your model.


Thank you Kim-Anh, that really helps. Is there a way to alter the levels of the Y variable in the individual plots? Because, I have coded them as a factor with level 0 (no/mild) and 1 (moderate/severe), but the plot shows moderate/severe first.