Principal Components in plsda()

Hi, I’m Alberto and this is my firs topic in this forum.

Recently, I’ve discovered mixOmics because I needed to calculate a PLS-DA. But I think I’m doing something wrong.

I used the following command:
df3 ← mixOmics::plsda(df2, df1$disease, ncomp = 2, scale = TRUE)

But, in the plot, first PC have 2.83% while second PC have 3.06%. It is a little weird for me, because with PCA or PCoA, the first component always has the highest value. For this reason, I decided to do a test, with the first 100 components. I show a screenshot with two dataframes (I only show you the 10 first rows). The first is a df with the components as they appear in the analysis. The second is after ordering the components by their percentage value.

image

Can someone explain to me why this happens? and how do I solve it?

Thanks!

hi @Athalberht ,

You are confused between PCA and PLS-DA. PCA aims to maximise the variance of the data, and so the components should explain as much variance as possible. PLSDA maximises the covariance between the components and the output, so the variance, while interpretable (i.e. the outcome can be explained by xx variance from the data) is not maximised here.

You can have a look at the mixOmics handbook that gives more details on this.

Kim-Anh