I have a question regarding the presence of a single variable in more than one component when doing a sPLS-DA. My understanding was that components are orthogonal therefore a single variable could be only in one component. It appears that from my analysis I have variables present in two components. Is that normal?

hi @Fabien-Filaire,

It depends how many variables are selected on each component. Ie if keepX is a large number then it may result in some overlap.

It also depends on how difficult it is to classify your samples, maybe this variable is important on both components to achieve this.

Finally, when we say the components are orthogonal, it also means that the loading weights should differ between components (the extreme case being that gene A has a coefficient of 0 in component 1 and 1.5 in component 2, but of course this may vary).

Since you mentioned earlier that your data seemed to be highly collinear, it could be difficult to separate the signal. Try calculate the correlation between the two components (from your object$variates$X) to see how (un)correlated those components are. It may tell you a bit more about what is going on in your data, and if the PLS is managing.

As you can see above, there are a lot of maybes about what could happen. If it is only one variable I would not be too concerned. If more than that, then it is worth investigating further.