Keep in mind that those methods are exploratory so we cannot really talk about significance (let alone statistical significance, since we are not testing anything).
How the Q2 is defined in PLS2 is based on the calculation of the Predicted Error Sum of Squares (based on the test set defined during the CV process), PRESS vs the Residual Sum of Squares (calculated directly from the fitted data).
Each is summed over all the Y variables for a given component. You would like to see:
\sqrt(PRESS) < \sqrt(RSS), or, if you want to put some slack \sqrt(PRESS) < 0.95* \sqrt(RSS).
After squaring and rearranging the terms, you come up with
Q^2 = 1 - PRESS/RSS <= 0.95^2 = 0.0975
So if your Q2 is negative, it means that the model is not good at predicting / generalising. It could be because your number of samples is too small during the CV process (even if you use loo, it may give you an unsufficient estimation); or, as you say, because X does not explain Y.
If the Q2 is low, but positive, it means you are still in the right ‘bandwidth’ because PRESS < RSS.
I like to look at the
plotIndiv() to work out if the sample scores are similar from X and Y (or you could extract the
$Y$variates and plot one against the other for each component. Similar information could be extracted from
then, only if I see some common information that seems to be extracted, I look at
plotVar() to figure out the correlation between specific subsets of variables.
Considering a sparse model with sPLS could also help to filter out some variables. We are currently looking at a new criterion to tune sPLS, hopefully in the next mixOmics update.