Hi everyone,
I am working on a multi-omics analysis with the block.pls() method. There are 5 X-blocks with different number of features and the Y-block has 4 continuous variables. The aim of the study is to understand the mechanism rather than making pedictions.
As the first step, I would like to make a decision on the number of components by checking the variance explained by each component in each block. However, the variance explained inblock Y did not sum up to a value smaller than 1, which I couldn’t understand:
> fit.block.pls <- block.pls(X, Y, ncomp = 20, design = "full", scale = TRUE, mode = "regression")
> sum(fit.block.pls$prop_expl_var$X1)
[1] 0.1021845
> sum(fit.block.pls$prop_expl_var$X2)
[1] 0.1782524
> sum(fit.block.pls$prop_expl_var$X3)
[1] 0.7210476
> sum(fit.block.pls$prop_expl_var$X4)
[1] 0.6524423
> sum(fit.block.pls$prop_expl_var$X5)
[1] 0.8613584
> sum(fit.block.pls$prop_expl_var$Y)
[1] 7.309105
I read a previous post about sPLS-DA where the variance explained in Y was 1 at the first component. But I think the explanation does not apply here because my Y block has 4 continuous variables instead of 1 binary variable.
The variance explained by each component in Y look like this:
given which, is there no clear clue of how to select the number of components? In the X-blocks, it is much easier because there are clear elbows in the plot at around component 8 (could not paste a plot here because of the limit for new users).
Many thanks in advance for your help!
Sincerely,
CR