Tune.block.spls?

Hi mixOmics team,

Thanks for a great package. I’m aiming to use it for neuroimaging analysis and have had success with the DIABLO function.

I was wondering if there’s a tune.block.spls function/workaround/equivalent to tune keepX. I can’t seem to find anything specific, nor in the generic tune function. Apologies if I’ve missed something obvious.

Thank you,
Dave

hi @djaka,

Unfortunately we have no tuning function (and very few or no performance measures) for that module so far. We are about to launch a tune.spls() as a first step towards this, but it will be a long way before we reach the multi block. You can use a more exploratory analysis rather than having to rely on an ‘optimal’ keepX.

Kim-Anh

Thanks Kim-Anh.

I’ve put together a brief function which conceptually is the same as tune.spls (i.e. repeated M-fold cross validation and use the lowest MSE value to select optimal values), however looped this over all combination of KeepX for blocks. I believe this is was tune.block.splsda does. Luckily I’m only interested in 1 component.

Conceptually, can I check if there’s any reason this wouldn’t be valid?

Thanks again,
Dave

Dear @djaka,

Given that you are (I assume) only dealing with one y response, I think it should be fine, as this would be a similar case as tune.block.splsda but for a continuous y variable. The reason why we are still struggling with a tuning function is that we consider Y with multiple response variables.

Good luck in your analyses,

Kim-Anh

Dear @kimanh.lecao ,

I would also like to tune parameters and in particular the optimal number of components. However, I found an article in which cross-validation was applied in the framework of a sparse block PLS regression with a multi-response Y matrix (1). I wanted to know if this method could be applied to the block.spls function of the package, in which case I would be happy to share my R source code. I’m sure you’ve heard of this article, so perhaps you already have the answer?
Sincerely

(1) : Identifying multi-layer gene regulatory modules from
multi-dimensional genomic data , Wenyuan Li, Shihua Zhang, Chun-Chi Liu and Xianghong Jasmine Zhou, Bioinformatics Vol. 28 no. 19 2012, pages 2458–2466, doi:10.1093/bioinformatics/bts476
see section 2.5 page 2461

hi @gdrd,

thanks for sharing that paper. I think at the time that paper was published we were not on our way in developing the block.spls. I read section 2.5 and I think that could work (we have not yet had time to think about tuning the block.spls, only the 2-block spls).
Some computational limitations I see is that you would need to repeat the K-fold CV several times. This is what we do for block.splsda(), for every combination of the keepX parameter.

Potentially, if we wanted to implement such tuning parameter (and maybe now time is ripe as we start to have data that fit that framework), then we would need to test a bit more extensively this tuning parameter.

Would you be able to share your code on our gitHub issue (with potentially a dummy example from breast.TCGA) as a suggestion for improvement? And we will see how we could include it in the package after testing.

Thanks!

Kim-Anh

Hi @djaka ,

Would you be able to share with me this function? I’m also interested in tuning for KeepX for only one response variable in Y.

Thanks!