Is this function performing a repeated CV, and if so – what is the default number of repeats? (And if not, can we specify how many permutations we want?);
Unfortunately not to both questions. It’s an extremely underdeveloped function and does not have the same capability as the more-used tune.spls()
.
Is the function applying a regression or canonical mode?
It defaults to canonical
I have at times been getting: Warning: The SGCCA algorithm did not converge. Is this an issue?
This means that given the number of iterations, the threshold for stability in the variates is not reached. I’d recommend adjusting the set.seed()
value until this warning does not show, then stick with that value.
Overall, a pipeline that will serve you better than the tune.splslevel.wrapper()
function I wrote above is the following:
library(mixOmics)
data(vac18)
X <- vac18$genes[,1:500]
Y <- vac18$genes[,501:1000]
ML <- vac18$sample
design <- data.frame(sample = ML)
ncomp <- 5
test.keepX = seq(10,100,10)
test.keepY = seq(10,100,10)
Xw <- withinVariation(X = X, design = design)
Yw <- withinVariation(X = Y, design = design)
tune.spls(Xw, Yw,
ncomp=ncomp,
test.keepX=test.keepX, test.keepY=test.keepY,
validation = "...",
mode = "...",
folds = ...,
nrepeat = ...,
measure = "...",
BPPARAM = BiocParallel::SnowParam(workers="..."))
Essentially, we get the same functionality of tune.splslevel()
by calling the withinVariation()
function manually. Then, by passing this to tune.spls()
, you can have full control over the mode
(you can set it to "regression"
), the nrepeat
, etc