I have 2 questions about the number of variables to select in a DIABLO model (I’m using the mixOmics version 6.20.0):
- Among my datasets, there is one that maybe in some cases does not provide relevant information. Therefore any variable from this dataset should be included. For that, my idea was to apply the “tune.block.splsda()” containing the possibility of “0” for at least one block. However, when I try to do that I get the error indicated below. Is there a way to indicate the possibility of “0” in at least one of the blocks?
> data(breast.TCGA) > X <- list(mRNA = breast.TCGA$data.train$mrna, + miRNA = breast.TCGA$data.train$mirna, + protein = breast.TCGA$data.train$protein) > Y <- breast.TCGA$data.train$subtype > mydesign <- matrix(1, + ncol = length(X), nrow = length(X), + dimnames = list(names(X), names(X))) > set.seed(123) > tune <- tune.block.splsda( + X = X, + Y = Y, + test.keepX = list( + mRNA = c(16, 17), + miRNA = c(0, 18), + protein = 5 + ), + design = mydesign, + ncomp = 2, scale = FALSE, nrepeat = 3) Design matrix has changed to include Y; each block will be linked to Y. You have provided a sequence of keepX of length: 2 for block mRNA and 2 for block miRNA and 1 for block protein. This results in 4 models being fitted for each component and each nrepeat, this may take some time to run, be patient! You can look into the 'BPPARAM' argument to speed up computation time. Error: BiocParallel errors 1 remote errors, element index: 1 2 unevaluated and other errors first remote error: Error in if (diff.value < tol | iter > max.iter) break: missing value where TRUE/FALSE needed
- On the other side, in the case that for an specific block the output of tune.block.splsda() is suggesting to consider only 1 variable for an specific block and component I’ve realised that I cannot draw the circosPlot() on the first component… Here an example of code giving the error: first version works well (for any block and component is indicated to select just only 1 variable), whereas the second version is giving the error (here for the block “protein” I’m indicating to select only 1 variable within the first component):
> data(breast.TCGA) > X <- list(mRNA = breast.TCGA$data.train$mrna, + miRNA = breast.TCGA$data.train$mirna, + protein = breast.TCGA$data.train$protein) > Y <- breast.TCGA$data.train$subtype > mydesign <- matrix(1, + ncol = length(X), nrow = length(X), + dimnames = list(names(X), names(X))) > > lx_1 <- list(mRNA = c(16, 17), miRNA = c(18,5), protein = c(5, 5)) > res1 <- block.splsda(X, Y, keepX=lx_1) > circosPlot(res1, cutoff=0.7, comp = 1) > > lx_2 <- list(mRNA = c(16, 17), miRNA = c(18,5), protein = c(1, 5)) > res2 <- block.splsda(X, Y, keepX=lx_2) > circosPlot(res2, cutoff=0.7, comp = 1) Error in do.call(cbind, X)[, colnames(simMat)] : subscript out of bounds
Thank you very much in advance!!!