Tune.block.splsda() allowing 0 and 1

Hi!

I have 2 questions about the number of variables to select in a DIABLO model (I’m using the mixOmics version 6.20.0):

  1. Among my datasets, there is one that maybe in some cases does not provide relevant information. Therefore any variable from this dataset should be included. For that, my idea was to apply the “tune.block.splsda()” containing the possibility of “0” for at least one block. However, when I try to do that I get the error indicated below. Is there a way to indicate the possibility of “0” in at least one of the blocks?
> data(breast.TCGA)
> X <- list(mRNA = breast.TCGA$data.train$mrna, 
+           miRNA = breast.TCGA$data.train$mirna, 
+           protein = breast.TCGA$data.train$protein)
> Y <- breast.TCGA$data.train$subtype
> mydesign <- matrix(1,
+                    ncol = length(X), nrow = length(X),
+                    dimnames = list(names(X), names(X)))
> set.seed(123)
> tune <- tune.block.splsda(
+     X = X, 
+     Y = Y, 
+     test.keepX = list(
+         mRNA = c(16, 17), 
+         miRNA = c(0, 18), 
+         protein = 5
+     ), 
+     design = mydesign, 
+     ncomp = 2, scale = FALSE, nrepeat = 3)
Design matrix has changed to include Y; each block will be
            linked to Y.

You have provided a sequence of keepX of length: 2 for block mRNA and 2 for block miRNA and 1 for block protein.
This results in 4 models being fitted for each component and each nrepeat, this may take some time to run, be patient!

You can look into the 'BPPARAM' argument to speed up computation time.
Error: BiocParallel errors
  1 remote errors, element index: 1
  2 unevaluated and other errors
  first remote error:
Error in if (diff.value < tol | iter > max.iter) break: missing value where TRUE/FALSE needed
  1. On the other side, in the case that for an specific block the output of tune.block.splsda() is suggesting to consider only 1 variable for an specific block and component I’ve realised that I cannot draw the circosPlot() on the first component… Here an example of code giving the error: first version works well (for any block and component is indicated to select just only 1 variable), whereas the second version is giving the error (here for the block “protein” I’m indicating to select only 1 variable within the first component):
> data(breast.TCGA)
> X <- list(mRNA = breast.TCGA$data.train$mrna, 
+           miRNA = breast.TCGA$data.train$mirna, 
+           protein = breast.TCGA$data.train$protein)
> Y <- breast.TCGA$data.train$subtype
> mydesign <- matrix(1,
+                    ncol = length(X), nrow = length(X),
+                    dimnames = list(names(X), names(X)))
> 
> lx_1 <- list(mRNA = c(16, 17), miRNA = c(18,5), protein = c(5, 5))
> res1 <- block.splsda(X, Y, keepX=lx_1)
> circosPlot(res1, cutoff=0.7, comp = 1)
> 
> lx_2 <- list(mRNA = c(16, 17), miRNA = c(18,5), protein = c(1, 5))
> res2 <- block.splsda(X, Y, keepX=lx_2)
> circosPlot(res2, cutoff=0.7, comp = 1)
Error in do.call(cbind, X)[, colnames(simMat)] : subscript out of bounds

Thank you very much in advance!!!
Mar

Regarding your first point, unfortunately no there is not a way for the models to consider 0 features for one block. Not much point to this. If you want to try this, just don’t include that block in your analysis

To your second point, I have a strong suspicion as to what is causing this. I’ll try get a fix to you when I can.

1 Like