Tune.block.splsda error

Hi,

I am analysing three datasets together (metabolomics, proteomics and phosphoproteomics) and I would like to tune the number of variables I need to include into the block splsda model. However, when I try to tune with tune.block.splsda I get this error:

You have provided a sequence of keepX of length: 10 for block metabolomics and 10 for block proteomics and 10 for block phosphoproteomics.
This results in 1000 models being fitted for each component and each nrepeat, this may take some time to run, be patient!
Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: Please check the rownames of the data, there seems to be some
discrepancies

I checked the rownames and they are similar between the datasets (namely 1-18). When I was searching for this error I see that this error, or a variation, occurs often (there was a post at this forum as well), but the solution is never really stated. Somewhere it was suggested to use mixOmics 6.11.4 (I now have 6.10.9), but when I try to install mixOmics directly from gitHub, my R is going crazy ( this is not specific for mixOmics, but happens also when trying to download other packages from github). So, my question is whether you know how to solve this error?

This is what I was running:

test.keepX <- list (metabolomics = c(5:9, seq(10, 18, 2)),
proteomics = c(5:9, seq(10, 18, 2)),
phosphoproteomics = c(5:9, seq(10, 18, 2)))

design <- matrix(1, ncol = length(X), nrow = length(X),
dimnames = list(names(X), names(X)))
diag(design) <- 0

tune.TCGA <- tune.block.splsda(X = X , Y = Y, ncomp = 5,
test.keepX = test.keepX, design = design,
validation = ‘Mfold’, folds = 3, nrepeat = 5,
cpus = 2 , dist = “centroids.dist”)

Kind regards,
Lonneke Nouwen

UPDATE:
I managed to install a newer version of mixOmics via github (6.11.25), but the problem still persists.

Hi @lonnekenouwen,

It might be because the rownames (sample names) are not consistent in X blocks. You can check using the following code:

## should return 1
length(unique(lapply(X, function(x) rownames)))

Let us know if the result is 1. Otherwise, the rownames should be harmonised across blocks.

Hope that helps

Al

Hi Al,

Thank you for you reply!
The result of you code is 1

Kinds regards,
Lonneke

I am not sure if it is related or not, but when I plot with the IndivPlot function, the legend does not match the group names (names are 0,2,4,6,8,10 as is indicated in the plot). Maybe that has something to do with the error about the rownames?

Hi @lonnekenouwen,

Thanks for your clarifications.

Can you please save X and Y objects and forward to us to reproduce this issue.

save(X, Y, file = 'diablo_inputs.RData')

You can click on this text to send us an email.
Alternatively, you can right-click on the above text and choose ‘Copy Email Address’.

Best,

Al

I have send it to you!
Thanks in advance.

Kind regards,
Lonneke

Hi @lonnekenouwen,

Thanks for emailing the data file. Generally, as mentioned in the documentation, it’s advised that you use matrices for mixOmics functions. The issue stemmed from converting data.frames to matrices which should be fixed now in the latest development version (https://github.com/mixOmicsTeam/mixOmics#development-version).

Please let us know if you have any other issues/questions.

Best,

Al

Thank you very much! I should read the manual more carefully then :slightly_smiling_face:
I had one other question about the circosplots: in my plots the coloring of the blocks is sometimes on top of the variable names. Is there a way to ensure the variable names are always on top of the block coloring?

Kind regards,
Lonneke

Hi @lonnekenouwen,

I’m afraid that’s not something we want to change at the moment. You can use color.blocks argument to use colors with better contrast if that’s something you’re interested in.

Hope that helps.

Best wishes,

Al