Hi, I have three datasets I wish to integrate. I have performed separate sPLS-DA for each dataset and I have two categories: healthy (n=9) and diseased (n=6). The ideal ncomp and keep.X according to tune.splsda were as follows:
dataset ncomp keep.X
A 1 9
B 3 600, 460, 110
C 1 10
Analysing B further, the first 30 variables of comp1 are only significantly changed.
I thus would like to integrate these variables in Diablo.
list.keepX <- list(colon = c(9,1), plasma = c(30,1), olink = c(10,1))
MyResult.diablo <- block.splsda(X, Y, keepX=list.keepX, ncomp=2)
But when I visualise the data, e.g. by circosPlot many of the variables included in the plot are not those I wished to select, i.e. colon = c(9,1), plasma = c(30,1), olink = c(10,1) and many interesting ones are missing. So it seems my code is not extracting the correct variables. Did I misunderstand? How can I integrate only the variables of interest? Or would you argue against doing this at all given the small number of samples? Alternatively, how can I identify the best number of variables for each dataset for DIABLO? Is there something similar to tune.splsda that works for X with three datasets?
The variables we identified using the separate analyses are highly significant and make sense, so I wish to identify relationships amongst them.
Thank you very much for your help.
I very much enjoy mixomics and it is very easy to do PLS-DA with it and get beautiful figures