How to link data in DIABLO

[posted via email from Ravi]
I have a theoretical question regarding the DIABLO analysis. It is my understanding that is able to take multiple -omics platforms. Is there any “structure” that the data needs to be theoretically organized? For example, one could say that DNA goes to mRNA which goes to protein. If you have mRNA and protein data, protein would be at a “higher” level than mRNA data.

But suppose I have two omics datasets from the same level (e.g. the brain, [brain morphometry and brain connectivity]), would DIABLO be appropriate for this?

Hi Ravi,
Our methods are data-driven only and there is no ‘direction’ you can give to the data. So far, we (our team) actually found no evidence based on biological dogma. Potentially the data are too complex, the biology is too complex, and/or the data are too noisy!

So in your case I think DIABLO will be perfect as we only try maximise the correlation btw the datasets with no direction. You could also consider a PLS canonical mode if you want to have a first insight in whether biological variation corresponds to the treatment of interest. For DIABLO you will see in our tutorial that you can also play a bit with the weights to decide on the ‘amount’ of correlation. Our paper discusses in details the compromise between correlation vs discrimination.

Good luck!
Kim-Anh

1 Like