I have 3 omics datasets (metabolomics, transcriptomics, and proteomics), all untargeted, on a matched set of bacteria (2 strains, WT and gene KO) that were grown with 2 different treatments (+/- drug) and harvested at 2 different timepoints (6 and 24h). What mixOmics methods should I be using? I want to:
- perform N-integration across all 3 omics
- see which proteins/genes correlate with certain known metabolites of interest
- explore other patterns in the data?
I was looking into DIABLO but I was not sure since my categorical variables are treatment variables, rather than outcome variables.
Hi @Eunice,
Firstly - I recommend checking out this web page we have for helping you select which mixOmics method will work best for your data and biological question.
From what I’ve understood of your experimental design, DIABLO would actually work well for your data. DIABLO is good for identifying which features across your datasets X
(metabolomics, transcriptomics, proteomics) discriminate between your different Y
variables. We often call the Y
variable an ‘outcome variable’ (for example in our DIABLO case study we’re looking at if our omics data can discriminate different cancer subtypes), but it can also be your treatment variable. This is because the question ‘Which features across my omics best distinguish which treatment that sample had?’ is basically the same as asking ‘Which features are most strongly affected by my treatments?’.
If you would like to integrate your 3 omics datasets without including treatment or time information, you can also try multiblock (s)PLS.
Hope that helps!
Eva