Dear mixOmics
developers,
I am working on a project involving data from multiple omics layers across different tissues and treatments, and I would appreciate your guidance on the feasibility of integrating these datasets using the mixOmics
package.
Here are the specifics of my data:
- Metabolomics: samples from four tissues (meat, adipose tissue, liver, and blood plasma).
- Transcriptomics: samples from two tissues (liver and muscle).
- Microbiome (metagenomics): samples from two tissues (fecal and ruminal).
- Genomics: SNP data.
All the datasets are derived from the same animals, but some tissues differ across omics (for instance, transcriptomics and microbiome data come from different tissues). Additionally, I have three treatment groups across all animals (n = 5 per treatment).
My key questions are:
- Multi-omics integration: Is it possible to integrate these omics datasets given that they originate from different tissues, but the animals are the same? How should I approach this in
mixOmics
, particularly in light of the tissue-specific variation?
- Phenotype inclusion: I would also like to incorporate some phenotypic variables (e.g., weight, growth rate, etc.) into the integration. Is this possible, and if so, how should I go about doing it in the context of multi-omics analysis?
- Genomics (SNP) data: Can I also include my SNP data as one of the omics layers for integration, and how should this be done?
Thank you in advance for your time and assistance. Any advice or examples on how to structure the analysis with these types of data would be greatly appreciated!
Best regards, Guilherme!
hi @Gpolizel,
Apologies for the very late answer.
Since your samples are the same across all those datasets, I think you should divide the datasets into tissue; e.g
metab-meat, metab-adipose etc.
Then you can decide what you want to integrate; e.g. metab-liver and transcript-liver, or all. Just go slowly first before you go to a full integration! By considering these tissue-blocks you will be able to assess which blocks include similar information.
You can also include another dataset that includes the phenotypic variables. Again, go step by step. For example I would do a simple PLS or sPLS (with arbitrary keepX, keepY, small enough) to see whether there is any common information with the phenotype using the plotVar()
function. You could then consider DIABLO to include the treatment information.
Finally including the SNP data will be tricky (there are previous posts about this) because they include very little information. If you could obtain a polygenic score that would be better, then you can include this information as a single variable, also as a block. Maybe contact us when you have reached this stage
Another approach you could consider is a multilevel where you assume a repeated measurement, but you would only be able to consider metabolomics on its own, or transcriptomics on its own, and you can focus mostly on the treatment discrimination: Multilevel | mixOmics
Kim-Anh
Hello @kimanh.lecao!! Thank you so much for your reply.
I want to focus on my treatments, maybe find biomarkers associated with it through plsda. Using DIABLO is possible to discriminate each omic tissue, as you told, and assess the impact of my tretaments?
I will start my analyses in December, when I am in the SNPs integration, I will get in touch again!!
Hi @Gpolizel,
It depends on how your organise your data.
If you combine them all into, say, metabolomics and your Y indicates both tissue * treatment then you can discriminate all different types of groups.
If you consider each tissue as a block then your Y would only include the treatment information, that you will discriminate.
(you can do both and explore)
Kim-Anh
Perfect!! I will try!!
Thank you so much for all information provided!!
I will get in touch soon again to discuss about the SNPs.
Thank you again!!