N-integration from different sample groups

Peptoabysmal · May 4, 2023, 7:52am

We have proteomic and metabolomic data from a transgenic mouse model, comparing wild-type to transgenics. I would like to perform N-integration, however the data was not collected with multi-omics in mind.

We have 5 WT 5 +/- for proteomics, and 10 WT 11+/- for metabolomics, all different animals, all hippocampal tissue. Transcriptome may be upcoming, again from different animals.

Ideally we would of course have liked to have a larger number of animals and to have taken samples from the same animals so drawing conclusions is going to be pretty tentative. However, I am wondering if the DIABLO model could put out anything useful if samples are not individually matched (garbage in, garbage out)?

If the latter is true, might it be better to look at a simpler, more manual pathway analysis from KEGG, for example? Or some other method, eg. 3omics, paintomics?

One additional question from this bioinformatics newbie: can PCA be used to pull out relevant proteins/metabolites? I have done PCA with some other experimental groups and if I see that eg. Principal Component 2 accounts for the split in genotype, would then the top proteins/metabolites be indicative of pathways involved? If so, could I weight them for pathway analysis? And if so, how?

All advice gratefully recieved.

kimanh.lecao · May 25, 2023, 11:16pm

hi @Peptoabysmal,

Unfortunately, as the name suggest, N-integration is on the same N samples. It is because we need to calculate the covariance between the different data sets (where the N dimension is common).

As you suggested, you will have to analyse each omics separately, and then do some interpretative integration from the results. I am not really familiar with the other methods you mention.

You can use PCA or sparse PCA to identify the variables driving most of the variance in the data. It remains an exploratory approach of course, so there is no super clear criterion on how many variables you should look at. Remember to center and scale your data (in the PCA arguments), unless you are interested in pulling variables with the largest variance across your samples.

You could have a look at this (rather old) method called Pathifier to do the weighting directly. We had some good results in a breast cancer data set for a pathway analysis: Personalised pathway analysis reveals association between DNA repair pathway dysregulation and chromosomal instability in sporadic breast cancer - ScienceDirect.
Also look at the concepts of ‘Eigen genes’, it’s based on the same PCA principles.

Kim-Anh

Peptoabysmal · May 30, 2023, 10:15am

Thank you so much Kim-Anh, that answers my question regarding N-integration. I will check out the Pathifier package and look at Eigen genes.
Sparse PCA sounds useful, I may try this with my current script in R using nsprcomp or in mixOmics.

Topic		Replies	Views
DIABLO (N-integration) for different omic data and same set of genes Analysis	1	315	February 23, 2023
N-integration with smaller datasets (few predictors) Support	3	548	July 4, 2019
Integration with DIABLO for N-ingretaion with low sample size Analysis	7	3175	June 27, 2024
Using DIABLO to integrate multiple metabacording datasets Analysis	2	430	September 6, 2021
Choosing the right analysis for my scientific question Analysis	2	74	December 27, 2024

N-integration from different sample groups

Related topics