Transform both the gene expression and metabolite datasets into pathway datasets

Hi all,
Thanks for your innovative work. Mixomics is really helpful.
I am in trouble with Case study 2: Allergic Asthma.
“A module based approach (also known as eigengene summarization was used to transform both the gene expression and metabolite datasets into pathway datasets. Consequently, each variable in those two datasets now represented the scaled pathway activity expression level for each sample instead of direct gene/metabolite expression. The mRNA dataset was transformed into a dataset of metabolic pathways whereas the metabolite dataset was transformed into a metabolite pathway dataset.”
Does the method above mean:
1. Calculate module eigengenes (1st principal component) of modules in a given single dataset.
2. Use the selected variable names(component1/2…) to do pathway enrichment for gene/metabolite.
My guess above is obtained by the following code in asthma_analysis.Rmd file:
#########################################################################
diablow = block.splsda(X = A, Y = time, ncomp = ncomp, keepX = keepX, design = design)
diablowPanel <- list(cells = c(selectVar(diablow, comp = 1)$cells$name,
selectVar(diablow, comp = 2)$cells$name),
gene.module = c(selectVar(diablow, comp = 1)$gene.module$name,
selectVar(diablow, comp = 2)$gene.module$name),
metabolite.module = c(selectVar(diablow, comp = 1)$metabolite.module$name,
selectVar(diablow, comp = 2)$metabolite.module$name))
##############################################################################
3. Can I use the pathway module(transformed) in the standard DIABLO model without multilevels?
It works well in the pathway integration.
Looking forward to your help. Thank you in advance.
Yoyo

Hi Yoyo,

What you need to do is:
1 - assign the genes / metabolites to their given pathway. If you have 15 gene pathways, you should end up with 15 gene datasets, there can be some overlap between the gene sets.
2 - run a PCA on each data set individually, extract the first component (can be done in mixOmics), you should have now 15 components
3 - create your ‘pathway summarised’ data set X_gene_pathway that includes 15 columns named as gene_pathway1 … 15.
4 - Do the same for your metabolites => X_metabolite_pathway
5 - Input your X as a list of X <- list(gene_path = X_gene_pathway, metabolites_path = X_metabolite_pathway) then run DIABLO on this (as is shown in your code above.

And yes you can remove the multilevel (which would happen much higher in that code, basically just consider your normalised matrices to calculate the eigengenes.

Let us know if this is not clear,

Kim-Anh

1 Like