Extracting Highly Correlated Genes from rCCA or sPLS Objects

Apologies for the basic question, but I think this is the best place to ask. I am interested in finding gene sets from the expression matrix that are correlated with microbiome taxa. I have two questions:

  1. Is Canonical Correlation Analysis (CCA) the best model for this?
  2. If I am interested in grouping the genes based on their correlation with different microbiome taxa, are the ‘loadings’ the best way to represent this?

hi @AhmedOsman,

  1. Is Canonical Correlation Analysis (CCA) the best model for this?

No, rCCA is not good for variable selection, so try sPLS instead. sPLS will select the correlated variables across data sets.

  1. If I am interested in grouping the genes based on their correlation with different microbiome taxa, are the ‘loadings’ the best way to represent this?

The plotLoadings will show you visually the most important genes and taxa selected on each component and each data set, but we like to use plotVar(), cim(), network() to visualise the correlation (see our website + vignette for examples).

Kim-Anh

1 Like