CCA analysis (cross validation vs shrinkage method)

Hi all,

I wish to N-integrate two datasets from the same human patients; host transcriptome data (single-cell data aggregate to cluster level) and microbiome data (16S rRNA) to understand how the host affects the microbiome and vice versa along a numerical variable of disease progression. I am using the CCA method and wanted to ask if there are any guidelines as to whether one should use the regularized as opposed to the shrinkage method. My ultimate goal would be to validate these interactions experimentally with intervention studies. My cohort has 13 samples so it is on the low end of size. Apologies if this basic question has been answered elsewhere.

Hi @thkapell,

If you focus is on validating potential interactions experimentally, I would advise you use sPLS rather than CCA, as it will allow for variable selection. Have a look at our website for examples, and also screen previous post on how you should pre-filter the data beforehand to perhaps 5k variables per data set max since your number of samples is quite small.

Kim-Anh