Data integration with small sample size


I am trying to integrate data from both RNAseq and MS. I have data across 15 samples and a pre and post treatment timepoints for each sample. However, I have 5 samples with just RNAseq, 6 with just MS, and only have 4 samples have both RNAseq and MS data.

This may be a bit of a naive question, but is there any way to integrate these data together while still including the data from the patients that have either RNAseq or MS, not both?

If that’s not possible (or advisable) and I would have to limit my integration to just the 4 samples with both RNAseq and MS, I was thinking of using DIABLO to integrate them. Is that a reasonable choice?

Thank you for your help.


hi @bryanf,

Four samples for integration is not going to be very useful for you.
Have a look at other methods for ‘MOSAIC’ integration, although 15 samples is very low.

You may have to do 2 separate analyses for RNA-seq only and MS, identify some interesting variables, and then somewhat rely more on data interpretation (you won’t be able to do a correlation analysis either). Have a look at the multilevel decomposition in mixOmics for pre vs post Multilevel | mixOmics (and also any past posts related to multilevel).