Integration with uneven number of samples following outlier removal

Hello,

I am hoping to use an N-integration technique for combining metabolomics and proteomics datasets, where both initially contain the same number of samples taken across the same batches and timepoints (no replicates). However, a couple of these samples were thrown out due to being outliers, leaving a remaining uneven number of samples across the datasets. For instance, one sample at a particular timepoint and batch is missing in the proteomics data while the other metabolomics datasets still have this sample at that timepoint and batch.

I am wondering if it is still possible to continue with N-integration by Block (s)PLS or DIABLO with my remaining samples? Or would I need to remove the matching samples of the outliers in my other datasets, so as to keep the number of samples even?

Thank you,

Evelyn

Hi @evelynsq ,

Apologies for the delay in our answers as we are lacking a maintainer in the mixOmics team.

For DIABLO you will need the same matching samples, unfortunately. However I also note that you have several timepoints. If there are more than 3 you could use timeOmics to infer a missing sample at a given time point, using the smoothing spline approach. You an browse older posts on the topic.

DIABLO considers all samples independent of the time points, so if time is of interest to you, then you will need to choose a different approach.

Kim-Anh