Best model for repeated measures?

Hi,

I had a question concerning model selection for a clinical dataset and would greatly appreciate feedback. For context: I have a (n=64) subject group with urine metabolomics and urine proteomics performed at 8 unevenly spaced timepoints. On top of this, patients had brain MRS measurements at one timepoint and I am trying to construct a robust way to extract maximum information from this noisy, p >>> n, multimodal, multi-timepoint setting.

  1. Given repeated measures on the same patients, is the recommended workhorse for analysis of this format meant to be multilevel sPLS-DA? And this is in most cases equivalent to sPLS-DA on within-patient variation?
  2. Can DIABLO be used for N- and t-integration (time)? In one online lecture you mention integrating up to 15 diverse modalities (methylation, proteomics, metabolomics, transcriptomics), could some of these be for example metabolomics_time1, metabolomics_time2, etc? Are there existing mixOmics tools for time integration at discrete timepoints, as opposed to timeOmics for time modeling?
  3. I understand MINT is meant for different populations on the same variables. Just to confirm, this runs sPLS under the hood with mitigation of batch effects? Can this be applied on dependent measures of the same patients or is multilevel sPLS-DA / DIABLO the preferred option?
  4. How does the use case vary compared to N-way PLS DA/Khatri Rao structure? Does NPLS obscure signal if molecules behave differently in time (each metabolite gets one weight across all timepoints and each timepoint gets one weight across all metabolites)?

Apologies in advance if I’ve misunderstood something or if the questions above are too broad. I understand timeOmics would be an option for splines/temporal modeling which may be an interesting but different question. Many thanks Prof. Lê Cao and your team for making tools to push omics forward, and for making lectures available on YouTube. Best wishes

Hi @tomqu,

Thank you for your detailed questions. I appreciate you doing quite a bit of background reading on the topic!

  1. Given repeated measures on the same patients, is the recommended workhorse for analysis of this format meant to be multilevel sPLS-DA? And this is in most cases equivalent to sPLS-DA on within-patient variation?

    Given the unspaced nature of your timepoints I would say yes, make sure the multilevel decomposition makes a notable difference, otherwise there is no point doing it (ie. there are case scenarios where the time effect is stronger that the within patient variation).

  2. Can DIABLO be used for N- and t-integration (time)? In one online lecture you mention integrating up to 15 diverse modalities (methylation, proteomics, metabolomics, transcriptomics), could some of these be for example metabolomics_time1, metabolomics_time2, etc? Are there existing mixOmics tools for time integration at discrete timepoints, as opposed to timeOmics for time modeling?

    Currently in mixOmics we can only do one or the other. DIABLO considers each sample independently. Your proposition works (one block per time point) only if you are interested in covariation between the different blocks regardless of time. We have a new set of tensor methods that would do N and t (still working on it, but we will present it at our next advanced workshop in March 2026, if that is of interest to you: Workshops – mixOmics , it seems that most of your questions relate to the topics we want to cover!)

  3. I understand MINT is meant for different populations on the same variables. Just to confirm, this runs sPLS under the hood with mitigation of batch effects? Can this be applied on dependent measures of the same patients or is multilevel sPLS-DA / DIABLO the preferred option?

    MINT is designed to account for batch or study effect but I would not use it in this context to remove the time effect.

  4. How does the use case vary compared to N-way PLS DA/Khatri Rao structure? Does NPLS obscure signal if molecules behave differently in time (each metabolite gets one weight across all timepoints and each timepoint gets one weight across all metabolites)?

    DIABLO allows for supervised approach and feature selection, N-way PLS-DA potentially cannot do the latter. But DIABLO can’t handle the time information, whereas the N-way does. Sounds like our new tensor method (that includes DIABLO style analysis for longitudinal measurements) will be suited for your case study! The multilevel in your case would really be there to emphasise on the time effect, but in effect does not really weight base on the timepoint. As you said it would weight each metabolite across all time points, and do similar approach for other omics to extract correlated information. In practice it may still mean that the omics would be correlated between each other and across time, but it needs to be tested. Our tensor method should be out on bioRXiv by the end of this month if you want to check it out.

Kim-Anh