Proportion explained variance in PLS vs sPLS model

Hi @windsnowflake,

So your data includes:

  • metabolites (232 continuous variables)
  • before/after intervention (1 categorical variable)
  • age (1 continuous variable)
  • sex (1 categorical variable)

The type of model you will want to build depends on which of these variables you are interested in and which ones might be confounding. I imagine you are most interested in the effect of your intervention, in which case you should set before/after intervention as your Y variable and run a (s)PLS-DA model. If this is the case there are a couple of things I would consider:

  1. You have the same sample before/after treatment, so as you pointed out this is paired data, also called multilevel data. You can read more about multilevel data on this page, but essentially you need to account for this when you build any model in mixOmics, because we expect the difference between your individuals to be greater than the difference between before and after treatment. You can actually check if this is the case by doing a PCA plot and colouring your samples by individuals and making different shapes for before/after intervention.

  2. Age and sex are factors that you might not be primarily interested in, but they may also influence your metabolite data - these are called confounders/covariates. Again you can check what effect these have on your data with a simple PCA plot. Unfortunately mixOmics doesn’t currently have functionality to account for covariates, but there are other things you can do to get around this - see this related question and this one.

Hope that helps!
Eva

PS in future it please could you post different questions in different posts, it just helps others who might have similar questions to find what they need :slight_smile:

1 Like