PLS/PLSDA for multiple variables

Hello there, I moved this part of the discussion here so other people can find it more easily! The initial discussion is here.

Thanks for your responses so far, @evahamrud! What happens if I am actually interested on all 3 - the intervention, as well as whether the response differs by age and sex? This is why I was thinking I should check age and sex at baseline (time = 0) and post intervention (time = 1) first - are any age or sex differences in the metabolome at those timepoints? Then, construct an sPLSDA on the full data with a multilevel design.

I wished to do all these in one model for each dataset (baseline, post intervention, full), but from the links you sent me it doesn’t look like I can build a PLS model which includes all the variables of interest. Could you please expand a little bit more on how to look at the age and sex variables in my dataset, please? :slight_smile:

Many thanks in advance,
Evelyn

Hello @windsnowflake,

If you would like to look at all your variables at once:

  • metabolites (232 continuous variables)
  • before/after intervention (1 categorical variable)
  • age (1 continuous variable)
  • sex (1 categorical variable)

I would first break up the problem to see which of the covariates (before/after, age, sex) strongly affect your metabolite data. You can do this using a PCA plot on your metabolite data and colouring the samples for these different covariates - you might find that some of these covariates do not strongly impact your data and therefore may not be important to include downstream.

It is possible to construct a PLS where your X block is metabolite data and Y block are your other three variables, in this case you turn your categorical variables (before/after intervention and sex) into numeric (1 or 2). See this post for an example of someone doing this with clinical data. The problem is now you don’t have a discriminate analysis model anymore (you would be running PLS rather than PLS-DA), which I would think would be much more difficult to interpret, particularly if you are primarily interested on the effect of intervention on your metabolites.

Instead of building a PLS model, I would recommend trying to correct for your covariates i.e. age and sex before building a PLS-DA model, currently mixOmics does not include this functionality but you can have a look through the forum to see what tools others have used (perhaps this post is useful where someone is correcting for sex).

Hope that helps!
Eva