Using Multi level approach for a parallel design clinical intervention?

Dear mixOmics community and team,
We would like to know if the multi-level approach is right for our research question and we would be very happy to get support from your team. We have 2 groups (Intervention, and control) and 2 time points (pre-post). We have 23 participants in total and each collect 2 stool samples (pre-post).
We are interested in finding the microbes that changed due to the intervention. We have tried the multi-level. We have used

design ← data.frame(sample = meta.data$Subject_ID) # set the multilevel design

Subject ID: Unique ID for each subject
Sample ID: Unique ID for each sample

and we have the following output:

Here, we see differences in the 2 plots. Does this mean that the multi-level approach worked?
Is that design correct in this case? I check different forum posts and am confused about whether to use “sample id” or “Subject ID” in the design. One more question here: You have mentioned in the forum post that multi-level removes “inter-patients variability”. inter-patients variability in this case would mean the change in a subject’s microbiome between the time points?

Alternate strategy:
Since, we are interested in finding which microbes changed in the intervention group compared to the control group, Can we use the change in microbiome as an input to PCA or sSPL-DA (response variable: group)? Exact input: Change in clr transformed microbiome count data (post-pre). Does this strategy work instead of multi-level?

Thank you

  • Aakash

I am attaching the code used:

taxo ← tax_table(phyloseq_data) # extraction of the taxonomy

meta.data ← phyloseq_data@sam_data # extraction of the metadata

extract OTU table from phyloseq object

samples should be in row and variables in column

data.raw ← t(otu_table(phyloseq_data))

data.offset ← data.raw+1

remove low count OTUs

result.filter ← low.count.removal(data.offset,
percent=0.01)
data.filter ← result.filter$data.filter

X ← as.data.frame(data.filter) # set the raw genera data as the predictor dataframe
Y ← microbiome_markers$Intervention # set the RESPONDERS VARIABLE class as the response vector

undergo normal PCA after scaling/centering

pca ← pca(X,
scale = TRUE,
center = TRUE,
logratio = ‘CLR’)

undergo multilevel PCA after scaling/centering

pca.multilevel ← pca(X,
scale = TRUE,
center = TRUE,
logratio = ‘CLR’,
multilevel = design)

plot the samples on normal PCs

plotIndiv(pca,
group = meta.data$Intervention,
ind.names = meta.data$Person_ID,
legend = TRUE,
legend.title = ‘Intervention group’,
title = ‘(a) PCA on microbiome data’)

plot the samples on multilevel PCs

plotIndiv(pca.multilevel,
group = meta.data$Intervention,
ind.names = meta.data$Person_ID,
legend = TRUE,
legend.title = ‘Intervention group’,
title = ‘(b) Multilevel PCA on microbiome data’)

hi @Aakash,

In the design you need to use the Subject ID as it indicates that this individual is samples twice, so this is correct. Your plots show that without the multilevel decomposition, samples from the same individual cluster, while this is not the case in the other plot. This is what we mean by removing the individual variation. However your PCA plot dont show much discrimination, can you try a multilevel (s)PLSDA instead?

One more question here: You have mentioned in the forum post that multi-level removes “inter-patients variability”. inter-patients variability in this case would mean the change in a subject’s microbiome between the time points?

Yes, or accommodating for that bias since you are primarily interested in the intervention group. I guess it depends on what your aim is here, focusing on the differences between pre-post or intervention? Those methods can’t do both at once!

You can also do the strategy that you propose to basically correct at baseline. I think that would be a similar approach to multilevel. We did this for that study: https://www.nature.com/articles/s41467-019-08794-x

Kim-Anh

1 Like