sPLSDA multilevel and timepoints

upi · December 6, 2022, 3:00pm

Hi!
I have a dataset with 3 different timepoints. It’s about 24 Plants from 9 different locations. But in each timepoints about 14-15 plants were taken to measure their metabolom. The data is unbalanced, so I have Plant1_T1_Loc1 but it I don’t necessarily have the same plant at timepoint T2 because it died and another plant nearby from those 24 was taken instead.
I made a multilevel PCA which shows that the timepoints show differences. The chosen locations seem not play influence.
I then also tried to confirm that by a sPLS-DA with that code, and the locations do not cluster at all.

X ← tlog #log-2 transformed normalized data
Y ← as.factor(location)
summary(Y)

MyResult.splsda ← splsda(X,Y, multilevel = sampleID, scale=T)

My question is now if I can use the timepoints as group and additionally multilevel in sPLS-DA?

MaxBladen · December 8, 2022, 1:35am

This will be a tricky dataset to draw meaningful conclusions from. Without consistency of samples across time nor space, there are going to be a lot of spurious relationships in your data. Due to this, it’s unsurprising the locations didn’t cluster.

Additionally, within your call to splsda(), you use sampleID as the multilevel parameter. If you’re wanting to control for the time measurements or location, you need to pass that information to multilevel. I’d recommend exploring the withinVariation() function too as it gives you a bit more control

My question is now if I can use the timepoints as group and additionally multilevel in sPLS-DA?

I’m not sure I fully understand the question as I don’t know why you would do this. If you use the timepoints as your multilevel parameter, the algorithm will attempt to “remove” the between-timepoint variation in your dataset. Then, if you pass the same timepoint vector as your Y in splsda(), it will attempt to generate a model which best discriminates between the timepoints. However, you would have removed the variation between timepoints, meaning splsda() is unlikely to perform well at all.

Topic		Replies	Views
sPLS-DA model for repeated longitudinal measurements Analysis	1	1015	September 13, 2020
Two Factor Multi-Level sPLS-DA Analysis Analysis	4	846	January 10, 2021
High error rate even when more components are included in sPLS-DA Analysis	1	503	September 22, 2020
SPLS-DA for two time points (repeated), plotLoadings mean vs median, CSS normalisation and scaling Analysis	5	1212	April 25, 2020
Paired splsda with only two times Analysis	2	28	October 18, 2024

sPLSDA multilevel and timepoints

Related topics