Hi mixOmics team,
I went through your case study of sPLS with the Liver Toxicity dataset (sPLS Liver Toxicity Case Study | mixOmics) and applied the shown methods to my datasets. I then discovered that the tuning results, especially the keepX and keepY outputs of the tune.spls result, change when I change the order of my dataset. I tried it with the code shown in the case study and want to demonstrate to you what I mean:
Original code:
data(liver.toxicity)
X <- liver.toxicity$gene
Y <- liver.toxicity$clinic
spls.liver <- spls(X = X, Y = Y, ncomp = 5, mode = 'regression')
perf.spls.liver <- perf(spls.liver, validation = 'Mfold',
folds = 10, nrepeat = 5)
list.keepX <- c(seq(20, 50, 5))
list.keepY <- c(3:10)
tune.spls.liver <- tune.spls(X, Y, ncomp = 2,
test.keepX = list.keepX,
test.keepY = list.keepY,
nrepeat = 1, folds = 10,
mode = 'regression', measure = 'cor')
tune.spls.liver$choice.keepX
#Output:
# comp1 comp2
# 20 40
tune.spls.liver$choice.keepY
#Output:
# comp1 comp2
# 3 3
Swapped X and Y dataset:
#datasets swapped
X2 <- liver.toxicity$clinic
Y2 <- liver.toxicity$gene
spls.liver2 <- spls(X = X2, Y = Y2, ncomp = 5, mode = 'regression')
perf.spls.liver2 <- perf(spls.liver2, validation = 'Mfold',
folds = 10, nrepeat = 5)
# also swap lists as datasets are swapped
list.keepX2 <- c(3:10)
list.keepY2 <- c(seq(20, 50, 5))
tune.spls.liver2 <- tune.spls(X2, Y2, ncomp = 2,
test.keepX = list.keepX2,
test.keepY = list.keepY2,
nrepeat = 1, folds = 10, # use 10 folds
mode = 'regression', measure = 'cor')
tune.spls.liver2$choice.keepX
#Output:
# comp1 comp2
# 10 9
tune.spls.liver2$choice.keepY
#Output:
# comp1 comp2
# 30 50
As you can see the resulting values of keepX and keeps completely differ compared to the ones before, which makes me wonder why, as I would expect that the values would only be swapped (so keepY2 has now values as keepX before, …).
Maybe someone could explain to me why these results differ and how I know which dataset I have to choose as X and which one as why Y, as this obviously leads to different results for the plots (CIM, …) following the tuning.
Best regards,
Katharina