Perf() step gives inconsistent results?

Hi, I’m trying to work through the steps of building the model for a block.plsda analysis to run DIABLO. I have run block.plsda() followed by perf() but the results keep changing. Any suggestions on what I might be missing to make this step give reproducible results?

I’ve run pls() on each of my X list to determine the design and using that design matrix in the block.plsda().

diablo.MD ← block.plsda(X, Y, ncomp = 3, design = design)
perf.diablo.MD = perf(diablo.MD, validation = ‘Mfold’,
folds = 6, nrepeat = 10)
plot(perf.diablo.MD)
perf.diablo.MD$choice.ncomp$WeightedVote

hi @atan,

The cross-validation step in perf() randomly assigns samples to training and test set, hence the instability of the results. We usually advise nrepeat = 50 to obtain more stability. However, depending on your sample size it may still be unstable (so the folds argument matters).

You could also consider validation = ‘loo’ but I’d say this option is only valid when you have a very small sample size (<=10). It may lead to overoptimistic (but overfitted) results.

Kim-Anh