Optimal components from perf() and tune.splsda() functions are not optimal?

I am making a pls-da model and was playing around with the number of components in the final model. At first I was using the number of components as suggested by the perf() and tune() functions (taking into account your advice in the manual and in other topics).
However I noticed that often the number of components (and variables) that is suggested is not optimal.

For example, here it seems like 1 component would be the best for max.dist. But when I do this 3 of 26 samples are incorrectly classified. However, with 6 components all 26 samples are correctly classified.

I did 7 fold validation repeated 75 times, so this should not be the problem right? Does it have to do something with the threshold that determines if the model improves or not?

Thanks in advance!

When determining the optimal number of components, the perf() and tune() functions employ the following algorithm:

  • for x in 1 : ncomp:
    • generate component values and loadings by decomposition
    • Use folded CV to determine the error rate of a model utilising these variates. This will yield a distribution of nrepeat error rates.
    • if x > 1:
      • run a one sided T-test between the optimal ncomp error rates distribution against component x error rates distribution.
      • If x has statistically significant improvement, then set x as new optimal ncomp.
    • else
      • set x as optimal ncomp

Despite there being this “calculated” optimal ncomp, at the end of the day we as users need to decide on what value to use for this parameter. We always need to be balancing model accuracy with model complexity. This is why mixOmics tuning functions tend to lean towards suggesting simpler models (ie. fewer components and features).

Additionally, if we look at the error bars within your figure, there is a large degree of overlap. This suggests that T-tests for components 2 to 10 all were insignificant. The sharp increase beyond component 1 elucidates why it was selected.

Thanks for the quick response!
I’m trying to get the maximum classification accuracy so I think I’m just going to take the model which gives the best classification results for my training samples and then use that to classify my test samples.