Correct performances/error rates interpretation?

Hello,

In the past months, I analyzed different data sets, all coming from mass spectrometry analysis, using the sPLS-DA method. The main aim of the study was, first, to detect the variables that could enable sample discrimination correctly, and then correct prediction analysis of new data sets.

During the analysis workflow, I paid extra care to performances and in particular in the error rates data, thinking that the closest the error rate value would be 0 the best accuracy I would gate. After the tuning process, I usually get error rates between 0.4 and 0.25. I used to consider these as bad results, as for me it means that a 0.4 error rate indicates that 40% of the time the prediction would be incorrect. In the meantime, cim figures are nice, and sPLS-DA separates well the samples with no ellipse overlap.
In addition, when I run an AUROC analysis, the sensitivity tends to be high, sometime around 90%, suggesting a good model.

Finally, when looking at the literature, all the publications I have seen present figures and do not mention error rates so this makes me question the importance of error rates and performances.

From all of these above, I am very confused about what I should focus on. Do the error rates are the most important? If the sPLS-DA and cim figures separate well the samples make the analysis ok? Even though in the book (I have it) and online course (I followed it in October 2021) you mention that AUROC results and performance results can be different, which one should I be more interested in?

Thank you in advance for your help.

Best regards

Fabien

hi @Fabien-Filaire ,

Great to hear that you are progressing in your analyses.

Yes, people often do not mention the error rate, I suspect because often they are pretty poor :slight_smile: and they prefer to adopt a more ‘exploratory’ approach. It is all data dependent, so 0.25 might be quite good in your case. There are a few insights you can get from this process:

  • from the perf outputs, you can look at the classification prediction per class (and identify if there is a class that is difficult to classifify)
  • the stability measures from perf help you prioritise specific selected features

Your end results (cim, plotIndiv) tell you that on the full data set your samples are separable, whereas the perf tells you if these results are reproducible when you perturb the data set (and this is what the error rates are reflecting). The AUROC is largely inflated (it is kind of overfitting, because a PLS-DA has already been fitted to discriminate the classes).

In summary, feel free to show mostly the graphical outputs and focus your attention also on the features that are selected and the ones that are stable if your study is about identifying discriminative features. If your study is about designing a classifier for some sort of diagnostic, then the error rates, calculating specificity and sensitivity are important.

Kim-Anh

1 Like

Hello @kimanh.lecao
Thank you for your answer, it helped me a lot.

Yes, I usually look in more detail at the error rates per class, I find them more insightful.

About the stability, I have a question: what would be the threshold of the variable I should select? And this threshold should be the same on all components?

Finally, when you are speaking about specificity and sensitivity you are speaking about classical sensitivity and specificity calculation (TP/(TF+FN) & TN/(TN+FP))?

Thank you for your help

Fabien

hi @Fabien-Filaire,

About the stability, I have a question: what would be the threshold of the variable I should select? And this threshold should be the same on all components?

The threshold would differ from one component to another (you may see that some components are more stable than others). And no, there is no magic number. It depends on what you would be happy with. For example if you expect your variables to be highly correlated, then maybe you want a low threshold (i.e 50%) in order to capture those in your interpretation. If you want the most stable biomarker, then you should go higher.

Finally, when you are speaking about specificity and sensitivity you are speaking about classical sensitivity and specificity calculation (TP/(TF+FN) & TN/(TN+FP))?

Yes, as they would be classically calculated for a ROC curve.

Kim-Anh