Choice of components for DIABLO

bzavala · April 10, 2024, 6:55pm

So I’m dealing with a dataset that has 2 classes with 5 and 7 samples in each class. I know that the sample size is small but I was wondering how I should go about in performing DIABLO. I have done the perf() function to determine the number of components to keep and noticed that the optimal was 5, which is alot. For this reason I am unsure how should I approach this? I also noticed in other discussions that other had this type of question and wondering the best way to approach it?

kimanh.lecao · April 18, 2024, 11:32pm

hi @bzavala

I am not sure if you use cross-validation or loo (you should use loo, or Fold = 3 + nrepeat).

For this case, I would say there is probably little benefit in considering that many components. The increase in the Mahalanobis distance is also weird (we expect the error rate to either stabilise, or decrease).

I would rather base the choice on visual considerations from the sample plots. It is highly likely that 1 or 2 components would be enough to discriminate your sample groups/

Kim-Anh

bzavala · May 3, 2024, 3:08am

I used loo but could it also be interpreted as the classifications (with DIABLO) between classes are nearly indistinguishable?

kimanh.lecao · May 9, 2024, 11:21pm

Your error rate is ~ 17% for 1 or 2 components. I’ll let you calculate how many misclassified samples that may represent!

Kim-Anh

bzavala · May 9, 2024, 11:50pm

oh sorry, but I made some alterations and this is the results I am referring to.

kimanh.lecao · May 16, 2024, 9:48pm

hi @bzavala

yes your interpretation is correct. The classifier gets it wrong all the time. You can extract the output from perf() ($error.rate.per.class) and work out what is going on. You could also try some other classifier, e.g Random Forest to confirm similar results.

Kim-Anh

Topic		Replies	Views
Analytical issues using DIABLO Analysis	2	737	April 13, 2022
LOOCV - Not with DIABLO? Analysis	1	454	September 3, 2021
DIABLO inputs and optimal number of components Analysis	4	408	December 10, 2021
Diablo outputs for publication Analysis	2	44	June 24, 2025
Generic questions about DIABLO: perf, keepX and no variable selection Support	5	1381	December 11, 2022

Choice of components for DIABLO

Related topics