Why select ncomp = 2 when 1 component is sufficient?

Hi all,

I am using mixOmics version 6.18.1 and follow the example of perf function.

data(liver.toxicity)
X ← liver.toxicity$gene
Y ← liver.toxicity$clinic
liver.pls ← pls(X, Y, ncomp = 5)
liver.val ← perf(liver.pls, validation = “Mfold”, folds = 5)

The summary of “liver.val$measures$Q2.total$summary” give me a result as
feature comp mean sd
1 Y 1 0.2446896 NA
2 Y 2 -0.1014370 NA
3 Y 3 -0.3209418 NA
4 Y 4 -4.1172609 NA
5 Y 5 -2.1834656 NA

The conclusion in the example was as “# ncomp = 2 is enough”, Why 2 components were good in this case?
From the result I had above, component 1 had biggest value of all and also bigger than 0.0975 (as I read from an old example of mixOmics).

Could any one please help me explain this? and if there is any change in the way Q2.total is calculated or presented?

Many thanks,
Tuan nguyen

Hello @Tuan173,

I would recommend following the example found here rather than the examples in the documentation. The documentation contains little to no explanation of the decision making required as part of the usage of mixOmics methods.

You are correct in your interpretation that ncomp = 1 would be appropriate due to the Q2 score being above 0.0975. In a real life context, this is the decision you would make. However, in that example it states ncomp = 2 likely because this allows for the visualisation of the model using the various plotting functions within mixOmics (eg. plotVar() and plotIndiv()).

Hope this clarified things a bit for you.

Cheers,
Max.

1 Like

Hi Max,

Thank you very much for your explanation. It was of great help to me.

BW,
Tuan