Centroids dist vs Centroid in arrow plot

Jeni · January 5, 2023, 1:44pm

Hi!

I would like to know what does exactly “centroid” means in the arrow plot? I dont’t find that information in the manual. In addition, when assessing the performance of the model, I can check its performance with max, centroids and mahalanobis distance. Are both centroid the same? In that case, is any way to show the max.dist in the arrow plots instead of centroids? I am asking because the performance of the model is better with max than with centroids dist.

Thanks!

MaxBladen · January 9, 2023, 9:19pm

The term “centroid” when talking about the arrow plot and model prediction are different, but related. They both use the same base concept, of a centroid being the average of a set of points in N dimensional space.

When looking at an arrow plot, each sample has a centroid, which essentially takes the average component values to dictate X and Y position. Eg, if x axis is component 1 and y axis is component 2: for sample 1, its component 1 values are averaged across each block. This is then used as the x axis position of this sample’s centroid.

In contrast, prediction uses the term “centroid” to refer to a different process. I think the best resource to explain can be found at our website, click here.

If max.dist is yielding better results, its likely that there is a clear decision boundary in the reduced dimensional space between samples of your different groups - which is great! However, if the centroid and mahalanobis predictions have a significant drop in performance, then it may be a case of overfitting.

Topic		Replies	Views
Centroid values in a sample plot Support	1	421	March 16, 2021
How to interpret results from X block, Y block and X-Y block Analysis	2	477	March 21, 2022
DIABLO inputs and optimal number of components Analysis	4	407	December 10, 2021
DIABLO perf & tuning Analysis	4	1096	July 23, 2020
Question about error rate of zero Analysis	3	413	January 15, 2021

Centroids dist vs Centroid in arrow plot

Related topics