# Interpretation of scatterplot from plotDiablo

Hello,

I have a question regarding the description and interpretation of this scatterplot especially confidence ellipses. We used data with volatile organic compounds (VOC) analyzed from different biological matrices (exhaled breath, serum, ruminal fluid, milk and urine).
We did a trial with cows fed hay (n=16) or silage (n=16) and wanted to discriminate the VOC profiles between the animals fed different diets and we wanted to check correlations between the VOC profiles from the different biological matrices.

Regarding the results in our paper I wrote:
The Pearson correlations between the VOC profiles of the different biological matrices ranged from a correlation coefficient of 0.6 (between VOC from exhaled breath and milk) to 0.98 (between VOC from ruminal fluid and urine). The VOC from exhaled breath showed the strongest correlation with those from urine with correlation coefficients ranging from 0.70 to 0.86 followed by ruminal fluid with correlation coefficients ranging from 0.69 to 0.84 , milk with correlation coefficients ranging from 0.60 to 0.76 and serum with correlation coefficients ranging from 0.69 to 0.78.

Diskussion
A strong correlation structure between VOC profiles from the different biological matrices were observed. The component 1 of urine and rumen fluid are good to discriminate between diets.

• Is this correct and how can I interpret “correlation structure”? What exactly means this term?

What can I improve and how can I describe and interpret the confidence ellipses?

Julia

hi @Julia,

One thing you need to remember is that this correlation is calculated between the components associated to each dataset, not the dataset themselves. These components have been defined to maximise the covariance / correlation between the data sets, and they summarise this information into one dimension (if you look at the first component).
So the statement 'The Pearson correlations between the VOC profiles of the different biological matrices ’ is not quite correct.

I would say something like [of course fix my text below, or break into 2 sentences]:
The Pearson correlation between the components resulting from the DIABLO integration analysis indicate that we were able to extract correlated information between datasets, with correlation values ranging from …

Your discussion point is correct. basically you are saying that from the data, the DIABLO analysis was able to extract correlated information between the datasets (i.e this correlation structure exists - this is not a given) and that you were also able to discriminate the diets group.
The ellipse plots in this case are not extremely relevant, they are more visual.It just shows the variance across groups.
I dont know if you looked / reported the perf() outputs, including the error rate classification per group - this could complete this part if the results are worth showing.

Kim-Anh

Hi Kim-Anh,

thank you for your explanation, this helps me really much!

Have a nice day

Julia