[DIABLO]: circosPlot - export correlation matrix

Hello,

I’ve been using the DIABLO approach to integrate 3 different omics datasets and came across the circosPlot function, which is of great value in terms of visualization of paiwise correlations within the 3 data matrices.

My suggestion is that it would be great if those inner-produced correlation matrices could be assessible as an optional output, possibly builted into the circosPlot function as an export argument. This would allow a better intepretation of the results and would complement this plot in terms of interpretation.

Nevertheless, congrats on the development of the mixOmics package.

Rafael

Hi Rafael,

That sounds like a good idea. We’ll let you know once it’s implemented. In the meantime, you can simply get the pairwise correlation using the cor function from base R stats with appropriate inputs for “method” and “use” arguments . Below I included an example:

## toy omic matrices
omic1=matrix(sample(1:100, 12),ncol = 4,dimnames = list(features= letters[10:12], samples=LETTERS[1:4]))
omic2=matrix(sample(1:100, 20),ncol = 4,dimnames = list(features= letters[13:17], samples=LETTERS[1:4]))
## add a missing value
omic2[2,4] <- NA
## pairwsie correlations b/w omic1 and omic2 features
cor(t(omic1), t(omic2), use = "complete.ob", method = "pearson")

Hope it helps.

Best

Al

Hi @RafaSilva,

Just an update that the correlation matrix can be save as an output of circosPlot. e.g.:

data(nutrimouse)
Y = nutrimouse$diet
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid)
design = matrix(c(0,1,1,1,0,1,1,1,0), ncol = 3, nrow = 3, byrow = TRUE)


nutrimouse.sgccda <- wrapper.sgccda(X=data,
                                    Y = Y,
                                    design = design,
                                    keepX = list(gene=c(10,10), lipid=c(15,15)),
                                    ncomp = 2,
                                    scheme = "horst")

corMat <- circosPlot(nutrimouse.sgccda, cutoff = 0.7, ncol.legend = 2, size.legend = 1.1)
corMat

Please let me know if you have any other questions/suggestions.

Thanks

Al

Hey AI:
Really appreciate for the update.

I tried to get the correlation using the cor function in R, but I found the result is a bit different from the one directly exported from circosPlot. Could you please check if I did anything wrong? Or correlation from circosPlot is modified based on the pairwise correlation?

corMat1 = cor(nutrimouse$gene, nutrimouse$lipid, use = "complete.ob", method = "pearson")

Thanks in advance for your help.

Kai

Hi @Kai,

Thanks for the code and for sharing your experience. We have started a discussion over how similarities are calculated and so far I believe we should re-visit some of these calculations. The similarity measure outputted by circosPlot considers the correlation b/w the variables and the variates which would give different outputs to the ones I mentioned earlier using cor function so my apologies. This is a high priority matter at hand and we’ll update you soon.

Best,

Al

Hi,

Thank you for this great function.
Is there a way you could extract which ones actually are significant (e.g., r < 0.9)?

It would be nice if it could produce a table (a single column) with

a1 from dataset A - b1 from dataset B with correlation = 0.85
a2 from dataset A - b2 from dataset B with correlation = 0.35
a3 from dataset A - b1 from dataset B with correlation = -0.25
.
.
.

Working with the same name of accessions but with different variable makes it difficult to distinguish which ones are the ones shown on the circosPlot.

Thanks,
John

1 Like

Hello @aljabadi ,

I just realized that the correlation matrix yielded by circosPlot has diagonal values that are not 1 that we should expect from a correlation to self. Am I misunderstanding something ? Are these values relative to the correlation between a single feature across the considered datasets?
Is the interpretation of these coefficients as correlations still valid ?

Best wishes,
Guillaume

3 Likes

Was there an answer to your question @gsalle? @aljabadi I am also wondering why the diagonal values in the matrix generated by circosPlot does not contain 1 on the diagonal. It makes me think I am misunderstanding something

Hi @jules21 and @gsalle,

I’ve briefly examined the source code the output of circosPlot and here’s my initial thoughts. You are correct in that your assumption that the diagonal values should be equal to 1 as they represent the correlation of a given feature to itself. However, this would be if the returned matrix represented correlations.

The discrepancy derives from the fact the function returns the similarity matrix, not the correlation matrix. This is calculated via matrix multiplication of the features values projected onto the components.

I’ll have a look at a way to return the correlation matrix and I’ll get back to you

1 Like

Hi, have you by any chance found a way to get correlation matrix from the circosplot results ?
I want to display the correlations between my features into a correlation plot but it is a little bit misleading when I use the output dataframe from circosplot as diagonal values are not equal to 1 (as it is a similarity matrix, like you mentioned).

Thank you very much for your reply

hi @debosuissa,

You can have a look at the previous post explaining why it is the case, and why a scatterplot might not be suitable: Scatter plot using sgcca() output - #4 by kimanh.lecao

I am not sure which plot you are referring to here, but you could also consider replacing a 1 on a diagonal, since the diagonal is not used in most correlation plots?

Kim-Anh

1 Like

@kimanh.lecao and @MaxBladen! Thanks for your responses, I had the same question.