Hi all!
Based on a previous post, I’m really happy with the integration results I’ve been able to achieve with this package!
I was hoping for a clarification on some results
So I am trying to understand the correlation matrix extracted from my circos plot. Given I have 4 blocks, based on my code I would expect this matrix to have all the features from the first component in rows and columns, and there would be the typical “triangle” structure where autocorrelations are perfect (1) while x vs y and y vs x correlation coefficients are the same. However, this is not the case. While the latter expectation holds true, the values for autocorrelations are <1 and seem more or less random. I’ve attached a screenshot below.
Is there a reason for this? How can I interpret these values? Does this have to do with the design matrix used for building the model? I have experimented with both full and data-driven design matrices, and neither give the expected result.
The calculation of the correlation is described in Visualising associations between paired ‘omics’ data sets | BioData Mining | Full Text and summarised in the slides below. We dont calculate direct cross-correlations, but correlations with respect to the component. These calculations are designed only for cross-correlations (i.e x with y, but not x with x) so you should ignore x vs x (or set it to 1 if you need this output for downstream analysis).
Just an add-on question/my own sanity check regarding the matrix from the circosPlot.
I’ve been following the book and its tutorials recently for my own metabolite and microbial data. Posts in this forum have helped a lot too!
So, I’ve come to the point of circosPlot and like OP, I have extracted the underlaying matrix.
If I plot my circosPlot without the parameter comp= I get cross-correlations in respect to the components in my diablo model. I’m using ncomp=2, as I’ve found it to well describe my data sets. So, this matrix shows me how my variables are cross-correlated in both comp1 and comp2, correct?
Then, if I add the comp=1 parameter to the plot, I’m now looking how my variables are cross-correlated in that first component.
Are these correct assumptions?
I want the matrix to use in Cytoscape to show relationships between my metabolites and bacteria. (networks in R come up quite miserable, and I’m not that awesome in R to begin with to make it better. I know I don’t need the matrix per say, because I can get the network with a code in R, but I’m a very visual person so I want to see everything to understand it)
Yes your interpretation is correct, you can choose to show the variables selected on each component (i.e comp = 1) or across several components (comp = 1:2).
To input into Cytoscape, it is best to use the function network (network, plotVar, CircosPlot, they all do the same thing!). Make sure you specify the component(s) of interest, and maybe the cutoff, then save as an object, e.g. myNetwork <- network(....)
Then extract myNetwork@gR that is an input for Cytoscape.