Plot DIABLO components

emmy · September 1, 2022, 1:41pm

Hi,

I would like to make a figure similar to Fig. 3A in Singh et al. 2019. By using plotDiablo() I am able to plot the correlation between each omics type to each other for a specific component. But how do I plot components 1 and 2 for instance to see how well samples separate by the components?

Kind regards,
Emmy

MaxBladen · September 6, 2022, 11:23pm

Have you explored the plotIndiv() function?

emmy · May 15, 2023, 8:21am

Thank you for your response. I have used the block.splsda (diablo) function and integrated 4 omics types for a classification purpose. I have explored the plotIndiv function and I think I found the parameter I am looking for; rep.space = c(“X-variate”, “XY-variate”, “Y-variate”, “multi”), but this parameter does not seem to be available for my object, which is a block.splsda object?

kimanh.lecao · May 25, 2023, 11:31pm

Hi @emmy

Say you follow this example: DIABLO TCGA Case Study | mixOmics

You want to add blocks = 'average' for a consensus plot, and then style = 'graphics' so that on top you can overlay the test samples using their predicted components.

plotIndiv(diablo.tcga, ind.names = FALSE, legend = TRUE, title = 'TCGA, DIABLO comp 1 - 2', blocks = 'average', style = 'graphics')

Screen Shot 2023-05-26 at 09.30.58

Kim-Anh

emmy · May 26, 2023, 6:32am

This is what I was looking for, thank you so much!

emmy · June 9, 2023, 1:42pm

Hello again,

I was wondering if it is possible to perform cross-validation with averaged components? When I plot my model with averaged components (blocks=‘average’), I get excellent separation and when I look at my four omics block separately, one of them gives excellent separation whereas the other three blocks has a slightly worse separation but still OK. If I understand the ‘perf’ function correctly, it performs cross-validation for the omics blocks separately, right? Then the performance will be negatively affected by the other three omics blocks, whereas if I could perform cross-validation on the averaged components, I would likely get excellent results, similar to the single omics blocks that performed best.

So my question is, is there a way to perform cross-validation directly on averaged components instead of performing cross-validation on the omics blocks separately and then take the average of that?

Kind regards,
Emmy

kimanh.lecao · June 15, 2023, 11:37pm

hi @emmy

If I understand the ‘perf’ function correctly, it performs cross-validation for the omics blocks separately, right?

No, it does not. It runs block.splsda on cross-validation sets that take all omics into account. What your results mean is that while you see a visual separation (which is nice!), when you do cross-validation (make sure the # of folds is ok) the results are not so generalisable. Could be also that your sample size is fairly small.

It might me worthwhile for you to look deeper into the perf() results, there are a bunch, some might be useful than others to really understand where the classification errors are coming from (e.g maybe a specific class and specific data set?)

For sgccda models, `perf` produces the following outputs:

|`error.rate`|Prediction error rate for each block of `object$X` and each `dist`|
| --- | --- |
|`error.rate.per.class`|Prediction error rate for each block of `object$X`, each `dist` and each class|
|`predict`|Predicted values of each sample for each class, each block and each component|
|`class`|Predicted class of each sample for each block, each `dist`, each component and each nrepeat|
|`features`|a list of features selected across the folds (`$stable.X` and `$stable.Y`) for the `keepX`and `keepY` parameters from the input object.|
|`AveragedPredict.class`|if more than one block, returns the average predicted class over the blocks (averaged of the `Predict` output and prediction using the `max.dist` distance)|
|`AveragedPredict.error.rate`|if more than one block, returns the average predicted error rate over the blocks (using the`AveragedPredict.class` output)|
|`WeightedPredict.class`|if more than one block, returns the weighted predicted class over the blocks (weighted average of the `Predict` output and prediction using the `max.dist` distance). See details for more info on weights.|
|`WeightedPredict.error.rate`|if more than one block, returns the weighted average predicted error rate over the blocks (using the `WeightedPredict.class` output.)|
|`MajorityVote`|if more than one block, returns the majority class over the blocks. NA for a sample means that there is no consensus on the predicted class for this particular sample over the blocks.|
|`MajorityVote.error.rate`|if more than one block, returns the error rate of the `MajorityVote` output|
|`WeightedVote`|if more than one block, returns the weighted majority class over the blocks. NA for a sample means that there is no consensus on the predicted class for this particular sample over the blocks.|
|`WeightedVote.error.rate`|if more than one block, returns the error rate of the `WeightedVote` output|
|`weights`|Returns the weights of each block used for the weighted predictions, for each nrepeat and each fold|

Kim-Anh

emmy · June 16, 2023, 7:58am

Ok great, thanks for your help!

Topic		Replies	Views
Cross-validated AUC in DIABLO	2	635	April 15, 2020
Perf on DIABLO with one component Support	5	992	December 7, 2020
AUC for DIABLO object Bugs	15	1518	May 3, 2024
Generic questions about DIABLO: perf, keepX and no variable selection Support	5	1396	December 11, 2022
Combine variates from different omics Analysis	3	134	June 13, 2024

Plot DIABLO components

Related topics