Plot DIABLO components

Hi,

I would like to make a figure similar to Fig. 3A in Singh et al. 2019. By using plotDiablo() I am able to plot the correlation between each omics type to each other for a specific component. But how do I plot components 1 and 2 for instance to see how well samples separate by the components?

Kind regards,
Emmy

Have you explored the plotIndiv() function?

Thank you for your response. I have used the block.splsda (diablo) function and integrated 4 omics types for a classification purpose. I have explored the plotIndiv function and I think I found the parameter I am looking for; rep.space = c(ā€œX-variateā€, ā€œXY-variateā€, ā€œY-variateā€, ā€œmultiā€), but this parameter does not seem to be available for my object, which is a block.splsda object?

Hi @emmy

Say you follow this example: DIABLO TCGA Case Study | mixOmics

You want to add blocks = 'average' for a consensus plot, and then style = 'graphics' so that on top you can overlay the test samples using their predicted components.

plotIndiv(diablo.tcga, ind.names = FALSE, legend = TRUE, title = 'TCGA, DIABLO comp 1 - 2', blocks = 'average', style = 'graphics')

Screen Shot 2023-05-26 at 09.30.58

Kim-Anh

This is what I was looking for, thank you so much! :slight_smile:

Hello again,

I was wondering if it is possible to perform cross-validation with averaged components? When I plot my model with averaged components (blocks=ā€˜averageā€™), I get excellent separation and when I look at my four omics block separately, one of them gives excellent separation whereas the other three blocks has a slightly worse separation but still OK. If I understand the ā€˜perfā€™ function correctly, it performs cross-validation for the omics blocks separately, right? Then the performance will be negatively affected by the other three omics blocks, whereas if I could perform cross-validation on the averaged components, I would likely get excellent results, similar to the single omics blocks that performed best.

So my question is, is there a way to perform cross-validation directly on averaged components instead of performing cross-validation on the omics blocks separately and then take the average of that?

Kind regards,
Emmy

hi @emmy

If I understand the ā€˜perfā€™ function correctly, it performs cross-validation for the omics blocks separately, right?

No, it does not. It runs block.splsda on cross-validation sets that take all omics into account. What your results mean is that while you see a visual separation (which is nice!), when you do cross-validation (make sure the # of folds is ok) the results are not so generalisable. Could be also that your sample size is fairly small.

It might me worthwhile for you to look deeper into the perf() results, there are a bunch, some might be useful than others to really understand where the classification errors are coming from (e.g maybe a specific class and specific data set?)

For sgccda models, `perf` produces the following outputs:

|`error.rate`|Prediction error rate for each block of `object$X` and each `dist`|
| --- | --- |
|`error.rate.per.class`|Prediction error rate for each block of `object$X`, each `dist` and each class|
|`predict`|Predicted values of each sample for each class, each block and each component|
|`class`|Predicted class of each sample for each block, each `dist`, each component and each nrepeat|
|`features`|a list of features selected across the folds (`$stable.X` and `$stable.Y`) for the `keepX`and `keepY` parameters from the input object.|
|`AveragedPredict.class`|if more than one block, returns the average predicted class over the blocks (averaged of the `Predict` output and prediction using the `max.dist` distance)|
|`AveragedPredict.error.rate`|if more than one block, returns the average predicted error rate over the blocks (using the`AveragedPredict.class` output)|
|`WeightedPredict.class`|if more than one block, returns the weighted predicted class over the blocks (weighted average of the `Predict` output and prediction using the `max.dist` distance). See details for more info on weights.|
|`WeightedPredict.error.rate`|if more than one block, returns the weighted average predicted error rate over the blocks (using the `WeightedPredict.class` output.)|
|`MajorityVote`|if more than one block, returns the majority class over the blocks. NA for a sample means that there is no consensus on the predicted class for this particular sample over the blocks.|
|`MajorityVote.error.rate`|if more than one block, returns the error rate of the `MajorityVote` output|
|`WeightedVote`|if more than one block, returns the weighted majority class over the blocks. NA for a sample means that there is no consensus on the predicted class for this particular sample over the blocks.|
|`WeightedVote.error.rate`|if more than one block, returns the error rate of the `WeightedVote` output|
|`weights`|Returns the weights of each block used for the weighted predictions, for each nrepeat and each fold|

Kim-Anh

Ok great, thanks for your help! :slight_smile:

1 Like