AUC for DIABLO object

Hello,

I am using DIABLO to integrate transcriptomics and proteomics in order to identify a signature made up of ~2 transcripts and ~2 genes.

I have identified the signature using block.splsda, and now I would like to use leave one out cross validation to find out the AUC of the combined signature (genes and proteins).

I have used the perf function, however it will not provide the AUCs. This is the code I have ran:
perf.null <- perf(diablo.res.null, validation = ‘loo’, auc = TRUE,
nrepeat = 2, dist = ‘mahalanobis.dist’)

When I do perf.null$auc, auc is empty.
Is there an alternative way to do cross validation and then obtaining the combined cross-validated AUC?

Many thanks in advance,
Heather Jackson

Hi Heather,
Once you have your parameters keepX etc, run a full block.splsda on the whole data set and then:
auroc(object).

However, we won’t perform cross validation in this case, so we will follow up on this perf() code in the meantime.
Note: if you do leave-one-out cross-validation, you do not need to repeat the CV, only onces covers all possibilities!

Kim-Anh

Hi Kim-Anh,

Thanks a lot for your reply, and for pointing out about leave-one-out cross validation!

So is it only possible to see the AUC for each block individually, rather than the combined AUC?

Thanks
Heather

Hi @hj4817,

You can now install the latest version in which the perf function calculates the combined AUC (averaged across all blocks) for each component. Also, feel free to use the cpus argument for faster computation.

Please let us know if you run into any issues.

Best wishes,

Al

Dear Al, Kim-Anh,

I am trying to run the auroc on the DIABLO perf data, but it is giving the following error:

auroc(perf.diablo.final)
Error in UseMethod(“auroc”) :
no applicable method for ‘auroc’ applied to an object of class “perf.sgccda.mthd”

I am running the block.splsda function, resulting in my diablo.final object:

class(diablo.final)
[1] “block.splsda” “block.spls” “sgccda” “sgcca” “DA”

When I run the perf() function, it apparently uses the sgccda class in stead of the block.splsda. I am not sure whether that is the reason why it is not able to run the auroc() function… so far I have not able to figure out why is does not use the block.splsda and how to fix it:

class(perf.diablo.final)
[1] “perf.sgccda.mthd”

Any ideas?

Best, Lisette

Hi @LKogelman,

The auroc function can be applied to the diablo object directly. It cannot be applied to the perf object. You essentially use the perf function to choose the number of components and then evaluate the final diablo model using auroc. That is:

auroc.sgccda(diablo.final)

Let me know if you have further questions.

Al

HI @aljabadi

Thanks for your reply!
Ok, then I misunderstood the line you wrote above " You can now install the latest version in which the perf function calculates the combined AUC (averaged across all blocks) for each component."

How can I get the combined AUC for each component? That is the point where I understood I had to run the auroc after the perf function. I unfortunately cannot find the combined AUC of ‘diablo.final’ when using the auroc function.

Best,
Lisette

1 Like

Hi @LKogelman,
You can use auc = TRUE with perf to calculate the average AUC. See example below.

data(nutrimouse)
Y = nutrimouse$diet
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid)
design = matrix(c(0,1,1,1,0,1,1,1,0), ncol = 3, nrow = 3, byrow = TRUE)
nutrimouse.sgccda <- block.splsda(X=data,
                                  Y = Y,
                                  design = design,
                                  keepX = list(gene=c(10,10), lipid=c(15,15)),
                                  ncomp = 2,
                                  scheme = "horst")

perf.res <- perf(nutrimouse.sgccda, auc = TRUE, folds = 3, nrepeat = 3)
perf.res$auc

Please let me know if you have further questions.

Best,

Al

2 Likes

Hi @aljabadi

In the above example, how can one plot the result of perf.res$auc?

Thanks,

Ramiro

We have the AUROC function.

Hi @aljabadi ,

Thanks for the tweak to compute the combined AUC per component. But the auroc plot function still plots per block per component?

I am currently using mixOmics_6.15.1. I tried looking into the auroc function but the current parameters does not seem to allow plotting those combined AUC. Is something out there that I might be missing?

Thanks again for your amazing suite of methods in mixOmics.

Hi @kimanh.lecao
i split my data into training and test set.
i used the perf function for average AUC for the training set (diablo object).
but now i want to know how to calculate the combined average auc for my test set.
as perf can not be applied to predict object.

Hi @shshahbazii

You can use the predict function on your test set, as shown at the end of this vignette DIABLO TCGA Case Study | mixOmics and then calculate what your prediction error is, the sensitivity, specificity etc.

Kim-Anh

Hello,

I have some confusions regarding the combined AUC of DIABLO when interpretating it. I really hope that I can get your advice about them if possible.

As mentioned in above comments, output of perf() function when we set auc = TRUE is the combined AUC for each component.

I wonder whether it means that when performing the classification, model only use predicted scores from exactly that latent component, OR it means that the model will use upto that number of components for classification

For example, if I run the perf() function to calculate the combined AUC of a block.plsda model with two components, and the output combined AUC of comp 1 = 0.9 and of comp 2 = 0.8. Does it mean that model with one component will have AUC = 0.9 and model with two components will AUC = 0.8, OR it means that in that two-components model, the comp 1 will give us AUC = 0.9 and the comp 2 will give us AUC = 0.8.

In the case if the AUC is given separately for each component, may I ask whether we have any ways to combine the predictions of all components of the model to obtain a single prediction. Of note, I don’t have a test set to use predict () function as my sample size is too small. I found it is quite hard to interpret the AUC of each component separately.

Thank you so much in advance for your kind support. I am looking for your response.

Kỳ Phát

Dear @ngkyphat,

For example, if I run the perf() function to calculate the combined AUC of a block.plsda model with two components, and the output combined AUC of comp 1 = 0.9 and of comp 2 = 0.8. Does it mean that model with one component will have AUC = 0.9 and model with two components will AUC = 0.8, OR it means that in that two-components model, the comp 1 will give us AUC = 0.9 and the comp 2 will give us AUC = 0.8.

Our legend is poorly worded. All our PLS models are sequential, meaning that what you learn from component 2 includes any information learnt previously for component 1.
So in your case:
AUC_comp1 = 0.9
AUC_comp 1 and 2 = 0.8 (a decrease in performance if you add a second component). See my previous post about AUC and how this can be a bit optimistic compared to classification error rate.

Kim-Anh

1 Like

Dear Prof. Kim Anh,

Thank you so much for your prompt response and clear explanation.

Your explanation helped me a lot with the model interpretation.

I wish you all the best.

Yours sincerely,

Kỳ Phát