Hello, I am trying to analyze 3 datasets (methylation, expression and proteomics) for a drug sensitivity in DIABLO. In the first part of the code, the graph expressing the classification error rate does not seem correct since all the lines are clustered at 0 as y=0 lines (can be seen in attached picture). Anybody has any idea, what might be the problem? Additionally, when I change the nrepeat in a higher number the graph changes just a little bit.
and here is my code till this plot;
Blockquote
----message = TRUE------------------------------------------------------
library(mixOmics)
##general
Data1 = read.csv(“expression_traindata.csv” , sep= “,”, row.names = 1,header = T,check.names = F,stringsAsFactors = F)
Data2 = read.csv(“methylation1_traindata.csv” , sep= “,”, row.names = 1,header = T,check.names = F,stringsAsFactors = F)
Data3 = read.csv(“proteomics_traindata.csv” , sep= “,”, row.names = 1,header = T,check.names = F,stringsAsFactors = F)
data = list(expression=as.matrix((Data1)),methylation=as.matrix((Data2)),proteomics=as.matrix((Data3)))
lapply(data, dim)
Y=read.csv(“sensitive_resistant.csv” , sep= “,”, row.names = 1, header = T, stringsAsFactors = T)
summary(Y)
------------------------------------------------------------------------
design = matrix(0.1, ncol = length(data), nrow = length(data),
dimnames = list(names(data), names(data)))
diag(design) = 0
design
------------------------------------------------------------------------
sgccda.res = block.splsda(X = data, Y = Y$subtype, ncomp = 10, design = design)
set.seed(123)
perf.diablo = perf(sgccda.res, validation = ‘Mfold’, folds = 3, nrepeat = 100)
#perf.diablo # lists the different outputs
plot(perf.diablo)
Blockquote