Error in Check.entry.single(newdata[[q]], ncomp[q], q = q) :
samples should have a unique identifier/rowname
I’ve checked all the row names are unique and gone through a similar problem on the forum (Diablo perf error - #7 by aljabadi) like updating bioconductor and the mixomics package.
I wasn’t getting this error initially on a much smaller dataset. Any help would be appreciated thank you!
Thanks for sharing your data, I’ve found that like the other post you linked (Diablo perf error - #7 by aljabadi), you have one sample which is in its own class. This causes the error when running perf with Mfold validation (but not with leave-one-out validation). The error is unfortunately not very informative, so I will flag that as something to improve in the perf function!
The offending sample is 23, which has the class ‘#N/A’, removing this sample avoids the error:
# model building with original data, perf returns error
basic.diablo.model = block.splsda(X = data, Y = Y, ncomp = 5, design = design, near.zero.var = TRUE)
perf.diablo <- perf(basic.diablo.model)
# Error in Check.entry.single(newdata[[q]], ncomp[q], q = q) :
# samples should have a unique identifier/rowname
# this error does not occur during LOO cross-validation
perf.diablo <- perf(basic.diablo.model, validation = "loo")
# one sample is in its own category '#N/A'
sort(table(Y))
#N/A IV+IV AZLI+IV
# 1 19 20
# identify the offending sample - sample 23
Y
# [1] "IV+IV" "IV+IV" "IV+IV" "IV+IV" "IV+IV" "IV+IV" "IV+IV" "IV+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV"
# [17] "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "#N/A" "IV+IV" "IV+IV" "IV+IV" "IV+IV" "IV+IV" "IV+IV" "IV+IV" "IV+IV" "IV+IV"
# [33] "IV+IV" "IV+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV" "AZLI+IV"
# remove sample 23 from data
Y <- Y[-23]
Bacteria_filtered <- Bacteria[-23, ]
Metabolites_table_filtered <- Metabolites_table[-23, ]
data = list(Bacteria=Bacteria_filtered,
Metabolites_table=Metabolites_table_filtered)
# re-run model building and perf without errors
basic.diablo.model = block.splsda(X = data, Y = Y, ncomp = 5, design = design, near.zero.var = TRUE)
perf.diablo <- perf(basic.diablo.model)
plot(perf.diablo)