Hi everyone,
First, many thanks for the great package!
I am using the package version 6.24.0.
I am running a DIABLO analysis in 223 individuals with the following data sets:
- 150 lipid-related metabolites (~normal distributions)
- 35 proteins (~normal distributions)
- 51 genetic variants (0/1/2 values)
The outcome is a 3-category variable (sample size per category = 103, 64 and 56).
X ← list(metabo = alldata[, c(104:155,173:270)],
prot = alldata[, c(2:36)],
SNPs = alldata[, c(37:59,76:103)])
Y ← alldata$Beck.orig; table(Y)
All the X variables are coded in numeric, scaled and centered (even the SNPs). The outcome variable is coded as a factor. I have chosen a 0.5 design matrix given the correlations between the first component of the different data sets (using a PLS analysis).
design ← matrix(0.5, ncol = length(X), nrow = length(X),
dimnames = list(names(X), names(X)))
diag(design) ← 0; design
The “quick start” code works fine:
diablo.result1 ← block.plsda(X, Y)
plotIndiv(diablo.result1)
plotVar(diablo.result1)
However, here is the error message that I get when I use the perf() function:
tune.diablo ← block.plsda(X, Y, ncomp = 5, design = design)
perf.diablo ← perf(tune.diablo, validation = ‘Mfold’, folds = 5, nrepeat = 10)
Error in filter()
:
In argument: row_number(x) == n()
.
In group 1: Group.2 = "100"
.
Caused by error in vec_rank()
:
! Unsupported vctrs type null
.
In file type-info.c at line 189.
This is an internal error that was detected in the vctrs package.
Please report it at https://github.com/r-lib/vctrs/issues with a reprex and the full backtrace.
Backtrace:
▆
- ├─mixOmics::perf(…)
- ├─mixOmics:::perf.sgccda(…)
- │ └─base::lapply(…)
- │ └─mixOmics (local) FUN(X[[i]], …)
- │ └─mixOmics (local) repeat_cv_perf.diablo(nrep)
- │ └─base::lapply(…)
- │ └─mixOmics (local) FUN(X[[i]], …)
- │ └─mixOmics:::predict.block.spls(model[], X.test[], dist = “all”)
- │ └─mixOmics:::internal_predict.DA(…)
- │ ├─dplyr::filter(.data = data_max, row_number(x) == n())
- │ └─dplyr:::filter.data.frame(.data = data_max, row_number(x) == n())
- │ └─dplyr:::filter_rows(.data, dots, by)
- │ └─dplyr:::filter_eval(…)
- │ ├─base::withCallingHandlers(…)
- │ └─mask$eval_all_filter(dots, env_filter)
- │ └─dplyr (local) eval()
- ├─dplyr::row_number(x)
- │ └─vctrs::vec_rank(x, ties = “sequential”, incomplete = “na”)
- └─rlang:::stop_internal_c_lib(…)
- └─rlang::abort(message, call = call, .internal = TRUE, .frame = frame)
I have performed PLSDA analyses between each X data set and Y, and didn’t encounter this issue.
I have unsuccessfully tried the followings:
- removing the SNPs from X to see if these variables could have caused the issue.
- change ncomp and folds.
- change the design matrix.
Any idea about what could be causing this issue?
Many thanks in advance.
Simon Nusinovici