Hi everyone,

First, many thanks for the great package!

I am using the package version 6.24.0.

I am running a DIABLO analysis in 223 individuals with the following data sets:

- 150 lipid-related metabolites (~normal distributions)
- 35 proteins (~normal distributions)
- 51 genetic variants (0/1/2 values)

The outcome is a 3-category variable (sample size per category = 103, 64 and 56).

X ← list(metabo = alldata[, c(104:155,173:270)],

prot = alldata[, c(2:36)],

SNPs = alldata[, c(37:59,76:103)])

Y ← alldata$Beck.orig; table(Y)

All the X variables are coded in numeric, scaled and centered (even the SNPs). The outcome variable is coded as a factor. I have chosen a 0.5 design matrix given the correlations between the first component of the different data sets (using a PLS analysis).

design ← matrix(0.5, ncol = length(X), nrow = length(X),

dimnames = list(names(X), names(X)))

diag(design) ← 0; design

The “quick start” code works fine:

diablo.result1 ← block.plsda(X, Y)

plotIndiv(diablo.result1)

plotVar(diablo.result1)

However, here is the error message that I get when I use the perf() function:

tune.diablo ← block.plsda(X, Y, ncomp = 5, design = design)

perf.diablo ← perf(tune.diablo, validation = ‘Mfold’, folds = 5, nrepeat = 10)

Error in `filter()`

:

In argument: `row_number(x) == n()`

.

In group 1: `Group.2 = "100"`

.

Caused by error in `vec_rank()`

:

! Unsupported vctrs type `null`

.

In file type-info.c at line 189.

This is an internal error that was detected in the vctrs package.

Please report it at https://github.com/r-lib/vctrs/issues with a reprex and the full backtrace.

Backtrace:

▆

- ├─mixOmics::perf(…)
- ├─mixOmics:::perf.sgccda(…)
- │ └─base::lapply(…)
- │ └─mixOmics (local) FUN(X[[i]], …)
- │ └─mixOmics (local) repeat_cv_perf.diablo(nrep)
- │ └─base::lapply(…)
- │ └─mixOmics (local) FUN(X[[i]], …)
- │ └─mixOmics:::predict.block.spls(model[], X.test[], dist = “all”)
- │ └─mixOmics:::internal_predict.DA(…)
- │ ├─dplyr::filter(.data = data_max, row_number(x) == n())
- │ └─dplyr:::filter.data.frame(.data = data_max, row_number(x) == n())
- │ └─dplyr:::filter_rows(.data, dots, by)
- │ └─dplyr:::filter_eval(…)
- │ ├─base::withCallingHandlers(…)
- │ └─mask$eval_all_filter(dots, env_filter)
- │ └─dplyr (local) eval()
- ├─dplyr::row_number(x)
- │ └─vctrs::vec_rank(x, ties = “sequential”, incomplete = “na”)
- └─rlang:::stop_internal_c_lib(…)
- └─rlang::abort(message, call = call, .internal = TRUE, .frame = frame)

I have performed PLSDA analyses between each X data set and Y, and didn’t encounter this issue.

I have unsuccessfully tried the followings:

- removing the SNPs from X to see if these variables could have caused the issue.
- change ncomp and folds.
- change the design matrix.

Any idea about what could be causing this issue?

Many thanks in advance.

Simon Nusinovici