Hi,

I’m having issues using parallelisation (with BiocParallel) in some of the mixOmics functions.

It might be a user mistake or a bug, I’m not sure but it looks like the ‘BPPARAM’ argument doesn’t have any effect on running time. At least in the perf() function.

Here is a fully reproducible example:

```
library(mixOmics)
library(dplyr)
library(BiocParallel)
## -------------------------------------------------------------------------------------------------------------------
data(breast.TCGA) # load in the data
data = list(miRNA = breast.TCGA$data.train$mirna, # set a list of all the X dataframes
mRNA = breast.TCGA$data.train$mrna,
proteomics = breast.TCGA$data.train$protein)
Y = breast.TCGA$data.train$subtype # set the response variable as the Y dataframe
## -------------------------------------------------------------------------------------------------------------------
design = matrix(0.1, ncol = length(data),
nrow = length(data), # for square matrix filled with 0.1s
dimnames = list(names(data), names(data)))
diag(design) = 0 # set diagonal to 0s
basic.diablo.model = block.splsda(X = data, Y = Y, ncomp = 5, design = design) # form basic DIABLO
## -------------------------------------------------------------------------------------------------------------------
# Benchmark
n_rep = 1
res <- list(
"MulticoreParam(10)" = microbenchmark(perf(basic.diablo.model, validation = 'Mfold',
folds = 10, nrepeat = 10,
progressBar=FALSE,
BPPARAM=MulticoreParam(workers = 10)),
times = n_rep),
"MulticoreParam(5)" = microbenchmark(perf(basic.diablo.model, validation = 'Mfold',
folds = 10, nrepeat = 10,
progressBar=FALSE,
BPPARAM=MulticoreParam(workers = 5)),
times = n_rep),
"MulticoreParam(2)" = microbenchmark(perf(basic.diablo.model, validation = 'Mfold',
folds = 10, nrepeat = 10,
progressBar=FALSE,
BPPARAM=MulticoreParam(workers = 2)),
times = n_rep),
"SnowParam(10)" = microbenchmark(perf(basic.diablo.model, validation = 'Mfold',
folds = 10, nrepeat = 10,
progressBar=FALSE,
BPPARAM=BiocParallel::SnowParam(workers = 10)),
times = n_rep),
"SnowParam(5)" = microbenchmark(perf(basic.diablo.model, validation = 'Mfold',
folds = 10, nrepeat = 10,
progressBar=FALSE,
BPPARAM=BiocParallel::SnowParam(workers = 5)),
times = n_rep),
"SnowParam(2)" = microbenchmark(perf(basic.diablo.model, validation = 'Mfold',
folds = 10, nrepeat = 10,
progressBar=FALSE,
BPPARAM=BiocParallel::SnowParam(workers = 2)),
times = n_rep),
"SerialParam(1)" = microbenchmark(perf(basic.diablo.model, validation = 'Mfold',
folds = 10, nrepeat = 10,
progressBar=FALSE,
BPPARAM=SerialParam()),
times = n_rep))
bind_rows(res)
```

The table below shows the results.

```
Unit: seconds
expr min lq mean median uq
BPPARAM = MulticoreParam(workers = 10) 25.17865 25.17865 25.17865 25.17865 25.17865
BPPARAM = MulticoreParam(workers = 5) 25.37876 25.37876 25.37876 25.37876 25.37876
BPPARAM = MulticoreParam(workers = 2) 25.19722 25.19722 25.19722 25.19722 25.19722
BPPARAM = SnowParam(workers = 10)) 25.45244 25.45244 25.45244 25.45244 25.45244
BPPARAM = SnowParam(workers = 5)) 25.81489 25.81489 25.81489 25.81489 25.81489
BPPARAM = SnowParam(workers = 2)) 25.91184 25.91184 25.91184 25.91184 25.91184
BPPARAM = SerialParam()) 25.55273 25.55273 25.55273 25.55273 25.55273
```

Regardless of the number of workers (10,5,2 or serial (1)), the running time is always the same. MulticoreParam or SnowParam provide similar results.

This was tested on a Mac (table above) and a linux cluster (results not shown here but they were similar).

The problem doesn’t come from BiocParallel

```
# Test on a simple function
FUN <- function(x) { round(sqrt(x), 4) }
n_rep = 10
resb <- list(
"MulticoreParam(10)" = microbenchmark(BiocParallel::bplapply(1:10, FUN,
BPPARAM=MulticoreParam(workers = 10)),
times = n_rep),
"MulticoreParam(5)" = microbenchmark(BiocParallel::bplapply(1:10, FUN,
BPPARAM=MulticoreParam(workers = 5)),
times = n_rep),
"MulticoreParam(2)" = microbenchmark(BiocParallel::bplapply(1:10, FUN,
BPPARAM=MulticoreParam(workers = 2)),
times = n_rep),
"SerialParam(1)" = microbenchmark(BiocParallel::bplapply(1:10, FUN,
BPPARAM=SerialParam()),
times = n_rep))
bind_rows(resb)
```

```
Unit: milliseconds
expr min lq mean median uq
MulticoreParam(workers = 10)) 109.917966 112.847457 117.48494 117.635478 121.685909
MulticoreParam(workers = 5)) 105.232978 108.726998 111.34625 110.138341 111.077569
MulticoreParam(workers = 2)) 184.162119 184.523493 186.24020 185.903594 187.473689
SerialParam()) 2.200429 2.234254 2.32217 2.266336 2.274926
```

→ BiocParallel seems to work as expected with a regular R function.

Would you have an idea why the BPPARAM has no effect in the perf function ?

Thank you!