Hi @aljabadi, i have noticed a little bug in the latest devel versions (6.13.xx).
According to the scoreplot, the explained variance on PC1 and PC2 are always NaN or 1%, but only when doing a sparse PCA. It works fine with normal PCA function and with the stable versions.
Hi @christoa,
Thanks for your post. Unfortunately, the tune.spca
function could behave that way with specific pathological datasets. The only way we deal with this at the moment is to increase keepX
to better capture the variation with such datasets. See the examples below.
suppressPackageStartupMessages(library(mixOmics))
data("liver.toxicity")
ncomp <- 5
X <- liver.toxicity$gene
## large number of features
dim(X)
#> [1] 64 3116
keepX <- rep(10, ncomp)
spca.rat.keep.low <- spca(X, ncomp = ncomp, keepX = keepX)
spca.rat.keep.low$explained_variance
#> $X
#> PC1 PC2 PC3 PC4 PC5
#> 0.002323020 0.002317836 0.001884928 0.002234816 0.001648253
## choose higher proportion of variables
keepX <- seq(500, 100, length.out = ncomp)
keepX
#> [1] 500 400 300 200 100
spca.rat.keep.high <- spca(X, ncomp = ncomp, keepX = keepX)
spca.rat.keep.high$explained_variance
#> $X
#> PC1 PC2 PC3 PC4 PC5
#> 0.084215877 0.053616591 0.026611077 0.017273549 0.007563077
Created on 2020-10-22 by the reprex package (v0.3.0)
An example with a less pathological dataset:
suppressPackageStartupMessages(library(mixOmics))
data("nutrimouse")
ncomp <- 5
X <- nutrimouse$gene
## large number of features
dim(X)
#> [1] 40 120
keepX <- as.integer(seq(from=25, to = 5, length.out = ncomp))
keepX
#> [1] 25 20 15 10 5
spca.nutri <- spca(X, ncomp = ncomp, keepX = keepX)
spca.nutri$explained_variance$X
#> PC1 PC2 PC3 PC4 PC5
#> 0.13255426 0.10391290 0.05822852 0.03073844 0.01543979
Created on 2020-10-22 by the reprex package (v0.3.0)
Hope it helps
Al
Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.2 (2020-06-22)
#> os macOS Catalina 10.15
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_AU.UTF-8
#> ctype en_AU.UTF-8
#> tz Australia/Melbourne
#> date 2020-10-22
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib
#> assertthat 0.2.1 2019-03-21 [1]
#> backports 1.1.10 2020-09-15 [1]
#> BiocParallel 1.23.3 2020-10-19 [1]
#> callr 3.5.1 2020-10-13 [1]
#> cli 2.1.0 2020-10-12 [1]
#> colorspace 1.4-1 2019-03-18 [1]
#> corpcor 1.6.9 2017-04-01 [1]
#> crayon 1.3.4 2017-09-16 [1]
#> desc 1.2.0 2018-05-01 [1]
#> devtools 2.3.2 2020-09-18 [1]
#> digest 0.6.25 2020-02-23 [1]
#> dplyr 1.0.2 2020-08-18 [1]
#> ellipse 0.4.2 2020-05-27 [1]
#> ellipsis 0.3.1 2020-05-15 [1]
#> evaluate 0.14 2019-05-28 [1]
#> fansi 0.4.1 2020-01-08 [1]
#> fs 1.5.0 2020-07-31 [1]
#> generics 0.0.2 2018-11-29 [1]
#> ggplot2 * 3.3.2 2020-06-19 [1]
#> ggrepel 0.8.2 2020-03-08 [1]
#> glue 1.4.2 2020-08-27 [1]
#> gridExtra 2.3 2017-09-09 [1]
#> gtable 0.3.0 2019-03-25 [1]
#> highr 0.8 2019-03-20 [1]
#> htmltools 0.5.0 2020-06-16 [1]
#> igraph 1.2.6 2020-10-06 [1]
#> knitr 1.30 2020-09-22 [1]
#> lattice * 0.20-41 2020-04-02 [1]
#> lifecycle 0.2.0 2020-03-06 [1]
#> magrittr 1.5 2014-11-22 [1]
#> MASS * 7.3-53 2020-09-09 [1]
#> Matrix 1.2-18 2019-11-27 [1]
#> matrixStats 0.57.0 2020-09-25 [1]
#> memoise 1.1.0 2017-04-21 [1]
#> mixOmics * 6.13.94 2017-02-06 [1]
#> munsell 0.5.0 2018-06-12 [1]
#> pillar 1.4.6 2020-07-10 [1]
#> pkgbuild 1.1.0 2020-07-13 [1]
#> pkgconfig 2.0.3 2019-09-22 [1]
#> pkgload 1.1.0 2020-05-29 [1]
#> plyr 1.8.6 2020-03-03 [1]
#> prettyunits 1.1.1 2020-01-24 [1]
#> processx 3.4.4 2020-09-03 [1]
#> ps 1.4.0 2020-10-07 [1]
#> purrr 0.3.4 2020-04-17 [1]
#> R6 2.4.1 2019-11-12 [1]
#> rARPACK 0.11-0 2016-03-10 [1]
#> RColorBrewer 1.1-2 2014-12-07 [1]
#> Rcpp 1.0.5 2020-07-06 [1]
#> remotes 2.2.0 2020-07-21 [1]
#> reshape2 1.4.4 2020-04-09 [1]
#> rlang 0.4.8 2020-10-08 [1]
#> rmarkdown 2.4.5 2020-10-16 [1]
#> rprojroot 1.3-2 2018-01-03 [1]
#> RSpectra 0.16-0 2019-12-01 [1]
#> scales 1.1.1 2020-05-11 [1]
#> sessioninfo 1.1.1 2018-11-05 [1]
#> stringi 1.5.3 2020-09-09 [1]
#> stringr 1.4.0 2019-02-10 [1]
#> testthat 2.3.2 2020-03-02 [1]
#> tibble 3.0.4 2020-10-12 [1]
#> tidyr 1.1.2 2020-08-27 [1]
#> tidyselect 1.1.0 2020-05-11 [1]
#> usethis 1.6.3 2020-09-17 [1]
#> vctrs 0.3.4 2020-08-29 [1]
#> withr 2.3.0 2020-09-22 [1]
#> xfun 0.18 2020-09-29 [1]
#> yaml 2.2.1 2020-02-01 [1]
#> source
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> Github (Bioconductor/BiocParallel@3527523)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> Bioconductor (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> Github (rstudio/rmarkdown@7b3b420)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.2)
#> CRAN (R 4.0.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library