`plsda`: NA values in Y data

Aditya · March 29, 2023, 10:06am

Hey guys, first of all, thankyou very much for creating mixOmics, I very gratefully use it in my own BioC package autonomics and love it.

Recently I noticed that the PLS X variates plot changes whether or not samples with missing Y values are dropped. I initially thought this is a bug, but then thought this is by design. Given that PLS is a method that looks at the covariance of X and Y, somehow it seems that when Y is missing it falls back to the variance.

Let me give a reproducible example.
First go to github/bhagwataditya/autonomics.
Then download the devel version and install.
Then run:

require(autonomics)
file <- download_data('atkin18.metabolon.xlsx')
object <- read_metabolon(file)
object$subgroup[object$subgroup == 't2'] <- NA
biplot(pls(object))                                # uses mixOmics::plsda internally
biplot(pls(filter_samples(object, !is.na(subgroup)))   # result differs

kimanh.lecao · April 14, 2023, 12:12am

hi @Aditya,

Apologies for the late answer, your post ended in spam on the forum for some reason.

I am not sure what your Y includes (one single variable? several? and I can only see sample plots, no biplot so I won’t be able to answer your question very specifically. I would not know where to start from your gitHub repo).

I can only explain what is happening inside the PLS in general when you have missing values. The method performs local regression of each data set onto the components and so when data are missing, they are dropped to 0 in the algorithm as we fit these regression. We explain this in our book in Chapters 9 and 10 if you can get hold of an electronic copy.

Screen Shot 2023-04-14 at 10.08.01

I dont know what is happening in your case or what you are trying to show.

Kim-Anh

Aditya · May 2, 2023, 1:02pm

Dear Kim, thank you very much for your response! Your book looks very interesting, thank you for that pointer! And for mixomics off course : ).

Topic		Replies	Views
PLS-DA with missing '' values predicted in Y Analysis	1	723	April 26, 2020
PLSDA-Handling Missing Data Support	1	614	March 29, 2021
Variance explained in PLS-DA in X and Y Analysis	5	93	December 11, 2024
mixOmics Partial Least Square (PLS) algorithm Analysis	6	833	November 18, 2020
Biplot for (s)PLSDA Support	1	410	December 2, 2021

`plsda`: NA values in Y data

Related topics