PLSDA-Handling Missing Data

Does the “plsda” function you have developed in this package makes a restitution of missing data with NIPALS before proceeding to the classification or missing values are handled simply by being disregarded without having to delete rows with missing data.

I would also be grateful if I could have a paper on the processing of missing data in the plsda function.

Thank you in advance for your reply.

Best regards,

Aabir

hi Aabir,

There are two ways of dealing with missing values in mixOmics:

  • through the NIPALS function, which is explicitly called, and the data matrix reconstructed (note: the code / function will change in the updated version of mixOmics): Missing Values | mixOmics
    there is a very good reference in ‘La regression PLS, theorie et pratique’ from Tenenhaus, assuming you read French.

  • internally in the function, as you discovered. In that case the method, say plsda() (but it applies to most of our methods) will fit the local regression by ignoring the missing values, there is no reconstruction. This is also partly explain in the PCA case in the Tenenhaus book.

We dont have a paper on this, although a book is coming in a few months (with some examples of application, the actual theory is the same as in Tenenhaus).

Kim-Anh