How do I calculate plsda residuals for new data?

Hello,

I was wondering what exactly the “mat.c” value in a plsda object is. In documentation, it is listed as " matrix of coefficients from the regression of X / residual matrices X on the X-variates, to be used internally by predict", but in both the predict and predict.mixo_pls the value cannot be found.
I ask because I am interested in calculating the residuals of new data after a plsda object has been computed. Perhaps this can be done by means of the predict function?

Looking forward to your response,
Nick

hi @NickBliziotis,

This is a bit of a tricky question because you can extract the deflated X matrix from a trained object, but not from a predicted object (which makes sense).
The code in predict is here line 586. You can calculate Y.hat (predicted) - Y (as dummy matrix) and that is how far you can go, prediction wise. For X, you would have to rely on a trained object$defl.matrix. That is the only work around I can think of.

Kim-Anh

Hello Kim-Anh,

Thanks for your response. I will try to apply it.

Sincerely,
Nick

Hi @kimanh.lecao,

Related to that question about mat.c and the meaning of that matrix. Theoretically which are the differences between loading vectors and the matrix with coefficients from regression of X (mat.c)?

Is possible to use this mat.c to assume that the variables (in my case different OTUs) are more associated with one of the groups? and in that case which is the role of loading vectors?

Looking forward to your response,
Guillermo

Dear @GMB,

The answer to your question is described in the PLS algorithm. Potentially, yes, mat.c could give you more insight into the OTUs more associated to explain Y, but we only use it for matrix deflation. The loading vectors would provide a more global value of importance.

Kim-Anh

Thank you very much @kimanh.lecao

Guillermo