 # VIP score and PLS regression coefficient

Dear mixomics users,

I want to obtain the VIP score from a PLS model. The vip function from the package returns a VIP coefficient for each predictor variable and for each PLS component. However, it is my understanding that this coefficient summarizes the importance of the predictor variables across the PLS components. In other words, the VIP matrix returned by the function should be of dimension p x 1 even for ncomp>1. I provide the reference describing how to obtain the VIP score.

https://www.sciencedirect.com/science/article/abs/pii/S0169743912001542

In addition, I would like to know how to obtain the matrix of PLS regression coefficients as I understand this is not a direct output from any function.

Best wishes

Hello all,
Can somebody help me with this query please?
I thought that the advantage of the VIP score over the loading coefficients was that the former describes the importance of the predictor variables across all PLS components while the later is the weight associated to the predictor variables for each component.
Can somebody explain me why the output from the vip function of the package provides a VIP score for each component?

Hi @uc_55,

Regarding the VIP question:
We actually do not use the VIP much as it is very specific to a PLS regression mode model.

Short answer: we calculate the VIP per component and this calculation includes the loading vectors from X. We use the formula similar to what is proposed by SIMCA-P / Tenenhaus book (see details below). The idea is that, because components are orthogonal, one variable might be influential in one component, but not the other. The paper you mention seems to sum over all components, so they are using a different formula. If you wish to have a global overview of: what is the importance of the variables after 3 components, then go for the definition of Mehmood. However if your problem is complex (i.e. several X variables with a problem that can be solve orthogonally) then the definition we propose might be better.

Details on how to calculate the VIP:
1- We calculate cor2, the correlation between the Y variables, and the PLS components associated to the X data set, squared `cor(object\$Y, object\$variates\$X, use = "pairwise")^2`. It returns a matrix of size P x H where P is the number of variables in X and H is the number of components.
`sum(Rd)` takes into account all components, whereas the rest of the elements in the equation are component based.