Thank you for your question (and your patience)
Regarding the VIP question:
We actually do not use the VIP much as it is very specific to a PLS regression mode model.
Short answer: we calculate the VIP per component and this calculation includes the loading vectors from X. We use the formula similar to what is proposed by SIMCA-P / Tenenhaus book (see details below). The idea is that, because components are orthogonal, one variable might be influential in one component, but not the other. The paper you mention seems to sum over all components, so they are using a different formula. If you wish to have a global overview of: what is the importance of the variables after 3 components, then go for the definition of Mehmood. However if your problem is complex (i.e. several X variables with a problem that can be solve orthogonally) then the definition we propose might be better.
Details on how to calculate the VIP:
1- We calculate cor2, the correlation between the Y variables, and the PLS components associated to the X data set, squared
cor(object$Y, object$variates$X, use = "pairwise")^2. It returns a matrix of size P x H where P is the number of variables in X and H is the number of components.
W = loading vectors from X, for each component h.
2 - We calculate the redundancy value Rd. For h = 1, Rd = W^2 (loadings squared). For h >1, Rd = the sum of cor2 of all Y variables, for each component (this is when you have more than 1 Y variable)
3 - For a given variable from X and a given component, the VIP is defined as sqrt(P * Rd %*% t(W^2)/sum(Rd))
What is means is that we assess the importance of a variable from X to explain all variables Y (through the Rd calculation), while also taking into account its loading weight W^2. However,
sum(Rd) takes into account all components, whereas the rest of the elements in the equation are component based.
Regarding the coefficients, I’ll get back to you as I need to dig further in the code.