How to get regression coefficients by using function sPLS

yenerpol · November 29, 2022, 10:42pm

Hi Every one

First, I apologize if any of my doubts sound like baby questions. I’m new in this field. Currently I’m trying to use spls to make models using spectroscopic data gathered using LIBS. My goal is to make a multi element concentration regression. I have 300 spectra out of 5 different steel alloys. Each spectra has 6000 predictors (Wavelengths). I want to make predictions about the concentration of 5 elements. So, my X matrix is 1500x6000 and my Y matrix is 1500x5. My plan is to use SPLS to make features selection (could be regression?). I made a first try approaching this problem as a classification problem, just to warm up and get insights about possibly important predictors. I got really good results. Now, as said above, I’m going further to make regression on elemental composition. Signals from some elements are easier than others to detect and spectral data is full of noise.

My question is about the output of pls() and the predict method.

When I use predict on pls() output, I get predictions for the 5 elements I’m modeling but got it for each component. I have for example 10 components so I have 10 different predictions for the Yi column. Should I make the final linear model between Y and Latent components or is this result in the pls output? Sorry, I’m reading the documentation and papers about the method, but I’m confused by terminology.

Could I have a quick explanation about the elements “predict”, “variate” and “B.hat”?

Thank you!

MaxBladen · November 30, 2022, 11:34pm

I have for example 10 components so I have 10 different predictions for the Yi column

This is what you want to see, such that each column represents the predicts made by a model with upto that many components. In other words, the first column are predictions from a 1 component model, the second column are those from a 2 component model, etc.

Should I make the final linear model between Y and Latent components or is this result in the pls output?

I’m not quite sure what you’re trying to say here sorry. You model is essentially two sets of latent components, as there are components made from the Y dataframe as well as the X dataframe. The spls() function returns all the relevant information for that model.

“predict”, “variate” and “B.hat”

For clarification, look at our website. This page, this page and this page might be of assistance.

$predict: The values generated by your model (model.spls) when it was provided your testing data (X.test).
$variates: The projection of the predicted values ($predict) onto the components found in your model (model.spls). Refer to the third link I sent for explanation about these terms if you’re unclear
$B.hat: as part of the model building process, we iteratively adjust the weights associated with each input feature using a “regression coefficient”. B.hat can be thought of as the final values of these coefficients.

yenerpol · December 1, 2022, 12:26am

Many thanks @MaxBladen!! Your answer was exactly what I needed

Topic		Replies	Views
Regression coefficients PLS-DA Analysis	2	221	December 13, 2023
sPLS-DA prediction problem Analysis	4	859	August 11, 2020
Proportion explained variance in PLS vs sPLS model Analysis	4	61	March 28, 2025
PLS regression coefficients Analysis	3	21	May 21, 2025
sPLS tuning always selecting minimum in range of test values Analysis	1	147	July 13, 2023

How to get regression coefficients by using function sPLS

Related topics