Hi Every one
First, I apologize if any of my doubts sound like baby questions. I’m new in this field. Currently I’m trying to use spls to make models using spectroscopic data gathered using LIBS. My goal is to make a multi element concentration regression. I have 300 spectra out of 5 different steel alloys. Each spectra has 6000 predictors (Wavelengths). I want to make predictions about the concentration of 5 elements. So, my X matrix is 1500x6000 and my Y matrix is 1500x5. My plan is to use SPLS to make features selection (could be regression?). I made a first try approaching this problem as a classification problem, just to warm up and get insights about possibly important predictors. I got really good results. Now, as said above, I’m going further to make regression on elemental composition. Signals from some elements are easier than others to detect and spectral data is full of noise.
My question is about the output of pls() and the predict method.
When I use predict on pls() output, I get predictions for the 5 elements I’m modeling but got it for each component. I have for example 10 components so I have 10 different predictions for the Yi column. Should I make the final linear model between Y and Latent components or is this result in the pls output? Sorry, I’m reading the documentation and papers about the method, but I’m confused by terminology.
Could I have a quick explanation about the elements “predict”, “variate” and “B.hat”?
Thank you!