Feature selection vs. using selectVar function with no feature selection

Hello! I was wondering for feature selection for timeOmics it is typically done by selecting a list for block.spls() but I was wondering if that is necessary if you can just use the selectVar() function from the regular block.pls() output and select the top 10 or etc and use those instead. Is there anything wrong with that approach?

hi @bzavala,

Nothing wrong with this approach, however the results will differ from a block.spls where you specify keepX = (10,10 …) as the method is orthogonal, so the selection of the variables on component 2 will depend on the selection of component 1 etc.

If you dont want to select variables and just explain the main drivers of the patterns that you see on the sample plot then you approach is appropriate.

If you are interested in feature selection then it’s better to use a sparse version.

Kim-Anh

Ok, so will it make a difference if I’m using just the 1st component? Also, is there a major difference of selecting the top 10 variables than feature selecting 10 variables (I believed they were the same approach)?

hi @bzavala,

If you use the first component, then it does not matter. Results will start to change from component 2 between a sparse and non sparse method.

Kim-Anh