Feature selection vs. using selectVar function with no feature selection

bzavala · November 11, 2024, 5:28am

Hello! I was wondering for feature selection for timeOmics it is typically done by selecting a list for block.spls() but I was wondering if that is necessary if you can just use the selectVar() function from the regular block.pls() output and select the top 10 or etc and use those instead. Is there anything wrong with that approach?

kimanh.lecao · November 14, 2024, 5:22am

hi @bzavala,

Nothing wrong with this approach, however the results will differ from a block.spls where you specify keepX = (10,10 …) as the method is orthogonal, so the selection of the variables on component 2 will depend on the selection of component 1 etc.

If you dont want to select variables and just explain the main drivers of the patterns that you see on the sample plot then you approach is appropriate.

If you are interested in feature selection then it’s better to use a sparse version.

Kim-Anh

bzavala · November 14, 2024, 12:52pm

Ok, so will it make a difference if I’m using just the 1st component? Also, is there a major difference of selecting the top 10 variables than feature selecting 10 variables (I believed they were the same approach)?

kimanh.lecao · December 11, 2024, 8:36pm

hi @bzavala,

If you use the first component, then it does not matter. Results will start to change from component 2 between a sparse and non sparse method.

Kim-Anh

Topic		Replies	Views
Confusion of feature selection with timeomics mulit.block.pls Analysis	5	167	April 12, 2024
Spls / keepx / keep specific variables Support	3	366	August 30, 2022
Unable to understand selectVar() output in sPLS-DA Bugs	4	1024	June 9, 2020
How to determine number of variables tio be used when we say 𝑛≪𝑝;	1	308	May 10, 2021
Number of variables in final sPLS-DA Analysis	1	88	May 2, 2024

Feature selection vs. using selectVar function with no feature selection

Related topics