Hello Team,
I am currently using PLS-DA to identify metabolites associated with diet exposures. The outcome is a particular food consumption (continuous variable), and I would like to identify top metabolites that are associated with this food. I have looked into using elasticnet regression, however, due to multicollinearity, it flips the sign for some metabolites that are highly correlated and I would like to like to identify the best metabolites associated with the outcome so if two metabolites are both correlated with each other and with the outcome then I want to identify them and both should have a positive slope value. I understand that I can change the penalties in elasticnet to reduce the impact of collinearity but then the model does not give me the top metabolites correlated with the outcome. I would really appreciate if you could share your thought on this and if you had any advice on what is best analytical approach to answer this question.
Thank you!