Im doing my master thesis in metabolomics, where I want to find the relation between metabolomics samples (normal diet and lowcarb diet) and the incresement of LDL.
It is known that LDL increase after following Ketogenic diet, but why it does is not known. My aim is to find the correlated metabolites with the change in LDL by the use of PLS or sPLS and interpret the correlated metabolites in biological pathways.
Since i have 3515 metabolic features do you guys recommend to use sPLS or PLS? I think that sPLS with sparsity will choose the most important variables so it would be easier to start the search in biological pathways?
As it is important to validate the models we use, i use the perf function with Mfold and 5 folds (I have 25 samples). I dont quite understand what Q2 and Q2.total means? How is the criteria of Q2.total to be under 0,0975 contructed and how do we interpret it?
Can i only use Q2 or Q2.total, or do i also need to take a look at R2 and MSEP when choosing the number of components for a model?