I have a question regarding a meta analysis (for a number of data sets separately) I am doing; for all datasets I am facing the same problem; an odd first LV value for Q2 total;
I am doing PLSR regression mode and want to select the number of components according to Q2 criteria (mentioned @mixOmics) where the rule of thumbs is that a PLS component should be included in the model if its value is ≤0.0975.
I have read that negative values and bad prediction could be because of low number of samples and/or large number of variables.
In my case I have 171 samples and 17k genes (from a transcriptomics dataset) and a single response variable.
the value of Q2 total of my top PC/LV comes negative; I am stuck on how to deal with this; any help is appreciated.
below is the code and top pcs q2 total ;
pls.GE<-pls(my_predictors,my_response, ncomp=20, mode= “regression”,scale=TRUE)
perf.pls=perf(pls.GE, validation = “Mfold”, folds=10, nrepeat=10)
1 comp -0.3664158
2 comp 0.3304083
3 comp 0.2942803
4 comp 0.2380420
5 comp 0.2939665
6 comp 0.2672010
7 comp 0.3311232
8 comp 0.2078379