PLS-DA predictions over 100 splits of the data

Hi everyone.

I am using PLS-DA to predict (2 categories). I have used the perf function on the original dataset to determine the number of components for the final model (ncomp=2). Now, for the prediction, I have split the data into training and testing sets (80% and 20%). I want to repeat this process 100 times and compute the average AUC at the end. My question is ;
1-/ Should I use the ncomp=2 for each split?
2-/ Or should I determine the number of components (using the perf function) for each of the 100 training data? If this is the case, how can I choose these numbers? since with 100 splits, I don’t have the chance to visualize the perf plot?

Thank for your advice.

I would certainly say option (1) is the preferable option. Having each model use the same ncomp means they are comparable and an average AUC is reflective of this model. Using option (2) is possible (explore the choice.ncomp component of perf() output) but would really only be used in a context of evaluating how good perf() is at selecting the optimal ncomp.

Thanks, Max for the quick answer. I was thinking of option (1) and needed a point of view of the mixOmics support team,