Error for perf.plsda

Hello MixOmics team

I am using perf to test the performance of plsda for my data, but it shows the following errors:
the basic information is

  1. raw peak areas on GC-TOF-MS data without normilzation for plsda analysis
  2. 15 samples with each 1825 features after mzMine treatment
    perf.TR.plsda <- perf(plsda.TR,
  •                   validation = "Mfold", 
    
  •                   folds = 5)
    

Error in solve.default(Sr) :
system is computationally singular: reciprocal condition number = 3.94774e-17
In addition: Warning messages:
1: In MCVfold.spls(X, Y, multilevel = multilevel, validation = validation, :
At least one class is not represented in one fold, which may unbalance the error rate.
Consider a number of folds lower than the minimum in table(Y): 4
2: In MCVfold.spls(X, Y, multilevel = multilevel, validation = validation, :
At least one class is not represented in one fold, which may unbalance the error rate.
Consider a number of folds lower than the minimum in table(Y): 4
3: In MCVfold.spls(X, Y, multilevel = multilevel, validation = validation, :
At least one class is not represented in one fold, which may unbalance the error rate.
Consider a number of folds lower than the minimum in table(Y): 4
4: In MCVfold.spls(X, Y, multilevel = multilevel, validation = validation, :
At least one class is not represented in one fold, which may unbalance the error rate.
Consider a number of folds lower than the minimum in table(Y): 4
5: In MCVfold.spls(X, Y, multilevel = multilevel, validation = validation, :
At least one class is not represented in one fold, which may unbalance the error rate.
Consider a number of folds lower than the minimum in table(Y): 4
6: In MCVfold.spls(X, Y, multilevel = multilevel, validation = validation, :
At least one class is not represented in one fold, which may unbalance the error rate.
Consider a number of folds lower than the minimum in table(Y): 4
7: In MCVfold.spls(X, Y, multilevel = multilevel, validation = validation, :
At least one class is not represented in one fold, which may unbalance the error rate.
Consider a number of folds lower than the minimum in table(Y): 4
8: In MCVfold.spls(X, Y, multilevel = multilevel, validation = validation, :
At least one class is not represented in one fold, which may unbalance the error rate.
Consider a number of folds lower than the minimum in table(Y): 4
9: In MCVfold.spls(X, Y, multilevel = multilevel, validation = validation, :
At least one class is not represented in one fold, which may unbalance the error rate.
Consider a number of folds lower than the minimum in table(Y): 4

I am looking forward to your reply. Thanks in advance!
Best wishes,
Weichao Wu

Hi Weichao,

Thanks for using mixOmics.

It might be the case that there are highly correlated variables in the data. Have you considered the sparse model (splsda) which performs variable selection and remedies this issue?

Also, as you can see in warning messages, there is one class that has only 4 samples in it (minimum in table(Y): 4), so I recommend you use a lower folds argument (4 or 3), or use validation = 'loo'

Let us know if you how you go.

Best wishes,

Al