when talking about the analysis, instead of talking about variance explained, I should talk about the BER, right?
Explained variance is a good thing to mention if it has quite high (or low) values as it can give you an idea of the amount of non-discriminatory information in your dataset. However, when it comes to quantitative evaluation of your method, error rate (or BER) is definitely the better metric to use.
What BER is usually considered good in one of these analyses?
There is no specific threshold as every analysis is unique. For example, if you’re working as part of a preliminary study and have a minimal dataset with 4 response classes, a score of 60% BER is actually quite good as it represents a 15% improvement over random class selection. Being a prelim. study, you’re not looking for amazing results.
However, for more stringent experimental designs, you may be looking to minimise BER to 10-15%. I’d say generally, aiming for about 25% BER is a good starting point, but don’t take that as gospel!
Is there any other parameter used to evaluate the “efficiency” of the model result?
This may be semantics, but I wouldn’t describe BER as measuring “efficiency”, rather accuracy. In assessing accuracy, have a look at the auroc()
function (via ?auroc
). It provides some numerical and graphical ways for you to assess the Specificity and Sensitivity of your model. If you’re unsure as to what these metrics mean/what AUROC represents, feel free to ask.
Regarding efficiency, if you’re talking about runtime efficiency, then merely tracking the runtime willprovide you assessment (via Sys.time()
). Thinking about the efficiency of the method in explaining your data, this is where the proportion of explained variance will be handy to think about. Additionally, examining the loadings (you can use the plotLoadings()
function) will elucidate how effective some features were at discriminating classes.
Hope this info helps!