Prediction PLS-DA

Hi,

I write because I have a question concerning to prediction function. I have a PLSDA model defined by 5 chemical variables and I would predict the class of new samples. However, these samples have available only 4 chemical variables. I run the function with theses samples, and even if they have not all the variables they were classified (with a missing variable).

Do you know how the function can classify unknown samples with a missing variable? how it is possible?

Thanks very much!!!

Hi @enzo,

I get an error with the following. Can you please confirm that you don’t receive this error and provide a reproducible example of your case?

data(breast.tumors)
X.train <- breast.tumors$gene.exp[1:40,]
## prediction samples lacks one variable
X.pred <- breast.tumors$gene.exp[41:47,-1]
Y <- breast.tumors$sample$treatment

plsda.breast <- plsda(X, Y, ncomp = 2)

predict(plsda.breast, newdata = X.pred)
> Error in predict.mixo_pls(plsda.breast, newdata = X.pred) : 
>   'newdata' must include all the variables of 'object$X'

Best,

Al

Hi @aljabadi , thanks very much for your reply.

With your example yes, I receive this error. I don’t know how provide you a reproductible example with my data, because I don’t know how upload my data to forum.
But, I think that difference between my example and the yours is that my database to be predicted is not completely incomplete.
I mean, for instance, of 20 samples (5 variables) 7 ones have not values of variable 4. Probably the algorithm deals the missing values? imputation?
I will try to provide you a reproductible example.
Another question:
Is it possible to save the plsda model (results)? I would like to avoid load database-create the model-select latent variables, etc and then predict. I would like just, to save plsda results and when I need, to load it and predict.

Thanks very much!!!

Enzo

Hi @enzo,

Yes, the algorithm ‘ignores’ the missng values and performs prediction using the oberved values.

Yes, if, say, plsda.res is your plsda model, you can save it using:

save(plsda.res, "plsda.res.RData")

You can then load it using :

load("plsda.res.RData")

Hope it helps,
Al