Thank you for your suggestion to update to the latest version of mixOmics using
devtools::install_github(“mixOmicsTeam/mixOmics”)
I have followed your advice and installed the version 6.31.4, but unfortunately, the issue with the plotIndiv function mislabeling the sample groups still persists.
I am not able to recreate your error with mis-allocated sample group colours, could you please run this test code using the SRBCT dataset and check that the outputs are correct?
library(mixOmics)
data(srbct)
X <- srbct$gene
groups <- srbct$class
print(groups)
# [1] EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS
# [24] BL BL BL BL BL BL BL BL NB NB NB NB NB NB NB NB NB NB NB NB RMS RMS RMS
# [47] RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS
# Levels: EWS BL NB RMS
result.pca <- pca(X)
# plot where each sample is named and coloured by group
plotIndiv(result.pca,
ind.names = TRUE,
legend = TRUE,
group = groups,
legend.title = "Groups")
# plot where each sample shape and colour set by group
plotIndiv(result.pca,
ind.names = FALSE,
legend = TRUE,
group = groups,
legend.title = "Groups",
pch.levels = groups,
legend.title.pch = "Groups")
This code generates two plots, the first where samples points are the sample names (as set by ind.names = TRUE) and you can see they are coloured properly. The second plot the sample points are shaped and coloured based on their groups, which is what it looks like you are aiming for your with code (to do this you need to set ind.names = FALSE).
If you get the same plots as these with this example code, then your problem likely lies with data$Sample and factor(data$Time) that you have set. Make sure that these vectors look the same as groups in this example code.
If you can’t find the issue feel free to send me your data (eva.hamrud@unimelb.edu.au) as RDS objects and I can have a look as well!
Thank you very much for your quick reply. The code you have shared doesn’t reproduce my problem because their is only one variable represented in the legend. The bug occurs when the data is grouped using a second variable using ‘pch’. Only in this case, the colours are correctly assigned (sample for me), but not the shapes (time for me).
Don’t hesitate to ask me for more information if my explanation isn’t clear enough.
Ah I thought we were talking about mis-colouring samples! So just to clarify - your samples are being coloured correctly (based on data$Sample) but the sample don’t have the correct shape (based on data$Time)?
I have modified my example code to make an independent variable by which to assign shapes, this seems to work on my end, do you get the same plot as me? If yes you can send me your data objects using the email address I sent and I can investigate further
library(mixOmics)
data(srbct)
X <- srbct$gene
groups_cols <- srbct$class
print(groups_cols)
# [1] EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS
# [24] BL BL BL BL BL BL BL BL NB NB NB NB NB NB NB NB NB NB NB NB RMS RMS RMS
# [47] RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS
# Levels: EWS BL NB RMS
groups_shapes <- as.factor(c(rep("A", 20), rep("B:", 20), rep("C", 23)))
levels(groups_shapes) <- c("A", "B", "C")
print(groups_shapes)
# [1] A A A A A A A A A A A A A A A A A A A A B B B B B B B B B B B B B B B B B B B B C
# [42] C C C C C C C C C C C C C C C C C C C C C C
# Levels: A B C
result.pca <- pca(X)
plotIndiv(result.pca,
ind.names = FALSE,
legend = TRUE,
group = groups_cols,
legend.title = "Groups",
pch = groups_shapes,
pch.levels = groups_shapes,
legend.title.pch = "Groups")
your samples are being coloured correctly (based on data$Sample ) but the sample don’t have the correct shape (based on data$Time ) → yes !
I get the same figure when I use the code. However, the graph remains the same when I change the order in which the letters are assigned, as in the following code:
library(mixOmics)
data(srbct)
X <- srbct$gene
groups_cols <- srbct$class
print(groups_cols)
# [1] EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS BL BL BL BL BL BL
# [30] BL BL NB NB NB NB NB NB NB NB NB NB NB NB RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS
# [59] RMS RMS RMS RMS RMS
# Levels: EWS BL NB RMS
groups_shapes <- as.factor(c(rep("B", 20), rep("C", 20), rep("A", 23)))
levels(groups_shapes) <- c("A", "B", "C")
print(groups_shapes)
# [1] B B B B B B B B B B B B B B B B B B B B C C C C C C C C C C C C C C C C C C C C A A A A A A A A A A A A A A A A A A A
# [60] A A A A
# Levels: A B C
result.pca <- pca(X)
plotIndiv(result.pca,
+ ind.names = FALSE,
+ legend = TRUE,
+ group = groups_cols,
+ legend.title = "Groups",
+ pch = groups_shapes,
+ pch.levels = groups_shapes,
+ legend.title.pch = "Groups")
In my own plot, shapes are assigned to data$Time in alphabetical order, not according to their actual correspondence.
The simple example here shows the difficulty I’m having and I figure it might be of interest to others, so I’m continuing for the moment on the forum.