mixOmics::cimDiablo problems with group annotations

Has anyone reported the case that when running mixOmics::cimDiablo, it appears that the categorical outcome as become mismatched with the samples (colnames)?

In the picture below, if you lookup any sample ‘RUSxxx’ from the heat map, and look at the corresponding category in the Y vector. They are mismatched.

What is the general guidance to put the list of blocks X, and Y into cim, to make sure that the call to pheatmap (I assume its using pheatmap)

Here is a matrix generated from mixOmics::cimDiablo

You will notice that the H/L category columns do not match the Y vector Here is the Yvector that is consumed by the function, is it a named character vector, I ordered it here to its easier to see what I am saying:

RUS002 RUS003 RUS004 RUS005 RUS006 RUS008
H L L L L H
RUS010 RUS012 RUS014 RUS015 RUS019 RUS020
L H H L L L
RUS022 RUS023 RUS024 RUS025 RUS026 RUS028
L L H H H H
RUS030 RUS031 RUS032 RUS033 RUS034 RUS035
L H L L H L
RUS036 RUS040 RUS041 RUS042 RUS043 RUS044
L H L H L L
RUS045 RUS047 RUS048 RUS050 RUS051 RUS052
H H H L L L
RUS056 RUS057 RUS059 RUS060 RUS062 RUS065
L H L H L H
RUS068 RUS069 RUS071 RUS072 RUS075 RUS078
H L H L L L
RUS081 RUS083 RUS084 RUS086 RUS087 RUS089
L H H H H L
RUS090 RUS091 RUS092 RUS093 RUS094 RUS097
L L L H L L
RUS099 RUS103 RUS104 RUS106 RUS107 RUS109
L L H H L L
RUS111 RUS113 RUS115 RUS116 RUS117 RUS119
L H L L H H
RUS120 RUS123 RUS125 RUS126 RUS127 RUS128
L H L H L L
RUS129 RUS133 RUS134 RUS137 RUS138 RUS139
H H L L L L
RUS141 RUS143 RUS144 RUS145 RUS146 RUS147
H L L L L H
RUS150 RUS151
L L
Levels: L < H

There are 3 blocks going in and here is one of them that is consumed by mixOmics::cimDiablo, along with the actual Y vector that goes in that shows that the row names (samples) are ordered correctly going into block.plsda:

head(X$glycansInd)
A2 A2B A2F A2FB G1S1 G1FS1 A1
RUS059 0.02061 0.002315 0.023218 0.01444 0.0001752 0.01851 0.03701
RUS126 0.02322 0.003875 0.027167 0.01851 0.0005451 0.05058 0.02061
RUS083 0.05058 0.003185 0.020609 0.01612 0.0007398 0.02717 0.02322
RUS031 0.01612 0.001387 0.020609 0.01851 0.0003504 0.02717 0.01444
RUS094 0.02061 0.001063 0.018508 0.01444 0.0010634 0.02717 0.01612
RUS125 0.01732 0.001387 0.007052 0.01732 0.0003504 0.03701 0.02717
A1p A1F G0 A1FB G0B G0F G1
RUS059 0.0013870 0.07643 0.0007398 0.01612 0.0031846 0.2482 0.007052
RUS126 0.0023154 0.09956 0.0005451 0.01612 0.0005451 0.1362 0.007052
RUS083 0.0007398 0.07643 0.0007398 0.01444 0.0007398 0.2482 0.011724
RUS031 0.0031846 0.07643 0.0003504 0.02322 0.0045650 0.2482 0.002315
RUS094 0.0010634 0.07643 0.0010634 0.02322 0.0045650 0.2482 0.001063
RUS125 0.0003504 0.02322 0.0023154 0.02061 0.0045650 0.2482 0.003185
G0FB G1B G1F G1Fp G1FB G2 G2F
RUS059 0.02717 0.0001752 0.2005 0.09956 0.05058 0.004565 0.13625
RUS126 0.01444 0.0005451 0.2482 0.07643 0.03701 0.003875 0.20055
RUS083 0.01851 0.0007398 0.2005 0.09956 0.03701 0.004565 0.13625
RUS031 0.05058 0.0003504 0.2005 0.13625 0.03701 0.007052 0.09956
RUS094 0.05058 0.0010634 0.2005 0.13625 0.03701 0.007052 0.09956
RUS125 0.13625 0.0003504 0.2005 0.09956 0.07643 0.011724 0.05058
G2FB
RUS059 0.011724
RUS126 0.011724
RUS083 0.007052
RUS031 0.011724
RUS094 0.011724
RUS125 0.014442
head(Yvec)
RUS059 RUS126 RUS083 RUS031 RUS094 RUS125
L H H H L L
Levels: L < H

This is the function call plsda is:

diablo.plsda ← block.plsda(X = X, Y = Yvec, ncomp = 5, design = design)

Then the call to cim is:

mixOmics::cimDiablo(diablo.final, transpose = TRUE
, color.blocks = c(‘darkorchid’,‘brown1’,‘lightgreen’)
, comp = 1, margin = c(8, 20), legend.position = ‘topright’
, size.legend = 1
)

When inspecting diablo.final object all appears ok before going into cim.

Again, has anyone encountered this. I assume that something going into plot.pldsa throwing the cim function off. And what is the general guidance for inputing the list X and Y. I make sure that the row names of each Block match, and that they match the names of Y.

Peter

There is nothing wrong with the function. Your Y vector is of length 92, while the x labels in the cimDiablo plot has 46 ticks, meaning the every second sample name is being shown.

The first x label in the plot is RUS123. The Y vector states this is as H and the column colour in the plot denotes the second sample as H. The second x label is RUS071. The Y vector states this is H and the column colour in the plot denotes the fourth sample as H. The third x label is RUS125. The Y vector denotes this as L and the sixth column colour is orange, L.

If you continue this procedure, you’ll see that there isnt any problem with the function. The first x label is referring to the second sample (after clustering). The second x label is referring to the fourth sample (after clustering). So on and so on.

Let me know if this doesn’t make sense