Response vector/matrix in mint.spls?

Hello,
We are looking to run mixOmics on RNAseq and proteomics data we have from an experiment. We have n=6 RNAseq (3 ctrl, 3 treatment) and n=10 proteomics (5 ctrl, 5 treatment). As I understand, P-integration over shared features is suitable where there are different numbers of samples.

We are looking to do an unsupervised analysis to determine correlated variables (ultimately we want to compare ctrl vs treatment over both datasets), so the mixmint page suggests MINT with canonical analysis is best in this case?

Both mint.pls and mint.spls require a continuous Y response vector/matrix, so if X = rbind(RNAseq, proteomics), how should Y be specified?
The help page for mint.spls shows an example of creating an artificial Y by taking the first gene, does it matter what this gene is? Or is there a better way to construct the response variable?

I did test X=RNAseq and Y=proteomics but I got an error due to unequal rows in X and Y (6 vs 10). Any advice appreciated, thank you in advance!

hi @g.eisel858,

I don’t think MINT is suited in your case as we assume the same features, but you have RNA-seq and proteomics?
The set up would be a concatenated matrix X (e.g different studies, same genes) and Y would be the outcome of interest (ctrl / trt). This is for mint.splsda.

For a mint.block.splsda you would need the same N samples across the omics, so I dont think this is appropriate in your case.

Kim-Anh

Hi Kim-Anh, thanks for your reply,
Yes, I was thinking to subset both RNAseq and proteomics to the set of shared features (about 6000), since the N samples are not matching.
I will try mint.splsda as you describe and see how it looks. Am I correct in assuming there are no other models covered by mixomics which would be more appropriate, given the data we have?
Thankyou again

Dear @g.eisel858,

I confirm that most of our integrative tools require either the same samples, or the same features. You can have a look at the concept of mosaic integration developed by others, but it requires more datasets and samples.

Kim-Anh