Integration with genotype data

Hello,

I would like to use mixOmics, in particular Diablo, on my data. One type of data I have is genotypes. In the paper you wrote:

Genotype data, such as bi-allelic Single Nucleotide Polymorphism coded as counts of the minor allele can also fit in our framework, by implicitly considering an additive model.

It is not clear for me how to recode my data, could you please explain ?

Thanks

Joanna

Dear Joanna,

We do not consider SNP data as categories, but rather as count data (counting the number of reference alleles for each SNP). This means that if you code your SNPs as {0,1,2}, then you make the implicit assumption of an additive genetic model (there isa uniform and linear increase in risk for each copy of the reference allele).
So far our models have not been very successful in selecting relevant SNPs, simply because they have a small effect, so dont hesitate to select a large number of them as a polygenic model.

Hope that helps,

Regards,
Kim-Anh

Hi Kim Anh,

Just wanted to follow this post because I am also trying to use DIABLO to integrate spectral data and genotype (50k) to predict fertility of dairy cows via block.splsda
The model ran fine but it produced the following warning:

fert.model = block.splsda(X = train.dat, Y = y.train, ncomp = 15,

  •                       design = design)
    

Design matrix has changed to include Y; each block will be
linked to Y.
Warning message:
In cor(A[[k]], variates.A[[k]]) : the standard deviation is zero

Is there anything worried?

Thanks,
Phuong

Dear @ph78,

the warning message seems to indicate the residual matrices are getting empty. Your ncomp value is very high, and potentially not needed (usually ncomp ~ K-1 where K is the number of categories in Y). With this kind of method, it is best to keep ncomp small enough for easier interpretation too.

Kim-Anh