Dear forum members,
I am new to mixOmics and I am wondering whether is good practice to try to use continuos data as many times as possible, or if, on the contrary I is better to keep categorical/binary data as it is?
Imean, I got some data that normally could be coded into categorical values (or binaries) but also comes with the possibility to be coded as continuous values. It is the case of genomic variants, that can be 1 or 0 depending if the gene of interest is mutated or not, but, they can also be modeled as allele frequencies ranging between 0 and 1.
Other example is copy number variations (CNV) that can be ranging from 0 (absence), 1 (one copy), 2 (diploid, so, normal state in humans), 3 (one extra copy), (two extra copies)…and so on. These values can be transformed into log2 ratios making them continuous too.
So here’s my question: what’s more desirable? Is there any specific type of data modality that we should avoid?
Thank you!