Using DESeq on CLR-transformed counts


I want to perform differential abundance on my CLR-transformed relative abundance table (metaphlan output from Shotgun data). At first I was just extracting the top taxa based on the PERMANOVA coefficient, but I was also considering doing DESeq, since this is what I usually use for differential abundance analysis. However, I was usually using it with untransformed, raw counts, but for this data I only have relative abundance information. I was wondering if it makes sense to offset the CLR-transformed counts by the minimum negative value to make the values all positive and feed them to DESeq to treat something like raw counts? I was doing this and it returned me more or less the same taxa I got by PERMANOVA, but I wanted to know if this is something you would advise.

Thank you!


In retrospect I realize that my question is answered in this paper: Frontiers | Microbiome Datasets Are Compositional: And This Is Not Optional | Microbiology

and that using DESeq for differential abundance is not appropriate for CLR-transformed counts.


1 Like

hi @ange,
The CLR transformation is to address compositionality of the data, a data characteristic that DESeq ignores. Adding an offset just to fit the requirement of DESeq sounds a bit weird to me, but from what you way, it does not seem to drastically change the results compared to PERMANOVA. It would be a bit difficult to justify in a paper.

The best way would be to have access to the raw shotgun data, but then you would have CLR+ PERMANOVA vs RAW+DESeq, so not a fair comparison. The other option would be to apply a linear or generalised linear model on the CLR data. I am not sure DESeq is really suitable for microbiome data (we show in the supp how to use a linear model here, maybe that is useful, although this is to include covariates).