Question on CLR transformation of composition


I have a naive question regarding the CLR transformation for compositional data, more specifically on microbiome relative abundance.

I have normalised the raw counts of my OTUs (gathered by genus) by a TSS and then I would like to perform a CLR on it.

In this OTU dataset, I have a “Unclassified” column corresponding to non-affiliated genus and I am not interested by this column, I therefore ask myself if I should perform the CLR before or after removing the Unclassified column.

For me I should perform the CLR on the “complete” compositions but I wonder if there could be an impact on the analysis if I do it on a sub-composition since a proper CoDA should be sub-composition coherent…

Thank you for your help :slight_smile:
Florent Guinot

Hi @florent_guinot,

This is a tough call: if you want to respect sub compositional coherence then you need to keep the taxa that have not been filtered out. What we do is that we stay at the OTU level so avoid this issue of ‘missing out’. I would advise you to keep this unclassified column. If that column is selected later on, that you should probably question your assumptions that it is not ‘interesting’! If they are not selected, then business as usual.

Also note, if you do CLR, then you do not need to TSS (this is a redundant step), we will update our website tutorial to reflect this.


1 Like

Thank you Kim-Anh for this clear answer :slight_smile:

hi @florent_guinot,
We have updated the website accordingly!