Hi mixOmics team!
I’m interested in using the sPCA function on a microbiota dataset consisting of 153 genera and 158 samples. I’ve managed to run this successfully, with 2 components of interest, but I wanted to double check that I’m understanding all of the various final.spca output generated:
- the variable ‘X’ represents each sample’s original (mean-centred?) abundance of each genus
- the variable ‘loadings’ represents the weights (or coefficients) assigned to each of the genera to determine their contribution to each of the given components
I’m following a paper’s methods for generating scores on PCs which states that:
“a participant’s score for PC1 = (their loading of genus 1 on PC1 x their relative abundance of genus 1) + (their loading of genus 2 on PC1 x their relative abundance of genus 2) + etc.” - and then the whole process is repeated for the other PCs. I am guessing that the wording is perhaps a little bit misleading, because “their loading of genus 1 on PC1” makes it seem like they mean every sample’s individual loading, which I don’t think is the case. Rather, I think it’s “the” loading of genus 1 on PC1 which is then multiplied by each individual sample’s rel. abundance of that genus. Does that sound right?
I’m also unsure what the variable ‘variates’ is for. In the glossary it states that “variates are essentially synonymous with components” in the context of CCAs. Does this mean I should ignore this variable since I’m running sPCA? I also noticed that the values for ‘variates’ are not the same as for ‘loadings’ so I just wanted to check what each variable represents in this case.
I hope all of my questions make sense!
Thanks so much in advance.
DJ