Which counts to use for RNA-seq heatmap and PCA?

Which counts to use for RNA-seq heatmap and PCA?

1

Hi,

I have RNA-seq data that I would like to visualise with a PCA plot and a heatmap. I am wondering whether I should use normalised or log transformed normalised counts for this.

I have generated TMM-normalised counts per million in EdgeR as follows:

y <- calcNormFactors(y)
tmm <- edgeR::cpm(y)

I have also generated log2 transformed normalised TMM CPM:

tmm_log <- edgeR::cpm(y, log = T, prior.count = 1)

I am wondering whether it is best to use just the normalised CPMs, or the log-transformed normalised CPMs for a PCA plot and heatmap. I find that the plots look better when I use log-transformed normalised counts, but I am not sure whether this is the correct approach.

Could someone please explain why you would/would not want to use log counts?

Many thanks,

Lucy


RNA-seq


heatmap


EdgeR


PCA

• 1.4k views

updated 1 hour ago by

53k

written 20 months ago by

&utrif;

80

Read more here: Source link