r – RNA-Seq Data Heatmap: Is it necessary to do a log2 transformation of RPKM values before doing the Z-score standardisation?

I am making a heatmap using RNA-Seq data in R. The heatmap shows gene expression values (RPKM) in different brain regions. I have the following code:

library(tidyverse)

library(pheatmap)

library(matrixStats)

read_csv("prenatal_heatmap_data.csv") -> all_data

all_data %>%

   column_to_rownames("Brain Region") -> heatmap_data

heatmap_data %>%

   pheatmap()

Which generates the following heatmap:

enter image description here

I want to do a log2 transformation of the RPKM values so that I can do the Z-score standardisation of the data. I have log2-transformed the values using the following code:

heatmap_data %>%

   log2() -> heatmap_data_log2

heatmap_data_log2 %>%

   pheatmap()

However, when I try to create a heatmap using the log2-transformed values I get the following error:

enter image description here

I have looked at the log2-transformed data and I know that the reason I am getting this error is because in the original dataset, some of the RPKM values were 0. When these 0 values get log-transformed they become -Inf:

Original Data

^ Original Data

enter image description here

^ log2 transformed data

I am not sure how to overcome this issue. I was wondering if it is necessary to do the log2 transformation of the RPKM values before doing the Z-score standardisation? I have seen that it is conventional to do the log2-transformation of the data.

Any advice is appreciated.

Read more here: Source link