What is the best way to clean bulk RNA-seq data?

What is the best way to clean bulk RNA-seq data?

1

As far as I know, there isn’t a universally agreed-upon threshold or an approach to clean the data. I want to remove the genes that don’t contribute, or in other words, the noise genes, BEFORE I normalize the data, using CPM or TPM or any other approach.

I’ve picked the threshold randomly, I tried not to set it too high so that I dont delete important genes that might have infomative value. This is my code:

 thresh = data > 0.5
  keep = rowSums(thresh) >= 1.5
  data = data[keep,]

What do you think? thanks!


normalization


TPM


r

• 41 views

updated 2 hours ago by

72k

written 2 hours ago by

▴

80

Read more here: Source link