What is the cutoff used for define high or low expression level of gene for survival analysis

What is the cutoff used for define high or low expression level of gene for survival analysis


Hi everyone

In RNA-seq analysis, we need to separate samples into two groups for survival analysis. How can I define high level or low level for a gene according to counts or FPKM. Use median? average or quantile?

In TCGA or Oncomine, how are they define the cutoff for a gene ?




There’s no definitive answer to your question. I would not advise going by the median or average. Quantile is a reasonable idea, or tertiles, with the higher third being regarded as “high expression”.

An even better idea would be to convert your data to the Z scale, i.e., standard deviations from the mean, and then choose absolute 3, 4, 5, or 6 (3, 4, 5, or 6 standard deviations from the mean) as potential cut-offs. I trust that you have QC’d your data already and that low count transcripts have been removed.

before adding your answer.

Read more here: Source link