which normalization before differential expression analysis (legacy=TRUE vs. legacy=FALSE)

TCGAbiolinks: which normalization before differential expression analysis (legacy=TRUE vs. legacy=FALSE)

1

Dear All,

I am following the TCGAbiolinks tutorial for conducting differential expression analysis on TCGA data (“TCGAanalyze: Analyze data from TCGA” section). I have 2 questions about it.

1) I don’t understand the following: when dealing with legacy=TRUE data (platform = "Illumina HiSeq", file.type = "results"), they perform normalization to correct gene length (TCGAanalyze_Normalization with default parameter); but when they are dealing with legacy=FALSE data (workflow.type = "HTSeq - Counts"), they perform normalization to correct GC content (TCGAanalyze_Normalization with method = "gcContent"). What is the reason for that ? Do you have any explanation ?

2) if I want to use the TCGAanalyze_DEA function with pipeline=limma, should I use the same normalization methods as for pipeline=edgeR ? otherwise, which one should I use for the legacy=FALSE and legacy=TRUE data, respectively ?

Hope you could help a bit. Thanks in advance !

Erica


TCGAbiolinks


limma


TCGA


RNA-seq


normalization

• 30 views

Read more here: Source link