Batch effect removal

Batch effect removal


Hi! I’m new to bioinformatics and I’m working with 6 different RNA-seq(high throughput) studies from GEO, 3 of the GSEs contain gene expression for tumors and the other 3 contains healthy tissue.

I’m going to do batch correction, and I’m wondering do I merge all datasets together first and do normalization and batch correction on all together? Or do I merge the 3 GSEs for tumor-data and do normalization and batch correction on this merged dataset separately and then merge the 3 GSEs for healthy tissue and do the normalization/batch correction there, and then merge them all together if that make sense?





I’m going to do batch correction

No, you don’t. You cannot randomly collect datasets and expect to then run any stats magic and make them comparable. You need indentical wetlabl processing for a fair comparison. Otherwise batch effects obscure the results. You cannot correct it as each batch (=each study) is nested with the condition (tumor/normal). A very common problem, and the only way around is to either find a study that produced case and control in go, or make the data yourself with proper study design. You have with these data above a fully confounded design, nothing you can do about it.

Oh, I see you asked this before and the answer was the same:

Batch effects

Difference between dataset analysis

before adding your answer.

Traffic: 2433 users visited in the last hour

Read more here: Source link