DGE/DEG Analysis for comparing multiple cell lines

Hello community,

I’m relatively new to DGE/DEG analysis using RNA-Seq data, for which I’ve seen that DESeq2 is one of the go-tos for differential gene analysis. I am a bit confused about the list of genes I am obtaining and which type of normalization methods are best to use (variance stabilizing vs r-log).

I have an experimental design that consists of two cell lines, one which has applied the same treatment and another that is control (done in triplicates).

Experimental Design

sample    condition
NALM6     control (x3)
NALM6     treatment (x3)
SEM       control (x3)
SEM       treatment (x3)

Code for Experimental Design

expDesign <- data.frame(
  row.names = colnames(geneCounts),
  sample = c(rep("NALM6", 6), c(rep("SEM", 6))),
  condition = c(rep("control", 3), rep("treatment", 3), rep("control", 3), rep("treatment", 3))
)

Code for Running DeSEQ2

# Constructing the DESeq2 object
dds <- DESeqDataSetFromMatrix(countData = geneCountsMat, 
                              colData = expDesign, 
                              design = ~ condition)

"Running DESeq"
#Use dds object previously created
dds <- DESeq(dds)

Results Example

DataFrame with 10 rows and 2 columns
                log2FoldChange         padj
                     <numeric>    <numeric>
ENSG00000196230       -2.31206  0.00000e+00
ENSG00000112972       -1.96868  0.00000e+00
ENSG00000182831        1.94195  0.00000e+00
ENSG00000116830       -1.35854 4.06128e-141
ENSG00000111602       -2.61308 5.67284e-136
  1. Does DESeq2 differentiate between cell lines or should I run DESeq2 separately per cell line (control vs treatment)?
  2. How can I know which genes are most differentially expressed per cell line according to treatment?

Read more here: Source link