Multifactorial Design for DESEQ2

Multifactorial Design for DESEQ2



Last seen 3 days ago


Hello community,

I’m interested in conducting a DEG analysis to obtain differentially expressed genes based on treatment per cell line. In other words, determine the effect of condition per cell line and extract DEGs based on treatment.


Experimental Design:
This consists of a two-factorial design with factors: cell line (A vs B) and treatment (control vs treatment).

Research Question:
Does the treatment have a different given effect per cell line?

Compare the effect of treatment per cell line (A vs B)

Experimental Design:

sample  cell_line condition
A_Ctr_1    A     control        
A_Ctr_2    A     control        
A_Ctr_3    A     control        
A_Met_1    A     treatment      
A_Met_2    A     treatment      
A_Met_3    A     treatment      
B_Ctr_1    B     control        
B_Ctr_2    B     control        
B_Ctr_3    B     control        
B_Met_1    B     treatment
B_Met_2    B     treatment
B_Met_3    B     treatment

DESeq2 Analysis

#Check multi-factorial design for experimental design
print(model.matrix(~cell_line + condition, expDesign))

# Constructing the DESeq2 object (using two design factor)
dds <- DESeqDataSetFromMatrix(countData = geneCountsMat, 
                              colData = expDesign, 
                              design = ~ cell_line + condition + cell_line:condition)

# Filter out lowly expressed genes, here the rowSums(counts(dds)) >= 10 filters out low-count genes
# i.e. keep rows that have at least 10 reads
dds <- dds[ rowSums(counts(dds)) >= 10, ]

#select the reference level for comparing cell lines (set the factor level)
#dds$cell_line <- relevel(dds$cell_line, ref = "A")

"Running DESeq"
# Estimate size factors and dispersion
dds <- DESeq(dds)

# see all comparisons (here there are two given we want to compare conditions and cel_lines)


  1. Is the design here enough and how can I obtain genes per cell line,
    would this be done with contrasts in results?
  2. Do I need to relevel the baseline per cell line in this case?
  3. Should I instead use the interactions instead to obtain genes per cell line?
  4. vst normalization also necessary here?




Read more here: Source link