I have a count matrix from an RNA-seq experiment that I’d like to normalize using DESeq2 and perform DE analysis on. My code is below:
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design= ~ condition)
My experiment is performed over two time periods, week1 (with treated vs untreated) and week2, (untreated vs untreated). Samples were collected at the end of week 1 and week 2 without replacement. So essentially, week 2 we should see the reversal of any unregulated genes from week 1 (and the data is clustering this way).
I have two possible coldata files
coldata1
sample_id condition week
treated1 treated 1
treated2 treated 1
treated3 treated 1
untreated1 untreated 1
untreated2 untreated 1
untreated3 untreated 1
treated4 treated 2
treated5 treated 2
treated6 treated 2
untreated4 untreated 2
untreated5 untreated 2
untreated6 untreated 2
coldata2
sample_id condition week
treated1 treatedA 1
treated2 treatedA 1
treated3 treatedA 1
untreated1 untreated 1
untreated2 untreated 1
untreated3 untreated 1
treated4 treatedB 2
treated5 treatedB 2
treated6 treatedB 2
untreated4 untreated 2
untreated5 untreated 2
untreated6 untreated 2
So coldata2 would have three treatments instead of two. I’m a bit lost on which is better, and what the best way to fill the design section. I was thinking about making it time-series, but since the treatment was reversed, I’m not sure it’s appropriate.
Any help would be greatly appreciated! Apologies if it is not clear, please let me know and I’ll try to reexplain.
Edit, for clarification:
During week1: treated vs untreated samples. End of week 1: harvested half of the samples and isolated RNA, etc.
During week2: untreated (were treated in week 1) vs untreated (were untreated in week 1). End of week 2: harvested rest of samples and terminated experiment.