Combining RNA-seq data from 2 experiments (DESeq2)

Hello,

I have two RNA-seq data generated from Illumina Novaseq (same experimental design but different depth, 25M and 15M reads/sample for Run1 and Run2 respectively).

The dateset look like this:

Samples         Condition        Run
Sample_1            A             R1
Sample_2            B             R1
Sample_3            A             R1
Sample_4            B             R1
Sample_5            A             R1
Sample_6            B             R1
Sample_7            A             R2
Sample_8            B             R2

I want to do DE analysis using DESeq2. Since I have to analyze these samples together, I set a $run factor in my colData, and I try to use collapseReplicates() function to “collapse my technical replicates”.

dds<-DESeqDataSetFromMatrix(count,coldata,design=~Condition)
ddsColl <- collapseReplicates(dds, dds$condtion dds$run)

However, after merging these two dataset (R1&R2) by geneID, there are NAs in my count matrix due to different sequencing. For example, sample_5 & sample_6 are from run1 and sample_7 & sample_8 are from run 2:

gene.id         sample_5   Sample_6    Sample_7   Sample_8
gene_1            2             6         2          0
gene_2            3             0         0          0
gene_3            2             3         NA         NA 
gene_4            NA            NA        1          2

My question is: What should I do with these NAs? Is it appropriate to convert NAs into 0 (considering I will perform collapseReplicates() function)?

Please correct me if I am wrong in any point.
Many thanks, Nicole

Read more here: Source link