How to perform DE anaysis on a data set in which biological replicates have high varaince?
Hello guys! I would really appreciate if someone could help me with DE analysis. This is my challenge:
I have four conditions and I have five biological replicates for each condition. Performing a PCA (DESeq2) we can observe that most of the replicates don’t cluster together. My first question is, can I simply analyze them using DESeq2 or edgeR or this would be wrong because of this replicate scenario? Second, are there any ways to filter out the genes in which the replicates are not good and keep only the ones that are consistent among the replicates of the same condition to run the DE analysis?
Thank for the help!
• 53 views
If the 5 replicates are repeated in each condition (e.g. 5 cell lines), you can do a paired t-test, even if the PCA doesn’t look nice. The paired t-test will correct for the individual bias of each replicate. Alternatively, if the 5 replicates are repeated in the conditions, you can perform batch correction on the replicates. As last resort, you can you SVA (surrogate variable analysis) to correct any batch effects.
I would hazard that the ground truth of your experiment is that your conditions do not affect RNA expression much. You can do DESeq2, but I predict it will return very few genes as DE.
You certainly cannot just throw away genes that don’t agree with what you think the experiment should look like.
You might want to investigate and see if there is any experimental reason driving PC1.
Traffic: 2438 users visited in the last hour
Read more here: Source link