Batch effect consideration (re-seq the same sample twice)

Batch effect consideration (re-seq the same sample twice)

1

Hello,

I would like to know how you guys address batch effects on re sequence on the same samples (Fastq files).

Our client targeted 20 million reads for all of her samples. However, in the first run, we generated less than 20 million reads for a couple of samples(sample_2,3 and 7). So we re sequenced those samples again.

For the 1st run

| sample_id | #_obtained_reads |
| --------- | ---------------- |
| sample_1  | 21.4             |
| sample_2  | 11               |
| sample_3  | 12               |
| sample_4  | 35.5             |
| sample_5  | 23.8             |
| sample_6  | 29.4             |
| sample_7  | 10               |
| sample_8  | 23.8             |
| sample_9  | 24.3             |
| sample_10 | 18.6             |

For the 2nd run

| sample_id | #_obtained_reads |
| --------- | ---------------- |
| sample_2  | 9                |
| sample_3  | 8                |
| sample_7  | 10               |

When it comes to downstream analysis, how would you address those samples(sample2, 3 and 7). Would you just merge them? i.g.

cat sample_2.fastq.gz (from the 1st run) sample_2.fastq.gz (from the 2nd run) > sample_2.merged.fastq.gz ?

Or would you visualize PCA or hclustering to see if they cluster together or not, and then decide to drop/merge the samples from the 2nd run?


files


batch_effect


fastq


bulkRNAseq

• 113 views

updated 2 hours ago by

13k

written 5 hours ago by

▴

140

Read more here: Source link