Moderate Mapping percentage

Moderate Mapping percentage


Hi all,
I received my sequenced transcriptome and genomic data from my service provider and started working with it. Both the DNA and RNA data passed quality metrics post trimming. But the mapping percentage comes out to be 90% using bowtie-DNA and 85% using Hisat2-RNA. I tried both hg38 and hg19 reference genomes, still the same issue persists. Ill attach the QC metrics here. Kindly let me know where i am making an error. Its paired end data 150bp.

enter image description here





Have you tried looking at the unmapped reads? BLASTing them to see if they are even considered human?

You could use something like KRAKEN2 to check for contamination, but this does require downloading and setting up the large databases, but is a good QC step if you have dealt with contamination issues in the past.

Also, what is the taxonomic relationship between the genome human genome assemblies you are using and the individual/population you have sequenced? The differences could be real, and manifesting as lower than expected mapping rates due to fixed differences between your samples and the reference genome.

before adding your answer.

Traffic: 680 users visited in the last hour

Read more here: Source link