I’m trying to calculate the depth of coverage from my WXS data.
Using Qualimap, I first used as the feature file the Gencode human genome (release 38) .gtf file associated with the genome I aligned to:
feat="gencode.v38.primary_assembly.annotation.gtf"
for ea in *bam
do $qualimap bamqc
--java-mem-size=20G
-bam $ea
--feature-file $feat
done;
… and the coverage was ~6.8.
I then subset the .gtf file to exonic regions only:
grep 'transcript_type "protein_coding"' gencode.v38.primary_assembly.annotation.gtf |
awk '($3=="exon") {printf("%st%st%sn",$1,int($4)-1,$5);}' |
sort -T . -k1,1 -k2,2n |
bedtools merge |
awk 'BEGIN{OFS="t"}{print $1,$2,$3,$4,0,"."}'
> gencode.v38.primary_assembly.annotation_exome.bed
feat="gencode.v38.primary_assembly.annotation_exome.bed"
for ea in *bam
do $qualimap bamqc
-bam $ea
--feature-file $feat
done;
…and the coverage jumped to ~68.
Is the latter the correct coverage given that my target is the human exome?
Thanks in advance.
Read more here: Source link