Identify Mapped reads and Unmaped Mate pairs
Hey, I have a sorted bam file which i got by mapping with reference. I want to identify two things from this bam file.
- How many reads mapped to each gene/contig in my bam file.
- As this was paired-end data, i want to know how many genes/contigs just got a single read out of a pair mapped. like Forward read mapped to contig but reverse didn’t.
I’m trying to find the answers using Pyhton library Pysam.
My Bam file headers are
@HD VN:1.5 SO:coordinate
@SQ SN:TRINITY_DN30590_c0_g2 LN:223
@SQ SN:TRINITY_DN30590_c0_g3 LN:331
and bam file looks like this.
HWI-1KL178:42:c39jdacxx:3:2308:16762:5814/1 16 TRINITY_DN30590_c0_g2 1 9 19S55M27S * 0 0AGGTTTAATTGCAGAAGCTGAGCCTTTCATCTCCATCTTGGGGGTTTCCAGTTCTGCTAGCTGTGCCTCCAAAACACTGGCCTTACTGTATAATTCATCAG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFIIFIIIIIIIFIIIIFFFFFFFFFFBBB NM:i:0 ms:i:110 AS:i:110 nn:i:0 tp:A:P cm:i:9 s1:i:46 s2:i:0 de:f:0 rl:i:0
I’m trying this Python code:
import pysam
file = pysam.AlignmentFile("AT165.bam", "rb")
for read in file.fetch():
print(read)
file.close()
Any help will be great. Thank you.
• 27 views
Read more here: Source link