Check if BAM is derived from pair-end or single-end reads?
I’m automating a pipeline and currently a user inputs as a parameter if the input BAM is from pair-end or single-end reads. I want to automatically check this. How can do I this?
• 16k views
Use the sam flag and samtools view -c -f 1 in.bam
to test if your bam contains paired reads. ( 1= “read paired” See here )
You may also use samtools view -H
. It will print you the header of the bam, including the command that was used for the alignment. Usually, paired-end data are separated into two fastq files, one for forward/reverse reads. Hence, if the alignment command shows two fastq files to be used, it was a paired-end alignment.
An example where BWA mem was used:
samtools view -H my.bam
@PG ID:bwa PN:bwa VN:0.7.12-r1044 CL:/opt/bwa-0.7.12/bwa mem -M -t 24 /Genomes/hg19/hg19.fasta in_1.fastq in_2.fastq
-----------------------------------------------------------------------------------------------------^---------^
where the general BWA mem paired-end command is (similar in Bowtie etc.):
./bwa mem -M -t *threads* indexedRefGenome.fasta in_1.fastq in_2.fastq > out.sam
-----------------------------------------------------^---------^
Both methods suggested by Pierre and Alexander will work great for 99.99% of cases, but things get a little messy if you want to detect single-end-reads in a mixed paired & single BAM file, or singletons in a previously paired-end BAM file.
Detecting singles in a mixed file is easy, just look for samtools view -c -F 1 in.bam
Detecting singletons is a bit trickier though… since the filtering step might have just removed the read and left the mate totally unaffected.
Perhaps you could use the read names (although people have a habit of putting junk like /1 and /2 on the end of the read names)
Or maybe one of Pierre’s tools can detect orphaned singletons 🙂
Alexander’s method would solve this very quickly and easily if all BAM files coming in are totally under your pipeline’s control (you mapped/filtered them yourself at some point). But external “mystery BAMs” may have issues.
Traffic: 2727 users visited in the last hour
Read more here: Source link