The samtools view
outputs information from SAM and BAM files in SAM format. You can find a description of the SAM format here: samtools.github.io/hts-specs/SAMv1.pdf
Section 1.4 deals with the meaning of each of the manditory coloumns. It includes the following table:
Col Field Type Regexp/Range Brief description
|---|------|-------|----------------------------|----------------------------------------|
1 QNAME String [!-?A-~]{1,254} Query template NAME
2 FLAG Int [0, 216 − 1] bitwise FLAG
3 RNAME String *|[:rname:∧*=][:rname:]* Reference sequence NAME11
4 POS Int [0, 231 − 1] 1-based leftmost mapping POSition
5 MAPQ Int [0, 28 − 1] MAPping Quality
6 CIGAR String *|([0-9]+[MIDNSHPX=])+ CIGAR string
7 RNEXT String *|=|[:rname:∧*=][:rname:]* Reference name of the mate/next read
8 PNEXT Int [0, 231 − 1] Position of the mate/next read
9 TLEN Int [−231 + 1, 231 − 1] observed Template LENgth
10 SEQ String *|[A-Za-z=.]+ segment SEQuence
11 QUAL String [!-~]+ ASCII of Phred-scaled base QUALity+33
Column 12 contains a space separated list of optional informational tags about the read.
We can see that the first column is, as you have guessed, the name of the read (or query name). The second column is the FLAG – this is a bitwise flag that encode information about the status of the alignment. Things like is it a successful alignment, is its pair mapped, is it a read1 or a read2?
Finally the third column is the Reference sequence (i.e. the name of the contig the read is aligned to). This is the column you are interested in if you want to know which contig a read is aligned to. And you are correct that reads can align to more than one contig (depending on the configuration of the aligner).
The 7th column, which you note sometimes does and sometimes doesn’t contain contig information gives us information about the contig to which the mate of this read aligns. It only contains a contig name if the mate aligns to a different contig. If it aligns to the same contig, this columns will contain =
. If the mate is unaligned, it will contain *
.
Read more here: Source link