SAM Validation Error with CleanSam (Aligment start must be

SAM Validation Error with CleanSam (Aligment start must be <= reference seq length)


I ran several bam files through a pipeline with CleanSam, SortSam, and MarkDuplicates without a problem.

However, one of the input files gave me the following error with CleanSam:

ERROR: Record 2106053, Read name A00187:414:HMYCYDSXY:3:1426:13367:11083, Alignment start   (21157039) must be <= reference sequence length (21154825) on reference 7

Because all of the bam files were generated from libraries from the same dataset using the same pipeline and aligned/mapped to the same reference genome, I’m having difficulty knowing where to begin to trouble shoot this error. The Picard script that I used is:

"java -Xmx" . $mem . "g`pwd`/tmp -jar " . $picard . "CleanSam.jar INPUT=" . $BFile[$i] . ".bam OUTPUT= " . $BFile[$i] . "clean.bam";

Where Bfile is just the prefix from a glob list of input bam file names S1.bam….S8.bam

Any suggestions on where to start? Since I’m using the same reference genome for this as for the alignment I don’t understand how it’s possible to get coordinates outside the range of the reference genome length.




Read more here: Source link