I’m using blastn
(anaconda.org/bioconda/blast) to find similar sequences of a target sequence against a FASTA
file. But my read is quite short (68 bases). I realised that blastn won’t report any hit. But there is actually a very good one in the FASTA
file after checking manually.
Here is the query sequence:
$ cat query.fa
>query read
GGAGTAGGCGCGAGCGGCAGGAGGCGGGCAGGCGGAGGGCGAGGCAGGGAGGCGCCGCCTGGAGCGCA
And this is the “good one” that I found manually (I picked it out from the original FASTA
file and save it as a single FASTA
file and also created the BLAST database from it using makeblastdb
):
$ cat db.fa
>database read
GAGTAGGCGCGAGCTAAGCAGGAGGCGGAGGCGGAGGCGGAGGGCGAGGGGCGGGGAGCGCCGCCTGGAGCGCGGCAG
And the command that I used is:
$ blastn -db db.fa -query query.fa -outfmt "6 qseqid sseqid evalue length pident bitscore ppos"
(then no hit was reported)
But they two can actually match very well:
database 1 -GAGTAGGCGCGAGCTAAGCAGGAGGCGGAGGCGGAGGCGGAGGGCGAGG 49
|||||||||||||| .||||||||||| |.|||||||||||||
query 1 GGAGTAGGCGCGAGC--GGCAGGAGGCGG----GCAGGCGGAGGGCGA-- 42
database 50 GGCGGGGA-GCGCCGCCTGGAGCGCGGCAG 78
|||.|||| ||||||||||||||||.
query 43 GGCAGGGAGGCGCCGCCTGGAGCGCA---- 68
(above is the alignment result by needle
)
Is there any parameter of blastn
that I can adjust to relax the BLAST requirement cutoff?
Read more here: Source link