NcbiblastpCommandline alignment results are different from blast webpage

What you are trying to do is fairly simple, and you are complicating it by: 1) not providing your sequences so that someone can reproduce your attempt; 2) giving a result in a form that is impossible to read. Be honest, can you make any sense of the result you posted above?

Here are my two sequences:

::::::::::::::
p13_hs.fas
::::::::::::::
>NP_001018025.1 protein p13 MTCP-1 [Homo sapiens]
MAGEDVGAPPDHLWVHQEGIYRDEYQRTWVAVVEEETSFLRARVQQIQVPLGDAARPSHLLTSQLPLMWQ
LYPEERYMDNNSRLWQIQHHLMVRGVQELLLKLLPDD
::::::::::::::
p13_mi.fas
::::::::::::::
>XP_019483182.1 PREDICTED: protein p13 MTCP-1 [Hipposideros armiger]
MSGEDVGPPPDHLWVHQEGIYRDEYQRTWVAVLEEDTNFLRARVQQVQVPLGDAARPSHLLTSQLPLMWQ
LYPEERYMDNNSRLWQIQHHLMVRGVQELLLKLLPDD

After I paste them in, I select all the lines I just pasted, and click on the button above that says 101/010 and they get formatted in a way that is legible.

Here is my code, and I will also format it as above so it can be read:

from Bio.Blast.Applications import NcbiblastpCommandline
from Bio import SeqIO

fasta_file1 = 'p13_hs.fas'
fasta_file2 = 'p13_mi.fas'
seq1 = SeqIO.read(fasta_file1, "fasta")
seq2 = SeqIO.read(fasta_file2, "fasta")
output = NcbiblastpCommandline(query=fasta_file1, subject=fasta_file2, outfmt=0)()[0]
print(output)

And finally here is the output from running the above script:

BLASTP 2.11.0+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: User specified sequence set (Input: p13_mi.fas).
           1 sequences; 107 total letters



Query= NP_001018025.1 protein p13 MTCP-1 [Homo sapiens]

Length=107
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

XP_019483182.1 PREDICTED: protein p13 MTCP-1 [Hipposideros armiger]   210     1e-77


> XP_019483182.1 PREDICTED: protein p13 MTCP-1 [Hipposideros armiger]
Length=107

 Score = 210 bits (535),  Expect = 1e-77, Method: Compositional matrix adjust.
 Identities = 101/107 (94%), Positives = 106/107 (99%), Gaps = 0/107 (0%)

Query  1    MAGEDVGAPPDHLWVHQEGIYRDEYQRTWVAVVEEETSFLRARVQQIQVPLGDAARPSHL  60
            M+GEDVG PPDHLWVHQEGIYRDEYQRTWVAV+EE+T+FLRARVQQ+QVPLGDAARPSHL
Sbjct  1    MSGEDVGPPPDHLWVHQEGIYRDEYQRTWVAVLEEDTNFLRARVQQVQVPLGDAARPSHL  60

Query  61   LTSQLPLMWQLYPEERYMDNNSRLWQIQHHLMVRGVQELLLKLLPDD  107
            LTSQLPLMWQLYPEERYMDNNSRLWQIQHHLMVRGVQELLLKLLPDD
Sbjct  61   LTSQLPLMWQLYPEERYMDNNSRLWQIQHHLMVRGVQELLLKLLPDD  107



Lambda      K        H        a         alpha
   0.321    0.136    0.431    0.792     4.96

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6

Effective search space used: 9025


  Database: User specified sequence set (Input: p13_mi.fas).
    Posted date:  Unknown
  Number of letters in database: 107
  Number of sequences in database:  1



Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Neighboring words threshold: 11
Window for multiple hits: 40

So if something like this doesn’t work for you, it is either because the sequences are not formatted properly (we don’t know, because you never showed us), or maybe your BioPython version is not recent (we don’t know, you never told us). Other than that, the script should do what is expected.

Read more here: Source link