python calculation of protein multiple sequence alignment

python calculation of protein multiple sequence alignment


Dear all 🙂

I am trying to compute the conservation score of each position of a protein multiple sequence alignment.
I already used the Shannon entropy, but I am not satisfied with it since it is not similarity-based but identity only.
So I thought that maybe it could be a good idea to use a substitution matrix. I tried to implement two methods:

  1. Protein–Protein Interfaces: Analysis of Amino Acid Conservation in Homodimers (;2-O)
  2. the “sum-of-pairs” method from AL2CO (

The first method gives me wrong results (maybe because I used BLOSUM62 instead of PET91 used in the article…).
The second method (AL2CO) doesn’t give me satisfying results.

In practice, I would like a score in [0,1] with some sensitivity to sequence redundancy.
I have a workflow in python that process my alignment and calculate properties, so I try as much as possible to avoid external tools…

Do you have some bits of advice or maybe a hidden magick package that I didn’t found :-)?






Read more here: Source link