Protein name in GenBank

Protein name in GenBank

0

Hi everyone!

I make a program in python, which download genome sequences from GenBank and look for protein sequences. It should join the same protein sequences from different organism to one file. The problem is the same proteins in GenBank have different name and my program can’t recognise those sequences as sequences from the same protein. For example: “photosystem II cytochrome b559 alpha subunit” and “photosystem II cytochrome b559 subunit alpha”. It’s almost the same but of course it’s not for python. Does anybody have idea to merge the same protein from different organisms? Maybe using some feature.qualifiers?

Thanks for any help!


genbank


python


biopython

• 22 views

updated 1 hour ago by

38k

written 2 hours ago by

0


Login
before adding your answer.

Read more here: Source link