You can use this:
wget –quiet -O- ftp.uniprot.org/pub/databases/uniprot/currentrelease/knowledgebase/complete/uniprotsprot.dat.gz | zcat | python uniprot2go.py GO:0070337 GO:0016021 > out.fasta
#!/usr/bin/env python
"""Fetch uniprot entries for given go terms"""
import sys
from Bio import SwissProt
#load go terms
gos = set(sys.argv[1:])
sys.stderr.write("Looking for %s GO term(s): %sn" % (len(gos)," ".join(gos)))
#parse swisprot dump
k = 0
sys.stderr.write("Parsing...n")
for i,r in enumerate(SwissProt.parse(sys.stdin)):
sys.stderr.write(" %9ir"%(i+1,))
#parse cross_references
for ex_db_data in r.cross_references:
#print ex_db_data
extdb,extid = ex_db_data[:2]
if extdb=="GO" and extid in gos:
k += 1
sys.stdout.write( ">%s %sn%sn" % (r.accessions[0], extid, r.sequence) )
sys.stderr.write("Reported %s entriesn" % k)
For me it’s less than 6 minutes to parse the latest swissprot dump (it depends on your internet connection).
Of course, if you will run it multiple times, better download the dump and run it from local copy.
Traffic: 2388 users visited in the last hour
Read more here: Source link