ncbi – How to use biopython Entrez efetch to get genbank file from “gene” database

I am trying to programmatically get whole genes ( with intron and exon structure as defined by CDS) using Biopython Entrez esearch and efetch utilities.

from Bio import Entrez
Entrez.email = "myemail@gmail.com"
handle = Entrez.esearch(db="gene",retmax = "10",term="P53 AND Homo Sapiens [organism]")
record = Entrez.read(handle)

handle_first_record = Entrez.efetch(db="gene",id=record["IdList"][0],rettype="gb",retmode="text")
info = handle.read()

#Is there a more direct way of getting the start and stop from the Annotation Field

annot = info.split("n")[6].split()
chrom = annot[3]
start_stop = annot[4].split("..")
start = start_stop[0][1:]
stop = start_stop[1][:-1]
print(f"Chromid: {chrom} Start:{start} Stop: {stop}")
gbfile_handle = Entrez.efetch(db="nuccore",id=chrom,start=start,stop=stop)
# Need to figure out how to parse this record to get a Genbank file

A typical annotation is given below

1. TP53
Official Symbol: TP53 and Name: tumor protein p53 [Homo sapiens (human)]
Other Aliases: BCC7, BMFS5, LFS1, P53, TRP53
Other Designations: cellular tumor antigen p53; antigen NY-CO-13; mutant tumor protein 53; p53 tumor suppressor; phosphoprotein p53; transformation-related protein 53; tumor protein 53; tumor supressor p53
Chromosome: 17; Location: 17p13.1
Annotation: Chromosome 17 NC_000017.11 (7668421..7687490, complement)
MIM: 191170
ID: 7157  

Is there an easier way in Biopython to get the Genbank file for a human gene starting from the name of the gene than the way above .

Read more here: Source link