Accesing reference genome from Genome database (ncbi) with biopython
Hello all,
I would like to acces to the reference genome RefSeq UID given a taxonomy id using the Genome database with biopython.
I will try to explain with images what I mean. I search in the Genome database using a taxonomy id. It returns me a single result, then i click on the “Reference genome” link.
Now I scroll to the bottom of the page and get RefSeq reference genome UID for the given taxonomy ID.
Is it possible to achieve this using biopython ?
• 36 views
Using Entrezdirect (truncated to save space).
$ esearch -db taxonomy -query "1005566 [taxID]" | elink -target nuccore | efetch -format docsum | xtract -pattern DocumentSummary -if SourceDb -contains refseq -element Caption,Title,SourceDb
NZ_AMUP00000000 Escherichia coli 07798, whole genome shotgun sequencing project refseq
NZ_JH964525 Escherichia coli 07798 strain 7798 E07798.contig.252, whole genome shotgun sequence refseq
NZ_JH964524 Escherichia coli 07798 strain 7798 E07798.contig.251, whole genome shotgun sequence refseq
NZ_JH964523 Escherichia coli 07798 strain 7798 E07798.contig.249, whole genome shotgun sequence refseq
NZ_JH964522 Escherichia coli 07798 strain 7798 E07798.contig.248, whole genome shotgun sequence refseq
NZ_JH964521 Escherichia coli 07798 strain 7798 E07798.contig.247, whole genome shotgun sequence refseq
NZ_JH964520 Escherichia coli 07798 strain 7798 E07798.contig.246, whole genome shotgun sequence refseq
NZ_JH964519 Escherichia coli 07798 strain 7798 E07798.contig.245, whole genome shotgun sequence refseq
NZ_JH964518 Escherichia coli 07798 strain 7798 E07798.contig.244, whole genome shotgun sequence refseq
NZ_JH964517 Escherichia coli 07798 strain 7798 E07798.contig.241, whole genome shotgun sequence refseq
If you only want NC*
accessions then
$ esearch -db taxonomy -query "511145 [taxID]" | elink -target nuccore | efetch -format docsum | xtract -pattern DocumentSummary -if SourceDb -contains refseq -element Caption,Title,SourceDb | grep NC
NC_000913 Escherichia coli str. K-12 substr. MG1655, complete genome refseq
Traffic: 1374 users visited in the last hour
Read more here: Source link