Tag: locus_tag

How to identify locus_tag by using RefSeq protein info (WP_*)

How to identify locus_tag by using RefSeq protein info (WP_*) 0 Hi, I would like to know the locus tag of a protein annotated with RefSeq (WP_*). For example, I would like to identify the genomic location of a protein (WP_073031595.1) and also know its adjacent proteins. The GenBank file…

Continue Reading How to identify locus_tag by using RefSeq protein info (WP_*)

can gff2 reference used in htseq-count?

Dear all We are recently working with E.coli plasmid and tried to summarize the gene counts from our RNA-Seq samples. The short reads were mapped to E.coli plasmid using tophat which generated bam files accordingly. However, we were unable to obtain a gff3 version of our target plasmid genome, the…

Continue Reading can gff2 reference used in htseq-count?

Parsing GenBank file: get locus tag vs product

As your sample GenBank file was incomplete, I went online to find a sample file that could be used in an example, and I found this file. Using this code and the Bio::GenBankParser module, it was parsed guessing what parts of the structure you were after. In this case, “features”…

Continue Reading Parsing GenBank file: get locus tag vs product

The meaning of greter than character (>) in gene position in Genbank files

The meaning of greter than character (>) in gene position in Genbank files 1 Hello.This character made some issues when I used Genbank files’ contents.Here an example of ‘>’ usage in a Genbabk file: gene 957467..>957886 /locus_tag=”BME_RS04610″ /old_locus_tag=”BMEI0926″ I couldn’t find what ‘>’ signifies. Does anyone knows? genbank • 120…

Continue Reading The meaning of greter than character (>) in gene position in Genbank files

How to extract two genomic location numbers within the following fasta header?

How to extract two genomic location numbers within the following fasta header? 0 I am wondering how to extract the two numbers within the location tab of the following fasta header. >lcl|CP033719.1_cds_AYW77996.1_1542 [locus_tag=EGX94_07890] [protein=copper oxidase] [protein_id=AYW77996.1] [location=1885267..1887939] [gbkey=CDS] fasta extract location genomic bash • 42 views • link updated 34…

Continue Reading How to extract two genomic location numbers within the following fasta header?

How to extract genomic upstream region of a protein identified by its NCBI accession number?

How to extract genomic upstream region of a protein identified by its NCBI accession number? 1 I have a list of NCBI protein accession numbers. I would like to extract out the upstream genomic region of the corresponding gene’s nucleotide sequence. I will be thankful to you if you can…

Continue Reading How to extract genomic upstream region of a protein identified by its NCBI accession number?

does not contain a ‘gene’ attribute

htseq-count returns : does not contain a ‘gene’ attribute 1 Dear BIOSTAR community, I’m trying to make count matrix with htseq-count, htseq-count -s yes -t gene -i gene 01.sorted.sam annotation_cattle.gff > 01.txt even with –idattr=gene , it returns error: Error processing GFF file (line 1864255 of file annotation_cattle.gff): Feature gene-D1Y31_gp1…

Continue Reading does not contain a ‘gene’ attribute

Download nucleotide sequence with locus_tag

Download nucleotide sequence with locus_tag 1 I have a list of locus_tag, my idea was to download them using esearch but the downloaded file is not the desired gene, instead the nucleotide sequence of the entire contig is downloaded. in this example my gene of interest to download has 830…

Continue Reading Download nucleotide sequence with locus_tag