Tag: gff3

How to sort gff3 according to chromosome order?

How to sort gff3 according to chromosome order? 1 Hello, Curious to know on how to sort the gff3 file according to its chromosome while keeping its parent (gene) and child features (mRNA, cds and exon) intact: input example: Chr6 EVM gene 212579245 212580018 . + . ID=evm.TU.Chr6.3631;Name=EVM prediction Chr6.3631…

Continue Reading How to sort gff3 according to chromosome order?

featurecounts not working on mirbase annotation file

featurecounts not working on mirbase annotation file 0 Hello I am trying to analyze miRNA-seq data but I am having problems with the mapping. I always get pretty much 0 counts with the built in annotation file, so I got one from miRBase. However, I always get an error when…

Continue Reading featurecounts not working on mirbase annotation file

MacOS Quicklook plugin for gtf and gff3 files?

MacOS Quicklook plugin for gtf and gff3 files? 2 Does anyone know of any MacOS Quicklook plugins that can handle gtf and/or gff3 files? Google searches are turning up nothing. MacOS plugin Quicklook gtf gff3 • 432 views • link updated 24 minutes ago by Ram 39k • written 1…

Continue Reading MacOS Quicklook plugin for gtf and gff3 files?

ChIP-Seq

ChIP-Seq Input Data (Reference Feature)       LiftOver   LiftOver option] body=[We provide on-the fly lift-over of reference data sets between different genome assemblies for broader comparison among annotations.]”> :    Upload custom Data   File Format] body=[All ChIP-seq tools use SGA (Simplified Genome Annotation) files as an internal working format. SGA intput…

Continue Reading ChIP-Seq

Perl debugging help – miRWoods

Hello, I was wondering if anyone with Perl experience could help me debug a miRWoods? I tried reaching out the authors via e-mail with no response, and issues on GitHub are turned off so I’d be super grateful if anyone could provide any insight. When I run miRWoods I get…

Continue Reading Perl debugging help – miRWoods

Extract transcript ID and gene ID from ITAG4.1_gene_models.gff

Extract transcript ID and gene ID from ITAG4.1_gene_models.gff 0 Hello all, I was hoping to extract the transcript ID and corresponding gene ID from ITAG4.1_gene_models.gff (downloaded from solgenomics.net/ftp/genomes/Solanum_lycopersicum/annotation/ITAG4.1_release/) using R. I have tried different methods: First method: List <- tr2g_gff3(file = directory, write_tr2g = FALSE, get_transcriptome = FALSE, save_filtered_gff =…

Continue Reading Extract transcript ID and gene ID from ITAG4.1_gene_models.gff

IGV custom tracks from gff3 files; how to customize feature blocks “shape”?

IGV custom tracks from gff3 files; how to customize feature blocks “shape”? 1 Hi, In IGV, I am using gff3 files to visualize the genomic location of features that I identified through my experiments. The features are visualized as rectangular “boxes” with strand direction shown as arrow heads. I would…

Continue Reading IGV custom tracks from gff3 files; how to customize feature blocks “shape”?

SnpEff Error

SnpEff Error 1 Hello guys, I. run this code : snpEff Prunus_armeniaca_cv_Stella.gff3.gz output.vcf > output.txt I am getting this Error! Could you pls help me with this issue? java.lang.RuntimeException: Property: ‘Prunus_armeniaca_cv_Stella.gff3.gz.genome’ not found at org.snpeff.interval.Genome.<init>(Genome.java:104) at org.snpeff.snpEffect.Config.readGenomeConfig(Config.java:784) at org.snpeff.snpEffect.Config.readConfig(Config.java:751) at org.snpeff.snpEffect.Config.init(Config.java:529) at org.snpeff.snpEffect.Config.<init>(Config.java:116) at org.snpeff.SnpEff.loadConfig(SnpEff.java:429) at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:889) at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:875) at…

Continue Reading SnpEff Error

How to extract summary statistics from GFF3 /GTF file?

Hi! You could try using the gffutils Python library as an alternative to the AGAT toolkit for extracting summary statistics from GFF3/GTF files. gffutils is a flexible and efficient library for working with GFF and GTF files in a variety of formats. Here’s an example of how to use gffutils…

Continue Reading How to extract summary statistics from GFF3 /GTF file?

Homer detailed annotation

Homer detailed annotation 1 Dear, I used HOMER annotatePeaks.pl to annotate my peaks. Here is the format for my code: annotatePeaks.pl peak.bed ref.fa -gff3 ref.gff3 > PeakAnno.txt. But, I don’t know why it is “NA” for the columns of “Focus Ratio/Region Size” and Detailed Annotation””? I am more interested in…

Continue Reading Homer detailed annotation

Issue about generating EMBL Flat file for ENA submission

Issue about generating EMBL Flat file for ENA submission 0 Hello all! I am trying to generate an EMBL flat file to submit an annotated assembly to ENA. I am using EMBLmyGFF3 to generate the flat file from the whole genome FASTA file and the GFF3 file. I am getting…

Continue Reading Issue about generating EMBL Flat file for ENA submission

Detection of Burkholderia pseudomallei with CRISPR-Cas12a based on specific sequence tags

1. Introduction Melioidosis is a tropical disease caused by the aerobic, Gram-negative motile bacillus which is classified as a category B biological agent by the Centers for Disease Control and Prevention (CDC) of America (1, 2). It is a highly pathogenic endemic zoonotic disease in many tropical countries, particularly in…

Continue Reading Detection of Burkholderia pseudomallei with CRISPR-Cas12a based on specific sequence tags

snpeff not recognizes Gff3 file

snpeff not recognizes Gff3 file 0 I made database with a different genome version Zea_mays B73v4, I provide an annotation file gff3 of the same version, but when I run the command in the snpEff database, and output is generated. The Genes. text file contains the Gene IDs of the…

Continue Reading snpeff not recognizes Gff3 file

How to convert GTF output of TSEBRA to gff3 file as an input for EVM ?

How to convert GTF output of TSEBRA to gff3 file as an input for EVM ? 0 Hello, Curious if anyone have experience using the TSEBRA GTF output in EMV. The GTF file generated by TSEBRA gives error while converting to GFF3 format to be used as an input for…

Continue Reading How to convert GTF output of TSEBRA to gff3 file as an input for EVM ?

ANNOVAR – Bioinformatics DB

ANNOVAR is a software tool that annotates single nucleotide variants (SNVs) and insertions/deletions. This tool is particularly useful in the field of genetics research, where high-throughput sequencing platforms generate massive amounts of genetic variation data. However, it can be a challenge to pinpoint a small subset of functionally essential variants…

Continue Reading ANNOVAR – Bioinformatics DB

FeatureCounts tool

FeatureCounts tool 0 0 Entering edit mode 2 hours ago Vikram • 0 Can we use a annotation file in GFF3 format in FeatureCounts tool ? file annotation FeatureCounts • 25 views ADD COMMENT • link 2 hours ago by Vikram • 0 Login before adding your answer. Similar Posts…

Continue Reading FeatureCounts tool

Maker Gff3 file issues

Maker Gff3 file issues 1 Hi community, This is really a technical question, I hope it is OK to post it here… I am trying to import the gff3 file from Maker to my Jbrowse to view the annotations. I am using the maker2jbrowse script and getting constant errors. There…

Continue Reading Maker Gff3 file issues

Antismash on Fasta files

Hello, you can provide FASTA files to it ########### antiSMASH 6.1.1 ############# usage: antismash [–taxon {bacteria,fungi}] [–output-dir OUTPUT_DIR] [–output-basename OUTPUT_BASENAME] [–reuse-results PATH] [–limit LIMIT] [–minlength MINLENGTH] [–start START] [–end END] [–databases PATH] [–write-config-file PATH] [–without-fimo] [–executable-paths EXECUTABLE=PATH,EXECUTABLE2=PATH2,…] [–allow-long-headers] [-v] [-d] [–logfile PATH] [–list-plugins] [–check-prereqs] [–limit-to-record RECORD_ID] [-V] [–profiling] [–skip-sanitisation] [–skip-zip-file]…

Continue Reading Antismash on Fasta files

Can’t install Transdecoder –

Can’t install Transdecoder – 0 I was trying to install TransDecoder to do Transcriptome annotation, but when I run make test, this shows up. I’ve tried to install the module but it is not working. Is there any way around it? Can’t locate URI/Escape.pm in @INC (you may need to…

Continue Reading Can’t install Transdecoder –

gff3 – Extracting animo acid and nucleotide sequences from KofamScan output and codon alignment

I want to extract the amino acid sequences from KofamScan output, and my workflow is as attached in the picture: For the analysis I am doing, I need to get the animo acid sequences, align them, and do codon alignment with the corresponding nucleotide sequences, so that I can get…

Continue Reading gff3 – Extracting animo acid and nucleotide sequences from KofamScan output and codon alignment

Convert Abricate output tsv file to gff3 format

Here’s one way using awk, that I think fulfills the requirements. It adds each of the column names (on the first line) to an array to make accessing each of the fields a bit easier. This approach isn’t strictly necessary, but it does make for a more readable solution in…

Continue Reading Convert Abricate output tsv file to gff3 format

Improving conversion of abricate tsv file to gff3 file

Since such a neat solution (abricate tsv to gff3) was provided by Steve, here are few other steps that I am looking to add so that the script progress to logical maturity to be usable by many others. I have two files – (1) fasta file with .fna extension, and…

Continue Reading Improving conversion of abricate tsv file to gff3 file

list of old gene name in C. elegans

Blog:list of old gene name in C. elegans 1 Hi, Gene of C. elegans may have two different names. For instance, WBGene00006993 has two locus name: (1) “zyg-8” and, (2) old/other names “apo-1” (www.wormbase.org/species/c_elegans/gene/WBGene00006993#0-9e-3). I have compiled (1)WB id, (2)locus, and (3)cosmid id of all genes from C. elegans GFF3…

Continue Reading list of old gene name in C. elegans

Converting Abricate output (.tsv) to gff3 format

Converting Abricate output (.tsv) to gff3 format 0 Hello Everyone I have a tsv file generated from abricate (github.com/tseemann/abricate). I need to convert them to gff3 format with certain columns retained, certain columns reordered, while other columns deleted. We are trying to use these gff3 files for downstream applications and…

Continue Reading Converting Abricate output (.tsv) to gff3 format

Converting an output de-novo transcriptome assembled with Trinity to a .gff3 file

Converting an output de-novo transcriptome assembled with Trinity to a .gff3 file 2 Hello! I’ve de-novo assembled a transcriptome from Trinity, resulting into Trinity.fasta, whose headers look like this: >TRINITY_DN29256_c0_g1_i1 len=323 path=[0:0-322] Followed, in the next line, by the sequence. To run an external downstream analysis with a R script,…

Continue Reading Converting an output de-novo transcriptome assembled with Trinity to a .gff3 file

org.biojava.nbio.core.sequence.CDSSequence.getSequenceAsString java code examples | Tabnine

/** * A CDS sequence if negative stranded needs to be reverse complement * to represent the actual coding sequence. When getting a ProteinSequence * from a TranscriptSequence this method is callled for each CDSSequence * {@link www.sequenceontology.org/gff3.shtml} * {@link biowiki.org/~yam/bioe131/GFF.ppt} * @return coding sequence */ public String getCodingSequence() {…

Continue Reading org.biojava.nbio.core.sequence.CDSSequence.getSequenceAsString java code examples | Tabnine

How to perform synteny alignments and plots only with a gene?

How to perform synteny alignments and plots only with a gene? 0 Hi everyone, I’m trying to perform synteny alignments and plots for a gene of interest and its exons. I have two genomes in FASTA format and their corresponding annotations in GFF3 format. Does anyone know some software that…

Continue Reading How to perform synteny alignments and plots only with a gene?

Error while converting GFF file to GTF using AGAT

Error while converting GFF file to GTF using AGAT 0 Hi I am trying to convert a gff file to gtf file which I want to use for STAR. I tried AGAT(latest version) to convet but it gives me a series of error(mailny tow types) .I have attached the error…

Continue Reading Error while converting GFF file to GTF using AGAT

gff3 file format

gff3 file format 1 Can I use the gff3 format file as a reference genome? I added a screenshot photo of how can I find a reference genome in this picture? gff3 genome reference • 61 views • link updated 1 hour ago by GenoMax 126k • written 2 hours…

Continue Reading gff3 file format

In addition to the chado, are there other biological database schemas?

In addition to the chado, are there other biological database schemas? 0 I would like to know, what are the other existing biological database schemes, in addition to the chado? edition: I’m participating in a project, and they asked me to create a database for plants that use ontologies, a…

Continue Reading In addition to the chado, are there other biological database schemas?

gff format to genome annotation

gff format to genome annotation 0 I am mapping RNAseq transcripts against a genome to annotate it. I am looking at Spaln and GMAP, and they both have two types of gff files as output (GFF3 gene format and GFF3 match format), which one is better to proceed with annotation?…

Continue Reading gff format to genome annotation

TRF output to .gff file

TRF output to .gff file 2 Hello, biostars! I’m trying to get .gff file from Tandem Repeat Finder output. Since TRF can’t do that, I’ve found TRAP tool, which can create .gff. But, TRAP creates as many .gff files as the number of contigs (ok, there is ‘cat’ command). The…

Continue Reading TRF output to .gff file

Detection of Streptococcus pyogenes M1UK in Australia and characterization of the mutation driving enhanced expression of superantigen SpeA

Walker, M. J. et al. Disease manifestations and pathogenic mechanisms of Group A Streptococcus. Clin. Microbiol. Rev. 27, 264–301 (2014). Article  PubMed  PubMed Central  Google Scholar  Carapetis, J. R., Steer, A. C., Mulholland, E. K. & Weber, M. The global burden of group A streptococcal diseases. Lancet Infect. Dis. 5,…

Continue Reading Detection of Streptococcus pyogenes M1UK in Australia and characterization of the mutation driving enhanced expression of superantigen SpeA

How to use chado after installation?

How to use chado after installation? 0 Hello, this is the first time I’m having contact with chado and perl, after some problems I managed to install it, however, I don’t know how to continue. GMOD provides documentation for converting gff file to gff3 and other data. However, I am…

Continue Reading How to use chado after installation?

Genome data visualization

Genome data visualization 0 Hi, Please I need help with producing visualization for genomic DNA regions such as seen in these figures I obtained from a publication: The other image also shows the regions of a chromosome by color. I just need information on the right tools (not IGV) that…

Continue Reading Genome data visualization

PROKKA.gff file is not compatible with featureCounts

Hi all, I am trying to count the number of reads that map to each gene using FeatureCounts. (RNA-Seq PE, linux) my input; GFF. file generated using Prokka GTF.file generated by NCBI annotation Sorted.bam files generated by bowtie2 and samtools. When I used gtf.file generated by NCBI, featurecounts run without…

Continue Reading PROKKA.gff file is not compatible with featureCounts

Sort gff3 on chromosome, position and then featuretype (gene, mRNA, exon, CDS)

Sort gff3 on chromosome, position and then featuretype (gene, mRNA, exon, CDS) 1 Is it possible to sort a gff3 on chromosome, position and then featuretype (gene, mRNA, exon, CDS). The order of the featuretypes is important when converting a gff file to a gtf file with gffread. If the…

Continue Reading Sort gff3 on chromosome, position and then featuretype (gene, mRNA, exon, CDS)

Converting GFF3 and FASTA files to GenBank format – Job in Data Science And Analytics

Find more Data Mining And Management Remote Jobs posted recently Worldwide Posted at – Feb 6, 2023 Toogit Instant Connect Enabled I have GFF3 files (annotation) for my bacterial genomes. I want a script that can be used to convert this GFF3 and its fasta file into Genbank file. Thanks….

Continue Reading Converting GFF3 and FASTA files to GenBank format – Job in Data Science And Analytics

mVISTA annotation

mVISTA annotation 0 Hello Biostars, I have been trying to use mVISTA for the comparing the chloroplast DNA. For this purpose I used the NCBI References as input and downloaded the annotation in GFF3 format for Arabidopsis thaliana (MZ323108) as a Reference Sequence. However, the result does not show the…

Continue Reading mVISTA annotation

Seqlengths of x contains NA values!

Hello, I would like to use ORFik to determine the coverage of the different ORFs across the maize genome. I have ribo-seq data, the latest annotation file (a GFF3), and the v5 genome fasta file for B73. After running my code, three Large CompressedGRangesLists are created and none of them…

Continue Reading Seqlengths of x contains NA values!

gff file from NCBI RefSeq GCF dataset has an invalid format

Thank you for noticing this. It is indeed an issue in the GFF3 file. The root of the problem is it’s a gene that is impossible to correctly represent in GFF3 because it incorporates sequence from both strands via trans_splicing. The complexity of this gene can be seen on the…

Continue Reading gff file from NCBI RefSeq GCF dataset has an invalid format

Retrieve specific fasta sequences from a group of assemblies

Retrieve specific fasta sequences from a group of assemblies 0 Hi all, Sorry if this question has been addressed before but I haven’t been able to find a solution to this. I have a lot of assemblies (around 800) and I would like to retrieve the fasta sequence for a…

Continue Reading Retrieve specific fasta sequences from a group of assemblies

error making Txdb from GTF and fasta files

Hello, I would like to use ORFik to map Ribo-reads to different ORFs in the maize genome. The latest version of the genome is Zm-B73-REFERENCE-NAM-5.0.fa. The annotation file is a GFF3. I have the genome fasta file, the fasta fai file, and the GFF3 file. The ORFik package uses GTF…

Continue Reading error making Txdb from GTF and fasta files

How to convert VCF (with possible predicted gene effects) to protein fasta/MSA

How to convert VCF (with possible predicted gene effects) to protein fasta/MSA 1 How to convert VCF (with possible predicted gene effects) and multiple samples to protein fasta/MSA Input: VCF (possibly with already gene/protein effects predicted via e.g. SnpEff) GFF3 (for the reference protein sequence and maybe to predict effects)…

Continue Reading How to convert VCF (with possible predicted gene effects) to protein fasta/MSA

genbank sequence format

HHS Vulnerability Disclosure, Help This document is an overview of the Entrez databases, with general information on If you are not sure that the “Save” option in your program will do this for you, use “Save As”, In Excel, select “Save As” from the File menu. optimizations to reduce memory…

Continue Reading genbank sequence format

can gff2 reference used in htseq-count?

Dear all We are recently working with E.coli plasmid and tried to summarize the gene counts from our RNA-Seq samples. The short reads were mapped to E.coli plasmid using tophat which generated bam files accordingly. However, we were unable to obtain a gff3 version of our target plasmid genome, the…

Continue Reading can gff2 reference used in htseq-count?

Use RSEM and Bowtie2 to align paired-end sequences

Use RSEM and Bowtie2 to align paired-end sequences 0 I want to use rsem-calculate-expression and bowtie2 aligner to align paired-end sequence based on the following conditions: 2 processors generate BAM file very fast bowtie2 sensitivity append gene/transcript name My code: rsem-refseq-extract-primary-assembly GCF_000001405.31_GRCh38.p5_genomic.fna GCF_000001405.31_GRCh38.p5_genomic.primary_assembly.fna rsem-prepare-reference –gff3 GCF_000001405.31_GRCh38.p5_genomic.gff –bowtie2 –bowtie2-path /bowtie2-2.4.5-py39hd2f7db1_2 –trusted-sources…

Continue Reading Use RSEM and Bowtie2 to align paired-end sequences

Htseq is giving me 0 counts using the GFF3 of miRBase

Hello! I am trying to annotate a miRNA-seq so that it gives me mature miRNAs where I already have 5p and 3p. For this, I have used the index mm10.fa and the miRBase mmu.gff3. I have aligned with HISAT2 and am trying to count with HTSeq, however I get 0…

Continue Reading Htseq is giving me 0 counts using the GFF3 of miRBase

genbank to GTF in galaxy

genbank to GTF in galaxy 0 Hi all, I am working on galaxy and have a genome file in genbank format. To use featurecounts for my RNAseq, I need to convert the genbank format to a GTF format because that’s the format the featurecounts tool in galaxy expects. Now, I…

Continue Reading genbank to GTF in galaxy

What is RNAcentral? | RNAcentral

RNAcentral is a database of non-coding RNA sequences that aggregates ncRNA data from over 40 member resources known as Expert Databases.1 Non-coding RNAs Similar to mRNAs, non-coding RNAs (ncRNAs) are transcribed from DNA but are not translated into proteins. NcRNAs are found in all organisms and have a broad range…

Continue Reading What is RNAcentral? | RNAcentral

computeMatrix in deeptool is Running with no result

computeMatrix in deeptool is Running with no result 0 Hi All, I wonder if someone can help me in explaining what to input on the -R <bed file> argument of the code below? computeMatrix scale-regions -S <bigwig file(s)> -R <bed file> -b 1000 what I did for example, I download…

Continue Reading computeMatrix in deeptool is Running with no result

Indexing with STAR

Indexing with STAR 0 Hello, I am working with RNA seq data and creating an index of reference genome Gossypium hirsutum by using STAR. STAR asks GTF annotation format while my file is GFF3. According to literature, in order to run GFF file I need to remove –sjdbOverhang 50 and…

Continue Reading Indexing with STAR

Senior Bioinformatics Scientist II/ Staff Bioinformatics Scientist

Inscripta was founded in 2015 and recently launched the world’s first benchtop Digital Genome Engineering platform. The company is growing aggressively, investing in its leadership, team, and technology with a recent $150mm financing round led by Fidelity and TRowe price. The company’s advanced CRISPR-based platform, consisting of an instrument, reagents,…

Continue Reading Senior Bioinformatics Scientist II/ Staff Bioinformatics Scientist

Convertion Of Gff3 To Gtf

Convertion Of Gff3 To Gtf 3 How do I convert GFF file to a GTF file? Is there any tool available? gtf gff • 79k views The easiest way is to use the gffread program that comes with the Cufflinks software suite (Tuxedo) gffread my.gff3 -T -o my.gtf See gffread…

Continue Reading Convertion Of Gff3 To Gtf

Adding repeats in a genome fasta at a particular location without messing up the annotations?

Adding repeats in a genome fasta at a particular location without messing up the annotations? 0 I want to add a bunch of expanded repeats in a genome fasta file, for eg. 100 ATTs at a particular location eg Chr1-1:2. How do I that and at the same time update…

Continue Reading Adding repeats in a genome fasta at a particular location without messing up the annotations?

biopython – Updating the GFF3 + Fasta to GeneBank code

I’m trying to convert gff3 and fasta into a gbk file for usage in Mauve. I’ve found a solution but the code is outdated: “””Convert a GFF and associated FASTA file into GenBank format. Usage: gff_to_genbank.py <GFF annotation file> <FASTA sequence file> “”” import sys import os from Bio import…

Continue Reading biopython – Updating the GFF3 + Fasta to GeneBank code

Change separator just between specific columns

I am trying to change the separator just between columns 1 and 9. After that, I would like to maintain the original separator. Those are first lines of my file both when directly reading it and when od -c file is executed: #description: evidence-based annotation of the human genome (GRCh38),…

Continue Reading Change separator just between specific columns

How to assess structural variation in your genome, and identify jumping transposons

Prerequisites Data An annotated genome Long reads Repeat annotation Software minimap2 samtools bedtools – for comparisons only tabix – for visualization only Installation 1 2 3 /work/gif/remkv6/USDA/04_TEJumper conda create -n svim_env –channel bioconda svim source activate svim_env Map your long reads to your genome with minimap My directory locale 1…

Continue Reading How to assess structural variation in your genome, and identify jumping transposons

EXOM-seq counting

EXOM-seq counting 0 Hi everyone, Does anyone know where to download the human Annotating Genomes with GFF3 or GTF files. I want to apply featureCounts to quantify read counts in the bam file in the command line. featureCounts -t exon -g gene_id -a annotation.gtf -o counts.txt mapping_results_SE.bam Best, AD expression…

Continue Reading EXOM-seq counting

counting EXOM reads using subread featureCounts

counting EXOM reads using subread featureCounts 0 does anyone know where to download the Annotating Genomes with GFF3 or GTF files? My idea is to download one of these files in the command line or any other way and to apply featureCounts to quantify read counts in the bam file….

Continue Reading counting EXOM reads using subread featureCounts

How GenomicFeatures cdsBy() accounts for the frame info in the gff to get the CDS?

How GenomicFeatures cdsBy() accounts for the frame info in the gff to get the CDS? The info in the the 8th gff field m.ensembl.org/info/website/upload/gff.html frame – One of ‘0’, ‘1’ or ‘2’. ‘0’ indicates that the first base of the feature is the first base of a codon, ‘1’ that…

Continue Reading How GenomicFeatures cdsBy() accounts for the frame info in the gff to get the CDS?

RSEM gff ‘parent’ field

RSEM gff ‘parent’ field 0 Hi Biostars, When quantifying RNAseq expression with RSEM using a gff3 annotation file, does RSEM take into consideration the ‘locus_tag’ or ‘Parent’ field from the gff3? For example, if the 3-5 processed rRNA/tRNA transcripts within an rDNA locus have the same ‘Parent’ or ‘locus_tag’, are…

Continue Reading RSEM gff ‘parent’ field

Building Snpeff Database

I just went through figuring this out and I thought I would add my process, including the FASTA component, using Vibrio phage VP882 as my example and utilizing the gff strategy you mentioned in a comment to the other answer. Here is everything I did using an established snpEff installation….

Continue Reading Building Snpeff Database

Why does featurecounts give me an output file with only 0s?

Why does featurecounts give me an output file with only 0s? 0 Hello, I’m trying to run featurecounts on my .bam files, but the resulting file yields only 0s in every row and column. Here are the steps I have taken so far: (de novo) Assembled 40 transcripts from RNASeq…

Continue Reading Why does featurecounts give me an output file with only 0s?

PASA pipeline updating annotation results in no changes

PASA pipeline updating annotation results in no changes 0 I’m running these commands leaving the numbers as default in config and cleaned transcripts already. module load Miniconda3/4.9.2 # create database $PASAHOME/scripts/create_sqlite_cdnaassembly_db.dbi -c alignAssembly.config -S /home/data/pest_genomics/pasa/opt/pasa-2.4.1/schema/cdna_alignment_sqliteschema # Upload annotations $PASAHOME/scripts/Load_Current_Gene_Annotations.dbi -c alignAssembly.config -g Chilo_suppressalis_v2_genome_220620_correct2.fasta -P chilo_transfer2_corrected2.gff3 module load SAMtools/1.12-GCC-10.2.0 $PASAHOME/Launch_PASA_pipeline.pl -c…

Continue Reading PASA pipeline updating annotation results in no changes

mm39 genePred file

mm39 genePred file 1 Hello, i need a gene annotation file for the mm39 mouse genome in the genePred format. I found that there is a utility which can convert the information from the gtf format. However, where I would download the gtf file it says that it was created…

Continue Reading mm39 genePred file

Transform a GTF file into a data frame in R

Transform a GTF file into a data frame in R 4 Hi, I would like to analyse the content of a GTF file. I am quite able with R and dplyr, so I would like to transform my GTF file into a data frame to facilitate my analysis. Does anybody…

Continue Reading Transform a GTF file into a data frame in R

How to identify corresponding chromosomes and coordinates of a species for query genes from a another species

How to identify corresponding chromosomes and coordinates of a species for query genes from a another species 0 I have a list of genes from species A and reference genome and gff3 of species B. I want to know homologous genes of species A genes in species B. I am…

Continue Reading How to identify corresponding chromosomes and coordinates of a species for query genes from a another species

Splice sequence indexing failed with err =127

Tophat2 Error: Splice sequence indexing failed with err =127 0 I’ve been trying to map my RNA-seq results onto an entire genome, and I’ve encountered a problem with splices. The script.pbs I submitted to cluster servers is: #PBS -N tophat_cufflinks_1 #PBS -o tophat_cufflinks_1_out.txt #PBS -e tophat_cufflinks_1_error_out.txt #PBS -l nodes=cu01:ppn=24 export…

Continue Reading Splice sequence indexing failed with err =127

Extracting GeneID from Dbxref section in GFF file while using featureCounts

Extracting GeneID from Dbxref section in GFF file while using featureCounts 1 Hi all, I’m trying to generate feature count files for the DeSeq2 pipeline, but I’ve run into an issue while using featureCounts . I see that the gene IDs that I need, aren’t in the same format at…

Continue Reading Extracting GeneID from Dbxref section in GFF file while using featureCounts

Given set of genomic sequences find potentially enriched genes?

I’m not sure there is any one tool that will do all of this for you. Perhaps some of the following might help. After downloading genes for mm10, construct a list of windows upstream or centered on TSSs, and overlap them or associate them with your coordinates: $ wget -qO-…

Continue Reading Given set of genomic sequences find potentially enriched genes?

gmod post processing not working

In principle it is working. on gmod possessor mod. Found inside – Page 241The problem is that a postprocessing vertex for argument y in the call to Add from A is included in A’s procedure … summary information consists of the following sets , which are computed for each procedure…

Continue Reading gmod post processing not working

The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Sequencing of Shorea leprosula genome Sample collection Leaf samples of S. leprosula were obtained from a reproductively mature (diameter at breast height, 50 cm) diploid tree B1_19 (DNA ID 214) grown in the Dipterocarp Arboretum, Forest Research Institute Malaysia (FRIM). DNA extraction Genomic DNA was extracted from leaf samples using the…

Continue Reading The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Any tools converting Genbank format to GFF3 format?

Any tools converting Genbank format to GFF3 format? 4 Dear all, As my title describe, I am asking help to convert Genbank format to GFF format.  By GOOGLing, I found a perl script bp_genbank2gff3.pl), which has many bioperl dependencies.  Could anyone inform me other easy-to-use tools?  Any suggestions will be…

Continue Reading Any tools converting Genbank format to GFF3 format?

Extracting exon level read coverage of a specific gene

HTSeq – Extracting exon level read coverage of a specific gene 1 Dear all, I am trying to quantify RNASeq reads at the “exon level” using HTSeq. To achieve a quantitative exon comparison. I am using ENCODE mouse data which is Illumina reads alligned to GENCODE M27 (GRCm39) using STAR…

Continue Reading Extracting exon level read coverage of a specific gene

Why rvtest skipped all genes in analysis?

Why rvtest skipped all genes in analysis? 0 Sorry that I’m not a native English speaker, so maybe I did not make myself clear somewhere. If so, please forgive me and welcome to ask. I tried to do gene-level association analysis with Rvtests tools ( github.com/zhanxw/rvtests ). But rvtest detect…

Continue Reading Why rvtest skipped all genes in analysis?

Line length limit on input FASTA file: 65,536 characters (limit imposed by bioperl)

Hello, I’m trying to run the following command: agat_sp_extract_sequences.pl -g JU2526_Y39G10AR.22.gff -f JU2526*_region.fa -p And it throws the following error: ————- EXCEPTION: Bio::Root::Exception ————- MSG: Each line of the file must be less than 65,536 characters. Line 2 is 67824 chars. STACK: Error::throw STACK: Bio::Root::Root::throw /home/lgs6452/.conda/envs/exonerate_env/lib/site_perl/5.26.2/Bio/Root/Root.pm:447 STACK: Bio::DB::IndexedBase::_check_linelength /home/lgs6452/.conda/envs/exonerate_env/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm:757 STACK:…

Continue Reading Line length limit on input FASTA file: 65,536 characters (limit imposed by bioperl)

how to identiify real isomers in mirge3.0’s output files.

how to identiify real isomers in mirge3.0’s output files. 0 How do you distinguish/extract ‘real’ isomirnas from the exhaustive output of mirge3.0? Im trying to do a differential expression analysis on the isomers of miRNA in my dataset. Im using mirge3.0 with the -gff and other outputs (basically all of…

Continue Reading how to identiify real isomers in mirge3.0’s output files.

How to write gffutils.feature.Feature object to file

How to write gffutils.feature.Feature object to file 0 How do you most efficiently write a collection of gffutils.feature.Feature objects to file, so that you can create a gff3 file from a collection of Feature objects? I am trying to create a gff3 file without the ##FASTA part at the bottom,…

Continue Reading How to write gffutils.feature.Feature object to file

Extracting exons and transcripts from gff3/gtf

I was just doing something similar about a week ago. You may be able to accomplish this using the GenomicFeatures R package. First load up the following in R: library(GenomicFeatures) library(GenomicRanges) library(rtracklayer) Then you will need to get the chromosome sizes file, which you can generate with directions from this…

Continue Reading Extracting exons and transcripts from gff3/gtf

Determining LOC coordinate from GFF3 start column

Determining LOC coordinate from GFF3 start column 0 Hi all, total noob question: I have a GFF3 file of a pepper (C. annuum) plant genome that looks like this: seqid src type start end chr01 PROTEIN gene 29119 37617 . – . ID=CA.PGAv.1.6.scaffold567.122 chr01 PROTEIN mRNA 29119 37617 . -…

Continue Reading Determining LOC coordinate from GFF3 start column

Where is the annotation file if using the GtRNAdb (tRNA SE analysis) for mapping to RNAseq libraries?

Where is the annotation file if using the GtRNAdb (tRNA SE analysis) for mapping to RNAseq libraries? 0 Hi all, On the GtRNAdb (tRNA-SE analysis) website there is a file containing fasta sequences of different tRNA genes. gtrnadb.ucsc.edu/genomes/eukaryota/Hsapi38/ I aligned this GtRNAdb database with RNAseq libraries using bowtie2 and got…

Continue Reading Where is the annotation file if using the GtRNAdb (tRNA SE analysis) for mapping to RNAseq libraries?

wont recognize the gtf or gff3 files (runtime exception)

snpeff : wont recognize the gtf or gff3 files (runtime exception) 1 Hi, I am trying to build a custom databasee for snpeff. As instructed both in the forum and snpeff instructions, I did the following; Then I added the following into snpEff.config file # BG94_1 BG94_1.genome : BG94_1 Then…

Continue Reading wont recognize the gtf or gff3 files (runtime exception)

featureCounts for WGS instead of RNA-seq

featureCounts for WGS instead of RNA-seq 1 Hello all, I have done whole genome sequencing and aligned reads on a reference genome. I have some bam files. I want to get the number of reads mapped to sepecific regions defined in a gff3 files. I have used featureCounts for RNA-seq…

Continue Reading featureCounts for WGS instead of RNA-seq

Are there any alternatives to Liftoff

Are there any alternatives to Liftoff – Mapping annotations (GFF/GTF) between assemblies 2 Hi, I am annotating closely related accession (varieties) using reference assembly (please note that I am using only a region, so that is the reason why you don’t see chromosome info). I really liked liftoff (ver 1.6.1:…

Continue Reading Are there any alternatives to Liftoff

Blank output When converting GFF3 file to GTF using either gffread or AGAT

Blank output When converting GFF3 file to GTF using either gffread or AGAT 1 Hi, I am trying to convert gff3 file (please see below) to GTF. I used two tools suggested here gffread and agat here. #gff-version 3 Bg_94-1_CX35|chr01_10700000_16500000 Liftoff gene 1 1345 . + . ID=gene_1;Name=Os01g0293800 gene;coverage=0.997;sequence_ID=0.982;extra_copy_number=0;copy_num_ID=gene_1_0 Bg_94-1_CX35|chr01_10700000_16500000…

Continue Reading Blank output When converting GFF3 file to GTF using either gffread or AGAT

How to align and visualize data with .fasta and .gff3 files in IGV?

How to align and visualize data with .fasta and .gff3 files in IGV? 1 Hi everyone, I have an issue in aligning and visualizing my data in IGV. As I read in manual of IGV, to align and visualize data, I need to to prepare .BAM/.SAM or other input format…

Continue Reading How to align and visualize data with .fasta and .gff3 files in IGV?

local variable ‘feature_db’ referenced before assignment

Hi, I want to map annotations from rich gencode human gtf or gff3 to great apes’ genome. I tried to run liftoff (github.com/agshumate/Liftoff) but it returns error the below. Following the github issue’s post, I confirmed there’s no _db file before running, the annotation file is .gff3, the permission of…

Continue Reading local variable ‘feature_db’ referenced before assignment

Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files

Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files 0 Hi, I sincerely hope that I am not repeating an already answered question. I couldn’t find the answer to my exact problem. I have three VCF files derived using bcftools (isec). Those…

Continue Reading Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files

Answer: PopGenome – VCF, fasta, GTF and codons still missing

Dear Maciek Hopefully you were able to solve these problems already. I cannot comment on the main set of issues you reported. However, I also encountered the error: `Error in START[!REV, 3] : incorrect number of dimensions` following certain instances of `set.synnonsyn` which I also noticed occurred for genes which…

Continue Reading Answer: PopGenome – VCF, fasta, GTF and codons still missing

MAKER genome annotation error with SNAP ab initio prediction

I am trying to do a second round of maker genome annotation with ab initio prediction by snap. The error I am getting is as follows: error: unknown command “genome.hmm”, see ‘snap help’. ERROR: Snap failed –> rank=NA, hostname=bioinformatics ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2…

Continue Reading MAKER genome annotation error with SNAP ab initio prediction

How to trim a GFF3 file based on specific coordinates?

How to trim a GFF3 file based on specific coordinates? 0 Hi, I would like to create a GFF3 file containing information only for specific coordinates from the chromosome level GFF3 file. I know how to extract gene and CDS info separately but don’t know how to do trimming based…

Continue Reading How to trim a GFF3 file based on specific coordinates?

STAR rna-seq for bacterial genomes

Hi, I’m willing to use STAR for bacterial genomes. I wanted to ask if this is strongly unadvised or if there is a way to manage the main challenges of mapping reads to prokaryotes. (I know there are specific tools for this purpose, i.e. EdgePro, but I’m a beginner in…

Continue Reading STAR rna-seq for bacterial genomes