Tag: gff3

genbank to GTF in galaxy

genbank to GTF in galaxy 0 Hi all, I am working on galaxy and have a genome file in genbank format. To use featurecounts for my RNAseq, I need to convert the genbank format to a GTF format because that’s the format the featurecounts tool in galaxy expects. Now, I…

Continue Reading genbank to GTF in galaxy

What is RNAcentral? | RNAcentral

RNAcentral is a database of non-coding RNA sequences that aggregates ncRNA data from over 40 member resources known as Expert Databases.1 Non-coding RNAs Similar to mRNAs, non-coding RNAs (ncRNAs) are transcribed from DNA but are not translated into proteins. NcRNAs are found in all organisms and have a broad range…

Continue Reading What is RNAcentral? | RNAcentral

computeMatrix in deeptool is Running with no result

computeMatrix in deeptool is Running with no result 0 Hi All, I wonder if someone can help me in explaining what to input on the -R <bed file> argument of the code below? computeMatrix scale-regions -S <bigwig file(s)> -R <bed file> -b 1000 what I did for example, I download…

Continue Reading computeMatrix in deeptool is Running with no result

Indexing with STAR

Indexing with STAR 0 Hello, I am working with RNA seq data and creating an index of reference genome Gossypium hirsutum by using STAR. STAR asks GTF annotation format while my file is GFF3. According to literature, in order to run GFF file I need to remove –sjdbOverhang 50 and…

Continue Reading Indexing with STAR

Senior Bioinformatics Scientist II/ Staff Bioinformatics Scientist

Inscripta was founded in 2015 and recently launched the world’s first benchtop Digital Genome Engineering platform. The company is growing aggressively, investing in its leadership, team, and technology with a recent $150mm financing round led by Fidelity and TRowe price. The company’s advanced CRISPR-based platform, consisting of an instrument, reagents,…

Continue Reading Senior Bioinformatics Scientist II/ Staff Bioinformatics Scientist

Convertion Of Gff3 To Gtf

Convertion Of Gff3 To Gtf 3 How do I convert GFF file to a GTF file? Is there any tool available? gtf gff • 79k views The easiest way is to use the gffread program that comes with the Cufflinks software suite (Tuxedo) gffread my.gff3 -T -o my.gtf See gffread…

Continue Reading Convertion Of Gff3 To Gtf

Adding repeats in a genome fasta at a particular location without messing up the annotations?

Adding repeats in a genome fasta at a particular location without messing up the annotations? 0 I want to add a bunch of expanded repeats in a genome fasta file, for eg. 100 ATTs at a particular location eg Chr1-1:2. How do I that and at the same time update…

Continue Reading Adding repeats in a genome fasta at a particular location without messing up the annotations?

biopython – Updating the GFF3 + Fasta to GeneBank code

I’m trying to convert gff3 and fasta into a gbk file for usage in Mauve. I’ve found a solution but the code is outdated: “””Convert a GFF and associated FASTA file into GenBank format. Usage: gff_to_genbank.py <GFF annotation file> <FASTA sequence file> “”” import sys import os from Bio import…

Continue Reading biopython – Updating the GFF3 + Fasta to GeneBank code

Change separator just between specific columns

I am trying to change the separator just between columns 1 and 9. After that, I would like to maintain the original separator. Those are first lines of my file both when directly reading it and when od -c file is executed: #description: evidence-based annotation of the human genome (GRCh38),…

Continue Reading Change separator just between specific columns

How to assess structural variation in your genome, and identify jumping transposons

Prerequisites Data An annotated genome Long reads Repeat annotation Software minimap2 samtools bedtools – for comparisons only tabix – for visualization only Installation 1 2 3 /work/gif/remkv6/USDA/04_TEJumper conda create -n svim_env –channel bioconda svim source activate svim_env Map your long reads to your genome with minimap My directory locale 1…

Continue Reading How to assess structural variation in your genome, and identify jumping transposons

EXOM-seq counting

EXOM-seq counting 0 Hi everyone, Does anyone know where to download the human Annotating Genomes with GFF3 or GTF files. I want to apply featureCounts to quantify read counts in the bam file in the command line. featureCounts -t exon -g gene_id -a annotation.gtf -o counts.txt mapping_results_SE.bam Best, AD expression…

Continue Reading EXOM-seq counting

counting EXOM reads using subread featureCounts

counting EXOM reads using subread featureCounts 0 does anyone know where to download the Annotating Genomes with GFF3 or GTF files? My idea is to download one of these files in the command line or any other way and to apply featureCounts to quantify read counts in the bam file….

Continue Reading counting EXOM reads using subread featureCounts

How GenomicFeatures cdsBy() accounts for the frame info in the gff to get the CDS?

How GenomicFeatures cdsBy() accounts for the frame info in the gff to get the CDS? The info in the the 8th gff field m.ensembl.org/info/website/upload/gff.html frame – One of ‘0’, ‘1’ or ‘2’. ‘0’ indicates that the first base of the feature is the first base of a codon, ‘1’ that…

Continue Reading How GenomicFeatures cdsBy() accounts for the frame info in the gff to get the CDS?

RSEM gff ‘parent’ field

RSEM gff ‘parent’ field 0 Hi Biostars, When quantifying RNAseq expression with RSEM using a gff3 annotation file, does RSEM take into consideration the ‘locus_tag’ or ‘Parent’ field from the gff3? For example, if the 3-5 processed rRNA/tRNA transcripts within an rDNA locus have the same ‘Parent’ or ‘locus_tag’, are…

Continue Reading RSEM gff ‘parent’ field

Building Snpeff Database

I just went through figuring this out and I thought I would add my process, including the FASTA component, using Vibrio phage VP882 as my example and utilizing the gff strategy you mentioned in a comment to the other answer. Here is everything I did using an established snpEff installation….

Continue Reading Building Snpeff Database

Why does featurecounts give me an output file with only 0s?

Why does featurecounts give me an output file with only 0s? 0 Hello, I’m trying to run featurecounts on my .bam files, but the resulting file yields only 0s in every row and column. Here are the steps I have taken so far: (de novo) Assembled 40 transcripts from RNASeq…

Continue Reading Why does featurecounts give me an output file with only 0s?

PASA pipeline updating annotation results in no changes

PASA pipeline updating annotation results in no changes 0 I’m running these commands leaving the numbers as default in config and cleaned transcripts already. module load Miniconda3/4.9.2 # create database $PASAHOME/scripts/create_sqlite_cdnaassembly_db.dbi -c alignAssembly.config -S /home/data/pest_genomics/pasa/opt/pasa-2.4.1/schema/cdna_alignment_sqliteschema # Upload annotations $PASAHOME/scripts/Load_Current_Gene_Annotations.dbi -c alignAssembly.config -g Chilo_suppressalis_v2_genome_220620_correct2.fasta -P chilo_transfer2_corrected2.gff3 module load SAMtools/1.12-GCC-10.2.0 $PASAHOME/Launch_PASA_pipeline.pl -c…

Continue Reading PASA pipeline updating annotation results in no changes

mm39 genePred file

mm39 genePred file 1 Hello, i need a gene annotation file for the mm39 mouse genome in the genePred format. I found that there is a utility which can convert the information from the gtf format. However, where I would download the gtf file it says that it was created…

Continue Reading mm39 genePred file

Transform a GTF file into a data frame in R

Transform a GTF file into a data frame in R 4 Hi, I would like to analyse the content of a GTF file. I am quite able with R and dplyr, so I would like to transform my GTF file into a data frame to facilitate my analysis. Does anybody…

Continue Reading Transform a GTF file into a data frame in R

How to identify corresponding chromosomes and coordinates of a species for query genes from a another species

How to identify corresponding chromosomes and coordinates of a species for query genes from a another species 0 I have a list of genes from species A and reference genome and gff3 of species B. I want to know homologous genes of species A genes in species B. I am…

Continue Reading How to identify corresponding chromosomes and coordinates of a species for query genes from a another species

Splice sequence indexing failed with err =127

Tophat2 Error: Splice sequence indexing failed with err =127 0 I’ve been trying to map my RNA-seq results onto an entire genome, and I’ve encountered a problem with splices. The script.pbs I submitted to cluster servers is: #PBS -N tophat_cufflinks_1 #PBS -o tophat_cufflinks_1_out.txt #PBS -e tophat_cufflinks_1_error_out.txt #PBS -l nodes=cu01:ppn=24 export…

Continue Reading Splice sequence indexing failed with err =127

Extracting GeneID from Dbxref section in GFF file while using featureCounts

Extracting GeneID from Dbxref section in GFF file while using featureCounts 1 Hi all, I’m trying to generate feature count files for the DeSeq2 pipeline, but I’ve run into an issue while using featureCounts . I see that the gene IDs that I need, aren’t in the same format at…

Continue Reading Extracting GeneID from Dbxref section in GFF file while using featureCounts

Given set of genomic sequences find potentially enriched genes?

I’m not sure there is any one tool that will do all of this for you. Perhaps some of the following might help. After downloading genes for mm10, construct a list of windows upstream or centered on TSSs, and overlap them or associate them with your coordinates: $ wget -qO-…

Continue Reading Given set of genomic sequences find potentially enriched genes?

gmod post processing not working

In principle it is working. on gmod possessor mod. Found inside – Page 241The problem is that a postprocessing vertex for argument y in the call to Add from A is included in A’s procedure … summary information consists of the following sets , which are computed for each procedure…

Continue Reading gmod post processing not working

The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Sequencing of Shorea leprosula genome Sample collection Leaf samples of S. leprosula were obtained from a reproductively mature (diameter at breast height, 50 cm) diploid tree B1_19 (DNA ID 214) grown in the Dipterocarp Arboretum, Forest Research Institute Malaysia (FRIM). DNA extraction Genomic DNA was extracted from leaf samples using the…

Continue Reading The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Any tools converting Genbank format to GFF3 format?

Any tools converting Genbank format to GFF3 format? 4 Dear all, As my title describe, I am asking help to convert Genbank format to GFF format.  By GOOGLing, I found a perl script bp_genbank2gff3.pl), which has many bioperl dependencies.  Could anyone inform me other easy-to-use tools?  Any suggestions will be…

Continue Reading Any tools converting Genbank format to GFF3 format?

Extracting exon level read coverage of a specific gene

HTSeq – Extracting exon level read coverage of a specific gene 1 Dear all, I am trying to quantify RNASeq reads at the “exon level” using HTSeq. To achieve a quantitative exon comparison. I am using ENCODE mouse data which is Illumina reads alligned to GENCODE M27 (GRCm39) using STAR…

Continue Reading Extracting exon level read coverage of a specific gene

Why rvtest skipped all genes in analysis?

Why rvtest skipped all genes in analysis? 0 Sorry that I’m not a native English speaker, so maybe I did not make myself clear somewhere. If so, please forgive me and welcome to ask. I tried to do gene-level association analysis with Rvtests tools ( github.com/zhanxw/rvtests ). But rvtest detect…

Continue Reading Why rvtest skipped all genes in analysis?

Line length limit on input FASTA file: 65,536 characters (limit imposed by bioperl)

Hello, I’m trying to run the following command: agat_sp_extract_sequences.pl -g JU2526_Y39G10AR.22.gff -f JU2526*_region.fa -p And it throws the following error: ————- EXCEPTION: Bio::Root::Exception ————- MSG: Each line of the file must be less than 65,536 characters. Line 2 is 67824 chars. STACK: Error::throw STACK: Bio::Root::Root::throw /home/lgs6452/.conda/envs/exonerate_env/lib/site_perl/5.26.2/Bio/Root/Root.pm:447 STACK: Bio::DB::IndexedBase::_check_linelength /home/lgs6452/.conda/envs/exonerate_env/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm:757 STACK:…

Continue Reading Line length limit on input FASTA file: 65,536 characters (limit imposed by bioperl)

how to identiify real isomers in mirge3.0’s output files.

how to identiify real isomers in mirge3.0’s output files. 0 How do you distinguish/extract ‘real’ isomirnas from the exhaustive output of mirge3.0? Im trying to do a differential expression analysis on the isomers of miRNA in my dataset. Im using mirge3.0 with the -gff and other outputs (basically all of…

Continue Reading how to identiify real isomers in mirge3.0’s output files.

How to write gffutils.feature.Feature object to file

How to write gffutils.feature.Feature object to file 0 How do you most efficiently write a collection of gffutils.feature.Feature objects to file, so that you can create a gff3 file from a collection of Feature objects? I am trying to create a gff3 file without the ##FASTA part at the bottom,…

Continue Reading How to write gffutils.feature.Feature object to file

Extracting exons and transcripts from gff3/gtf

I was just doing something similar about a week ago. You may be able to accomplish this using the GenomicFeatures R package. First load up the following in R: library(GenomicFeatures) library(GenomicRanges) library(rtracklayer) Then you will need to get the chromosome sizes file, which you can generate with directions from this…

Continue Reading Extracting exons and transcripts from gff3/gtf

Determining LOC coordinate from GFF3 start column

Determining LOC coordinate from GFF3 start column 0 Hi all, total noob question: I have a GFF3 file of a pepper (C. annuum) plant genome that looks like this: seqid src type start end chr01 PROTEIN gene 29119 37617 . – . ID=CA.PGAv.1.6.scaffold567.122 chr01 PROTEIN mRNA 29119 37617 . -…

Continue Reading Determining LOC coordinate from GFF3 start column

Where is the annotation file if using the GtRNAdb (tRNA SE analysis) for mapping to RNAseq libraries?

Where is the annotation file if using the GtRNAdb (tRNA SE analysis) for mapping to RNAseq libraries? 0 Hi all, On the GtRNAdb (tRNA-SE analysis) website there is a file containing fasta sequences of different tRNA genes. gtrnadb.ucsc.edu/genomes/eukaryota/Hsapi38/ I aligned this GtRNAdb database with RNAseq libraries using bowtie2 and got…

Continue Reading Where is the annotation file if using the GtRNAdb (tRNA SE analysis) for mapping to RNAseq libraries?

wont recognize the gtf or gff3 files (runtime exception)

snpeff : wont recognize the gtf or gff3 files (runtime exception) 1 Hi, I am trying to build a custom databasee for snpeff. As instructed both in the forum and snpeff instructions, I did the following; Then I added the following into snpEff.config file # BG94_1 BG94_1.genome : BG94_1 Then…

Continue Reading wont recognize the gtf or gff3 files (runtime exception)

featureCounts for WGS instead of RNA-seq

featureCounts for WGS instead of RNA-seq 1 Hello all, I have done whole genome sequencing and aligned reads on a reference genome. I have some bam files. I want to get the number of reads mapped to sepecific regions defined in a gff3 files. I have used featureCounts for RNA-seq…

Continue Reading featureCounts for WGS instead of RNA-seq

Are there any alternatives to Liftoff

Are there any alternatives to Liftoff – Mapping annotations (GFF/GTF) between assemblies 2 Hi, I am annotating closely related accession (varieties) using reference assembly (please note that I am using only a region, so that is the reason why you don’t see chromosome info). I really liked liftoff (ver 1.6.1:…

Continue Reading Are there any alternatives to Liftoff

Blank output When converting GFF3 file to GTF using either gffread or AGAT

Blank output When converting GFF3 file to GTF using either gffread or AGAT 1 Hi, I am trying to convert gff3 file (please see below) to GTF. I used two tools suggested here gffread and agat here. #gff-version 3 Bg_94-1_CX35|chr01_10700000_16500000 Liftoff gene 1 1345 . + . ID=gene_1;Name=Os01g0293800 gene;coverage=0.997;sequence_ID=0.982;extra_copy_number=0;copy_num_ID=gene_1_0 Bg_94-1_CX35|chr01_10700000_16500000…

Continue Reading Blank output When converting GFF3 file to GTF using either gffread or AGAT

How to align and visualize data with .fasta and .gff3 files in IGV?

How to align and visualize data with .fasta and .gff3 files in IGV? 1 Hi everyone, I have an issue in aligning and visualizing my data in IGV. As I read in manual of IGV, to align and visualize data, I need to to prepare .BAM/.SAM or other input format…

Continue Reading How to align and visualize data with .fasta and .gff3 files in IGV?

local variable ‘feature_db’ referenced before assignment

Hi, I want to map annotations from rich gencode human gtf or gff3 to great apes’ genome. I tried to run liftoff (github.com/agshumate/Liftoff) but it returns error the below. Following the github issue’s post, I confirmed there’s no _db file before running, the annotation file is .gff3, the permission of…

Continue Reading local variable ‘feature_db’ referenced before assignment

Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files

Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files 0 Hi, I sincerely hope that I am not repeating an already answered question. I couldn’t find the answer to my exact problem. I have three VCF files derived using bcftools (isec). Those…

Continue Reading Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files

Answer: PopGenome – VCF, fasta, GTF and codons still missing

Dear Maciek Hopefully you were able to solve these problems already. I cannot comment on the main set of issues you reported. However, I also encountered the error: `Error in START[!REV, 3] : incorrect number of dimensions` following certain instances of `set.synnonsyn` which I also noticed occurred for genes which…

Continue Reading Answer: PopGenome – VCF, fasta, GTF and codons still missing

MAKER genome annotation error with SNAP ab initio prediction

I am trying to do a second round of maker genome annotation with ab initio prediction by snap. The error I am getting is as follows: error: unknown command “genome.hmm”, see ‘snap help’. ERROR: Snap failed –> rank=NA, hostname=bioinformatics ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2…

Continue Reading MAKER genome annotation error with SNAP ab initio prediction

How to trim a GFF3 file based on specific coordinates?

How to trim a GFF3 file based on specific coordinates? 0 Hi, I would like to create a GFF3 file containing information only for specific coordinates from the chromosome level GFF3 file. I know how to extract gene and CDS info separately but don’t know how to do trimming based…

Continue Reading How to trim a GFF3 file based on specific coordinates?

STAR rna-seq for bacterial genomes

Hi, I’m willing to use STAR for bacterial genomes. I wanted to ask if this is strongly unadvised or if there is a way to manage the main challenges of mapping reads to prokaryotes. (I know there are specific tools for this purpose, i.e. EdgePro, but I’m a beginner in…

Continue Reading STAR rna-seq for bacterial genomes