Tag: gff3
How to sort gff3 according to chromosome order?
How to sort gff3 according to chromosome order? 1 Hello, Curious to know on how to sort the gff3 file according to its chromosome while keeping its parent (gene) and child features (mRNA, cds and exon) intact: input example: Chr6 EVM gene 212579245 212580018 . + . ID=evm.TU.Chr6.3631;Name=EVM prediction Chr6.3631…
featurecounts not working on mirbase annotation file
featurecounts not working on mirbase annotation file 0 Hello I am trying to analyze miRNA-seq data but I am having problems with the mapping. I always get pretty much 0 counts with the built in annotation file, so I got one from miRBase. However, I always get an error when…
MacOS Quicklook plugin for gtf and gff3 files?
MacOS Quicklook plugin for gtf and gff3 files? 2 Does anyone know of any MacOS Quicklook plugins that can handle gtf and/or gff3 files? Google searches are turning up nothing. MacOS plugin Quicklook gtf gff3 • 432 views • link updated 24 minutes ago by Ram 39k • written 1…
ChIP-Seq
ChIP-Seq Input Data (Reference Feature) LiftOver LiftOver option] body=[We provide on-the fly lift-over of reference data sets between different genome assemblies for broader comparison among annotations.]”> : Upload custom Data File Format] body=[All ChIP-seq tools use SGA (Simplified Genome Annotation) files as an internal working format. SGA intput…
Perl debugging help – miRWoods
Hello, I was wondering if anyone with Perl experience could help me debug a miRWoods? I tried reaching out the authors via e-mail with no response, and issues on GitHub are turned off so I’d be super grateful if anyone could provide any insight. When I run miRWoods I get…
Extract transcript ID and gene ID from ITAG4.1_gene_models.gff
Extract transcript ID and gene ID from ITAG4.1_gene_models.gff 0 Hello all, I was hoping to extract the transcript ID and corresponding gene ID from ITAG4.1_gene_models.gff (downloaded from solgenomics.net/ftp/genomes/Solanum_lycopersicum/annotation/ITAG4.1_release/) using R. I have tried different methods: First method: List <- tr2g_gff3(file = directory, write_tr2g = FALSE, get_transcriptome = FALSE, save_filtered_gff =…
IGV custom tracks from gff3 files; how to customize feature blocks “shape”?
IGV custom tracks from gff3 files; how to customize feature blocks “shape”? 1 Hi, In IGV, I am using gff3 files to visualize the genomic location of features that I identified through my experiments. The features are visualized as rectangular “boxes” with strand direction shown as arrow heads. I would…
SnpEff Error
SnpEff Error 1 Hello guys, I. run this code : snpEff Prunus_armeniaca_cv_Stella.gff3.gz output.vcf > output.txt I am getting this Error! Could you pls help me with this issue? java.lang.RuntimeException: Property: ‘Prunus_armeniaca_cv_Stella.gff3.gz.genome’ not found at org.snpeff.interval.Genome.<init>(Genome.java:104) at org.snpeff.snpEffect.Config.readGenomeConfig(Config.java:784) at org.snpeff.snpEffect.Config.readConfig(Config.java:751) at org.snpeff.snpEffect.Config.init(Config.java:529) at org.snpeff.snpEffect.Config.<init>(Config.java:116) at org.snpeff.SnpEff.loadConfig(SnpEff.java:429) at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:889) at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:875) at…
How to extract summary statistics from GFF3 /GTF file?
Hi! You could try using the gffutils Python library as an alternative to the AGAT toolkit for extracting summary statistics from GFF3/GTF files. gffutils is a flexible and efficient library for working with GFF and GTF files in a variety of formats. Here’s an example of how to use gffutils…
Homer detailed annotation
Homer detailed annotation 1 Dear, I used HOMER annotatePeaks.pl to annotate my peaks. Here is the format for my code: annotatePeaks.pl peak.bed ref.fa -gff3 ref.gff3 > PeakAnno.txt. But, I don’t know why it is “NA” for the columns of “Focus Ratio/Region Size” and Detailed Annotation””? I am more interested in…
Issue about generating EMBL Flat file for ENA submission
Issue about generating EMBL Flat file for ENA submission 0 Hello all! I am trying to generate an EMBL flat file to submit an annotated assembly to ENA. I am using EMBLmyGFF3 to generate the flat file from the whole genome FASTA file and the GFF3 file. I am getting…
Detection of Burkholderia pseudomallei with CRISPR-Cas12a based on specific sequence tags
1. Introduction Melioidosis is a tropical disease caused by the aerobic, Gram-negative motile bacillus which is classified as a category B biological agent by the Centers for Disease Control and Prevention (CDC) of America (1, 2). It is a highly pathogenic endemic zoonotic disease in many tropical countries, particularly in…
snpeff not recognizes Gff3 file
snpeff not recognizes Gff3 file 0 I made database with a different genome version Zea_mays B73v4, I provide an annotation file gff3 of the same version, but when I run the command in the snpEff database, and output is generated. The Genes. text file contains the Gene IDs of the…
How to convert GTF output of TSEBRA to gff3 file as an input for EVM ?
How to convert GTF output of TSEBRA to gff3 file as an input for EVM ? 0 Hello, Curious if anyone have experience using the TSEBRA GTF output in EMV. The GTF file generated by TSEBRA gives error while converting to GFF3 format to be used as an input for…
ANNOVAR – Bioinformatics DB
ANNOVAR is a software tool that annotates single nucleotide variants (SNVs) and insertions/deletions. This tool is particularly useful in the field of genetics research, where high-throughput sequencing platforms generate massive amounts of genetic variation data. However, it can be a challenge to pinpoint a small subset of functionally essential variants…
FeatureCounts tool
FeatureCounts tool 0 0 Entering edit mode 2 hours ago Vikram • 0 Can we use a annotation file in GFF3 format in FeatureCounts tool ? file annotation FeatureCounts • 25 views ADD COMMENT • link 2 hours ago by Vikram • 0 Login before adding your answer. Similar Posts…
Maker Gff3 file issues
Maker Gff3 file issues 1 Hi community, This is really a technical question, I hope it is OK to post it here… I am trying to import the gff3 file from Maker to my Jbrowse to view the annotations. I am using the maker2jbrowse script and getting constant errors. There…
Antismash on Fasta files
Hello, you can provide FASTA files to it ########### antiSMASH 6.1.1 ############# usage: antismash [–taxon {bacteria,fungi}] [–output-dir OUTPUT_DIR] [–output-basename OUTPUT_BASENAME] [–reuse-results PATH] [–limit LIMIT] [–minlength MINLENGTH] [–start START] [–end END] [–databases PATH] [–write-config-file PATH] [–without-fimo] [–executable-paths EXECUTABLE=PATH,EXECUTABLE2=PATH2,…] [–allow-long-headers] [-v] [-d] [–logfile PATH] [–list-plugins] [–check-prereqs] [–limit-to-record RECORD_ID] [-V] [–profiling] [–skip-sanitisation] [–skip-zip-file]…
Can’t install Transdecoder –
Can’t install Transdecoder – 0 I was trying to install TransDecoder to do Transcriptome annotation, but when I run make test, this shows up. I’ve tried to install the module but it is not working. Is there any way around it? Can’t locate URI/Escape.pm in @INC (you may need to…
gff3 – Extracting animo acid and nucleotide sequences from KofamScan output and codon alignment
I want to extract the amino acid sequences from KofamScan output, and my workflow is as attached in the picture: For the analysis I am doing, I need to get the animo acid sequences, align them, and do codon alignment with the corresponding nucleotide sequences, so that I can get…
Convert Abricate output tsv file to gff3 format
Here’s one way using awk, that I think fulfills the requirements. It adds each of the column names (on the first line) to an array to make accessing each of the fields a bit easier. This approach isn’t strictly necessary, but it does make for a more readable solution in…
Improving conversion of abricate tsv file to gff3 file
Since such a neat solution (abricate tsv to gff3) was provided by Steve, here are few other steps that I am looking to add so that the script progress to logical maturity to be usable by many others. I have two files – (1) fasta file with .fna extension, and…
list of old gene name in C. elegans
Blog:list of old gene name in C. elegans 1 Hi, Gene of C. elegans may have two different names. For instance, WBGene00006993 has two locus name: (1) “zyg-8” and, (2) old/other names “apo-1” (www.wormbase.org/species/c_elegans/gene/WBGene00006993#0-9e-3). I have compiled (1)WB id, (2)locus, and (3)cosmid id of all genes from C. elegans GFF3…
Converting Abricate output (.tsv) to gff3 format
Converting Abricate output (.tsv) to gff3 format 0 Hello Everyone I have a tsv file generated from abricate (github.com/tseemann/abricate). I need to convert them to gff3 format with certain columns retained, certain columns reordered, while other columns deleted. We are trying to use these gff3 files for downstream applications and…
Converting an output de-novo transcriptome assembled with Trinity to a .gff3 file
Converting an output de-novo transcriptome assembled with Trinity to a .gff3 file 2 Hello! I’ve de-novo assembled a transcriptome from Trinity, resulting into Trinity.fasta, whose headers look like this: >TRINITY_DN29256_c0_g1_i1 len=323 path=[0:0-322] Followed, in the next line, by the sequence. To run an external downstream analysis with a R script,…
org.biojava.nbio.core.sequence.CDSSequence.getSequenceAsString java code examples | Tabnine
/** * A CDS sequence if negative stranded needs to be reverse complement * to represent the actual coding sequence. When getting a ProteinSequence * from a TranscriptSequence this method is callled for each CDSSequence * {@link www.sequenceontology.org/gff3.shtml} * {@link biowiki.org/~yam/bioe131/GFF.ppt} * @return coding sequence */ public String getCodingSequence() {…
How to perform synteny alignments and plots only with a gene?
How to perform synteny alignments and plots only with a gene? 0 Hi everyone, I’m trying to perform synteny alignments and plots for a gene of interest and its exons. I have two genomes in FASTA format and their corresponding annotations in GFF3 format. Does anyone know some software that…
Error while converting GFF file to GTF using AGAT
Error while converting GFF file to GTF using AGAT 0 Hi I am trying to convert a gff file to gtf file which I want to use for STAR. I tried AGAT(latest version) to convet but it gives me a series of error(mailny tow types) .I have attached the error…
gff3 file format
gff3 file format 1 Can I use the gff3 format file as a reference genome? I added a screenshot photo of how can I find a reference genome in this picture? gff3 genome reference • 61 views • link updated 1 hour ago by GenoMax 126k • written 2 hours…
In addition to the chado, are there other biological database schemas?
In addition to the chado, are there other biological database schemas? 0 I would like to know, what are the other existing biological database schemes, in addition to the chado? edition: I’m participating in a project, and they asked me to create a database for plants that use ontologies, a…
gff format to genome annotation
gff format to genome annotation 0 I am mapping RNAseq transcripts against a genome to annotate it. I am looking at Spaln and GMAP, and they both have two types of gff files as output (GFF3 gene format and GFF3 match format), which one is better to proceed with annotation?…
TRF output to .gff file
TRF output to .gff file 2 Hello, biostars! I’m trying to get .gff file from Tandem Repeat Finder output. Since TRF can’t do that, I’ve found TRAP tool, which can create .gff. But, TRAP creates as many .gff files as the number of contigs (ok, there is ‘cat’ command). The…
Detection of Streptococcus pyogenes M1UK in Australia and characterization of the mutation driving enhanced expression of superantigen SpeA
Walker, M. J. et al. Disease manifestations and pathogenic mechanisms of Group A Streptococcus. Clin. Microbiol. Rev. 27, 264–301 (2014). Article PubMed PubMed Central Google Scholar Carapetis, J. R., Steer, A. C., Mulholland, E. K. & Weber, M. The global burden of group A streptococcal diseases. Lancet Infect. Dis. 5,…
How to use chado after installation?
How to use chado after installation? 0 Hello, this is the first time I’m having contact with chado and perl, after some problems I managed to install it, however, I don’t know how to continue. GMOD provides documentation for converting gff file to gff3 and other data. However, I am…
Genome data visualization
Genome data visualization 0 Hi, Please I need help with producing visualization for genomic DNA regions such as seen in these figures I obtained from a publication: The other image also shows the regions of a chromosome by color. I just need information on the right tools (not IGV) that…
PROKKA.gff file is not compatible with featureCounts
Hi all, I am trying to count the number of reads that map to each gene using FeatureCounts. (RNA-Seq PE, linux) my input; GFF. file generated using Prokka GTF.file generated by NCBI annotation Sorted.bam files generated by bowtie2 and samtools. When I used gtf.file generated by NCBI, featurecounts run without…
Sort gff3 on chromosome, position and then featuretype (gene, mRNA, exon, CDS)
Sort gff3 on chromosome, position and then featuretype (gene, mRNA, exon, CDS) 1 Is it possible to sort a gff3 on chromosome, position and then featuretype (gene, mRNA, exon, CDS). The order of the featuretypes is important when converting a gff file to a gtf file with gffread. If the…
Converting GFF3 and FASTA files to GenBank format – Job in Data Science And Analytics
Find more Data Mining And Management Remote Jobs posted recently Worldwide Posted at – Feb 6, 2023 Toogit Instant Connect Enabled I have GFF3 files (annotation) for my bacterial genomes. I want a script that can be used to convert this GFF3 and its fasta file into Genbank file. Thanks….
mVISTA annotation
mVISTA annotation 0 Hello Biostars, I have been trying to use mVISTA for the comparing the chloroplast DNA. For this purpose I used the NCBI References as input and downloaded the annotation in GFF3 format for Arabidopsis thaliana (MZ323108) as a Reference Sequence. However, the result does not show the…
Seqlengths of x contains NA values!
Hello, I would like to use ORFik to determine the coverage of the different ORFs across the maize genome. I have ribo-seq data, the latest annotation file (a GFF3), and the v5 genome fasta file for B73. After running my code, three Large CompressedGRangesLists are created and none of them…
gff file from NCBI RefSeq GCF dataset has an invalid format
Thank you for noticing this. It is indeed an issue in the GFF3 file. The root of the problem is it’s a gene that is impossible to correctly represent in GFF3 because it incorporates sequence from both strands via trans_splicing. The complexity of this gene can be seen on the…
Retrieve specific fasta sequences from a group of assemblies
Retrieve specific fasta sequences from a group of assemblies 0 Hi all, Sorry if this question has been addressed before but I haven’t been able to find a solution to this. I have a lot of assemblies (around 800) and I would like to retrieve the fasta sequence for a…
error making Txdb from GTF and fasta files
Hello, I would like to use ORFik to map Ribo-reads to different ORFs in the maize genome. The latest version of the genome is Zm-B73-REFERENCE-NAM-5.0.fa. The annotation file is a GFF3. I have the genome fasta file, the fasta fai file, and the GFF3 file. The ORFik package uses GTF…
How to convert VCF (with possible predicted gene effects) to protein fasta/MSA
How to convert VCF (with possible predicted gene effects) to protein fasta/MSA 1 How to convert VCF (with possible predicted gene effects) and multiple samples to protein fasta/MSA Input: VCF (possibly with already gene/protein effects predicted via e.g. SnpEff) GFF3 (for the reference protein sequence and maybe to predict effects)…
genbank sequence format
HHS Vulnerability Disclosure, Help This document is an overview of the Entrez databases, with general information on If you are not sure that the “Save” option in your program will do this for you, use “Save As”, In Excel, select “Save As” from the File menu. optimizations to reduce memory…
can gff2 reference used in htseq-count?
Dear all We are recently working with E.coli plasmid and tried to summarize the gene counts from our RNA-Seq samples. The short reads were mapped to E.coli plasmid using tophat which generated bam files accordingly. However, we were unable to obtain a gff3 version of our target plasmid genome, the…
Use RSEM and Bowtie2 to align paired-end sequences
Use RSEM and Bowtie2 to align paired-end sequences 0 I want to use rsem-calculate-expression and bowtie2 aligner to align paired-end sequence based on the following conditions: 2 processors generate BAM file very fast bowtie2 sensitivity append gene/transcript name My code: rsem-refseq-extract-primary-assembly GCF_000001405.31_GRCh38.p5_genomic.fna GCF_000001405.31_GRCh38.p5_genomic.primary_assembly.fna rsem-prepare-reference –gff3 GCF_000001405.31_GRCh38.p5_genomic.gff –bowtie2 –bowtie2-path /bowtie2-2.4.5-py39hd2f7db1_2 –trusted-sources…
Htseq is giving me 0 counts using the GFF3 of miRBase
Hello! I am trying to annotate a miRNA-seq so that it gives me mature miRNAs where I already have 5p and 3p. For this, I have used the index mm10.fa and the miRBase mmu.gff3. I have aligned with HISAT2 and am trying to count with HTSeq, however I get 0…
genbank to GTF in galaxy
genbank to GTF in galaxy 0 Hi all, I am working on galaxy and have a genome file in genbank format. To use featurecounts for my RNAseq, I need to convert the genbank format to a GTF format because that’s the format the featurecounts tool in galaxy expects. Now, I…
What is RNAcentral? | RNAcentral
RNAcentral is a database of non-coding RNA sequences that aggregates ncRNA data from over 40 member resources known as Expert Databases.1 Non-coding RNAs Similar to mRNAs, non-coding RNAs (ncRNAs) are transcribed from DNA but are not translated into proteins. NcRNAs are found in all organisms and have a broad range…
computeMatrix in deeptool is Running with no result
computeMatrix in deeptool is Running with no result 0 Hi All, I wonder if someone can help me in explaining what to input on the -R <bed file> argument of the code below? computeMatrix scale-regions -S <bigwig file(s)> -R <bed file> -b 1000 what I did for example, I download…
Indexing with STAR
Indexing with STAR 0 Hello, I am working with RNA seq data and creating an index of reference genome Gossypium hirsutum by using STAR. STAR asks GTF annotation format while my file is GFF3. According to literature, in order to run GFF file I need to remove –sjdbOverhang 50 and…
Senior Bioinformatics Scientist II/ Staff Bioinformatics Scientist
Inscripta was founded in 2015 and recently launched the world’s first benchtop Digital Genome Engineering platform. The company is growing aggressively, investing in its leadership, team, and technology with a recent $150mm financing round led by Fidelity and TRowe price. The company’s advanced CRISPR-based platform, consisting of an instrument, reagents,…
Convertion Of Gff3 To Gtf
Convertion Of Gff3 To Gtf 3 How do I convert GFF file to a GTF file? Is there any tool available? gtf gff • 79k views The easiest way is to use the gffread program that comes with the Cufflinks software suite (Tuxedo) gffread my.gff3 -T -o my.gtf See gffread…
Adding repeats in a genome fasta at a particular location without messing up the annotations?
Adding repeats in a genome fasta at a particular location without messing up the annotations? 0 I want to add a bunch of expanded repeats in a genome fasta file, for eg. 100 ATTs at a particular location eg Chr1-1:2. How do I that and at the same time update…
biopython – Updating the GFF3 + Fasta to GeneBank code
I’m trying to convert gff3 and fasta into a gbk file for usage in Mauve. I’ve found a solution but the code is outdated: “””Convert a GFF and associated FASTA file into GenBank format. Usage: gff_to_genbank.py <GFF annotation file> <FASTA sequence file> “”” import sys import os from Bio import…
Change separator just between specific columns
I am trying to change the separator just between columns 1 and 9. After that, I would like to maintain the original separator. Those are first lines of my file both when directly reading it and when od -c file is executed: #description: evidence-based annotation of the human genome (GRCh38),…
How to assess structural variation in your genome, and identify jumping transposons
Prerequisites Data An annotated genome Long reads Repeat annotation Software minimap2 samtools bedtools – for comparisons only tabix – for visualization only Installation 1 2 3 /work/gif/remkv6/USDA/04_TEJumper conda create -n svim_env –channel bioconda svim source activate svim_env Map your long reads to your genome with minimap My directory locale 1…
EXOM-seq counting
EXOM-seq counting 0 Hi everyone, Does anyone know where to download the human Annotating Genomes with GFF3 or GTF files. I want to apply featureCounts to quantify read counts in the bam file in the command line. featureCounts -t exon -g gene_id -a annotation.gtf -o counts.txt mapping_results_SE.bam Best, AD expression…
counting EXOM reads using subread featureCounts
counting EXOM reads using subread featureCounts 0 does anyone know where to download the Annotating Genomes with GFF3 or GTF files? My idea is to download one of these files in the command line or any other way and to apply featureCounts to quantify read counts in the bam file….
How GenomicFeatures cdsBy() accounts for the frame info in the gff to get the CDS?
How GenomicFeatures cdsBy() accounts for the frame info in the gff to get the CDS? The info in the the 8th gff field m.ensembl.org/info/website/upload/gff.html frame – One of ‘0’, ‘1’ or ‘2’. ‘0’ indicates that the first base of the feature is the first base of a codon, ‘1’ that…
RSEM gff ‘parent’ field
RSEM gff ‘parent’ field 0 Hi Biostars, When quantifying RNAseq expression with RSEM using a gff3 annotation file, does RSEM take into consideration the ‘locus_tag’ or ‘Parent’ field from the gff3? For example, if the 3-5 processed rRNA/tRNA transcripts within an rDNA locus have the same ‘Parent’ or ‘locus_tag’, are…
Building Snpeff Database
I just went through figuring this out and I thought I would add my process, including the FASTA component, using Vibrio phage VP882 as my example and utilizing the gff strategy you mentioned in a comment to the other answer. Here is everything I did using an established snpEff installation….
Why does featurecounts give me an output file with only 0s?
Why does featurecounts give me an output file with only 0s? 0 Hello, I’m trying to run featurecounts on my .bam files, but the resulting file yields only 0s in every row and column. Here are the steps I have taken so far: (de novo) Assembled 40 transcripts from RNASeq…
PASA pipeline updating annotation results in no changes
PASA pipeline updating annotation results in no changes 0 I’m running these commands leaving the numbers as default in config and cleaned transcripts already. module load Miniconda3/4.9.2 # create database $PASAHOME/scripts/create_sqlite_cdnaassembly_db.dbi -c alignAssembly.config -S /home/data/pest_genomics/pasa/opt/pasa-2.4.1/schema/cdna_alignment_sqliteschema # Upload annotations $PASAHOME/scripts/Load_Current_Gene_Annotations.dbi -c alignAssembly.config -g Chilo_suppressalis_v2_genome_220620_correct2.fasta -P chilo_transfer2_corrected2.gff3 module load SAMtools/1.12-GCC-10.2.0 $PASAHOME/Launch_PASA_pipeline.pl -c…
mm39 genePred file
mm39 genePred file 1 Hello, i need a gene annotation file for the mm39 mouse genome in the genePred format. I found that there is a utility which can convert the information from the gtf format. However, where I would download the gtf file it says that it was created…
Transform a GTF file into a data frame in R
Transform a GTF file into a data frame in R 4 Hi, I would like to analyse the content of a GTF file. I am quite able with R and dplyr, so I would like to transform my GTF file into a data frame to facilitate my analysis. Does anybody…
How to identify corresponding chromosomes and coordinates of a species for query genes from a another species
How to identify corresponding chromosomes and coordinates of a species for query genes from a another species 0 I have a list of genes from species A and reference genome and gff3 of species B. I want to know homologous genes of species A genes in species B. I am…
Splice sequence indexing failed with err =127
Tophat2 Error: Splice sequence indexing failed with err =127 0 I’ve been trying to map my RNA-seq results onto an entire genome, and I’ve encountered a problem with splices. The script.pbs I submitted to cluster servers is: #PBS -N tophat_cufflinks_1 #PBS -o tophat_cufflinks_1_out.txt #PBS -e tophat_cufflinks_1_error_out.txt #PBS -l nodes=cu01:ppn=24 export…
Extracting GeneID from Dbxref section in GFF file while using featureCounts
Extracting GeneID from Dbxref section in GFF file while using featureCounts 1 Hi all, I’m trying to generate feature count files for the DeSeq2 pipeline, but I’ve run into an issue while using featureCounts . I see that the gene IDs that I need, aren’t in the same format at…
Given set of genomic sequences find potentially enriched genes?
I’m not sure there is any one tool that will do all of this for you. Perhaps some of the following might help. After downloading genes for mm10, construct a list of windows upstream or centered on TSSs, and overlap them or associate them with your coordinates: $ wget -qO-…
gmod post processing not working
In principle it is working. on gmod possessor mod. Found inside – Page 241The problem is that a postprocessing vertex for argument y in the call to Add from A is included in A’s procedure … summary information consists of the following sets , which are computed for each procedure…
The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests
Sequencing of Shorea leprosula genome Sample collection Leaf samples of S. leprosula were obtained from a reproductively mature (diameter at breast height, 50 cm) diploid tree B1_19 (DNA ID 214) grown in the Dipterocarp Arboretum, Forest Research Institute Malaysia (FRIM). DNA extraction Genomic DNA was extracted from leaf samples using the…
Any tools converting Genbank format to GFF3 format?
Any tools converting Genbank format to GFF3 format? 4 Dear all, As my title describe, I am asking help to convert Genbank format to GFF format. By GOOGLing, I found a perl script bp_genbank2gff3.pl), which has many bioperl dependencies. Could anyone inform me other easy-to-use tools? Any suggestions will be…
Extracting exon level read coverage of a specific gene
HTSeq – Extracting exon level read coverage of a specific gene 1 Dear all, I am trying to quantify RNASeq reads at the “exon level” using HTSeq. To achieve a quantitative exon comparison. I am using ENCODE mouse data which is Illumina reads alligned to GENCODE M27 (GRCm39) using STAR…
Why rvtest skipped all genes in analysis?
Why rvtest skipped all genes in analysis? 0 Sorry that I’m not a native English speaker, so maybe I did not make myself clear somewhere. If so, please forgive me and welcome to ask. I tried to do gene-level association analysis with Rvtests tools ( github.com/zhanxw/rvtests ). But rvtest detect…
Line length limit on input FASTA file: 65,536 characters (limit imposed by bioperl)
Hello, I’m trying to run the following command: agat_sp_extract_sequences.pl -g JU2526_Y39G10AR.22.gff -f JU2526*_region.fa -p And it throws the following error: ————- EXCEPTION: Bio::Root::Exception ————- MSG: Each line of the file must be less than 65,536 characters. Line 2 is 67824 chars. STACK: Error::throw STACK: Bio::Root::Root::throw /home/lgs6452/.conda/envs/exonerate_env/lib/site_perl/5.26.2/Bio/Root/Root.pm:447 STACK: Bio::DB::IndexedBase::_check_linelength /home/lgs6452/.conda/envs/exonerate_env/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm:757 STACK:…
how to identiify real isomers in mirge3.0’s output files.
how to identiify real isomers in mirge3.0’s output files. 0 How do you distinguish/extract ‘real’ isomirnas from the exhaustive output of mirge3.0? Im trying to do a differential expression analysis on the isomers of miRNA in my dataset. Im using mirge3.0 with the -gff and other outputs (basically all of…
How to write gffutils.feature.Feature object to file
How to write gffutils.feature.Feature object to file 0 How do you most efficiently write a collection of gffutils.feature.Feature objects to file, so that you can create a gff3 file from a collection of Feature objects? I am trying to create a gff3 file without the ##FASTA part at the bottom,…
Extracting exons and transcripts from gff3/gtf
I was just doing something similar about a week ago. You may be able to accomplish this using the GenomicFeatures R package. First load up the following in R: library(GenomicFeatures) library(GenomicRanges) library(rtracklayer) Then you will need to get the chromosome sizes file, which you can generate with directions from this…
Determining LOC coordinate from GFF3 start column
Determining LOC coordinate from GFF3 start column 0 Hi all, total noob question: I have a GFF3 file of a pepper (C. annuum) plant genome that looks like this: seqid src type start end chr01 PROTEIN gene 29119 37617 . – . ID=CA.PGAv.1.6.scaffold567.122 chr01 PROTEIN mRNA 29119 37617 . -…
Where is the annotation file if using the GtRNAdb (tRNA SE analysis) for mapping to RNAseq libraries?
Where is the annotation file if using the GtRNAdb (tRNA SE analysis) for mapping to RNAseq libraries? 0 Hi all, On the GtRNAdb (tRNA-SE analysis) website there is a file containing fasta sequences of different tRNA genes. gtrnadb.ucsc.edu/genomes/eukaryota/Hsapi38/ I aligned this GtRNAdb database with RNAseq libraries using bowtie2 and got…
wont recognize the gtf or gff3 files (runtime exception)
snpeff : wont recognize the gtf or gff3 files (runtime exception) 1 Hi, I am trying to build a custom databasee for snpeff. As instructed both in the forum and snpeff instructions, I did the following; Then I added the following into snpEff.config file # BG94_1 BG94_1.genome : BG94_1 Then…
featureCounts for WGS instead of RNA-seq
featureCounts for WGS instead of RNA-seq 1 Hello all, I have done whole genome sequencing and aligned reads on a reference genome. I have some bam files. I want to get the number of reads mapped to sepecific regions defined in a gff3 files. I have used featureCounts for RNA-seq…
Are there any alternatives to Liftoff
Are there any alternatives to Liftoff – Mapping annotations (GFF/GTF) between assemblies 2 Hi, I am annotating closely related accession (varieties) using reference assembly (please note that I am using only a region, so that is the reason why you don’t see chromosome info). I really liked liftoff (ver 1.6.1:…
Blank output When converting GFF3 file to GTF using either gffread or AGAT
Blank output When converting GFF3 file to GTF using either gffread or AGAT 1 Hi, I am trying to convert gff3 file (please see below) to GTF. I used two tools suggested here gffread and agat here. #gff-version 3 Bg_94-1_CX35|chr01_10700000_16500000 Liftoff gene 1 1345 . + . ID=gene_1;Name=Os01g0293800 gene;coverage=0.997;sequence_ID=0.982;extra_copy_number=0;copy_num_ID=gene_1_0 Bg_94-1_CX35|chr01_10700000_16500000…
How to align and visualize data with .fasta and .gff3 files in IGV?
How to align and visualize data with .fasta and .gff3 files in IGV? 1 Hi everyone, I have an issue in aligning and visualizing my data in IGV. As I read in manual of IGV, to align and visualize data, I need to to prepare .BAM/.SAM or other input format…
local variable ‘feature_db’ referenced before assignment
Hi, I want to map annotations from rich gencode human gtf or gff3 to great apes’ genome. I tried to run liftoff (github.com/agshumate/Liftoff) but it returns error the below. Following the github issue’s post, I confirmed there’s no _db file before running, the annotation file is .gff3, the permission of…
Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files
Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files 0 Hi, I sincerely hope that I am not repeating an already answered question. I couldn’t find the answer to my exact problem. I have three VCF files derived using bcftools (isec). Those…
Answer: PopGenome – VCF, fasta, GTF and codons still missing
Dear Maciek Hopefully you were able to solve these problems already. I cannot comment on the main set of issues you reported. However, I also encountered the error: `Error in START[!REV, 3] : incorrect number of dimensions` following certain instances of `set.synnonsyn` which I also noticed occurred for genes which…
MAKER genome annotation error with SNAP ab initio prediction
I am trying to do a second round of maker genome annotation with ab initio prediction by snap. The error I am getting is as follows: error: unknown command “genome.hmm”, see ‘snap help’. ERROR: Snap failed –> rank=NA, hostname=bioinformatics ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2…
How to trim a GFF3 file based on specific coordinates?
How to trim a GFF3 file based on specific coordinates? 0 Hi, I would like to create a GFF3 file containing information only for specific coordinates from the chromosome level GFF3 file. I know how to extract gene and CDS info separately but don’t know how to do trimming based…
STAR rna-seq for bacterial genomes
Hi, I’m willing to use STAR for bacterial genomes. I wanted to ask if this is strongly unadvised or if there is a way to manage the main challenges of mapping reads to prokaryotes. (I know there are specific tools for this purpose, i.e. EdgePro, but I’m a beginner in…