Tag: GFF

How to label columns in HTSeq output

How to label columns in HTSeq output 0 I’ve been working to process RNAseq data and I’ve used hisat2 to align my reads to the reference genome. When I take those output files and put them into HTSeq-count using the below code, I get a count matrix but the columns…

Continue Reading How to label columns in HTSeq output

Indexing with STAR

Indexing with STAR 0 Hello, I am working with RNA seq data and creating an index of reference genome Gossypium hirsutum by using STAR. STAR asks GTF annotation format while my file is GFF3. According to literature, in order to run GFF file I need to remove –sjdbOverhang 50 and…

Continue Reading Indexing with STAR

For Differential Gene Expression , which indexing format is better: GFF or GTF?

For Differential Gene Expression , which indexing format is better: GFF or GTF? 0 Hello, I am working on DGE and wish to create reference index for mapping. Two file formats are used for it GFF and GTF. My question is: What is the major difference between GTF and GFF?…

Continue Reading For Differential Gene Expression , which indexing format is better: GFF or GTF?

How to retrieve fasta sequence after local blast?

How to retrieve fasta sequence after local blast? 1 Hello, I have created a Blast database using a reference genome. Then, I have performed a local blast search in command line using a gene of interest. I have obtained some hits with the usual Blasting information. Now, I want to…

Continue Reading How to retrieve fasta sequence after local blast?

Convertion Of Gff3 To Gtf

Convertion Of Gff3 To Gtf 3 How do I convert GFF file to a GTF file? Is there any tool available? gtf gff • 79k views The easiest way is to use the gffread program that comes with the Cufflinks software suite (Tuxedo) gffread my.gff3 -T -o my.gtf See gffread…

Continue Reading Convertion Of Gff3 To Gtf

how to identiify real isomers in mirge3.0’s output files.

how to identiify real isomers in mirge3.0’s output files. 0 How do you distinguish/extract ‘real’ isomirnas from the exhaustive output of mirge3.0? Im trying to do a differential expression analysis on the isomers of miRNA in my dataset. Im using mirge3.0 with the -gff and other outputs (basically all of…

Continue Reading how to identiify real isomers in mirge3.0’s output files.

Submit sequence data to NCBI

Data provision and standards. GEO sequence submission procedures are designed to encourage provision of MINSEQE elements: Thorough descriptions of the biological samples under investigation, and procedures to which they were subjected. Thorough descriptions of the protocols used to generate and process the data. Request updates to accessioned records per the…

Continue Reading Submit sequence data to NCBI

How to get enome feature annotation through NCBI api ?

How to get enome feature annotation through NCBI api ? 1 Hi, I wanna get the whole genome annotion result with some information ,like transcript,exon,gene etc , As we know ,NCBI has provided the GFF file containing the above information , but I wanna get the latest content from NCBI…

Continue Reading How to get enome feature annotation through NCBI api ?

Refseq annotation for processed/unprocessed Pseudogenes

Refseq annotation for processed/unprocessed Pseudogenes 0 Hi, I have extracted the pseudogenes from refseq annotation file. However there is no information about the type of the pseudogene being processed/unprocessed in the gff file. on the other hand ensembl/gencode gff files do have this type of information. the problem is not…

Continue Reading Refseq annotation for processed/unprocessed Pseudogenes

How to write gffutils.feature.Feature object to file

How to write gffutils.feature.Feature object to file 0 How do you most efficiently write a collection of gffutils.feature.Feature objects to file, so that you can create a gff3 file from a collection of Feature objects? I am trying to create a gff3 file without the ##FASTA part at the bottom,…

Continue Reading How to write gffutils.feature.Feature object to file

read count to gene

read count to gene 0 I am using this command to get read counts to gene by using the bedtools intersect. samtools view -Shu -q10 -@ 20 UE-2955-CMLib12_sorted.bam | bedtools intersect -c -a GCA_900659725.1_ASM90065972v1_genomic.gff -b stdin > UE-2955-CMLib{i}_intersect_counts2.bed The command work for other files but not for one file. Which…

Continue Reading read count to gene

fetch out common/conserved genes from a bunch of bacteria species

fetch out common/conserved genes from a bunch of bacteria species 0 Hi all, I have a difficulty in determining and fetching out the common/conserved regulator genes from a bunch of species. I fetched out all the regulator genes from each bacteria species according to the GFF annotation. I would like…

Continue Reading fetch out common/conserved genes from a bunch of bacteria species

gffread error

hello I am currently trying to do RNA-seq using public data in brassica juncea. To use htseq-count for making count table, I have to convert gff file which downloaded in brassica database to gtf file. So I used gffread for converting gff file with below command gffread Bju.genome.gff -T -o…

Continue Reading gffread error

Incubator for useful bioinformatics code, primarily in Python and R

Collection of useful code related to biological analysis. Much of this is discussed with examples at Blue collar bioinformatics. All code, images and documents in this repository are freely available for all uses. Code is available under the MIT license and images, documentations and talks under the Creative Commons No…

Continue Reading Incubator for useful bioinformatics code, primarily in Python and R

wont recognize the gtf or gff3 files (runtime exception)

snpeff : wont recognize the gtf or gff3 files (runtime exception) 1 Hi, I am trying to build a custom databasee for snpeff. As instructed both in the forum and snpeff instructions, I did the following; Then I added the following into snpEff.config file # BG94_1 BG94_1.genome : BG94_1 Then…

Continue Reading wont recognize the gtf or gff3 files (runtime exception)

Are there any alternatives to Liftoff

Are there any alternatives to Liftoff – Mapping annotations (GFF/GTF) between assemblies 2 Hi, I am annotating closely related accession (varieties) using reference assembly (please note that I am using only a region, so that is the reason why you don’t see chromosome info). I really liked liftoff (ver 1.6.1:…

Continue Reading Are there any alternatives to Liftoff

convert genomic bigWig file to transcriptome space

convert genomic bigWig file to transcriptome space 0 Hi all, Is anyone aware of a function to convert a bw file mapped to a genome to map to a transcriptome (of said genome), where the input would be the genomic bw file and gff/gtf/bed annotation and output a single ‘transcriptomic’…

Continue Reading convert genomic bigWig file to transcriptome space

Blank output When converting GFF3 file to GTF using either gffread or AGAT

Blank output When converting GFF3 file to GTF using either gffread or AGAT 1 Hi, I am trying to convert gff3 file (please see below) to GTF. I used two tools suggested here gffread and agat here. #gff-version 3 Bg_94-1_CX35|chr01_10700000_16500000 Liftoff gene 1 1345 . + . ID=gene_1;Name=Os01g0293800 gene;coverage=0.997;sequence_ID=0.982;extra_copy_number=0;copy_num_ID=gene_1_0 Bg_94-1_CX35|chr01_10700000_16500000…

Continue Reading Blank output When converting GFF3 file to GTF using either gffread or AGAT

How to download the Homo_sapiens.GRCh38.100.gtf and Homo_sapiens.GRCh38.dna.primary_assembly.fa files for my analysis?

How to download the Homo_sapiens.GRCh38.100.gtf and Homo_sapiens.GRCh38.dna.primary_assembly.fa files for my analysis? 0 I am trying to perform STAR alignment and I need the reference files for indexing. I would like to know how to download the Homo_sapiens.GRCh38.100.gtf and Homo_sapiens.GRCh38.dna.primary_assembly.fa files so that I can use my following code for indexing…

Continue Reading How to download the Homo_sapiens.GRCh38.100.gtf and Homo_sapiens.GRCh38.dna.primary_assembly.fa files for my analysis?

Handy online tool for genomic analysis and data visualization

Previously, I have recommended two powered online tools for genomic analysis and data visualization here. I want to share with you other handy tools that I found recently. iTOL is perfect for beautifying genomic data. circos is useful for displaying the relationships between objects and positions. You could discover their…

Continue Reading Handy online tool for genomic analysis and data visualization

How to align and visualize data with .fasta and .gff3 files in IGV?

How to align and visualize data with .fasta and .gff3 files in IGV? 1 Hi everyone, I have an issue in aligning and visualizing my data in IGV. As I read in manual of IGV, to align and visualize data, I need to to prepare .BAM/.SAM or other input format…

Continue Reading How to align and visualize data with .fasta and .gff3 files in IGV?

does not contain a ‘gene’ attribute

htseq-count returns : does not contain a ‘gene’ attribute 1 Dear BIOSTAR community, I’m trying to make count matrix with htseq-count, htseq-count -s yes -t gene -i gene 01.sorted.sam annotation_cattle.gff > 01.txt even with –idattr=gene , it returns error: Error processing GFF file (line 1864255 of file annotation_cattle.gff): Feature gene-D1Y31_gp1…

Continue Reading does not contain a ‘gene’ attribute

Bio-DB-HTS installation and ensembl-vep

Bio-DB-HTS installation and ensembl-vep 0 I want to use ensembl-vep with custom annotation. In order to use gff file I need to have library Bio-DB-HTS installed. I downloaded Bio-DB-HTS and used Build.PL with no errors. When I try to install ensembl-vep it still gives an error asking for Bio-DB-HTS library….

Continue Reading Bio-DB-HTS installation and ensembl-vep

GRCh37 GFF filter transcript isoforms by RefSeq Select tag or longest

GRCh37 GFF filter transcript isoforms by RefSeq Select tag or longest 0 Dear all, I tried to filter the “RefSeq Select” transcript isoforms in the GRCh37.p13 human genome annotation gff (GCF_000001405.25_GRCh37.p13_genomic.gff.gz). Specifically my goal is to retain for each gene a transcript isoform with a tag=RefSeq Select attribute if exists,…

Continue Reading GRCh37 GFF filter transcript isoforms by RefSeq Select tag or longest

MiRBase miRNA analysis with STAR

MiRBase miRNA analysis with STAR 0 Hi All, I am using the latest mice reference genome (GRCm39) for small RNAseq/miRNA-seq analysis. MiRBase database doesn’t have any GFF/GTF file for the mouse mature-miRNA/loop-miRNA. I just have mature-miRNA and loop-miRNA fasta sequences from MiRBase. How I can use the STAR tool to…

Continue Reading MiRBase miRNA analysis with STAR

Answer: PopGenome – VCF, fasta, GTF and codons still missing

Dear Maciek Hopefully you were able to solve these problems already. I cannot comment on the main set of issues you reported. However, I also encountered the error: `Error in START[!REV, 3] : incorrect number of dimensions` following certain instances of `set.synnonsyn` which I also noticed occurred for genes which…

Continue Reading Answer: PopGenome – VCF, fasta, GTF and codons still missing

MAKER genome annotation error with SNAP ab initio prediction

I am trying to do a second round of maker genome annotation with ab initio prediction by snap. The error I am getting is as follows: error: unknown command “genome.hmm”, see ‘snap help’. ERROR: Snap failed –> rank=NA, hostname=bioinformatics ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2…

Continue Reading MAKER genome annotation error with SNAP ab initio prediction

How to trim a GFF3 file based on specific coordinates?

How to trim a GFF3 file based on specific coordinates? 0 Hi, I would like to create a GFF3 file containing information only for specific coordinates from the chromosome level GFF3 file. I know how to extract gene and CDS info separately but don’t know how to do trimming based…

Continue Reading How to trim a GFF3 file based on specific coordinates?

STAR rna-seq for bacterial genomes

Hi, I’m willing to use STAR for bacterial genomes. I wanted to ask if this is strongly unadvised or if there is a way to manage the main challenges of mapping reads to prokaryotes. (I know there are specific tools for this purpose, i.e. EdgePro, but I’m a beginner in…

Continue Reading STAR rna-seq for bacterial genomes

hisat2 compatibility for long read

hisat2 compatibility for long read 0 Hi, I am trying to align PacBio transcriptome reads against the genome to count the gene number. For pair end read i used the following workflow: # convert gff to gtf /home/software/cufflinks-2.2.1/gffread xxx.gff -T -o xxx.gtf # build index /home/software/hisat2-2.2.1/hisat2_extract_exons.py xxx.gtf > xxx.exon /home/software/hisat2-2.2.1/hisat2_extract_splice_sites.py…

Continue Reading hisat2 compatibility for long read

How to identify mutations from FASTA sequences?

How to identify mutations from FASTA sequences? 0 I have two full genome sequences (in Fasta format) plus annotation file (in gff format) from same organism. One sequence is the reference genome and another is my test sequence. Would you please suggest me some pipeline or tools ( preferably, R…

Continue Reading How to identify mutations from FASTA sequences?