wont recognize the gtf or gff3 files (runtime exception)

snpeff : wont recognize the gtf or gff3 files (runtime exception)

1

Hi,

I am trying to build a custom databasee for snpeff.
As instructed both in the forum and snpeff instructions, I did the following;

Then I added the following into snpEff.config file

# BG94_1
BG94_1.genome : BG94_1

Then I added a gff3 file (tried with gtf too) in to the path/to/snpeff-5.0.1/data/BG94_1 folder together with BG94_1.fa (both gzipped)
Then I ran the following command (please note that I am using bioconda installation of snpeff).

snpEff build -gff3 -v BG94_1

I am getting the following error;
0

0:00:00 SnpEff version SnpEff 5.0e (build 2021-03-09 06:01), by Pablo Cingolani
00:00:00    Command: 'build'
00:00:00    Building database for 'BG94_1'
00:00:00    Reading configuration file 'snpEff.config'. Genome: 'BG94_1'
00:00:00    Reading config file: /Users/venura/miniconda3/pkgs/snpeff-5.0-hdfd78af_1/share/snpeff-5.0-1/data/BG94_1/snpEff.config
00:00:00    Reading config file: /Users/venura/miniconda3/envs/py38/share/snpeff-5.0-1/snpEff.config
00:00:00    done
Reading GFF3 data file  : '/Users/venura/miniconda3/envs/py38/share/snpeff-5.0-1/./data/BG94_1/genes.gff'
java.lang.RuntimeException: File not found '/Users/venura/miniconda3/envs/py38/share/snpeff-5.0-1/./data/BG94_1/genes.gff'
    at org.snpeff.util.Gpr.reader(Gpr.java:536)
    at org.snpeff.util.Gpr.reader(Gpr.java:507)
    at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryGff.readGff(SnpEffPredictorFactoryGff.java:488)
    at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryGff.create(SnpEffPredictorFactoryGff.java:341)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:370)
    at org.snpeff.SnpEff.run(SnpEff.java:1188)
    at org.snpeff.SnpEff.main(SnpEff.java:168)
java.lang.RuntimeException: Error reading file '/Users/venura/miniconda3/envs/py38/share/snpeff-5.0-1/./data/BG94_1/genes.gff'
java.lang.RuntimeException: File not found '/Users/venura/miniconda3/envs/py38/share/snpeff-5.0-1/./data/BG94_1/genes.gff'
    at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryGff.create(SnpEffPredictorFactoryGff.java:357)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:370)
    at org.snpeff.SnpEff.run(SnpEff.java:1188)
    at org.snpeff.SnpEff.main(SnpEff.java:168)
00:00:00    Logging
00:00:01    Checking for updates...
00:00:03    Done.

Here are some lines from my gff3 file;

#gff-version 3
    Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    1   1345    .   +   .   ID=gene_1;Name=Os01g0293800 gene;coverage=0.997;sequence_ID=0.982;extra_copy_number=0;copy_num_ID=gene_1_0
    Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    1623    3128    .   -   .   ID=gene_6;Name=Os01g0293900 gene;coverage=0.999;sequence_ID=0.968;extra_copy_number=0;copy_num_ID=gene_6_0
    Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    20379   21605   .   -   .   ID=gene_7;Name=Os01g0294500 gene;coverage=0.999;sequence_ID=0.995;extra_copy_number=0;copy_num_ID=gene_7_0
    Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    48673   50214   .   -   .   ID=gene_5;Name=Os01g0294700 gene;coverage=1.0;sequence_ID=0.995;extra_copy_number=0;copy_num_ID=gene_5_0
    Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    102125  104501  .   -   .   ID=gene_4;Name=Os01g0295600 gene;coverage=1.0;sequence_ID=0.992;extra_copy_number=0;copy_num_ID=gene_4_0
    Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    105502  108051  .   -   .   ID=gene_3;Name=Os01g0295700 gene;coverage=0.996;sequence_ID=0.991;extra_copy_number=0;copy_num_ID=gene_3_0

I am wondering why this is happening.


GTF


GFF3


snpeff

• 259 views

Finally, I made the custom database. Adding steps here just in case someone else needs it.

First, I added my database entries into the snpEff.config file.

# BG94_1
BG94_1.genome : BG94_1

Since my genes.gff3 file continued to give troubles, I used @Juke34’s AGAT gff2gtf script and converted the file to .gtf (version matters) using the following command.

agat_convert_sp_gff2gtf.pl -gff genes.gff3  --gtf_version 2.2 -o genes.gtf

Then I included the annotation file (genes.gtf) together with sequence sequences.fa file inside the predefined folder for the database (my case its /Users/venura/miniconda3/pkgs/snpeff-5.0-hdfd78af_1/share/snpeff-5.0-1/data/BG94_1).
Then I used the following command (inside the snpeff-5.0-1 folder to build the database)

 java -jar snpEff.jar build -gtf22 -v BG94_1


Login
before adding your answer.

Traffic: 1153 users visited in the last hour

Read more here: Source link