Tag: HaplotypeCaller

variant – Error running gatk HaplotypeCaller with allele specific annotations

I’ve got HaplotypeCaller working nicely in standard mode, like so: # Run haplotypcaller gatk –java-options “-Xmx4g” HaplotypeCaller –intervals “$INTERVALS” -R “$REF” -I “$OUT”/results/alignment/${SN}_sorted_marked_recalibrated.bam -O “$OUT”/results/variants/${SN}_g.vcf.gz -ERC GVCF But when I try in allele-specific mode, I get the following error. All I’ve done is add the -G annotations at the end,…

Continue Reading variant – Error running gatk HaplotypeCaller with allele specific annotations

Do VQSR for HaplotypeCaller calls – Sarek

Expected Behavior Filter the calls from HaplotypeCaller with Variant Quality Score Recalibration according to GATK best practise (Tools VariantRecalibrator, ApplyRecalibration, see gatkforums.broadinstitute.org/gatk/discussion/39/variant-quality-score-recalibration-vqsr or a more recent version) Current Behavior Variant quality score recalibration currently not included. Asked Jan 26 ’18 at 08:25 malinlarsson 1 Answer: Keep in mind, that you’d…

Continue Reading Do VQSR for HaplotypeCaller calls – Sarek

Running samtools view on bam affects the number of variants called by both haplotypecaller and deepvariant – C samtools

Thanks for getting back to me Valeriu. As you suggested, I used the latest commit from the develop branch in my pipeline, and the results look good. I was able to replicate the numbers from samtools v1.10.2 and v1.11 for both variant callers. FYI $ docker run scilifelabram/htslib:dev_proper /opt/samtools/samtools version…

Continue Reading Running samtools view on bam affects the number of variants called by both haplotypecaller and deepvariant – C samtools

GATK GenotypeGVCFs changes HET to REF_ALT

Dear all, I’ve been using GATK HaplotypeCaller / GenotypGVFs (v4.2.3.0) for a while but, recently found something strange. There is a position (7063) with 8 reads (3T + 5A) that, even though HaplotyCaller calls as a HET (see image, lower track): NC_046966.1 7063 . T A,<NON_REF> 177.64 . BaseQRankSum=0.887;DP=8;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=2.369;RAW_MQandDP=16885,8;ReadPosRankSum=1.345 GT:AD:DP:GQ:PL:SB…

Continue Reading GATK GenotypeGVCFs changes HET to REF_ALT

Benchmarking the NVIDIA Clara Parabricks germline pipeline on AWS

This blog post was contributed by Ankit Sethia, PhD, and Timothy Harkins, PhD, at NVIDIA Parabricks, and Olivia Choudhury, PhD,  Sujaya Srinivasan, and Aniket Deshpande at AWS. This blog provides an overview of NVIDIA’s Clara Parabricks along with a guide on how to use Parabricks within the AWS Marketplace. It…

Continue Reading Benchmarking the NVIDIA Clara Parabricks germline pipeline on AWS

Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

I’ve got sequencing data for a small 500 bp amplicon from a few samples. GATK best principles suggest running VariantRecalibrator on the GVCF files I generate. I’m trying to get this working, but I get an error about “Found annotations with zero variances”. Reading the gatk manual and other posts…

Continue Reading Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

Large-scale genome-wide study reveals climate adaptive variability in a cosmopolitan pest

Genomic data The foundational resource for this study was a dataset of 40,107,925 nuclear SNPs sequenced from a worldwide sample of 532 DBM individuals collected in 114 different sites based on our previous project15. DNA was extracted from each of the 532 individuals using DNeasy Blood and Tissue Kit (Qiagen,…

Continue Reading Large-scale genome-wide study reveals climate adaptive variability in a cosmopolitan pest

Why invariant blocks in GATK consistently have very low quality scores (but not variant sites)

I am using the latest GATK 4.1.2.0 to do variant calling on insect samples with a reference genome of a closely related species. The heterozygosity is approximately 0.02. I followed the standard pipeline of “HaplotypeCaller –> GenomicDBImport –> GenotypeGVCFs” to get my unfiltered VCFs, however, although my variant sites have…

Continue Reading Why invariant blocks in GATK consistently have very low quality scores (but not variant sites)

No quality in non-variant sites GATK

No quality in non-variant sites GATK 1 Heys, I am doing the SNP calling with Haplotypecaller BP_Resolution, CombineGVCFs with convert-to-base-pair-resolution and GenotypeGVCFs with include-non-variant-sites with GATK and when I get my vcf file, the non-variant sites does not have any quality at all: #CHROM POS ID REF ALT QUAL FILTER…

Continue Reading No quality in non-variant sites GATK

Parallel genomic responses to historical climate change and high elevation in East Asian songbirds

Extreme environments present profound physiological stress. The adaptation of closely related species to these environments is likely to invoke congruent genetic responses resulting in similar physiological and/or morphological adaptations, a process termed “parallel evolution” (1). Existing evidence shows that parallel evolution is more common at the phenotypic level than at…

Continue Reading Parallel genomic responses to historical climate change and high elevation in East Asian songbirds

Germline variant calling pipeline using Snakemake

Tool:Germline variant calling pipeline using Snakemake 0 Hello everybody, as part of a project, I had to write an in-house pipeline to call germline mutations for ~100 patients. For that I used Snakemake and GATKs best practice guidelines. Steps that take a long time (HaplotypeCaller or BaseQualityScoreRecalibration) are automatically parallelized…

Continue Reading Germline variant calling pipeline using Snakemake

Pararellization in GATK 4

Pararellization in GATK 4 4 Hi all, I’m trying (and failing) to multi-thread HaplotypeCaller in GATK 4. I read in a few places online that multi-threading in GATK 4 has been made more tricky, maybe even unfeasible, but all the places where I read that seem to be more than…

Continue Reading Pararellization in GATK 4

GATK HaplotypeCaller – Shutting down engine

00:32:48.224 INFO  HaplotypeCaller – Shutting down engine [September 17, 2021 12:32:48 AM CST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.04 minutes. Runtime.totalMemory()=2398617600 java.nio.BufferUnderflowException         at java.nio.ByteBuffer.get(ByteBuffer.java:688)         at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:285)         at java.nio.ByteBuffer.get(ByteBuffer.java:715)         at htsjdk.samtools.MemoryMappedFileBuffer.readBytes(MemoryMappedFileBuffer.java:34)         at…

Continue Reading GATK HaplotypeCaller – Shutting down engine

missing genotype ./. even with many reads under AD and DP

missing genotype ./. even with many reads under AD and DP 0 Hi All, I am trying to troubleshoot all the missing genotypes in my VCF. I don’t quite understand why I get missing genotypes (./.) when there are plenty of reads under AD and DP. I think it’s because…

Continue Reading missing genotype ./. even with many reads under AD and DP

HaplotypeCaller Memory Optimization

HaplotypeCaller Memory Optimization 0 When using HaplotypeCaller on GATK, is there a fixed amount of memory that works well for for the java -Xmx input, or does it scale with the size of the input bam? eg if I have a 50 GB file do I need to set -Xmx…

Continue Reading HaplotypeCaller Memory Optimization

ABRF Study Benchmarks NGS Platforms on Human, Microbial Samples, Provides Peek at Genapsys Data

NEW YORK – The results of a major, core facilities-driven benchmarking study for next-generation sequencing platforms are in, and just about every major player in the field can claim a victory of some sort. The data support longstanding advantages touted by market leader Illumina, while also providing a sneak peak…

Continue Reading ABRF Study Benchmarks NGS Platforms on Human, Microbial Samples, Provides Peek at Genapsys Data

Speeding up HaplotypeCaller analysis

Speeding up HaplotypeCaller analysis 0 how can I speed up the HaplotypeCaller command running? input bam file is about 16G and running time using the below command is about 15 hours. java -Xmx64G -jar GenomeAnalysisTK.jar -nt 1 -nct 34 -T HaplotypeCaller -R Renamed.fasta -I realigned.bam -o raw_variants.g.vcf.gz -ERC GVCF GATK…

Continue Reading Speeding up HaplotypeCaller analysis

Use of GenotypeGVCFs in population genetic studies

Use of GenotypeGVCFs in population genetic studies 0 I have 16 whole genome sequenced samples from two populations (8 for each population). My goal is detection of signature of selection and introgression. I performed read cleaning, mapping to reference, mark duplication. SNP calling was performed using HaplotypeCaller in GATK for…

Continue Reading Use of GenotypeGVCFs in population genetic studies

How to pass custom software specific variables to nf-core/sarek nextflow pipeline?

How to pass custom software specific variables to nf-core/sarek nextflow pipeline? 0 I’m attempting to call whole genome variants using nf-core/sarek nextflow pipeline. In QC step there is an option that invokes trim_galore quality trimming, but i don’t know how to pass my custom adapters to be cut as well….

Continue Reading How to pass custom software specific variables to nf-core/sarek nextflow pipeline?

How to filter GATK vcf file using other programs

How to filter GATK vcf file using other programs 0 hi everyone I called variants for a WGS project using GATK (HaplotypeCaller). Now, when I want to filter that VCF file by VariantFiltration command in GATK, so the following error message appears. java.lang.NumberFormatException: For input string: “10.90” I asked my…

Continue Reading How to filter GATK vcf file using other programs

gatk, ref and alt percentages .

gatk, ref and alt percentages . 0 Hello everyone, I need some info regarding how to get percentage of REF and ALT nucleotide sequence in my data. I am using gatk and currently not getting REF and ALT percentages . the command i am using for the gatk vcf file…

Continue Reading gatk, ref and alt percentages .

Consolidate gVCF calling

Hi. I am running genotyping with HaplotypeCaller and GenotypeGVCFs. After that, in the genotype information for some samples in my vcf I found some calls containing multiple genotypes (e.g. 0|0:8,0:11:99:0|1:10777_AGGCGCGGAGG_A:102,126,462:). What could be the issue? Thank you! Here is the full line: chr10 10787 . G GGGCGCGCAGCGCCGGCGCA 356.99 PASS AC=1;AF=0.014;AN=18;BaseQRankSum=-1.762;DP=4023;Ex…

Continue Reading Consolidate gVCF calling

no positional argument is defined for this tool.

A USER ERROR has occurred: no positional argument is defined for this tool. 0 Hello, hope all are doing well. I am running the HaplotypeCaller command to generate the variant file by giving multiple input bam files in a single command. python3 gatk –java-options -Xmx7g HaplotypeCaller –reference ref.fasta –input file1.bam…

Continue Reading no positional argument is defined for this tool.

Calling variants on reads with MAPQ=0 on HaplotypeCaller or bcftools mpileup

Calling variants on reads with MAPQ=0 on HaplotypeCaller or bcftools mpileup 2 I am working with about 500 samples of human exome data. used hg19 to align my reads and ran a standard best-practices GATK workflow. Later only to realise that a small 1Mb loci has not mapped properly due…

Continue Reading Calling variants on reads with MAPQ=0 on HaplotypeCaller or bcftools mpileup

Error when Phasing with Beagle 5.2

Error when Phasing with Beagle 5.2 0 I’m having trouble phasing a multi-sample (9-samples) vcf file produced by gatk HaplotypeCaller with Beagle 5.2. I do not have a genetic map or reference panel. I am working with a very heterozygous group of organisms (sea urchins). When I run beagle with…

Continue Reading Error when Phasing with Beagle 5.2

So many variants detected.

So many variants detected. 0 Dear All, I have done variant calling in Germline data that has single sample of each individual and two genes. I did following steps, but after checking results I found too many variants. After Haplotypecaller (the step 6) I found 140900 known variants, and the…

Continue Reading So many variants detected.