Tag: VCF

Extract species-specific SNPs from VCF files

Extract species-specific SNPs from VCF files 1 Hi All, I have polymorphic sites for each species from different accessions in separate vcf files. ~4-5 million SNPs for each of the 5 closely related species. I need to extract species-specific SNPs from this data and was wondering if there are any…

Continue Reading Extract species-specific SNPs from VCF files

convert beagle genotypes to vcf

convert beagle genotypes to vcf 0 Hi, I have a phased beagle file which I generated through Angsd v0.935. I would like to use the beagle utility program beagle2vcf.jar. However I keep getting this error: java -jar beagle2vcf.jar rs markers bgl_comb ? > vcf Exception in thread “main” java.lang.IllegalArgumentException: Alleles…

Continue Reading convert beagle genotypes to vcf

Extract differen variants of a vcf file comparing to vcf files

I have a treatment vcf file and three control vcf files. This has been generated from somatic variants on RNAseq data. I want to extract variants that are in treatment sample but not in control groups. To do so, I first normalized them using bcftools norm -m -any command, then…

Continue Reading Extract differen variants of a vcf file comparing to vcf files

intersect VCF files

intersect VCF files 4 Dear all, please would appreciate a quick recommendation about the best way to intersect 2 VCF files. many thanks, bogdan VCF • 18k views • link updated 2 hours ago by priya.bmg ▴ 40 • written 7.0 years ago by Bogdan ★ 1.3k bcftools isec is…

Continue Reading intersect VCF files

Error: ##fileformat=VCFv4.2 does not exist

Error: ##fileformat=VCFv4.2 does not exist 3 Hello everybody, I am using Pharmcat to preprocess my vcf file, and for this I am running this command python3 pharmcat_vcf_preprocessor.py -vcf NA12801.VCF But I am getting this error Error: ##fileformat=VCFv4.2 does not exist I have generated my vcf file by using gatk Haplotypecaller…

Continue Reading Error: ##fileformat=VCFv4.2 does not exist

imputing haplotypes from variants in a VCF file

imputing haplotypes from variants in a VCF file – vcfIndexFile 0 HI there, trying to get my PHG imputation pipeline running. In the configuration file example for “imputing haplotypes from variants in a VCF file”, one parameter is indexFile=/phg/outputDir/vcfIndexFile Where should I derive this vcfIndexFile? Or, is it generated by…

Continue Reading imputing haplotypes from variants in a VCF file

Range-wide whole-genome resequencing of the brown bear reveals drivers of intraspecies divergence

Sample collection We obtained the short read sequences for 33 brown bear genomes, four polar bears (Ursus maritimus) and two American black bears (Ursus americanus), publicly available from NCBI’s SRA repository (Table S1 and Fig. 1a)12,13,15,16,40,51,65. Next, we selected from our private collections a total of 95 additional samples for sequencing, among…

Continue Reading Range-wide whole-genome resequencing of the brown bear reveals drivers of intraspecies divergence

How to choose –mind value for plink SNPs filtering

How to choose –mind value for plink SNPs filtering 0 I am trying to filter SNPs after converting vcf to plink format of only bialleleic SNPs, So itried following steps: plink –bfile idfilled_data –geno 0.1 –maf 0.1 –allow-extra-chr –make-bed –out maf_filtered_data plink2 –bfile maf_filtered_data –allow-extra-chr –indep-pairwise 200kb 1 0.15 –out…

Continue Reading How to choose –mind value for plink SNPs filtering

snakemake Unexpected keyword bam in rule definition

snakemake Unexpected keyword bam in rule definition 0 I am trying to automate GVF calling via DeepVariant using snakemake with this file based on snakemakes own documentation on using deepvariant via wrapper (snakemake-wrappers.readthedocs.io/en/stable/wrappers/deepvariant.html#deepvariant): # check if logfile exists or make new if it doesn’t import os.path if not os.path.exists(“slurm_logs”): os.mkdir(“slurm_logs”)…

Continue Reading snakemake Unexpected keyword bam in rule definition

SNP calling

SNP calling 0 Hello I made for 83 samples bam file a vcf file with HaplotypeCaller then filtered with VarianFiltration, after that with vcfR package in R program got “GT”. but I have many no-call (./.). I want to remove no-call . also I used of gatk HaplotypeCaller -R reference.fasta…

Continue Reading SNP calling

Is the heterozygosity flag (–het) in vcftools calculate observed and expected heterozygosity?

Is the heterozygosity flag (–het) in vcftools calculate observed and expected heterozygosity? 1 Hi All, I am still beginner in this field, I just want to make sure about what I am doing. I have used vcftools to calculate heterozygosity of my vcf file, which contains one population, please see…

Continue Reading Is the heterozygosity flag (–het) in vcftools calculate observed and expected heterozygosity?

VCF to Plink files

Hello, I am hoping somebody with experience with plink could help. I am trying to generate plink .bim, .fam and .bed files from a .vcf (one with variants filtered out and one that keeps the variants) and have toyed around with a couple of different commands that I found on…

Continue Reading VCF to Plink files

Manual Polygenic Risk Score calculation

Manual Polygenic Risk Score calculation 1 Hi all, I am attempted to calculate PRS manually, and I’m very close to to obtaining a score. To recap what has been done, I have a patients individual in which I annotated their VCF with RSIDs. From there, I went to PGS catalog…

Continue Reading Manual Polygenic Risk Score calculation

How To Install r-bioc-annotationhub on Kali Linux

In this tutorial we learn how to install r-bioc-annotationhub on Kali Linux. r-bioc-annotationhub is GNU R client to access AnnotationHub resources Introduction In this tutorial we learn how to install r-bioc-annotationhub on Kali Linux. What is r-bioc-annotationhub This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub…

Continue Reading How To Install r-bioc-annotationhub on Kali Linux

Median depth across samples from multi-sample VCF

Median depth across samples from multi-sample VCF 0 Hi folks, I am trying to extract median DP values, across samples, from each line of a multi-sample VCF. (The DP for each individual sample is given in the FORMAT columns. There are ~400,000 samples for most sites) I know I can…

Continue Reading Median depth across samples from multi-sample VCF

Analyze Amplicon Seq results for variants and mutation sites

Analyze Amplicon Seq results for variants and mutation sites 0 @41d09ed8 Last seen 51 minutes ago United States Hello, I have several Paired-End, Amplicon Sequenced data. The amplicon is 222 base-pairs and 2×250 sequencing was done so there is heavy overlap. I already aligned these sequences to the reference amplicon…

Continue Reading Analyze Amplicon Seq results for variants and mutation sites

Visualize variants and percentage of variants from one sample of Amplicon Seq data?

Visualize variants and percentage of variants from one sample of Amplicon Seq data? 0 Hello, We are analyzing viral evolution by analyzing mutations present in a specific genomic location and how it evolves over time. We are performing amplicon sequencing of a specific region that is 222-228 bp at intervals….

Continue Reading Visualize variants and percentage of variants from one sample of Amplicon Seq data?

Issue with VCF format while using Pharmcat

Hello everybody, I am using pharmcat tool’s prerprocessor feature to preprocessmy vcf file using the command > python3 pharmcat_vcf_preprocessor.py -vcf sample.vcf But I think there is some issue with my vcf file as this command outputs an error > Reading samples from sample.vcf … Saving output to . > >…

Continue Reading Issue with VCF format while using Pharmcat

Plink duplicate ID

Plink duplicate ID 1 Hi, I’ve converted the reich dataset to plunk format along with my vcf file provided from my full genome, I merged the both together which led to getting an error and output two files. The two files it output was .fam and .missnp, now it tried…

Continue Reading Plink duplicate ID

merging and annotating bcf files for variant calling

Hello I need to merge my all bcf (binary of vcf) files for filtering but it gave me this error. Note, because I have 100 samples, I have decided to split them into chromosomes. bcftools merge -o merged_samples_chr1.bcf –file-list bcf_list_chr1 Error: WARNING: Environment variable LD_PRELOAD already has value [], will…

Continue Reading merging and annotating bcf files for variant calling

drop duplicate insertion deletions in VCF at same position while keeping one

drop duplicate insertion deletions in VCF at same position while keeping one 0 I am normalizing some GWAS summary statistics to gnomad. gnomad has some entries like this that seem to be duplicated indels: chr21 13405435 rs140129927 G GT . PASS AC=2962;AN=148224;AF=0.0199833;popmax=afr;faf95_popmax=0.0636127;AC_non_v2_XX=1118;AN_non_v2_XX=59420> chr21 13405435 rs140129927 GT G . PASS AC=40946;AN=148190;AF=0.276307;popmax=amr;faf95_popmax=0.419202;AC_non_v2_XX=16812;AN_non_v2_XX=59400…

Continue Reading drop duplicate insertion deletions in VCF at same position while keeping one

Vcf-fix-newlines Command – Laramatic

vcf-fix-newlines Collection of tools to work with VCF files Maintainer: Debian Med Packaging Team Section: science Install vcf-fix-newlines Debian apt-get install vcftools Click to copy Ubuntu apt-get install vcftools Click to copy Kali Linux apt-get install vcftools Click to copy Fedora dnf install vcftools Click to copy Raspbian apt-get install…

Continue Reading Vcf-fix-newlines Command – Laramatic

generate 1 maf for 2 vcf files

vcf2maf – generate 1 maf for 2 vcf files 0 I know that vcf2maf can be used to generate one maf file per vcf file but I was wondering if it can also be used to generate 1 maf file for matched samples. I have a vcf for tumour and…

Continue Reading generate 1 maf for 2 vcf files

Critical criteria for filtering variants by VariantFiltration

Critical criteria for filtering variants by VariantFiltration 0 Hi all. I run the GATK VariantFiltration with the following parameters (according to GATK recommendation) to find robust variants. Then I used them as input for annotating variants. Do you have any suggestions for better filtration? Is it recommended to run VariantFiltration…

Continue Reading Critical criteria for filtering variants by VariantFiltration

find tandem repeats in DNA

find tandem repeats in DNA 1 @07a6aebe Last seen 8 hours ago United Kingdom I want to find tandem repeats in DNA. I have access to CRAM file and the VCF file. I initially tried to get the insertions from the VCF file, but I am not sure if the…

Continue Reading find tandem repeats in DNA

bgzip error 4

bgzip error 4 0 Hi there, I am trying to combine all separated chrom vcf files to one vcf file using picard, and found out chr1 gz file was corrupted so removed it and tried to make a new gz file. However, I’m having this error and I couldn’t find…

Continue Reading bgzip error 4

find tandem repeats in DNA from CRAM/VCF file

find tandem repeats in DNA from CRAM/VCF file 0 I want to find tandem repeats in DNA. I have access to CRAM file and the VCF file. I initially tried to get the insertions from the VCF file, but I am not sure if the variant caller has included all…

Continue Reading find tandem repeats in DNA from CRAM/VCF file

Whole genome sequencing revealed genetic diversity, population structure, and selective signature of Panou Tibetan sheep | BMC Genomics

Zhao E, Yu Q, Zhang N, Kong D, Zhao Y. Mitochondrial DNA diversity and the origin of Chinese indigenous sheep. Trop Anim Health Prod. 2013;45(8):1715-22. Liu J, Ding X, Zeng Y, Yue Y, Guo X, Guo T, et al. Genetic diversity and phylogenetic evolution of Tibetan sheep based on mtDNA D-loop…

Continue Reading Whole genome sequencing revealed genetic diversity, population structure, and selective signature of Panou Tibetan sheep | BMC Genomics

How to Calulate Allele Frequency from a VCF File?

I have a VCF file with 200 samples (mitochondrial genome of Plasmodium falciparum). Here is a pic to take a look at: And a few relevant lines from the actual file: ##INFO=<ID=AC,Number=A,Type=Integer,Description=”Allele count in genotypes, for each ALT allele, in the same order as listed”> ##INFO=<ID=AF,Number=A,Type=Float,Description=”Allele Frequency, for each ALT…

Continue Reading How to Calulate Allele Frequency from a VCF File?

Please supply a reference FASTA/GBK/EMBL file with –reference

Error: Please supply a reference FASTA/GBK/EMBL file with –reference 0 Hi, i am trying to run snippy on multiple genomes, however it gives following error like Please supply a reference FASTA/GBK/EMBL file with –reference even after providing the reference file. i don’t understand why it is happen and here is…

Continue Reading Please supply a reference FASTA/GBK/EMBL file with –reference

PHG Load haplotype and create consensus

Here, presented my PHG scripts, config, wgs_keyfile. 1. Create valid intervals docker run –name test_assemblies –rm -v /DATA/jysong/PHG/ver1.0_phg/:/phg/ -t maizegenetics/phg:1.0 /tassel-5-standalone/run_pipeline.pl -Xmx100G -debug -configParameters /phg/Masterconfig.txt -CreateValidIntervalsFilePlugin -intervalsFile /phg/inputDir/reference/glyma.Wm82.gnm4.ann1.T8TQ.gene_models_main.bed -referenceFasta /phg/inputDir/reference/glyma.Wm82.gnm4.4PTR.genome_main.fixed.fna.gz -mergeOverlaps true -generatedFile /phg/validBedFile.bed -endPlugin &> Log/1.Create_validinterval.txt & 2. Create initial DB docker run –name create_initial_db –rm -v /DATA/jysong/PHG/ver1.0_phg/:/phg/ -t…

Continue Reading PHG Load haplotype and create consensus

High ref mismatch rate after liftOver from 23andme hg19 to hg38

I lifted some 23andme files from hg19 to hg38 using the following workflow in R calling samtools,plink and liftOver: library(tidyverse) #set working directory to data directory trio_wd <- str_glue(here::here(),’/trio/K/’) #create file list for raw data file_list <- str_c(trio_wd,dir(trio_wd)) %>% str_extract(‘genome.+\\d.txt’) %>% str_extract(‘^(?:(?!admix).)+$’) %>% unique() %>% {.[!is.na(.)]} %>% str_c(trio_wd,.) #liftover loop…

Continue Reading High ref mismatch rate after liftOver from 23andme hg19 to hg38

How to convert VCF (with possible predicted gene effects) to protein fasta/MSA

How to convert VCF (with possible predicted gene effects) to protein fasta/MSA 1 How to convert VCF (with possible predicted gene effects) and multiple samples to protein fasta/MSA Input: VCF (possibly with already gene/protein effects predicted via e.g. SnpEff) GFF3 (for the reference protein sequence and maybe to predict effects)…

Continue Reading How to convert VCF (with possible predicted gene effects) to protein fasta/MSA

Standards, Regulation, Funding Move Bioinformatics in 2022, But Hurdles to Precision Medicine Remain

CHICAGO – Although the US Food and Drug Administration (FDA) provided some long-sought clarity in 2022 on how it would regulate clinical decision support and in vitro diagnostic software, technology developers and healthcare organizations still struggled with how to integrate genomics data into clinical practice. It will likely take more…

Continue Reading Standards, Regulation, Funding Move Bioinformatics in 2022, But Hurdles to Precision Medicine Remain

Finding common genes by taxa on Genbank? : bioinformatics

Hmm, I am assuming that since you are referring to canidae then you want “family level” information. I know you can do this bioinformaticly by getting a large list of different species that are also from different genus (one taxa from each known species /genus). Add in a few taxa…

Continue Reading Finding common genes by taxa on Genbank? : bioinformatics

dbSNP and indels

dbSNP and indels 0 I am working on WGS on bos indicus using ARS-UCD1.2 reference genome. However, I have no idea where to download known sites (indels and dbSNPs in vcf format) for base quality recalibration in GATK. Is there any one who would suggest me the link? thank you…

Continue Reading dbSNP and indels

Availability of information on genes in Gnomad VCF data

Availability of information on genes in Gnomad VCF data 1 Hi , Im new to gnomad and genetics in general and i was wondering does the gnomad genome data that is downlaoded in the vcf format on variants contains information of what is the nearest gene and is the genomic…

Continue Reading Availability of information on genes in Gnomad VCF data

What VCF file to use when using crossed mouse strains?

What VCF file to use when using crossed mouse strains? 2 Hi, I am new to working with mouse data. I am analyzing mouse RNA-Seq data from mice which are crosses between FVB and CAST strain (one parent is FVB and one is CAST). When doing base quality re-calibration (I…

Continue Reading What VCF file to use when using crossed mouse strains?

vcftools –weir-fst-pop returns -nan

vcftools –weir-fst-pop returns -nan 0 I am trying to calculate per site Fst for two samples in a vcf file but am getting -nan for the output for the mean Fst estimate and for every site. This is what I ran: vcftools –gzvcf ${VCF} –weir-fst-pop DBFCU –weir-fst-pop BBMCU –out ./cu_pops…

Continue Reading vcftools –weir-fst-pop returns -nan

AWS launches Amazon Omics for precision medicine

To enhance clinical insights at the point of care and help identify the best treatment or prevention options for patients, Amazon Web Services has launched a service that utilizes artificial intelligence (AI), machine learning, and other AWS and partner products and services to run IT-heavy bioinformatics workflows.  WHY IT MATTERS…

Continue Reading AWS launches Amazon Omics for precision medicine

Where to find vcf of dbsnp build 144 ?

Where to find vcf of dbsnp build 144 ? 0 Hi everyone, I have zipped vcf files that I would like to annotate using hg19 bsnp144. I have bed files for each chromosome but, based on other biostar answers (How to add rsIDs to VCF?), it seems it is easier…

Continue Reading Where to find vcf of dbsnp build 144 ?

Datasets | TogoVar

Variant frequencies for which you can apply for use of individual-level data∗1 to the NBDC human databases∗2 Click the links at the Included controlled-access datasets to apply for use of individual-level data ∗1:fastq/bam/cel files and/or lists of genotype data etc.∗2:Japanese Genotype-phenotype Archive (JGA) / AMED Genome group sharing Database (AGD)…

Continue Reading Datasets | TogoVar

Scatter Gather principle by chromosome on Gatk

Scatter Gather principle by chromosome on Gatk 0 Hi all, On a quest to optimize gatk pipeline, I met scatter gather principle, so I did following, pids= for chr in chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20…

Continue Reading Scatter Gather principle by chromosome on Gatk

encode gt of hg38 to machine learning

encode gt of hg38 to machine learning 0 Hi I’m new in the field , I have a large vcf file that have many variants with many sample I extract gt for each sample / variants to get a matrix to do a machine learning algorithem now, I need to…

Continue Reading encode gt of hg38 to machine learning

bcftools view remove (.) id

Hello I have a txt file that consists from CHROM,ID,POS, REF and ALT ( 48 variants ) I want to subset this txt with original VCF to make a new VCF I try to use bcftools using this query bcftools view -T variants.txt mydata.vcf > variant1.vcf but the problem ,…

Continue Reading bcftools view remove (.) id

Randomly pick variants from VCF file for 10000 iteration

Randomly pick variants from VCF file for 10000 iteration 1 Hi , I have a multisample VCF file containing nearly 6k variants. I want to pick randomly 1 variant at each iteration from total 10000 iteration and check whether this variant is present in another two vcf file. If its…

Continue Reading Randomly pick variants from VCF file for 10000 iteration

How to add reference as new sample to vcf?

How to add reference as new sample to vcf? 0 Hello, Do anyone know how to make a vcf file with a new sample from reference genome? I have a vcf file with 200 samples and 2,000 SNP My SNP were called with a reference genome, and I want to…

Continue Reading How to add reference as new sample to vcf?

What sequencing/alignment artifact is this?

What sequencing/alignment artifact is this? 0 I’m calling mitochondria variants with mutect2 and one variant looks like an artifact but I don’t understand what could be the cause. It looks like from IGV (picture below) that this variant is always at the same position on forward and backward reads. Also…

Continue Reading What sequencing/alignment artifact is this?

Detecting de novo SNV with vcftools

Detecting de novo SNV with vcftools 1 Hi, all. I have a raw whole genome sequence data of a kind of fish trio: father, mother and offspring. I would like to know how many SNV loci there are in the child but not in the parent (i.e. de novo SNV…

Continue Reading Detecting de novo SNV with vcftools

Contrasting levels of hybridization across the two contact zones between two hedgehog species revealed by genome-wide SNP data

Ai H, Fang X, Yang B, Huang Z, Chen H, Mao L et al. (2015) Adaptation and possible ancient interspecies introgression in pigs identified by whole-genome sequencing. Nat Genet 47:217–225 CAS  PubMed  Article  Google Scholar  Alexander DH, Lange K (2011) Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC…

Continue Reading Contrasting levels of hybridization across the two contact zones between two hedgehog species revealed by genome-wide SNP data

Bedtools Bam To Bed With Code Examples

Bedtools Bam To Bed With Code Examples With this article, we’ll look at some examples of how to address the Bedtools Bam To Bed problem . bedtools bamtobed [OPTIONS] -i <BAM> As we have seen, a large number of examples were utilised in order to solve the Bedtools Bam To…

Continue Reading Bedtools Bam To Bed With Code Examples

As of July 2015, the VCFtools project has been moved to github! Please visit the new website here: vcftools.github.io/man_0112a.html

NAME SYNOPSIS DESCRIPTION EXAMPLES BASIC OPTIONS SITE FILTERING OPTIONS INDIVIDUAL FILTERING OPTIONS GENOTYPE FILTERING OPTIONS OUTPUT OPTIONS COMPARISON OPTIONS AUTHOR NAME VCFtools v0.1.12a − Utilities for the variant call format (VCF) and binary variant call format (BCF) SYNOPSIS vcftools [ –vcf FILE | –gzvcf FILE | –bcf FILE]…

Continue Reading As of July 2015, the VCFtools project has been moved to github! Please visit the new website here: vcftools.github.io/man_0112a.html

Job – Principal Biostistician/Bioinformatics job at Kenya Medical Research

Vacancy title: Principal Biostistician/Bioinformatics [ Type: FULL TIME , Industry: Research , Category: Research ] Jobs at: Kenya Medical Research – KEMRI Deadline of this Job: 06 October 2022   Duty Station: Within Kenya , Kisumu , East Africa SummaryDate Posted: Tuesday, September 20, 2022 , Base Salary: Not Disclosed…

Continue Reading Job – Principal Biostistician/Bioinformatics job at Kenya Medical Research

Tool that can merge 2 VCF files while taking “representational ambiguity” of (multi-allelic) variants into account

Tool that can merge 2 VCF files while taking “representational ambiguity” of (multi-allelic) variants into account 0 Is there a tool that can merge 2 VCF files while taking “representational ambiguity” of multi-allelic variants into account? By: replaying all variant alleles from the 2 VCF files into the reference genome…

Continue Reading Tool that can merge 2 VCF files while taking “representational ambiguity” of (multi-allelic) variants into account

Bioinformatics Scientist in Pittsburgh, PA

Description Purpose:The scientist works independently using a robust math toolbox to discover solutions for a diverse portfolio of interesting and challenging problems. The scientist develops, implements, and monitors advanced analytic, medical informatics, and predictive modeling tools for health care programs at the UPMC. The scientist normally works Monday through Friday…

Continue Reading Bioinformatics Scientist in Pittsburgh, PA

A7993 – YFull YTree Info

R-A7993 – YFull YTree Info SNPs currently defining R-A7993 A7993     Sample ID Country / Language Info Ref File Testing company Statistics Status YF063745 —— R-A7993 R-A7993*, R-FGC59783* Hg38 .BAM FTDNA (Y700) 30X, 18.6 Mbp, 151 bp YF015291 Germany (Rheinland-Pfalz) R-A7993 R-A7993*, R-FGC59783* Hg38 .BAM FTDNA (Y500) 28X, 12.1 Mbp,…

Continue Reading A7993 – YFull YTree Info

Joint variant calling on DeepVariant GVCFs using GATK GenotypeGVCFs

Joint variant calling on DeepVariant GVCFs using GATK GenotypeGVCFs 0 Hi everyone I have a bunch of GVCF files generated by DeepVariant, but I want to use GATK’s GenotypeGVCFs for joint variant calling on them (I don’t want to use GLnexus). But GATK requires a genotype likelihood field produced by…

Continue Reading Joint variant calling on DeepVariant GVCFs using GATK GenotypeGVCFs

Using a phenotype file with several phenotype columns- PLINK2

Using a phenotype file with several phenotype columns- PLINK2 1 Hi all, I have created a tsv file ( phenotypes.tsv ) that includes phenotypes that I am using for a plink command with the –phenom flag. The first column is the #IID col with sample names that match the names…

Continue Reading Using a phenotype file with several phenotype columns- PLINK2

Index of /~psgendb/birchhomedir/public_html/doc/pkg/samtools-1.7/htslib-1.7/htslib

Name Last modified Size Description Parent Directory   –   bgzf.h 2018-01-10 07:45 14K   cram.h 2015-09-25 05:36 15K   faidx.h 2017-02-07 11:06 5.6K   hfile.h 2018-01-26 05:33 9.6K   hts.h 2017-11-24 09:46 29K   hts_defs.h 2017-08-10 11:07 3.3K   hts_endian.h 2017-09-27 10:40 11K   hts_log.h 2017-06-03 15:45 3.8K  …

Continue Reading Index of /~psgendb/birchhomedir/public_html/doc/pkg/samtools-1.7/htslib-1.7/htslib

How To Install libhts-dev on Kali Linux

In this tutorial we learn how to install libhts-dev on Kali Linux. libhts-dev is development files for the HTSlib Introduction In this tutorial we learn how to install libhts-dev on Kali Linux. What is libhts-dev HTSlib is an implementation of a unified C library for accessing common file formats, such…

Continue Reading How To Install libhts-dev on Kali Linux

Freebayes-parallel with large bam file – individual threads running for >6 days

Context: I’m trying to call variants on a sequencing project using pooled genotyping-by-sequencing. Pools consist of 94 samples each, alongside a number of individuals. Sequence data was demultiplexed and then aligned to a reference genome using hisat2, and the resultant bams were merged with samtools merge. The problem bam is…

Continue Reading Freebayes-parallel with large bam file – individual threads running for >6 days

Samtools Htslib Issues

Issue Title State Comments Created Date Updated Date How to get a specific chromosome open 1 2022-07-14 2022-07-18 tabix returns row from VCF file multiple times open 4 2022-07-11 2022-07-18 Modified base parsing failure failure closed 0 2022-07-01 2022-07-18 extract genotype information open 1 2022-06-24 2022-07-18 sam_hdr_remove_lines is inefficient if…

Continue Reading Samtools Htslib Issues

Senior Scientist Applied Bioinformatics Job In San Francisco, CA 94103| TechCareers

At Bristol Myers Squibb, we are inspired by a single vision – transforming patients’ lives through science. In oncology, hematology, immunology and cardiovascular disease – and one of the most diverse and promising pipelines in the industry – each of our passionate colleagues contribute to innovations that drive meaningful change….

Continue Reading Senior Scientist Applied Bioinformatics Job In San Francisco, CA 94103| TechCareers

Detecting heterogeneous X chromosome counts in XXY individual

Detecting heterogeneous X chromosome counts in XXY individual 1 Hi, I have a WGS of an individual with XXY DNA. I’d like to analyze their X calls to see what percentage are heterogeneous vs homogenous. I don’t know what tool is the best for this. Any suggestions would be really…

Continue Reading Detecting heterogeneous X chromosome counts in XXY individual

How can I keep INFO value when convert bgen to VCF by using plink2?

How can I keep INFO value when convert bgen to VCF by using plink2? 1 I am working on file handling for GWAS. When I converted bgen to VCF by using plink2 with a commands below, all INFO (and also FILTER) columns became “.” in the output VCF files. A…

Continue Reading How can I keep INFO value when convert bgen to VCF by using plink2?

Unexprected Ensembl-vep results

Unexprected Ensembl-vep results 0 Hi.I got a VCF from an individual that shows symptoms of a known disease with known mutations. I run it with Ensembl-vep, expecting to find some of those mutations in the results, yet, all the consequences in the results are “intergenic-variant”.The command I used was: –cache…

Continue Reading Unexprected Ensembl-vep results

Lh3 Minimap2 Issues

Issue Title State Comments Created Date Updated Date Mapping reads against multi references. Any proposition? open 0 2022-06-28 2022-06-30 Inversion between tandem repeats yields misalignment closed 1 2022-06-21 2022-06-30 use minimap2 to extract mitochondrial reads from genome assembly open 0 2022-06-20 2022-06-30 Asking for #301 to be reopened closed 0…

Continue Reading Lh3 Minimap2 Issues

How to modify VCF file?

Hi community, I have a question: the SNP position in vcf file is from GRCh37/hg19, I need to change the position to GRCh38. So, I used UCSC liftover to replace the hg19 pos by GRCh38 pos and deleted some SNPs, then sorted the pos and saved to a new vcf…

Continue Reading How to modify VCF file?

python – Matching two files(vcf to maf) using a dictionaries, and appending the contents

annotation_file ##INFO=<ID=ClinVar_CLNSIG,Number=.,xxx ##INFO=<ID=ClinVar_CLNREVSTAT,Number=.,yyy ##INFO=<ID=ClinVar_CLNDN,Number=.zzz #CHROM POS ID REF ALT QUAL FILTER INFO chr1 10145 . AAC A 101.83 . AC=2;AF=0.067;AN=30;aaa chr1 10146 . AC A 98.25 . AC=2;AF=0.083;AN=24;bbb chr1 10146 . AC * 79.25 . AC=2;AF=0.083;AN=24;ccc chr1 10439 . AC A 81.33 . AC=1;AF=0.008333;AN=120;ddd chr1 10450 . T G 53.09…

Continue Reading python – Matching two files(vcf to maf) using a dictionaries, and appending the contents

YP5260 – YFull YTree Info

Sample ID Country / Language Info Ref File Testing company Statistics Status I7021 Mongolia (Bulgan) C-F15910 C-F15910*, C-Y507 Hg19 .BAM Ancient 3X, 20.2 Mbp, 40 bp NEO249 Russia (Chukotskiy avtonomnyy okrug) C-F15910* —— Hg19 .BAM Ancient 1X, 7.2 Mbp, 81 bp I11696 Mongolia (Bulgan) C-Y507 —— Hg19 .BAM Ancient 2X,…

Continue Reading YP5260 – YFull YTree Info

08 compare visualization results of different annotation software

stay In the first two sections , We compared the differences vcf Use of annotation software , And convert the demerit recorded after the annotation into maf File format , because snpeff The comment result cannot be converted to maf, So we will compare later ANNOVAR、VEP、GATK Funcatator The results of…

Continue Reading 08 compare visualization results of different annotation software

Annotating with CADD, gnomad, Clinvar & dbNSFP on UKB RAP – Feature Requests

dint May 9, 2022, 1:33pm #1 i’m just wondering if you can specify cadd, gnomad, clinvar and dbNSFP options when annotating with hail on dxjupyterlab_spark_cluster o the UKB RAP? From the hail website, the following command can be used on your matrix file to annotate with these features: db =…

Continue Reading Annotating with CADD, gnomad, Clinvar & dbNSFP on UKB RAP – Feature Requests

YP3952 – YFull YTree Info

Q-YP3952 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF073154 Russia (Chechenskaya Respublika) / Chechen Q-YP3952* —— Hg38 .BAM FTDNA (Y700) 33X, 18.2 Mbp, 151 bp YF092378 Russia (Chechenskaya Respublika) / Chechen Q-BZ87 —— Hg38 .BAM FTDNA (Y700) 55X, 18.5 Mbp, 151…

Continue Reading YP3952 – YFull YTree Info

how to predict gene expression from genotype file using already developed elastic net model

how to predict gene expression from genotype file using already developed elastic net model 0 Hello everyone, I want to predict gene expression from genotype file and already developed elastic net model. My model file look like this: GENE RSID1 RSID2 VALUE ENSG00000107937.18 rs7475652 rs7475652 0.531316876443232 ENSG00000107937.18 rs7475652 rs7918643 -0.1434806647803035…

Continue Reading how to predict gene expression from genotype file using already developed elastic net model

Biostar Project

Showing : project • reset 1 result • Page 1 of 1 Recent … Replies Answer: merging VCF files by geweloy594 • 0 To merge multiple VCF files into a single VCF file, you can use VCF Merger software. This tool helps to merge numerous VCF data files and t……

Continue Reading Biostar Project

Bcftools equivalent of vcftools conversion to ped & map

Bcftools equivalent of vcftools conversion to ped & map 1 I am converting a VCF to ped & map thus in vcftools vcftools –gzvcf ZZZZZTYT.vcf.gz –plink –out ZZZZZTYT which works fine. However, I have been searching and searching, can bcftools do the same with a bcf? bcftools • 103 views…

Continue Reading Bcftools equivalent of vcftools conversion to ped & map

Z697 – YFull YTree Info

R-Z697 – YFull YTree Info SNPs currently defining R-Z697 Z697     Sample ID Country / Language Info Ref File Testing company Statistics Status YF009397 Sweden (Västra Götalands län) R-Z697* —— Hg19 .BAM FTDNA (Y500) 81X, 14.4 Mbp, 165 bp YF084333 Italy (Chieti) R-FT285492 —— Hg38 .BAM Dante Labs 14X, 23.4…

Continue Reading Z697 – YFull YTree Info

difficulty filtering vcf file with vcftools

difficulty filtering vcf file with vcftools 1 I had a large VCF file named “common_known_variants.vcf ” which contains all known human variants downloaded from ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/VCF/00-common_all.vcf.gz -O common_known_variants.vcf.gz I’m trying to extract the known variants from only chromosomes 1,2,3,9,22, and X and write them in a new vcf file with the…

Continue Reading difficulty filtering vcf file with vcftools

Error in BAFFromGVCFs – GenotypeGVCFs

Bug Report Affected module(s) or script(s) Module00c/BAFFromGVCFs/GenotypeGVCFs Affected version(s) Description I’m running GATKSVPipelineBatch and I got the following error in the GenotypeGVCFs task: A USER ERROR has occurred: Input /tmp/scratch/bean-resources/broad-references/v0/Homo_sapiens_assembly38.dbsnp138.vcf must support random access to enable queries by interval. If it’s a file, please index it using the bundled tool…

Continue Reading Error in BAFFromGVCFs – GenotypeGVCFs

Latest dbSNP VCF

This is the directory you’re looking for: ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/ curl -s ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/GCF_000001405.39.gz | zcat | head ##fileformat=VCFv4.2 ##fileDate=20210513 ##source=dbSNP ##dbSNP_BUILD_ID=155 ##reference=GRCh38.p13 ##phasing=partial ##INFO=<ID=RS,Number=1,Type=Integer,Description=”dbSNP ID (i.e. rs number)”> ##INFO=<ID=GENEINFO,Number=1,Type=String,Description=”Pairs each of gene symbol:gene id. The gene symbol and id are delimited by a colon (:) and each pair is delimited by a…

Continue Reading Latest dbSNP VCF

Missing data per site

Hi, I want to calculate statistics of missing data per each site in my vcf file. Using vcftools –missing-site gives wrong stats for several sites. Is there is any other way to calculate it? Thank you! I have 36 samples and here is an example of the vcftools –missing-site output…

Continue Reading Missing data per site

bedtools interset doesn’t return a VCF file?

bedtools interset doesn’t return a VCF file? 1 I am filtering a VCF file with a bed file using Bedtools. I have carried out this successfully with bedtools intersect -wb -a myVCF.vcf -b myBEDfile.bed > output.txt However, what I want is to get a VCF file with the metadata and…

Continue Reading bedtools interset doesn’t return a VCF file?

Hard filtering on GATK HaplotypeCaller giving multiple warnings

I’m using this pipeline for deriving variants from RNA sequencing data: github.com/modupeore/VAP which uses specific versions of various tools, including HaplotypeCaller from GATK (v3.8-0-ge9d806836). The final step is a set of hard filters on the called variants (applied using VariantFilter), but looking at the log files, there are a lot…

Continue Reading Hard filtering on GATK HaplotypeCaller giving multiple warnings

How Can I Merge VCF File ?

The multiple secure and trustworthy solution to merge several VCF files into a single VCF is by establishing an efficacious VCF Merge Tool. In this respect, one of my colleagues has just used the VCF Merge Tool which permitted him to merge multiple VCF files by maintaining high data integrity….

Continue Reading How Can I Merge VCF File ?

snp – Reference variant detected as altered one in bam file

I received (from manufacturer) several .bam files and I used four callers (samtools, freebayes, haplotypecaller, deepvariant) to find some sequence variants. In obtained .vcf files, I took a closer look to some calls. I found interesting, homozygous one rs477033 (C/G Ref/Alt) with flag ‘COMMON=0’ and very low MAF. I also…

Continue Reading snp – Reference variant detected as altered one in bam file

Bioinformatics Scientist for Whole Genome and Whole Exome Sequencing

** Bioinformatics Scientist for Whole Genome and Whole Exome Sequencing ** The NeuroGenomics and Informatics (NGI) Center lead by Dr. Carlos Cruchaga at Washington University School of Medicine is recruiting a Bioinformatics Scientist to work on Whole Genome and Whole Exome Sequencing. We are seeking an experienced, self-motivated, self-driven scientist…

Continue Reading Bioinformatics Scientist for Whole Genome and Whole Exome Sequencing

Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes

Sequencing data We used publicly available sequencing data from the GIAB consortium45, 1000 Genomes Project high-coverage data46 and Human Genome Structural Variation Consortium (HGSVC)4. All datasets include only samples consented for public dissemination of the full genomes. Statistics and reproducibility For generating the assemblies, we used all 14 samples for…

Continue Reading Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes

how to extract unique variants from GVCF

how to extract unique variants from GVCF 1 [note: cross-posted on GATK forum – still awaiting a response] I have a GVCF (generated using GATK’s HaplotypeCaller w/ -ERC GVCF parameter) of 36 related samples and would like to determine the (potentially de novo) variants that are unique to each sample….

Continue Reading how to extract unique variants from GVCF

wrong number of fields ?

Error occurence after merging files with bcftools: wrong number of fields ? 1 I have multiple vcf of CASES and CONTROLS variations annotated by VEP, SNPEff, SnpSift. first pair vcf -> only variations| CASES and CONTROLS second pair vcf -> variations + SnpEff | CASES and CONTROLS third pair vcf->…

Continue Reading wrong number of fields ?

L1193 – YFull YTree Info

I-L1193 – YFull YTree Info SNPs currently defining I-L1193 L1193     FGC87558     Y72031     Sample ID Country / Language Info Ref File Testing company Statistics Status ASH1 Ireland (Tipperary) I-L1193* —— Hg19 .BAM Ancient 1X, 10.5 Mbp, 101 bp PB581 Ireland (Clare) I-L1193* —— Hg19 .BAM Ancient 2X, 15.8…

Continue Reading L1193 – YFull YTree Info

Y18411 – YFull YTree Info

J-Y18411 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF072520 Albania J-BY111710 —— Hg19 .BAM Dante Labs 10X, 22.8 Mbp, 151 bp YF067307 Palestine (Nablus) J-BY111710 —— Hg38 .BAM FTDNA (Y700) 34X, 18.7 Mbp, 151 bp NA20827 Italy (Firenze) J-CTS3330 —— Hg19…

Continue Reading Y18411 – YFull YTree Info

How to Merge VCF files in Windows 10

Many organizations working on VCF have to face collecting and combining emails. Hiring technicians increase the data management cost. Along with the disadvantage, downtime is a big issue. It hampers work. Technicians often try to fix the problem manually. It is a time-consuming process, so trusting a vcf merge application is…

Continue Reading How to Merge VCF files in Windows 10

Variant quality and filters on GATK HaplotypeCaller generated VCFs

Variant quality and filters on GATK HaplotypeCaller generated VCFs 0 Hi, I am analysing human WGS data to diagnose rare inherited diseases. I followed the GATK Best Practices Guidelines for “Germline short variants discovery” for single-sample data to generate a VCF using HaplotypeCaller. The guidelines then point to the use…

Continue Reading Variant quality and filters on GATK HaplotypeCaller generated VCFs

Merge only bim files with plink

Merge only bim files with plink 0 Hello For the same dataset they provide a single BED and FAM files for all the chromosomes. However, the BIM files are split in chromosomes. I would like to generate the VCF file with the genotyping calls of all chromosomes but I need…

Continue Reading Merge only bim files with plink

BioInformatics Product Manager at Helix (remote)

You + Helix Helix is a place where innovators and doers gather in order to drive significant progress in population genomics. We have come together to work at the intersection of clinical care, research, and genomics.   If you’re excited by the idea of making a meaningful impact and joining a…

Continue Reading BioInformatics Product Manager at Helix (remote)

rna seq – RNAseq SNP discovery: deciding upon filters and dealing with allele expression bias

I am working with non-model plant RNA samples which we have been deep sequenced and analysed using STAR aligner under default parameters. Aim We would like to conduct SNP discovery of these samples. Objective Our ultimate goal with this genotypic data is to search for variants (both SNPs and indels)…

Continue Reading rna seq – RNAseq SNP discovery: deciding upon filters and dealing with allele expression bias

Parallel reduction in flowering time from de novo mutations enable evolutionary rescue in colonizing lineages

Díaz, S. et al. Summary for Policymakers of the Global Assessment Report on Biodiversity and Ecosystem Services of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES, 2019). Fisher, R. A. The correlation between relatives on the supposition of Mendelian inheritance. Earth Environ. Sci. Trans. R. Soc. Edinb. 52,…

Continue Reading Parallel reduction in flowering time from de novo mutations enable evolutionary rescue in colonizing lineages

using ANNOVAR annotation clinvar database out wrong position

using ANNOVAR annotation clinvar database out wrong position 0 Hello Biostars, I was trying to annotate the VCF using ANNOVAR,but I get a wrong out ,it seems my clinvar database is not sutibale bcftools_callCommand=call -m -v -o /project/plantform/20220316PCR/03.amplify/L2107973CFD7G5kxT1/L2107973CFD7G5kxT1.variation.vcf /project/plantform/20220316PCR/03.amplify/L2107973CFD7G5kxT1/L2107973CFD7G5kxT1.mpileup.vcf clinvar ANNOVAR • 34 views Read more here: Source link

Continue Reading using ANNOVAR annotation clinvar database out wrong position

M8498 – YFull YTree Info

B-M8498 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF004283 Saudi Arabia B-M8498* —— Hg19 .BAM FTDNA (Y500) 43X, 13.7 Mbp, 165 bp HGDP00992 Namibia B-M7650* —— Hg38 .BAM Scientific 18X, 23.5 Mbp, 151 bp YF013963 —— B-Y82361 —— Hg38 .BAM FTDNA…

Continue Reading M8498 – YFull YTree Info

FGC15109 – YFull YTree Info

I-FGC15109 – YFull YTree Info SNPs currently defining I-FGC15109 FGC15109     Sample ID Country / Language Info Ref File Testing company Statistics Status SZ43 Hungary (Somogy) I-BY138* —— Hg19 .BAM Ancient 8X, 22.8 Mbp, 32 bp YF010533 —— I-BY138* —— Hg19 .BAM FTDNA (Y500) 73X, 14.9 Mbp, 165 bp YF019250…

Continue Reading FGC15109 – YFull YTree Info