Tag: VCF

Job – Principal Biostistician/Bioinformatics job at Kenya Medical Research

Vacancy title: Principal Biostistician/Bioinformatics [ Type: FULL TIME , Industry: Research , Category: Research ] Jobs at: Kenya Medical Research – KEMRI Deadline of this Job: 06 October 2022   Duty Station: Within Kenya , Kisumu , East Africa SummaryDate Posted: Tuesday, September 20, 2022 , Base Salary: Not Disclosed…

Continue Reading Job – Principal Biostistician/Bioinformatics job at Kenya Medical Research

Tool that can merge 2 VCF files while taking “representational ambiguity” of (multi-allelic) variants into account

Tool that can merge 2 VCF files while taking “representational ambiguity” of (multi-allelic) variants into account 0 Is there a tool that can merge 2 VCF files while taking “representational ambiguity” of multi-allelic variants into account? By: replaying all variant alleles from the 2 VCF files into the reference genome…

Continue Reading Tool that can merge 2 VCF files while taking “representational ambiguity” of (multi-allelic) variants into account

Bioinformatics Scientist in Pittsburgh, PA

Description Purpose:The scientist works independently using a robust math toolbox to discover solutions for a diverse portfolio of interesting and challenging problems. The scientist develops, implements, and monitors advanced analytic, medical informatics, and predictive modeling tools for health care programs at the UPMC. The scientist normally works Monday through Friday…

Continue Reading Bioinformatics Scientist in Pittsburgh, PA

A7993 – YFull YTree Info

R-A7993 – YFull YTree Info SNPs currently defining R-A7993 A7993     Sample ID Country / Language Info Ref File Testing company Statistics Status YF063745 —— R-A7993 R-A7993*, R-FGC59783* Hg38 .BAM FTDNA (Y700) 30X, 18.6 Mbp, 151 bp YF015291 Germany (Rheinland-Pfalz) R-A7993 R-A7993*, R-FGC59783* Hg38 .BAM FTDNA (Y500) 28X, 12.1 Mbp,…

Continue Reading A7993 – YFull YTree Info

Joint variant calling on DeepVariant GVCFs using GATK GenotypeGVCFs

Joint variant calling on DeepVariant GVCFs using GATK GenotypeGVCFs 0 Hi everyone I have a bunch of GVCF files generated by DeepVariant, but I want to use GATK’s GenotypeGVCFs for joint variant calling on them (I don’t want to use GLnexus). But GATK requires a genotype likelihood field produced by…

Continue Reading Joint variant calling on DeepVariant GVCFs using GATK GenotypeGVCFs

Using a phenotype file with several phenotype columns- PLINK2

Using a phenotype file with several phenotype columns- PLINK2 1 Hi all, I have created a tsv file ( phenotypes.tsv ) that includes phenotypes that I am using for a plink command with the –phenom flag. The first column is the #IID col with sample names that match the names…

Continue Reading Using a phenotype file with several phenotype columns- PLINK2

Index of /~psgendb/birchhomedir/public_html/doc/pkg/samtools-1.7/htslib-1.7/htslib

Name Last modified Size Description Parent Directory   –   bgzf.h 2018-01-10 07:45 14K   cram.h 2015-09-25 05:36 15K   faidx.h 2017-02-07 11:06 5.6K   hfile.h 2018-01-26 05:33 9.6K   hts.h 2017-11-24 09:46 29K   hts_defs.h 2017-08-10 11:07 3.3K   hts_endian.h 2017-09-27 10:40 11K   hts_log.h 2017-06-03 15:45 3.8K  …

Continue Reading Index of /~psgendb/birchhomedir/public_html/doc/pkg/samtools-1.7/htslib-1.7/htslib

How To Install libhts-dev on Kali Linux

In this tutorial we learn how to install libhts-dev on Kali Linux. libhts-dev is development files for the HTSlib Introduction In this tutorial we learn how to install libhts-dev on Kali Linux. What is libhts-dev HTSlib is an implementation of a unified C library for accessing common file formats, such…

Continue Reading How To Install libhts-dev on Kali Linux

Freebayes-parallel with large bam file – individual threads running for >6 days

Context: I’m trying to call variants on a sequencing project using pooled genotyping-by-sequencing. Pools consist of 94 samples each, alongside a number of individuals. Sequence data was demultiplexed and then aligned to a reference genome using hisat2, and the resultant bams were merged with samtools merge. The problem bam is…

Continue Reading Freebayes-parallel with large bam file – individual threads running for >6 days

Samtools Htslib Issues

Issue Title State Comments Created Date Updated Date How to get a specific chromosome open 1 2022-07-14 2022-07-18 tabix returns row from VCF file multiple times open 4 2022-07-11 2022-07-18 Modified base parsing failure failure closed 0 2022-07-01 2022-07-18 extract genotype information open 1 2022-06-24 2022-07-18 sam_hdr_remove_lines is inefficient if…

Continue Reading Samtools Htslib Issues

Senior Scientist Applied Bioinformatics Job In San Francisco, CA 94103| TechCareers

At Bristol Myers Squibb, we are inspired by a single vision – transforming patients’ lives through science. In oncology, hematology, immunology and cardiovascular disease – and one of the most diverse and promising pipelines in the industry – each of our passionate colleagues contribute to innovations that drive meaningful change….

Continue Reading Senior Scientist Applied Bioinformatics Job In San Francisco, CA 94103| TechCareers

Detecting heterogeneous X chromosome counts in XXY individual

Detecting heterogeneous X chromosome counts in XXY individual 1 Hi, I have a WGS of an individual with XXY DNA. I’d like to analyze their X calls to see what percentage are heterogeneous vs homogenous. I don’t know what tool is the best for this. Any suggestions would be really…

Continue Reading Detecting heterogeneous X chromosome counts in XXY individual

How can I keep INFO value when convert bgen to VCF by using plink2?

How can I keep INFO value when convert bgen to VCF by using plink2? 1 I am working on file handling for GWAS. When I converted bgen to VCF by using plink2 with a commands below, all INFO (and also FILTER) columns became “.” in the output VCF files. A…

Continue Reading How can I keep INFO value when convert bgen to VCF by using plink2?

Unexprected Ensembl-vep results

Unexprected Ensembl-vep results 0 Hi.I got a VCF from an individual that shows symptoms of a known disease with known mutations. I run it with Ensembl-vep, expecting to find some of those mutations in the results, yet, all the consequences in the results are “intergenic-variant”.The command I used was: –cache…

Continue Reading Unexprected Ensembl-vep results

Lh3 Minimap2 Issues

Issue Title State Comments Created Date Updated Date Mapping reads against multi references. Any proposition? open 0 2022-06-28 2022-06-30 Inversion between tandem repeats yields misalignment closed 1 2022-06-21 2022-06-30 use minimap2 to extract mitochondrial reads from genome assembly open 0 2022-06-20 2022-06-30 Asking for #301 to be reopened closed 0…

Continue Reading Lh3 Minimap2 Issues

How to modify VCF file?

Hi community, I have a question: the SNP position in vcf file is from GRCh37/hg19, I need to change the position to GRCh38. So, I used UCSC liftover to replace the hg19 pos by GRCh38 pos and deleted some SNPs, then sorted the pos and saved to a new vcf…

Continue Reading How to modify VCF file?

python – Matching two files(vcf to maf) using a dictionaries, and appending the contents

annotation_file ##INFO=<ID=ClinVar_CLNSIG,Number=.,xxx ##INFO=<ID=ClinVar_CLNREVSTAT,Number=.,yyy ##INFO=<ID=ClinVar_CLNDN,Number=.zzz #CHROM POS ID REF ALT QUAL FILTER INFO chr1 10145 . AAC A 101.83 . AC=2;AF=0.067;AN=30;aaa chr1 10146 . AC A 98.25 . AC=2;AF=0.083;AN=24;bbb chr1 10146 . AC * 79.25 . AC=2;AF=0.083;AN=24;ccc chr1 10439 . AC A 81.33 . AC=1;AF=0.008333;AN=120;ddd chr1 10450 . T G 53.09…

Continue Reading python – Matching two files(vcf to maf) using a dictionaries, and appending the contents

YP5260 – YFull YTree Info

Sample ID Country / Language Info Ref File Testing company Statistics Status I7021 Mongolia (Bulgan) C-F15910 C-F15910*, C-Y507 Hg19 .BAM Ancient 3X, 20.2 Mbp, 40 bp NEO249 Russia (Chukotskiy avtonomnyy okrug) C-F15910* —— Hg19 .BAM Ancient 1X, 7.2 Mbp, 81 bp I11696 Mongolia (Bulgan) C-Y507 —— Hg19 .BAM Ancient 2X,…

Continue Reading YP5260 – YFull YTree Info

08 compare visualization results of different annotation software

stay In the first two sections , We compared the differences vcf Use of annotation software , And convert the demerit recorded after the annotation into maf File format , because snpeff The comment result cannot be converted to maf, So we will compare later ANNOVAR、VEP、GATK Funcatator The results of…

Continue Reading 08 compare visualization results of different annotation software

Annotating with CADD, gnomad, Clinvar & dbNSFP on UKB RAP – Feature Requests

dint May 9, 2022, 1:33pm #1 i’m just wondering if you can specify cadd, gnomad, clinvar and dbNSFP options when annotating with hail on dxjupyterlab_spark_cluster o the UKB RAP? From the hail website, the following command can be used on your matrix file to annotate with these features: db =…

Continue Reading Annotating with CADD, gnomad, Clinvar & dbNSFP on UKB RAP – Feature Requests

YP3952 – YFull YTree Info

Q-YP3952 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF073154 Russia (Chechenskaya Respublika) / Chechen Q-YP3952* —— Hg38 .BAM FTDNA (Y700) 33X, 18.2 Mbp, 151 bp YF092378 Russia (Chechenskaya Respublika) / Chechen Q-BZ87 —— Hg38 .BAM FTDNA (Y700) 55X, 18.5 Mbp, 151…

Continue Reading YP3952 – YFull YTree Info

how to predict gene expression from genotype file using already developed elastic net model

how to predict gene expression from genotype file using already developed elastic net model 0 Hello everyone, I want to predict gene expression from genotype file and already developed elastic net model. My model file look like this: GENE RSID1 RSID2 VALUE ENSG00000107937.18 rs7475652 rs7475652 0.531316876443232 ENSG00000107937.18 rs7475652 rs7918643 -0.1434806647803035…

Continue Reading how to predict gene expression from genotype file using already developed elastic net model

Biostar Project

Showing : project • reset 1 result • Page 1 of 1 Recent … Replies Answer: merging VCF files by geweloy594 • 0 To merge multiple VCF files into a single VCF file, you can use VCF Merger software. This tool helps to merge numerous VCF data files and t……

Continue Reading Biostar Project

Bcftools equivalent of vcftools conversion to ped & map

Bcftools equivalent of vcftools conversion to ped & map 1 I am converting a VCF to ped & map thus in vcftools vcftools –gzvcf ZZZZZTYT.vcf.gz –plink –out ZZZZZTYT which works fine. However, I have been searching and searching, can bcftools do the same with a bcf? bcftools • 103 views…

Continue Reading Bcftools equivalent of vcftools conversion to ped & map

Z697 – YFull YTree Info

R-Z697 – YFull YTree Info SNPs currently defining R-Z697 Z697     Sample ID Country / Language Info Ref File Testing company Statistics Status YF009397 Sweden (Västra Götalands län) R-Z697* —— Hg19 .BAM FTDNA (Y500) 81X, 14.4 Mbp, 165 bp YF084333 Italy (Chieti) R-FT285492 —— Hg38 .BAM Dante Labs 14X, 23.4…

Continue Reading Z697 – YFull YTree Info

difficulty filtering vcf file with vcftools

difficulty filtering vcf file with vcftools 1 I had a large VCF file named “common_known_variants.vcf ” which contains all known human variants downloaded from ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/VCF/00-common_all.vcf.gz -O common_known_variants.vcf.gz I’m trying to extract the known variants from only chromosomes 1,2,3,9,22, and X and write them in a new vcf file with the…

Continue Reading difficulty filtering vcf file with vcftools

Error in BAFFromGVCFs – GenotypeGVCFs

Bug Report Affected module(s) or script(s) Module00c/BAFFromGVCFs/GenotypeGVCFs Affected version(s) Description I’m running GATKSVPipelineBatch and I got the following error in the GenotypeGVCFs task: A USER ERROR has occurred: Input /tmp/scratch/bean-resources/broad-references/v0/Homo_sapiens_assembly38.dbsnp138.vcf must support random access to enable queries by interval. If it’s a file, please index it using the bundled tool…

Continue Reading Error in BAFFromGVCFs – GenotypeGVCFs

Latest dbSNP VCF

This is the directory you’re looking for: ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/ curl -s ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/GCF_000001405.39.gz | zcat | head ##fileformat=VCFv4.2 ##fileDate=20210513 ##source=dbSNP ##dbSNP_BUILD_ID=155 ##reference=GRCh38.p13 ##phasing=partial ##INFO=<ID=RS,Number=1,Type=Integer,Description=”dbSNP ID (i.e. rs number)”> ##INFO=<ID=GENEINFO,Number=1,Type=String,Description=”Pairs each of gene symbol:gene id. The gene symbol and id are delimited by a colon (:) and each pair is delimited by a…

Continue Reading Latest dbSNP VCF

Missing data per site

Hi, I want to calculate statistics of missing data per each site in my vcf file. Using vcftools –missing-site gives wrong stats for several sites. Is there is any other way to calculate it? Thank you! I have 36 samples and here is an example of the vcftools –missing-site output…

Continue Reading Missing data per site

bedtools interset doesn’t return a VCF file?

bedtools interset doesn’t return a VCF file? 1 I am filtering a VCF file with a bed file using Bedtools. I have carried out this successfully with bedtools intersect -wb -a myVCF.vcf -b myBEDfile.bed > output.txt However, what I want is to get a VCF file with the metadata and…

Continue Reading bedtools interset doesn’t return a VCF file?

Hard filtering on GATK HaplotypeCaller giving multiple warnings

I’m using this pipeline for deriving variants from RNA sequencing data: github.com/modupeore/VAP which uses specific versions of various tools, including HaplotypeCaller from GATK (v3.8-0-ge9d806836). The final step is a set of hard filters on the called variants (applied using VariantFilter), but looking at the log files, there are a lot…

Continue Reading Hard filtering on GATK HaplotypeCaller giving multiple warnings

How Can I Merge VCF File ?

The multiple secure and trustworthy solution to merge several VCF files into a single VCF is by establishing an efficacious VCF Merge Tool. In this respect, one of my colleagues has just used the VCF Merge Tool which permitted him to merge multiple VCF files by maintaining high data integrity….

Continue Reading How Can I Merge VCF File ?

snp – Reference variant detected as altered one in bam file

I received (from manufacturer) several .bam files and I used four callers (samtools, freebayes, haplotypecaller, deepvariant) to find some sequence variants. In obtained .vcf files, I took a closer look to some calls. I found interesting, homozygous one rs477033 (C/G Ref/Alt) with flag ‘COMMON=0’ and very low MAF. I also…

Continue Reading snp – Reference variant detected as altered one in bam file

Bioinformatics Scientist for Whole Genome and Whole Exome Sequencing

** Bioinformatics Scientist for Whole Genome and Whole Exome Sequencing ** The NeuroGenomics and Informatics (NGI) Center lead by Dr. Carlos Cruchaga at Washington University School of Medicine is recruiting a Bioinformatics Scientist to work on Whole Genome and Whole Exome Sequencing. We are seeking an experienced, self-motivated, self-driven scientist…

Continue Reading Bioinformatics Scientist for Whole Genome and Whole Exome Sequencing

Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes

Sequencing data We used publicly available sequencing data from the GIAB consortium45, 1000 Genomes Project high-coverage data46 and Human Genome Structural Variation Consortium (HGSVC)4. All datasets include only samples consented for public dissemination of the full genomes. Statistics and reproducibility For generating the assemblies, we used all 14 samples for…

Continue Reading Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes

how to extract unique variants from GVCF

how to extract unique variants from GVCF 1 [note: cross-posted on GATK forum – still awaiting a response] I have a GVCF (generated using GATK’s HaplotypeCaller w/ -ERC GVCF parameter) of 36 related samples and would like to determine the (potentially de novo) variants that are unique to each sample….

Continue Reading how to extract unique variants from GVCF

wrong number of fields ?

Error occurence after merging files with bcftools: wrong number of fields ? 1 I have multiple vcf of CASES and CONTROLS variations annotated by VEP, SNPEff, SnpSift. first pair vcf -> only variations| CASES and CONTROLS second pair vcf -> variations + SnpEff | CASES and CONTROLS third pair vcf->…

Continue Reading wrong number of fields ?

L1193 – YFull YTree Info

I-L1193 – YFull YTree Info SNPs currently defining I-L1193 L1193     FGC87558     Y72031     Sample ID Country / Language Info Ref File Testing company Statistics Status ASH1 Ireland (Tipperary) I-L1193* —— Hg19 .BAM Ancient 1X, 10.5 Mbp, 101 bp PB581 Ireland (Clare) I-L1193* —— Hg19 .BAM Ancient 2X, 15.8…

Continue Reading L1193 – YFull YTree Info

Y18411 – YFull YTree Info

J-Y18411 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF072520 Albania J-BY111710 —— Hg19 .BAM Dante Labs 10X, 22.8 Mbp, 151 bp YF067307 Palestine (Nablus) J-BY111710 —— Hg38 .BAM FTDNA (Y700) 34X, 18.7 Mbp, 151 bp NA20827 Italy (Firenze) J-CTS3330 —— Hg19…

Continue Reading Y18411 – YFull YTree Info

How to Merge VCF files in Windows 10

Many organizations working on VCF have to face collecting and combining emails. Hiring technicians increase the data management cost. Along with the disadvantage, downtime is a big issue. It hampers work. Technicians often try to fix the problem manually. It is a time-consuming process, so trusting a vcf merge application is…

Continue Reading How to Merge VCF files in Windows 10

Variant quality and filters on GATK HaplotypeCaller generated VCFs

Variant quality and filters on GATK HaplotypeCaller generated VCFs 0 Hi, I am analysing human WGS data to diagnose rare inherited diseases. I followed the GATK Best Practices Guidelines for “Germline short variants discovery” for single-sample data to generate a VCF using HaplotypeCaller. The guidelines then point to the use…

Continue Reading Variant quality and filters on GATK HaplotypeCaller generated VCFs

Merge only bim files with plink

Merge only bim files with plink 0 Hello For the same dataset they provide a single BED and FAM files for all the chromosomes. However, the BIM files are split in chromosomes. I would like to generate the VCF file with the genotyping calls of all chromosomes but I need…

Continue Reading Merge only bim files with plink

BioInformatics Product Manager at Helix (remote)

You + Helix Helix is a place where innovators and doers gather in order to drive significant progress in population genomics. We have come together to work at the intersection of clinical care, research, and genomics.   If you’re excited by the idea of making a meaningful impact and joining a…

Continue Reading BioInformatics Product Manager at Helix (remote)

rna seq – RNAseq SNP discovery: deciding upon filters and dealing with allele expression bias

I am working with non-model plant RNA samples which we have been deep sequenced and analysed using STAR aligner under default parameters. Aim We would like to conduct SNP discovery of these samples. Objective Our ultimate goal with this genotypic data is to search for variants (both SNPs and indels)…

Continue Reading rna seq – RNAseq SNP discovery: deciding upon filters and dealing with allele expression bias

Parallel reduction in flowering time from de novo mutations enable evolutionary rescue in colonizing lineages

Díaz, S. et al. Summary for Policymakers of the Global Assessment Report on Biodiversity and Ecosystem Services of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES, 2019). Fisher, R. A. The correlation between relatives on the supposition of Mendelian inheritance. Earth Environ. Sci. Trans. R. Soc. Edinb. 52,…

Continue Reading Parallel reduction in flowering time from de novo mutations enable evolutionary rescue in colonizing lineages

using ANNOVAR annotation clinvar database out wrong position

using ANNOVAR annotation clinvar database out wrong position 0 Hello Biostars, I was trying to annotate the VCF using ANNOVAR,but I get a wrong out ,it seems my clinvar database is not sutibale bcftools_callCommand=call -m -v -o /project/plantform/20220316PCR/03.amplify/L2107973CFD7G5kxT1/L2107973CFD7G5kxT1.variation.vcf /project/plantform/20220316PCR/03.amplify/L2107973CFD7G5kxT1/L2107973CFD7G5kxT1.mpileup.vcf clinvar ANNOVAR • 34 views Read more here: Source link

Continue Reading using ANNOVAR annotation clinvar database out wrong position

M8498 – YFull YTree Info

B-M8498 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF004283 Saudi Arabia B-M8498* —— Hg19 .BAM FTDNA (Y500) 43X, 13.7 Mbp, 165 bp HGDP00992 Namibia B-M7650* —— Hg38 .BAM Scientific 18X, 23.5 Mbp, 151 bp YF013963 —— B-Y82361 —— Hg38 .BAM FTDNA…

Continue Reading M8498 – YFull YTree Info

FGC15109 – YFull YTree Info

I-FGC15109 – YFull YTree Info SNPs currently defining I-FGC15109 FGC15109     Sample ID Country / Language Info Ref File Testing company Statistics Status SZ43 Hungary (Somogy) I-BY138* —— Hg19 .BAM Ancient 8X, 22.8 Mbp, 32 bp YF010533 —— I-BY138* —— Hg19 .BAM FTDNA (Y500) 73X, 14.9 Mbp, 165 bp YF019250…

Continue Reading FGC15109 – YFull YTree Info

bedtools -u not giving unique files

bedtools -u not giving unique files 1 The following are the steps Im following: First step to extract sample using bed file is this (here the bedfile is input bedfile converted to Hg38): tabix -h -R Hg19_to_Hg38_sorted.bed.gz gnomad.genomes.v{g_version}.hgdp_tgp.chr{chr}.vcf.bgz | perl {vcftools} -c {sample_name} > {sample_name}_out.vcf’ output({sample_name}_out.vcf’) chr2 113982416 rs56177103 TATAAAATAAAATAAA…

Continue Reading bedtools -u not giving unique files

bam – Detect mutation context in a read of a sam file

That kind of custom fiddling with reads and variants is very cumbersome, non-standard and also error-prone. Do a standard variant callign pipeline and then filter for the mutations that you want. Then extract the variant position (so the coordinates) and get the variant context from the reference genome. Using individual…

Continue Reading bam – Detect mutation context in a read of a sam file

vcf – Why does GATK produce both 0/1 and 1/0 genotypes in the same file? Are the two not equivalent?

I have always thought that 1/0 and 0/1 in VCF genotype fields are equivalent. And yet, GATK uses both. For example, these are two variants called in the same sample and the same run of GATK 4.1.4.0: chr7 117120317 . ATTCATTGTTTTGAAAGAAAGATGGAAGAATGAACTGAAG A 748.97 . AC=1;AF=0.5;AN=2;DP=64;ExcessHet=3.0103;FS=0;MLEAC=1;MLEAF=0.5;MQ=60;QD=11.89;SOR=7.223 GT:AD:DP:GQ:PL:SB 1/0:0,36:63:99:2294,1042,933:0,0,0,36 chr7 117120306 ….

Continue Reading vcf – Why does GATK produce both 0/1 and 1/0 genotypes in the same file? Are the two not equivalent?

split gtex genotype data by chromosomes.

Hello, I used and edited the command line to use –vcf to import vcf file. I used these commands: for chr in $(seq 1 22); do      plink –vcf /dbGAP/GTEx_Analysis_2017-06-05_v8_WholeExomeSeq_979Indiv_VEP_annot.vcf.gz            –chr $chr            –recode            –out…

Continue Reading split gtex genotype data by chromosomes.

Understanding the number of intersection in bedtools jaccard

Understanding the number of intersection in bedtools jaccard 1 Hello, I am using bedtools jaccard to compare two vcf files, as: bedtools jaccard -a ancestors.calls.norm.snp.vcf.gz -b GC078310.calls.norm.snp.vcf.gz intersection union-intersection jaccard n_intersections 1606899 1806667 0.889427 1536700 What I do not get is why n_intersections is equal to 1536700. Especially, the difference…

Continue Reading Understanding the number of intersection in bedtools jaccard

FGC19851 – YFull YTree Info

R-FGC19851 – YFull YTree Info SNPs currently defining R-FGC19851 FGC19851     Sample ID Country / Language Info Ref File Testing company Statistics Status YF072967 United States (Georgia) R-FGC19851* —— Hg38 .BAM FTDNA (Y700) 34X, 18.7 Mbp, 151 bp YF009427 —— R-FGC65264* —— Hg19 .BAM FTDNA (Y500) 38X, 12.8 Mbp, 165…

Continue Reading FGC19851 – YFull YTree Info

Re: Quick Way to Merge Multiple VCF Files into One

vCard files(.vcf) are essential for professional, personal, and even home purposes. Users need to merge multiple vCard files when they do not manage multiple vCard files. Another reason for users to combine multiple VCF files is security concerns. Here are some of the reasons:   Organize your address book contacts…

Continue Reading Re: Quick Way to Merge Multiple VCF Files into One

YP4024 – YFull YTree Info

Sample ID Country / Language Info Ref File Testing company Statistics Status ERS2478532 Turkmenistan Q-YP4024* —— Hg19 .BAM Scientific 17X, 16.7 Mbp, 151 bp YF006625 Russia (Tomskaya oblast’) / Selkup Q-YP4024* —— Hg19 .BAM FTDNA (Y500) 67X, 14.8 Mbp, 165 bp DA162 Russia (Severnaya Osetiya-Alaniya, Respublika) Q-BZ5214* —— Hg19 .BAM…

Continue Reading YP4024 – YFull YTree Info

HRJOB7442 Bioinformatics Scientist 2 (Various Locations) in Nether Alderley, Macclesfield (SK10) | Almac Group (Uk) Ltd

Bioinformatics Scientist 2 Hours: 37.5 hours per week Salary: Competitive Ref No: HRJOB7442 Business Unit: Diagnostic Services Location: Craigavon or Manchester Open To: Internal and External Applicants The Company Almac Diagnostic Services is a leading stratified medicine business, specialising in biomarker-driven clinical trials. We are incredibly proud to be involved…

Continue Reading HRJOB7442 Bioinformatics Scientist 2 (Various Locations) in Nether Alderley, Macclesfield (SK10) | Almac Group (Uk) Ltd

Genomic variation from an extinct species is retained in the extant radiation following speciation reversal

Vamosi, J. C., Magallon, S., Mayrose, I., Otto, S. P. & Sauquet, H. Macroevolutionary patterns of flowering plant speciation and extinction. Annu. Rev. Plant Biol. 69, 685–706 (2018). CAS  PubMed  Google Scholar  Rhymer, J. M. & Simberloff, D. Extinction by hybridization and introgression. Annu. Rev. Ecol. Syst. 27, 83–109 (1996)….

Continue Reading Genomic variation from an extinct species is retained in the extant radiation following speciation reversal

How to calculate r2 for IMPUTE2

How to calculate r2 for IMPUTE2 0 Hi all, I was finally able with all the help to remove some SNPs from the vcf file and then run it through IMPUTE2. This means I have the original vcf and the imputed vcf, how do run r2 analysis? Is there a…

Continue Reading How to calculate r2 for IMPUTE2

Y570 – YFull YTree Info

Sample ID Country / Language Info Ref File Testing company Statistics Status AF2 —— Q-Y570 Q-Y570*, Q-F746* Hg19 .BAM Ancient 1X, 1.3 Mbp, 94 bp YF093124 —— Q-M120* —— Hg38 .BAM Nebula Genomics 57X, 23.6 Mbp, 150 bp Kolyma1 Russia (Sakha, Respublika [Yakutiya]) Q-Y222276* —— Hg19 .BAM Ancient 7X, 13.4…

Continue Reading Y570 – YFull YTree Info

How to apply vcftools –diff and extract only the different variants

How to apply vcftools –diff and extract only the different variants 0 Hello, I am trying to apply vcftools –diff in order to extract the different variants between two VCF files. vcftools –vcf marked_IO002_tumor-pe.vcf –diff marked_IO002_normal-pe.vcf –diff-site –out t_v_n I am getting this as result : VCFtools – 0.1.16 (C)…

Continue Reading How to apply vcftools –diff and extract only the different variants

PF6747 – YFull YTree Info

E-PF6747 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF010216 Azerbaijan (Qəbələ) E-PF6747* —— Hg19 .BAM FTDNA (Y500) 50X, 13.7 Mbp, 165 bp YF064736 Egypt (Al Minūfīyah) E-FT97857* —— Hg38 .BAM FTDNA (Y700) 35X, 18.5 Mbp, 151 bp YF093064 Yemen (Tā’izz) E-Y280593…

Continue Reading PF6747 – YFull YTree Info

PostDoc Plant Bioinformatics job with SKOLKOVO INSTITUTE OF SCIENCE AND TECHNOLOGY

<p><strong>Want to participate to the outstanding new area of agro-genomics ? To put into the practice how the genetic diversity and genome-assisted breeding in crops contribute to provide healthy and high quality food in a sustainable way to humankind? Strong in bioinformatics and interested in working with very large datasets…

Continue Reading PostDoc Plant Bioinformatics job with SKOLKOVO INSTITUTE OF SCIENCE AND TECHNOLOGY

java – GATK: HaplotypceCaller IntelPairHmm only detecting 1 thread

I can’t seem to get GATK to recognise the number of available threads. I am running GATK (4.2.4.1) in a conda environment which is part of a nextflow (v20.10.0) pipeline I’m writing. For whatever reason, I cannot get GATK to see there is more than one thread. I’ve tried different…

Continue Reading java – GATK: HaplotypceCaller IntelPairHmm only detecting 1 thread

Z2039 – YFull YTree Info

Sample ID Country / Language Info Ref File Testing company Statistics Status YF003382 Finland (Länsi-Suomen lääni) I-Z2040* —— Hg19 .BAM FTDNA (Y500) 47X, 13.3 Mbp, 165 bp YF067917 Ireland I-FGC69701* —— Hg19 .BAM Dante Labs 9X, 22.9 Mbp, 151 bp YF078735 Belarus (Vicebskaja voblasc’) / Polish I-FGC69702 —— Hg38 .VCF…

Continue Reading Z2039 – YFull YTree Info

BY7447 – YFull YTree Info

E-BY7447 – YFull YTree Info SNPs currently defining E-BY7447 BY7447     Sample ID Country / Language Info Ref File Testing company Statistics Status YF075635 Yemen (Al Bayḑā’) E-FT183181 —— Hg38 .BAM FTDNA (Y700) 39X, 18.2 Mbp, 151 bp YF067501 Yemen (Şan’ā’) E-FT183181 —— Hg38 .BAM FTDNA (Y700) 44X, 18.8 Mbp,…

Continue Reading BY7447 – YFull YTree Info

Ensembl VEP gnomAD annotated allele frequencies different from gnomAD browser

I’ve annotated some variants using VEP, and was looking at the minor allele frequencies. Some of the variants had very different MAFs in the annotation than I expected (I expected MAF < 1%, whereas some annotated MAFs were >50%). I looked up the same variants on the gnomAD v3 browser,…

Continue Reading Ensembl VEP gnomAD annotated allele frequencies different from gnomAD browser

Bioconductor on Microsoft Azure – Microsoft Tech Community

Co-authored by: Nitesh Turaga – Scientist at Dana Farber/Harvard, Bioconductor Core Team Erdal Cosgun – Sr. Data Scientist at Microsoft Biomedical Platforms and Genomics team Vincent Carey – Professor at Harvard Medical School, Bioconductor Core Team   Introduction   The Bioconductor project promotes the statistical analysis and comprehension of current and emerging…

Continue Reading Bioconductor on Microsoft Azure – Microsoft Tech Community

DF109 – YFull YTree Info

Sample ID Country / Language Info Ref File Testing company Statistics Status YF016926 Ireland R-DF109 R-DF109*, R-A18726* Hg38 .BAM FTDNA (Y500) 27X, 12.7 Mbp, 165 bp YF016394 United States (Ohio) R-DF109 R-DF109*, R-A18726* Hg38 .BAM FTDNA (Y500) 34X, 11.9 Mbp, 151 bp YF011566 Ireland (Mayo) R-DF109 R-DF109*, R-A18726*, R-FGC23742* Hg38…

Continue Reading DF109 – YFull YTree Info

Errors when compiling older version **samtools**

Errors when compiling older version **samtools** 0 I have downloaded bcf file from this website ricevarmap. In order to “view” this old bcf format and convert it to a newer one, it’s said that I have to install samtools-0.1.17, which has a older version bcftools in it. When I make…

Continue Reading Errors when compiling older version **samtools**

GATK HaplotypeCaller with interval list

I am trying to use the -L option of GATK HaplotypeCaller to call SNPs and short InDels with in an interval list. My interval list file (top8snp.interval_list) content is as follows: 12 33029845 33030845 + rs24767598 13 40586682 40587682 + rs24748362 18 24373857 24374857 + rs8856159 21 50381146 50382146 +…

Continue Reading GATK HaplotypeCaller with interval list

Split multiallelic SNPs to biallelic from vcf

Dear all, I have a particular vcf file like this, chrX 29 . G A,T . PASS AC=1,1;AN=3 GT:DP:HF:CILOW:CIUP:SDP 0/1/2:4839:0.003,0.001:0.002,0.0:0.005,0.003:14;0,4;2 I tried various tools to split this, but I get the following results, so the FORMAT and INFO lines are identical. chrX 29 . G A . PASS AC=1,1;AN=3;OLD_MULTIALLELIC=chrM:899:G/A/T GT:DP:HF:CILOW:CIUP:SDP…

Continue Reading Split multiallelic SNPs to biallelic from vcf

ZP77 – YFull YTree Info

R-ZP77 – YFull YTree Info SNPs currently defining R-ZP77 ZP77 / FGC6562     Sample ID Country / Language Info Ref File Testing company Statistics Status YF008362 —— R-ZP77* —— Hg19 .BAM FTDNA (Y500) 41X, 13.8 Mbp, 165 bp YF067652 Unknown R-BY40744 —— Hg38 .BAM FTDNA (Y700) 36X, 18.7 Mbp, 151…

Continue Reading ZP77 – YFull YTree Info

python – How can I fix the dash bio error: devtools cannot load source map dashbio@1.0.1 bundle.js.map?

I am implementing a website in Python with Django framework and using django-plotly-dash to display data. I am trying to use dash_bio’s IGV feature to display some chromosome data, but when I attempt to call the functionality, I receive the following errors and the callback that returns ‘dashbio.igv’ is unable…

Continue Reading python – How can I fix the dash bio error: devtools cannot load source map dashbio@1.0.1 bundle.js.map?

bcftools merged vcf file assigns all variants to one sample

bcftools merged vcf file assigns all variants to one sample 0 I’ve made one vcf file for each of three samples. I then combined them using bcftools, like so: # Make a list of vcf files to merge cat “${OUT}/results/variants/vcf_list” /mnt/gpfs/live/rd01__/ritd-ag-project-rd018o-mdflo13/data/test/manual/results/variants/3a7a-10.vcf.gz /mnt/gpfs/live/rd01__/ritd-ag-project-rd018o-mdflo13/data/test/manual/results/variants/MF3.vcf.gz /mnt/gpfs/live/rd01__/ritd-ag-project-rd018o-mdflo13/data/test/manual/results/variants/R507H-FB_S355_L001.vcf.gz Then merge the list: bcftools merge -l…

Continue Reading bcftools merged vcf file assigns all variants to one sample

variant – Where should you put you cache for ensembl-vep using conda

I’ve installed vep in conda like so: conda install ensembl-vep=105.0-0 And then I installed the human cache like so: vep_install -a cf -s homo_sapiens -y GRCh38 -c /mnt/gpfs/live/rd01__/ritd-ag-project-rd018o-mdflo13/refs/vep –CONVERT But when I try and run vep I get an error: vep –dir_cache /mnt/gpfs/live/rd01__/ritd-ag-project-rd018o-mdflo13/refs/vep -i /mnt/gpfs/live/rd01__/ritd-ag-project-rd018o-mdflo13/data/test/manual/results/variants/cohort.norm_recalibrated.vcf -o /mnt/gpfs/live/rd01__/ritd-ag-project-rd018o-mdflo13/data/test/manual/results/variants/cohort.norm_recalibrated_vep.vcf Am I doing…

Continue Reading variant – Where should you put you cache for ensembl-vep using conda

variant – Error running gatk HaplotypeCaller with allele specific annotations

I’ve got HaplotypeCaller working nicely in standard mode, like so: # Run haplotypcaller gatk –java-options “-Xmx4g” HaplotypeCaller –intervals “$INTERVALS” -R “$REF” -I “$OUT”/results/alignment/${SN}_sorted_marked_recalibrated.bam -O “$OUT”/results/variants/${SN}_g.vcf.gz -ERC GVCF But when I try in allele-specific mode, I get the following error. All I’ve done is add the -G annotations at the end,…

Continue Reading variant – Error running gatk HaplotypeCaller with allele specific annotations

linux – How to fix Perl from anaconda not installing bioperl? Bailing out the installation for BioPerl

vep -i examples/homo_sapiens_GRCh38.vcf –database Can’t locate Bio/PrimarySeqI.pm in @INC (you may need to install the Bio::PrimarySeqI module) (@INC contains: /home/youssef/anaconda3/envs/ngs1/share/ensembl-vep-88.9-0/modules /home/youssef/anaconda3/envs/ngs1/share/ensembl-vep-88.9-0 /home/youssef/anaconda3/envs/ngs1/lib/site_perl/5.26.2/x86_64-linux-thread-multi /home/youssef/anaconda3/envs/ngs1/lib/site_perl/5.26.2 /home/youssef/anaconda3/envs/ngs1/lib/5.26.2/x86_64-linux-thread-multi /home/youssef/anaconda3/envs/ngs1/lib/5.26.2 .) at /home/youssef/anaconda3/envs/ngs1/share/ensembl-vep-88.9-0/Bio/EnsEMBL/Slice.pm line 75. BEGIN failed–compilation aborted at /home/youssef/anaconda3/envs/ngs1/share/ensembl-vep-88.9-0/Bio/EnsEMBL/Slice.pm line 75. Compilation failed in require at /home/youssef/anaconda3/envs/ngs1/share/ensembl-vep-88.9-0/Bio/EnsEMBL/Feature.pm line 84. BEGIN failed–compilation aborted at /home/youssef/anaconda3/envs/ngs1/share/ensembl-vep-88.9-0/Bio/EnsEMBL/Feature.pm…

Continue Reading linux – How to fix Perl from anaconda not installing bioperl? Bailing out the installation for BioPerl

Variant calls of published already assembled genomes

Variant calls of published already assembled genomes 0 I have a set of short read sequencing for the 172 KB Epstein-barr virus genome. We successfully called our variants using GATK to a reference genome. A publication linked below from a different population compared variants (also from short read sequencing) to…

Continue Reading Variant calls of published already assembled genomes

why my VCF file generated with manta is missing genotype information

Hi, everybody, I am pretty new to coding and bioinformatics. I am using Manta as a tool to infer somatic structural variants (SVs) from a paired tumor/normal sample call. However, my somaticSV.vcf.gz file does not contain information about the genotype nor the genotype quality (there is a dot instead of…

Continue Reading why my VCF file generated with manta is missing genotype information

bedtools intersect error: Invalid record in file

Hello to all I am trying to run bedtools intersect with vcf file and a bed file (my goal is to add the depth data to my VCF) I get an error running this command: bedtools intersect -a depth.bed -b fish.vcf -wa -wb > $out The error: “Error: Invalid record…

Continue Reading bedtools intersect error: Invalid record in file

What file type does “PLINK –block” accept as input?

What file type does “PLINK –block” accept as input? 0 Hi, I have set of SNPs (distributed over all the chromosomes) and I am trying to do some haplotype block estimation to identify whether some of them are part of the same haplotype block, etc. It seems like “PLINK –blocks”…

Continue Reading What file type does “PLINK –block” accept as input?

VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

Describe the issue VEP give errors even my query and reference has same assembly version Command :$: ./vep -i examples/homo_sapiens_GRCh37.vcf –cache –refseq cache reference details while running install.pl ? 458 NB: Remember to use –refseq when running the VEP with this cache! downloading ftp.ensembl.org/pub/release-104/variation/indexed_vep_cache/homo_sapiens_refseq_vep_104_GRCh37.tar.gz unpacking homo_sapiens_refseq_vep_104_GRCh37.tar.gz converting cache, this may…

Continue Reading VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

dbSNP specific to C57BL6J

dbSNP specific to C57BL6J 0 Hi is it possible to obtain a dbSNP file that is specific to a strain e.g. C57BL6J? I tried looking for it in the ncbi, MGI and Jackson website. But I don’t seem to find strain specific vcf. Thanks c57bl6j dbsnp • 39 views •…

Continue Reading dbSNP specific to C57BL6J

Failed to instantiate plugin dbNSFP in VEP

Failed to instantiate plugin dbNSFP in VEP 0 Hi Team, My VEP (version 105, installed by perl INSTALL.pl) works well. But I face some problems to use dbNSFP plugin (also installed by perl INSTALL.pl) with VEP tool. My dbNSFP version 4.2a was installed by the following code without any warning…

Continue Reading Failed to instantiate plugin dbNSFP in VEP

help with CrossMap

help with CrossMap 0 Hello all, I would really appreciate your help as I am new to working with different file builds and having a setback lifting a vcf file from build hg38 to hg19. in essence, using CrossMap the chromosome value gets altered. Like for example, below is the…

Continue Reading help with CrossMap

Variant physical position must be monotonically increasing

ERROR: Variant physical position must be monotonically increasing 0 I want to calculate XPEHH for each SNP position. When I run the following command selscan –xpehh –vcf B10_beagle.vcf –vcf-ref D6_beagle.vcf –map MAP.map –threads 8 –out B10vsD6 I get this error ERROR: Variant physical position must be monotonically increasing Ch2:66 66…

Continue Reading Variant physical position must be monotonically increasing

sniffles failed detect SV on minimap2 aligments

When I use ngmlr the sniffles worked. The coverage it more than 90% The code I sent on the github is exactly what it generated, I don’t think there any error Xu Zhang PhD Postdoctoral Associate, Department of Microbiology and Immunology Weill Cornell Medicine 1300 York Avenue, Box 62 New…

Continue Reading sniffles failed detect SV on minimap2 aligments

Benchmarking the NVIDIA Clara Parabricks germline pipeline on AWS

This blog post was contributed by Ankit Sethia, PhD, and Timothy Harkins, PhD, at NVIDIA Parabricks, and Olivia Choudhury, PhD,  Sujaya Srinivasan, and Aniket Deshpande at AWS. This blog provides an overview of NVIDIA’s Clara Parabricks along with a guide on how to use Parabricks within the AWS Marketplace. It…

Continue Reading Benchmarking the NVIDIA Clara Parabricks germline pipeline on AWS

rust-bio-tools 0.35.0 – Docs.rs

rust-bio-tools-0.35.0 is not a library. A set of ultra fast and robust command line utilities for bioinformatics tasks based on Rust-Bio. Rust-Bio-Tools provides a command rbt, which currently supports the following operations: a linear time implementation for fuzzy matching of two vcf/bcf files (rbt vcf-match) a vcf/bcf to txt converter,…

Continue Reading rust-bio-tools 0.35.0 – Docs.rs

bcftools merge of over 9000+ vcf files

Hi all, I have around 9000+ vcf files that I’m trying to merge using bcftools merge. They are all located in their own folder so essentially I have a folder containing 9000+ separate folders, each containing one vcf.gz file. I have tried out the following code via this tutorial bcftools…

Continue Reading bcftools merge of over 9000+ vcf files

Sort vcf file based on Satsuma synteny output

Sort vcf file based on Satsuma synteny output 1 Hi all I have been using satsuma synteny to assign scaffolds (from the genome of my study species) to the chromosomes of a closely related species. I now have a tab delimited file that lists these scaffolds in the order that…

Continue Reading Sort vcf file based on Satsuma synteny output

Dragen-gatk for trio

Dragen-gatk for trio 0 Hi everyone, the Dragen gatk pipeline works great for single sample. however I would like to know if any have used this pipeline for a trio? if so how did you do it? it is recommended to do a hard filtering based on QUAL but how…

Continue Reading Dragen-gatk for trio

Reference panel data to be used for GCTA-COJO

Reference panel data to be used for GCTA-COJO 0 I performed a genome-wide meta-analysis based on summary statistics from the four cohorts to identify significant loci. Next, I would like to perform a conditional analysis using GCTA-COJO to search for SNPs independent of significant lead SNPs. I know that GCTA…

Continue Reading Reference panel data to be used for GCTA-COJO

Blast command line pipeline not working

Blast command line pipeline not working 0 Hello, I am running now a local blast pipeline using MacOs. The goal here is to take interval of the 5 best hits and then extract the SNP variants from multiple vcf.gz files. But I am facing an error which I cannot solve….

Continue Reading Blast command line pipeline not working

Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

I’ve got sequencing data for a small 500 bp amplicon from a few samples. GATK best principles suggest running VariantRecalibrator on the GVCF files I generate. I’m trying to get this working, but I get an error about “Found annotations with zero variances”. Reading the gatk manual and other posts…

Continue Reading Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

Large-scale genome-wide study reveals climate adaptive variability in a cosmopolitan pest

Genomic data The foundational resource for this study was a dataset of 40,107,925 nuclear SNPs sequenced from a worldwide sample of 532 DBM individuals collected in 114 different sites based on our previous project15. DNA was extracted from each of the 532 individuals using DNeasy Blood and Tissue Kit (Qiagen,…

Continue Reading Large-scale genome-wide study reveals climate adaptive variability in a cosmopolitan pest

how to add reference alleles to VCF?

how to add reference alleles to VCF? 1 I’m converting gVCFs to VCF, but the reference alleles are missing. An example below: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 180525_FD02929177 1 97547947 . T . . . DP=31 GT:DP:RGQ 0/0:31:81 1 97915614 . C . . . DP=40…

Continue Reading how to add reference alleles to VCF?

gatk VariantRecalibrator positional argument error

I’m trying to use recalibrate my vcf using gatk VariantRecalibrator, but keep getting an error “Illegal argument value: Positional arguments were provided”. But I don’t know what this means, or how to correct it! Here’s my call: gatk VariantRecalibrator -R “/Volumes/Seagate Expansion Drive/refs/hg38/gatk download/Homo_sapiens_assembly38.fasta” -V “$OUT”/results/variants/”$SN”.norm.vcf.gz -AS –resource hapmap,known=false,training=true,truth=true,prior=15.0: “/Volumes/Seagate…

Continue Reading gatk VariantRecalibrator positional argument error

Senior Bioinformatics Scientist II/ Staff Bioinformatics Scientist

Inscripta was founded in 2015 and recently launched the world’s first benchtop Digital Genome Engineering platform. The company is growing aggressively, investing in its leadership, team, and technology with a recent $150mm financing round led by Fidelity and TRowe price. The company’s advanced CRISPR-based platform, consisting of an instrument, reagents,…

Continue Reading Senior Bioinformatics Scientist II/ Staff Bioinformatics Scientist