Tag: GRCH37

The Evolution from HG19 to HG38

Welcome to another blog post! ‍ Reference genomes are essential benchmarks of a species’ genome that facilitate the accurate comparison of individual genomes and are crucial tools for identifying genetic variants and diagnosing rare diseases. Here, we will explore the evolution of the human reference genome, focusing on the transition…

Continue Reading The Evolution from HG19 to HG38

A Benchmark of Genetic Variant Calling Pipelines Using Metagenomic Short-Read Sequencing

Introduction Short-read metagenomic sequencing is the technique most widely used to explore the natural habitat of millions of bacteria. In comparison with 16S rRNA sequencing, shotgun metagenomic sequencing (MGS) provides sequence information of the whole genomes, which can be used to identify different genes present in an individual bacterium and…

Continue Reading A Benchmark of Genetic Variant Calling Pipelines Using Metagenomic Short-Read Sequencing

An FGFR2 mutation as the potential cause of a new phenotype including early-onset osteoporosis and bone fractures: a case report | BMC Medical Genomics

Anamnesis vitae A 13 year old male born was as result of the VII pregnancy, from unrelated parents. Other pregnancies resulted in: I-II silent miscarriage in the second trimester; III – female, born in 2003 (III-3 Fig. 1) that has the following phenotypic features: genu valgum, hip dysplasia, combined thoracolumbar scoliosis,…

Continue Reading An FGFR2 mutation as the potential cause of a new phenotype including early-onset osteoporosis and bone fractures: a case report | BMC Medical Genomics

Archaic Introgression Shaped Human Circadian Traits | Genome Biology and Evolution

Abstract When the ancestors of modern Eurasians migrated out of Africa and interbred with Eurasian archaic hominins, namely, Neanderthals and Denisovans, DNA of archaic ancestry integrated into the genomes of anatomically modern humans. This process potentially accelerated adaptation to Eurasian environmental factors, including reduced ultraviolet radiation and increased variation in…

Continue Reading Archaic Introgression Shaped Human Circadian Traits | Genome Biology and Evolution

bcftools=1.18 not filtering correcting MAF

bcftools=1.18 not filtering correcting MAF 0 Hi, I have encountered some issues when using bcftools v.1.11, v.1.14 or v.1.18 I want to filter MAF<=0.01 & ‘F_MISSING<0.1’ for rare-variant analysis. I have a vcf file mapped to the GRCh37, left aligned, and multi-allelic split. bcftools view -q 0.01:minor test1.vcf > test2.vcf…

Continue Reading bcftools=1.18 not filtering correcting MAF

kallisto index build difference according to version

kallisto index build difference according to version 0 Hi all, I’m trying to implement kallisto for a dataset of single-end RNA-seq data, And obviously started with building an index (The files were downloaded from ensembl). Homo_sapiens.GRCh37.ncrna.fa.gz Homo_sapiens.GRCh37.cdna.all.fa.gz using the command kallisto index -i index.idx Homo_sapiens.GRCh37.ncrna.fa.gz Homo_sapiens.GRCh37.cdna.all.fa.gz And although this wasn’t…

Continue Reading kallisto index build difference according to version

East Asian-specific and cross-ancestry genome-wide meta-analyses provide mechanistic insights into peptic ulcer disease

We conducted a three-stage genome-wide analysis of PUD and its subtypes. An overview of the workflow is provided in Fig. 1 and Supplementary Fig. 1. PUD cases in the east Asian populations were obtained by combining individuals with any of the two major PUD subtypes (DU and GU), which were…

Continue Reading East Asian-specific and cross-ancestry genome-wide meta-analyses provide mechanistic insights into peptic ulcer disease

DNA methylation change in blood cells of FB and CFS patients

Introduction Fibromyalgia (FM) and Chronic Fatigue Syndrome (CFS) are characterized by chronic pain, fatigue, and weakness. Patients with these symptoms also suffer from sleep abnormalities and report affected cognitive processes such as memory. The diagnosis of these two syndromes is challenging and is based on questionnaires that make the diagnosis…

Continue Reading DNA methylation change in blood cells of FB and CFS patients

How to overlap patient VCF with ClinVar database annotation using bedtools?

How to overlap patient VCF with ClinVar database annotation using bedtools? 1 Hello, I’m trying to help a colleague who is trying to add ClinVar databases clinical significance column to VCF samples that she analysed. More specifically, we are trying to add overlapping/common variant annotation so that if the variant…

Continue Reading How to overlap patient VCF with ClinVar database annotation using bedtools?

How to perform liftover from 38 to 37 in R?

I have some gwas summary statistics in GRCh38 that I want to lift to GRCh37. I am trying to liftover in R using this code: library(tidyverse) library(magrittr) library(data.table) library(rtracklayer) library(GenomicRanges) rm(list=ls()) gwas_data <- fread(“/gwas_sumstats_allchr.txt”) chain_file <- “/chain_files/hg38ToHg19.over.chain” chain <- import.chain(chain_file) # Convert to GRanges object (assuming GENPOS is 1-based) gwas_ranges…

Continue Reading How to perform liftover from 38 to 37 in R?

ILIAD: a suite of automated Snakemake workflows for processing genomic data for downstream applications | BMC Bioinformatics

Pipeline architecture and configuration file Genomic data processing poses a challenge for genetic research studies because it involves multiple program dependency installations, vast numbers of samples with raw data from various next-generation sequencing (NGS) platforms, and inconsistent genetic variant ID and/or positions among datasets. The Iliad suite of genomic data…

Continue Reading ILIAD: a suite of automated Snakemake workflows for processing genomic data for downstream applications | BMC Bioinformatics

Clonal Hematopoiesis and Cardiovascular Disease in Patients With Multiple Myeloma Undergoing Hematopoietic Cell Transplant | Cardiology | JAMA Cardiology

Key Points Question  Is clonal hematopoiesis of indeterminate potential (CHIP) detected at the time of hematopoietic stem transplant (HCT) associated with increased rates of cardiovascular disease (CVD) among patients with multiple myeloma (MM) following HCT? Finding  In this cohort study of patients with MM undergoing HCT, CHIP was highly prevalent…

Continue Reading Clonal Hematopoiesis and Cardiovascular Disease in Patients With Multiple Myeloma Undergoing Hematopoietic Cell Transplant | Cardiology | JAMA Cardiology

human genome – How many Ns and ns in GRCh37 / GRCh38 per ‘canonical’ chromosome?

This is kind of pedantic, but I’m not sure where to look… For GRCh38 (and a lot of work…) I have the following… Chr Length Ns ns chr1 248,956,422 18,475,229 181 chr2 242,193,529 1645,291 10 chr3 198,295,559 195,420 4 chr4 190,214,555 461,888 0 chr5 181,538,259 272,881 0 chr6 170,805,979 727,255…

Continue Reading human genome – How many Ns and ns in GRCh37 / GRCh38 per ‘canonical’ chromosome?

Quantification of rare somatic single nucleotide variants by droplet digital PCR using SuperSelective primers

Primary samples and nucleic acid extraction A cohort of 48 patients diagnosed with advanced adenoma (AAD), defined by size > 20 mm, or colorectal carcinoma (CRC) were collected between 2013 and 2016. The study was approved by the institutional ethics committee of Hospital General Universitario de Alicante (Ref. CEICPI2013/01), and written informed consent…

Continue Reading Quantification of rare somatic single nucleotide variants by droplet digital PCR using SuperSelective primers

Extracting the near amino acid number from an essential splice site variant

Extracting the near amino acid number from an essential splice site variant 1 Hi all, I have some essential splice site variants, and I am trying to find a systematic way to derive the nearest amino acid number to the variants. For example, I have 1:6522052:A:G (GRCh37), and it’s HGVS…

Continue Reading Extracting the near amino acid number from an essential splice site variant

Comparative Analysis of Structural Variant Callers on Short-Read Whole-Genome Sequencing Data

Pang, A.W., MacDonald, J.R., Pinto, D., et al., Towards a comprehensive structural variation map of an individual human genome, Genome Biol., 2010, vol. 11, no. 5, p. R52. doi.org/10.1186/gb-2010-11-5-r52 Article  CAS  PubMed  PubMed Central  Google Scholar  The International HapMap Consortium, The international HapMap project, Nature, 2003, pp. 789—796. doi.org/10.1038/nature02168 Sudmant,…

Continue Reading Comparative Analysis of Structural Variant Callers on Short-Read Whole-Genome Sequencing Data

Inactive S. aureus Cas9 downregulates alpha-synuclein and reduces mtDNA damage and oxidative stress levels in human stem cell model of Parkinson’s disease

Cloning of CRISPR/sgRNA lentiviral constructs with fluorescent selection markers A tetracycline-inducible promoter (TRE3G) was used to control the expression of S. aureus dCas9 in a lentiviral vector. To facilitate selection of cells by FACS, pHR:TRE3G-SadCas9-2xKRAB-p2a-tdTomato (Addgene ID #209298) was subcloned from a pHR:TRE3G-SadCas9-2xKRAB-p2a-zeo (A gift from Professor Stanley Qi), where zeocin…

Continue Reading Inactive S. aureus Cas9 downregulates alpha-synuclein and reduces mtDNA damage and oxidative stress levels in human stem cell model of Parkinson’s disease

map Ensembl gene ID from hg19 to hg38

map Ensembl gene ID from hg19 to hg38 0 Hello! I would like to convert Ensembl gene ID from hg19 to hg38 with R. I tried with this code: ensembl <- useMart(“ensembl”, dataset = “hsapiens_gene_ensembl”, host= “grch37.ensembl.org“) ensembl_ids <- c(“ENSG00000183878”, “ENSG00000146083”) converted_ids <- getLDS(attributes = c(“ensembl_gene_id”), filters = “ensembl_gene_id”, values…

Continue Reading map Ensembl gene ID from hg19 to hg38

public databases – Converting VCF format to text for use with PLINK and understanding column mapping

I successfully completed Nature PRS tutorial, which is based on PLINK. Turning to my real data, I downloaded ukb-d-20544_1.vcf.gz. Now I’m facing the problem that I seem to be unable to use it in PLINK or find the correct data format to download at all, and I am a bit…

Continue Reading public databases – Converting VCF format to text for use with PLINK and understanding column mapping

LOC127815786 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr9:124216027-124216526 [Homo sapiens (human)] – Gene

NEW Try the new Transcript table RefSeqs maintained independently of Annotated Genomes These reference sequences exist independently of genome builds. Explain These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by…

Continue Reading LOC127815786 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr9:124216027-124216526 [Homo sapiens (human)] – Gene

Genotyping, sequencing and analysis of 140,000 adults from Mexico City

Recruitment of study participants The MCPS was established in the late 1990s following discussions between Mexican scientists at the National Autonomous University of Mexico (UNAM) and British scientists at the University of Oxford about how best to measure the changing health effects of tobacco in Mexico. These discussions evolved into…

Continue Reading Genotyping, sequencing and analysis of 140,000 adults from Mexico City

Distribution tendencies of pathogens causing LRTI

Introduction Lower respiratory tract infection (LRTI) remains one of the leading causes of death worldwide.1 Several well-known pathogens, including Streptococcus pneumoniae, Pseudomonas aeruginosa, Klebsiella pneumoniae, Candida, Herpesvirus, and others, have been identified as significant causes of infection.2 Nonetheless, nearly half of the cases still have an undetermined etiology,3,4 despite the…

Continue Reading Distribution tendencies of pathogens causing LRTI

Picard Liftover MismatchedRefAllele PsychArray

Picard Liftover MismatchedRefAllele PsychArray 0 New to using liftOver and working with vcf files generally: I ran liftOver on data gathered from the PsychChip array to lift over from GRCh37 to GRCh38, and got only about 50% of variants lifted over. Most of the rejected ones had “MismatchedRefAllele” as their…

Continue Reading Picard Liftover MismatchedRefAllele PsychArray

AlphaMissense Plugin VEP

AlphaMissense Plugin VEP 0 I’ve installed alphamissense plugin in VEP, but I can’t use it. I’ve downloaded the requested files and launch the tabix command before use it. Then I’ve launched the command but I got this error: WARNING: Failed to instantiate plugin AlphaMissense: ERROR: No file specified Try using…

Continue Reading AlphaMissense Plugin VEP

Progress and challenges in completing the human gene catalogue

In a recent review published in Nature, a group of authors reviewed the progress and challenges in annotating the human genome, including protein-coding genes, isoforms, and non-coding ribonucleic acids (RNAs), and advocated for a universal annotation standard for clinical use. Study: The status of the human gene catalogue. Image Credit:…

Continue Reading Progress and challenges in completing the human gene catalogue

Bioconductor – SNPlocs.Hsapiens.dbSNP142.GRCh37

DOI: 10.18129/B9.bioc.SNPlocs.Hsapiens.dbSNP142.GRCh37     This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see SNPlocs.Hsapiens.dbSNP142.GRCh37. SNP locations for Homo sapiens (dbSNP Build 142) Bioconductor version: 3.13 SNP locations and alleles for Homo sapiens extracted from NCBI dbSNP Build 142. The source data files used for…

Continue Reading Bioconductor – SNPlocs.Hsapiens.dbSNP142.GRCh37

KCNQ potassium channels modulate Wnt activity in gastro-oesophageal adenocarcinomas

Introduction The KCNQ (potassium voltage-gated channel subfamily Q) family of ion channels encode potassium transporters (1). KCNQ proteins typically repolarise the plasma membrane of a cell after depolarisation by allowing the export of potassium ions, and are therefore involved in wide-ranging biological functions including cardiac action potentials (2), neural excitability…

Continue Reading KCNQ potassium channels modulate Wnt activity in gastro-oesophageal adenocarcinomas

bcftools error merging two VCFs: REF prefixes differ

Hi all, i am trying to merge two VCF files using bcftools merge. However, my command bcftools merge -m id VCF_d.vcf.gz VCF_p.vcf.gz -o merged.vcf.gz –force-samples returns the following The REF prefixes differ: TG vs GA (2,2) Failed to merge alleles at 18:786377 in VCF_d.vcf.gz These are the entries in the…

Continue Reading bcftools error merging two VCFs: REF prefixes differ

Diagnostic genome sequencing improves diagnostic yield: a prospective single-centre study in 1000 patients with inherited eye diseases

Introduction Although protein-coding regions represent only 1–2% of the human genome, they harbour an estimated 85% of annotated pathogenic variants.1 2 Despite these numbers, genome sequencing (GS) usually achieves a higher diagnostic yield than sequencing approaches that focus on exonic regions, not least because of its more homogeneous coverage3 4…

Continue Reading Diagnostic genome sequencing improves diagnostic yield: a prospective single-centre study in 1000 patients with inherited eye diseases

KidneyGPS: a user-friendly web application to help prioritize kidney function genes and variants based on evidence from genome-wide association studies | BMC Bioinformatics

User interface The user interface of KidneyGPS is organized into five tabs: Three tabs enable the specific search for genes, variants and regions (underlying data structure shown in Additional file 1: Fig. S4): (1) “gene search” tab: search for genes using their gene names (synonyms automatically mapped to their official HGNC…

Continue Reading KidneyGPS: a user-friendly web application to help prioritize kidney function genes and variants based on evidence from genome-wide association studies | BMC Bioinformatics

Cell-free chromatin immunoprecipitation to detect molecular pathways in heart transplantation

Abstract Existing monitoring approaches in heart transplantation lack the sensitivity to provide deep molecular assessments to guide management, or require endomyocardial biopsy, an invasive and blind procedure that lacks the precision to reliably obtain biopsy samples from diseased sites. This study examined plasma cell-free DNA chromatin immunoprecipitation sequencing (cfChIP-seq) as…

Continue Reading Cell-free chromatin immunoprecipitation to detect molecular pathways in heart transplantation

Idat raw data conversion

Idat raw data conversion 0 Hello everybody We have just started genotyping with GSA and generated first idat files. To QC we were advices to use GenomeStudio and for that we downloaded all the necessary files: bpm, egt, imap files from illumina website. Yet when we performed analysis in Genomestudio,…

Continue Reading Idat raw data conversion

Liftover GRCh37 to hg38 1kg/GATK.

Liftover GRCh37 to hg38 1kg/GATK. 1 I need to liftover a few variants from GRCh37 to hg38 1kg/GATK. UCSC lifover does not have this reference genome version available. I have tried with the standard hg38 but conversations are wrong. Where can I find GRCh37 to hg38 1kg/GATK chain files or…

Continue Reading Liftover GRCh37 to hg38 1kg/GATK.

Managing your data (BAM, VCF, sample, phenotype) with RDF and SPARQL.

Tutorial:Managing your data (BAM, VCF, sample, phenotype) with RDF and SPARQL. 0 13 years after How Do You Manage Your Files & Directories For Your Projects ? , I wrote a tutorial about how I now manage my data : BAM, VCF, sample, phenotype, reference etc… how to link everything…

Continue Reading Managing your data (BAM, VCF, sample, phenotype) with RDF and SPARQL.

GRCh37/38 reference genotype AF wrong ?

GRCh37/38 reference genotype AF wrong ? 1 Dear Colleagues, I am new to variant calling and started to analyse my VCF generated from WES bam files to isolate clinical relevant germline variations. The VCF was generated using GRCh38 as reference sequence. Now I stumpled over the fact that a hugh…

Continue Reading GRCh37/38 reference genotype AF wrong ?

SNPs that have the same position and alleles, which rsnumber to pick?

SNPs that have the same position and alleles, which rsnumber to pick? 0 When trying to match snps to rs number based on position I came across this problem. There are multiple SNPs on the same position with the same alleles and they are not synonyms or merged into each…

Continue Reading SNPs that have the same position and alleles, which rsnumber to pick?

Challenging interpretation of germline TP53 variants based on the experience of a national comprehensive cancer centre

Subjects TP53 gene was investigated in 880 consecutive oncology patients referred for molecular genetic testing at our national centres (Department of Laboratory Medicine, Semmelweis University and Department of Molecular Genetics, National Institute of Oncology) between 2021 and 2022. This cohort consisted of patients with potential hereditary tumour predisposition. Their genetic…

Continue Reading Challenging interpretation of germline TP53 variants based on the experience of a national comprehensive cancer centre

Checking a SNP as common SNP or not using UCSC genome browser

Checking a SNP as common SNP or not using UCSC genome browser 1 Hi all, I want to know how I can tell if a variant I got is a common SNP or not using UCSC genome browser. For example, if I got 1:115258683-A>A/C on GRCh37, how can I check…

Continue Reading Checking a SNP as common SNP or not using UCSC genome browser

Multivariate Analysis of Transcript Splicing (MATS)

Install rMATS: Add the Python directory to the $PATH environment variable Add the bowtie and tophat directories to the $PATH environment variable Add the samtools directory to the $PATH environment variable Obtain bowtie index for genome by either of the following two ways Build own bowtie index using bowtie-build from…

Continue Reading Multivariate Analysis of Transcript Splicing (MATS)

GATK AnnotateVcfWithBamDepth returns zero DP for all variants in VCF

Dear all, I am using GATK (v4.1.9.0) AnnotateVcfWithBamDepth to get the DP for all variants in ClinVar VCF in a retina RNA-seq BAM file. However, the tool returns zero depth for all variants in the VCF, even though I checked multiple variants in IGV and I saw that they are…

Continue Reading GATK AnnotateVcfWithBamDepth returns zero DP for all variants in VCF

Identification of two novel variants of the DMD gene

Introduction Duchenne muscular dystrophy (DMD, OMIM#310200) is a severe X-linked recessive, inherited neuromuscular disorder, characterized by rapidly progressive muscle weakness and muscle wasting throughout the body.1 DMD is more common in males than females, with an incidence rate of 1:5000 and 1:50,000,000, respectively.2 Female heterozygotes theoretically have 50% normal cells…

Continue Reading Identification of two novel variants of the DMD gene

Nuclear genetic control of mtDNA copy number and heteroplasmy in humans

Overview of mtSwirl Here we develop mtSwirl, a scalable pipeline for mtCN and variant calling which makes calls relative to an internally generated per-sample consensus sequence before mapping all calls back to GRCh38. In addition to GRCh38 reference files and WGS data, the mtSwirl pipeline takes as input nuclear genome…

Continue Reading Nuclear genetic control of mtDNA copy number and heteroplasmy in humans

Long-molecule scars of backup DNA repair in BRCA1- and BRCA2-deficient cancers

Pan-cancer WGS data sources GrCh37/hg19 BAM alignments for 2,489 primary tumour and matched normal whole-genome sequencing data were obtained as previously described18. In brief, 989 tumour–normal (T/N) pairs were obtained from The Cancer Genome Atlas (TCGA) Research Network (Genomic Data Commons at portal.gdc.cancer.gov/, accession: phs000178.v11.p8). Additional WGS data were obtained for 874 T/N pairs…

Continue Reading Long-molecule scars of backup DNA repair in BRCA1- and BRCA2-deficient cancers

Bioconductor – gwascat (development version)

DOI: 10.18129/B9.bioc.gwascat   This is the development version of gwascat; for the stable release version, see gwascat. representing and modeling data in the EMBL-EBI GWAS catalog Bioconductor version: Development (3.18) Represent and model data in the EMBL-EBI GWAS catalog. Author: VJ Carey <stvjc at channing.harvard.edu> Maintainer: VJ Carey <stvjc at…

Continue Reading Bioconductor – gwascat (development version)

Dissecting human population variation in single-cell responses to SARS-CoV-2

Sample collection The individuals of self-reported African (AFB) and European (EUB) descent studied are part of the EVOIMMUNOPOP cohort18. In brief, 390 healthy male donors (188 AFB and 202 EUB) were recruited between 2012 and 2013 in Ghent (Belgium), thus, before the COVID-19 pandemic. Blood was obtained from the healthy…

Continue Reading Dissecting human population variation in single-cell responses to SARS-CoV-2

jannovar download problem

jannovar download problem 0 I am trying to convert some HGVS to chrom:pos:ref:alt format. I was thinking to use jannovar. As per the documentation I run: jannovar download -d hg19/refseq which gives me this: Options JannovarDownloadOptions [downloadDir=data, getDataSourceFiles()=[bundle:///default_sources.ini], isReportProgress()=true, getHttpProxy()=null, getHttpsProxy()=null, getFtpProxy()=null, geneIdentifiers=[], outputFile=] Downloading/parsing for data source “hg19/refseq” INFO…

Continue Reading jannovar download problem

Nextflow files not referenced correctly when using wildcard in a for loop

Hi, I’m having some problems with my nextflow workflow when I use wildcards (*) to call in files. The files are created fine, (using process augment below) but when it is used by process snarls, it calls them as follows: CH-A2504_1.aug.gam -> workdir/2c/ce66a6417872a428111b7c2a5995d4/CH-A2504_01.aug.gam CH-A2504_1.aug.pg -> workdir/2c/ce66a6417872a428111b7c2a5995d4/CH-A2504_01.aug.pg … … CH-A2504_23.aug.gam ->…

Continue Reading Nextflow files not referenced correctly when using wildcard in a for loop

Evolutionary histories of breast cancer and related clones

Data reporting No statistical methods were used to determine the sample size. The experiments were not randomized. Pathologists were blinded to the genetic alterations in each sample during histopathological evaluation. Participants and materials We enroled 207 female patients with breast cancer who underwent surgery at the Kyoto University Hospital and…

Continue Reading Evolutionary histories of breast cancer and related clones

How to Detect Presence of HPV-33 in Sample Data (FASTQ, BAM, VCF on GRCh37)

How to Detect Presence of HPV-33 in Sample Data (FASTQ, BAM, VCF on GRCh37) 2 Hello everyone, I have a tissue sample for which I have sequencing data available in several formats – FASTQ, BAM, and VCF. The alignment has been done against the GRCh37 reference genome. I am interested…

Continue Reading How to Detect Presence of HPV-33 in Sample Data (FASTQ, BAM, VCF on GRCh37)

Differentially expressed gene analysis in Python with omicverse

An important task of bulk rna-seq analysis is the different expression , which we can perform with omicverse. For different expression analysis, ov change the gene_id to gene_name of matrix first. When our dataset existed the batch effect, we can use the SizeFactors of DEseq2 to normalize it, and use…

Continue Reading Differentially expressed gene analysis in Python with omicverse

A framework for individualized splice-switching oligonucleotide therapy

Patients The WGS and clinical data of 235 patients with A-T were provided by the Global A-T Family Data Platform of ATCP. Our access to the data was approved by the Data Access Committee of ATCP. Selected patients with A-T enrolled at the Manton Center for Orphan Disease Research under…

Continue Reading A framework for individualized splice-switching oligonucleotide therapy

R could not could not find function “makeTxDbFromGFF” after loading (GenomicFeatures)

R could not could not find function “makeTxDbFromGFF” after loading (GenomicFeatures) 0 @db96ead6 Last seen 7 hours ago United States Enter the body of text here I am new to R (R version 4.3.0) and RNAseq. As tutorial, I am using the paper ‘RNA-Seq workflow: gene-level exploratory analysis and differential…

Continue Reading R could not could not find function “makeTxDbFromGFF” after loading (GenomicFeatures)

Now Available! Access to Historical Human Transcript Alignments

Do you need to work with variant data mapped to historical human RefSeq transcript versions? To make it easier to map your data to the current GRCh38 reference genome and MANE transcripts, we’re now providing a collection of RefSeq transcript alignments including both the latest versions in the GCF_000001405.40-RS_2023_03 annotation…

Continue Reading Now Available! Access to Historical Human Transcript Alignments

Noncoding variants alter GATA2 expression in rhombomere 4 motor neurons and cause dominant hereditary congenital facial paresis

Tandem duplications and noncoding SNVs at the HCFP1 locus We enrolled families and simplex cases with nonsyndromic congenital facial paresis (CFP, cohort 1 US-based study) and performed genome-wide single-nucleotide polymorphism (SNP) analysis and whole-exome sequencing (WES) in two large dominant pedigrees, family 1 (Fam1) and family 9 (Fam9; Fig. 1a)….

Continue Reading Noncoding variants alter GATA2 expression in rhombomere 4 motor neurons and cause dominant hereditary congenital facial paresis

LOC127890413 H3K27ac hESC enhancers GRCh37_chr19:10362245-10362818 and GRCh37_chr19:10362819-10363392 [Homo sapiens (human)] – Gene

NEW Try the new Transcript table RefSeqs maintained independently of Annotated Genomes These reference sequences exist independently of genome builds. Explain These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by…

Continue Reading LOC127890413 H3K27ac hESC enhancers GRCh37_chr19:10362245-10362818 and GRCh37_chr19:10362819-10363392 [Homo sapiens (human)] – Gene

Find gene regions (START and END) using gene IDs

Find gene regions (START and END) using gene IDs 1 I have a list of gene IDs. I would like to know if there is a way to find the gene regions (START-END) on GRCh37 build? TIA gene GRCh37 • 57 views • link updated 1 hour ago by GenoMax…

Continue Reading Find gene regions (START and END) using gene IDs

Bioconductor – SNPlocs.Hsapiens.dbSNP.20100427 (development version)

      This is the development version of SNPlocs.Hsapiens.dbSNP.20100427; for the stable release version, see SNPlocs.Hsapiens.dbSNP.20100427. SNP locations for Homo sapiens (dbSNP BUILD 131) Bioconductor version: Development (3.5) SNP locations and alleles for Homo sapiens extracted from dbSNP BUILD 131:human_9606 (based on GRCh37). The source data files used for…

Continue Reading Bioconductor – SNPlocs.Hsapiens.dbSNP.20100427 (development version)

A genotype-to-phenotype approach suggests under-reporting of single nucleotide variants in nephrocystin-1 (NPHP1) related disease (UK 100,000 Genomes Project)

Konrad, M. et al. Large homozygous deletions of the 2q13 region are a major cause of juvenile nephronophthisis. Hum. Mol. Genet. 5, 367–371 (1996). Article  CAS  PubMed  Google Scholar  Hildebrandt, F. et al. A novel gene encoding an SH3 domain protein is mutated in nephronophthisis type 1. Nat. Genet. 17,…

Continue Reading A genotype-to-phenotype approach suggests under-reporting of single nucleotide variants in nephrocystin-1 (NPHP1) related disease (UK 100,000 Genomes Project)

LOC127888533 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr17:80054421-80055336 [Homo sapiens (human)] – Gene

NEW Try the new Transcript table RefSeqs maintained independently of Annotated Genomes These reference sequences exist independently of genome builds. Explain These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by…

Continue Reading LOC127888533 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr17:80054421-80055336 [Homo sapiens (human)] – Gene

Building Dict File for GATK

Building Dict File for GATK 4 I’m going through the instructions page on gatkforums.broadinstitute.org/gatk/discussion/1601/how-can-i-prepare-a-fasta-file-to-use-as-reference Specifically, the command I don’t see how to do is: java -jar CreateSequenceDictionary.jar R= Homo_sapiens_assembly18.fasta O= Homo_sapiens_assembly18.dict [Fri Jun 19 14:09:11 EDT 2009] net.sf.picard.sam.CreateSequenceDictionary R= Homo_sapiens_assembly18.fasta O= Homo_sapiens_assembly18.dict [Fri Jun 19 14:09:58 EDT 2009] net.sf.picard.sam.CreateSequenceDictionary done….

Continue Reading Building Dict File for GATK

LOC127271744 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr1:226270255-226271122 [Homo sapiens (human)] – Gene

NEW Try the new Transcript table RefSeqs maintained independently of Annotated Genomes These reference sequences exist independently of genome builds. Explain These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by…

Continue Reading LOC127271744 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr1:226270255-226271122 [Homo sapiens (human)] – Gene

How to download iGenomes from S3

How to download iGenomes from S3 1 Hi all, I run a pipeline on HPC that got an error with pulling iGenomes from S3 so I try to download it to my cluster but don’t know how. Would you have a suggestion? Thank you so much. ewels.github.io/AWS-iGenomes/ iGenome • 45…

Continue Reading How to download iGenomes from S3

Error in Adding 1000Genomes Ancestral Allele info: Using VCF tools fill-aa

Error in Adding 1000Genomes Ancestral Allele info: Using VCF tools fill-aa 1 Hi I am trying to add ancestral allele to 1000 Genomes Phase3 VCF files. I have used the “human_ancestor_GRCh37_e59.tar.bz2” files for ancestral allele input file. The steps I have used are: cat human_ancestor_3.fa | sed ‘s,^>.*,>1,’ | bgzip…

Continue Reading Error in Adding 1000Genomes Ancestral Allele info: Using VCF tools fill-aa

BAMboozle

BAMboozle 1 Hi, I am running BAMboozle to anonymize variant sequences using the GRCh37 human reference genome on my bam files. My bam files originally are 2-3 GB but when I get the output bam file from BAMboozle it is 500-600 Kb. Does BAMboozle decrease the size of the bam…

Continue Reading BAMboozle

Decoy In Reference Assembly

Decoy In Reference Assembly 2 I am using 1000 Genomes data with my new project. When I am inspecting the reference assembly they have been using, I found it contains a “decoy” contig. The 1000 Genomes FAQ says: For the final round of alignments the sequence data will be mapped…

Continue Reading Decoy In Reference Assembly

Reconstruction of the personal information from human genome reads in gut metagenome sequencing data –

Topic participation The examine protocol was accredited by the ethics committees of Osaka College and associated medical establishments in addition to the Translational Well being Science and Know-how Institute (Faridabad). Japanese people (n = 343) for whom intestine metagenome shotgun sequencing had been carried out in earlier research had been included on…

Continue Reading Reconstruction of the personal information from human genome reads in gut metagenome sequencing data –

Reconstruction of the personal information from human genome reads in gut metagenome sequencing data

Subject participation The study protocol was approved by the ethics committees of Osaka University and related medical institutions as well as the Translational Health Science and Technology Institute (Faridabad). Japanese individuals (n = 343) for whom gut metagenome shotgun sequencing were performed in previous studies were included in this study46,47,48. Among these…

Continue Reading Reconstruction of the personal information from human genome reads in gut metagenome sequencing data

Prevalence of BRCA homopolymeric indels in an ION Torrent-based tumour-to-germline testing workflow in high-grade ovarian carcinoma

Patients cohort Among consecutive patients who underwent BRCA tumour testing through ION Torrent-based sequencing between August 2017 and February 2022, we retrospectively selected 222 high-grade ovarian cancer (HGOC) patients with the following histological subtypes: 203 serous (HGSOC), seven endometrioid, five clear-cell and seven with mixed histotypes. Since NGS BRCA1/2 tumour…

Continue Reading Prevalence of BRCA homopolymeric indels in an ION Torrent-based tumour-to-germline testing workflow in high-grade ovarian carcinoma

Could not get first alignment from target

Can you share some of the image as text for easier understanding? It seems like there might be an issue with your BAM file or the region you are trying to call variants on. To help diagnose the issue, please follow these steps: 1. Check if your BAM file is…

Continue Reading Could not get first alignment from target

Novel intronic mutations of SLC12A3 gene, Gitelman syndrome

Introduction Gitelman syndrome (GS) is an autosomal recessive disease, characterized by hypokalemic alkalosis, accompanied by hypomagnesaemia, hypocalciuria, low blood pressure, and hypocalcemia, first described by Gitelman in 1966.1 It is caused by mutations in the SLC12A3 gene, which is located on the long arm of chromosome 16(16q13) and encodes the…

Continue Reading Novel intronic mutations of SLC12A3 gene, Gitelman syndrome

Targeting Poly(ADP)ribose polymerase in BCR/ABL1-positive cells

Cells and cell culture KOPN30, BV173, and K562 are BCR/ABL1-positive leukemia cell lines. All leukemia cell lines, as well as Ba/F3 cells, were maintained in RPMI-1640 medium supplemented with 15% fetal bovine serum (FBS) and penicillin–streptomycin (100 U/mL) at 37 °C in an atmosphere containing 5% CO2. KOPN30 cells were obtained…

Continue Reading Targeting Poly(ADP)ribose polymerase in BCR/ABL1-positive cells

Assembly Table.

Assembly Table. A. mellifera (Apr 2011 Amel_4.5/amel5) A. carolinensis (May 2010 AnoCar2.0/anoCar2) A. thaliana (Feb 2011 TAIR10/araTha1) B. taurus (Aug 2006 Btau_3.1/bosTau3) B. taurus (Nov 2014 Bos_taurus_UMD_3.1.1/bosTau8) C. familiaris (May 2005 CanFam2.0/canFam2) C. familiaris (Sep 2011 CanFam3.1/canFam3) C. porcellus (Feb 2008 Cavpor3.0/cavPor3) C. elegans (Oct 2010 WBcel215/ce10) C. elegans (Feb…

Continue Reading Assembly Table.

Parallel sequencing of extrachromosomal circular DNAs and transcriptomes in single cancer cells

scEC&T sequencing A detailed, step-by-step protocol of scEC&T-seq is available on the Nature Protocol Exchange46 and is described below. The duration of the protocol is approximately 8 days per 96-well plate. Cell culture Human tumor cell lines were obtained from ATCC (CHP-212) or were provided by J. J. Molenaar (TR14; Princess…

Continue Reading Parallel sequencing of extrachromosomal circular DNAs and transcriptomes in single cancer cells

Where do I get a large reference VCF?

Where do I get a large reference VCF? 1 I would like to download a large .vcf file containing many (hundreds or thousands) of samples. Ideally, I would download different population-specific .vcf files, but the ability to sort/filter by ancestry group is fine. Where do I get such a file?…

Continue Reading Where do I get a large reference VCF?

In vitro erythrocyte production using human-induced pluripotent stem cells: determining the best hematopoietic stem cell sources | Stem Cell Research & Therapy

Materials The materials used for cell cultures and characterization are listed in Additional file 1: Table S1. Cell sources After getting informed consent, PB was drawn from three healthy O, Rh D-positive donors. CB was collected from three healthy newborn babies at the Department of Obstetrics and Gynecology at Severance…

Continue Reading In vitro erythrocyte production using human-induced pluripotent stem cells: determining the best hematopoietic stem cell sources | Stem Cell Research & Therapy

Struggling with protein context of Annovar output

Struggling with protein context of Annovar output 0 Hi, Im having some troubles extracting the protein sequences of missense mutations from annovar output files. I would like to create all of the possible neopeptides arising from missense mutations of TCGA tumor samples. For this I used Annovar to get the…

Continue Reading Struggling with protein context of Annovar output

Issue With CRAM -> BAM -> FASTQ Conversion

Issue With CRAM -> BAM -> FASTQ Conversion 2 Please help! I am trying to obtain fastq files from the GDSC, all we have in the lab is CRAM files. Unfortunately, the reference genome seems to not exist when pulled from an online source. I have attempted to use the…

Continue Reading Issue With CRAM -> BAM -> FASTQ Conversion

STRavinsky STR database and PGTailor PGT tool demonstrate superiority of CHM13-T2T over hg38 and hg19 for STR-based applications

doi: 10.1038/s41431-023-01352-6. Online ahead of print. Affiliations Expand Affiliations 1 Morris Kahn Laboratory of Human Genetics, NIBN and Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, Israel. 2 Genetics Institute, Soroka Medical Center, Beer Sheva, Israel. 3 Morris Kahn Laboratory of Human Genetics, NIBN and Faculty…

Continue Reading STRavinsky STR database and PGTailor PGT tool demonstrate superiority of CHM13-T2T over hg38 and hg19 for STR-based applications

Index of /pub/clinvar

Name Last modified Size Parent Directory – ClinGen/ 2018-12-14 09:17 – document_archives/ 2014-04-24 08:19 – presentations/ 2021-06-23 17:39 – release_notes/ 2023-04-06 10:38 – submission_examples/ 2020-08-03 13:46 – submission_templates/ 2023-02-17 13:23 – tab_delimited/ 2023-04-10 15:02 – temp/ 2022-12-20 16:01 – vcf_GRCh37/ 2023-04-10 14:53 – vcf_GRCh38/ 2023-04-10 14:53 – xml/ 2023-04-10 15:02…

Continue Reading Index of /pub/clinvar

segmentation fault error

Forum:segmentation fault error 0 hi there, I have a lot of BAM files and I tried counting them using featureCounts. All the files works great, but these few files throwing these error I’m using the same annotation file all the time so I guess that’s not the problem, I also…

Continue Reading segmentation fault error

illegal reference to local variable array

Hi, Dear all, I am using Juicer to analyze Hic data, after mapping paired-end fastq file to the genome, I got the sam file. But the next step of chimeric_sam.awk reports error: (-:  Align of /home/jib79/hic/2019-NG/juicer/splits/SRR9822212.fastq.sam done successfullyawk: /home/jib79/hic/2019-NG/juicer/scripts/scripts/common/chimeric_sam.awk: line 50: illegal reference to local variable arrayawk: /home/jib79/hic/2019-NG/juicer/scripts/scripts/common/chimeric_sam.awk: line 51: illegal…

Continue Reading illegal reference to local variable array

Can’t call subsampled bam file with GATK Haplotypecaller with –disable-tool-default-read-filters

I want to simulate variant calling of an ultra-low-coverage >0.005x bam file. I subsampled reads from the (HG02024) sample of the 1KG phase 3 dataset. My code in R to do so is the following (bam and reference are just path extensions, file is the inital bam file): cov_rate <-…

Continue Reading Can’t call subsampled bam file with GATK Haplotypecaller with –disable-tool-default-read-filters

rs3750846 RefSNP Report – dbSNP

ALFA Allele FrequencyThe ALFA project provide aggregate allele frequency from dbGaP. More information is available on the project page including descriptions, data access, and terms of use. Release Version: 20201027095038 Help Frequency tab displays a table of the reference and alternate allele frequencies reported by various studies and populations. Table lines,…

Continue Reading rs3750846 RefSNP Report – dbSNP

VEP-like tool for sequence ontology and HGVS annotation of VCF files

Mehari is a software package for annotating VCF files with variant effect/consequence. The program uses hgvs-rs for projecting genomic variants to transcripts and proteins and thus has high prediction quality. Other popular tools offering variant effect/consequence prediction include: Mehari offers predictions that aim to mirror VariantValidator, the gold standard for…

Continue Reading VEP-like tool for sequence ontology and HGVS annotation of VCF files

LD correlation matrix reference file, where can I find it?

LD correlation matrix reference file, where can I find it? 0 I am searching for a website where I can download LD correlation (r^2) matrices for (any) European population. My interest is in SNPs (preferably rsid as indices. If not genomic locations for grch37/grch38). The data can be divided by…

Continue Reading LD correlation matrix reference file, where can I find it?

Dante Genomics launches Avanti Software for a plug-and-play genomic interpretation that takes minutes instead of hours

  NEW YORK, March 20, 2023 /PRNewswire/ — Dante Genomics, a global leader in genomics and precision medicine, launched today the beta version of Avanti, the Company’s proprietary B2B software for variant interpretation and report writing at scale. Avanti provides clinicians, geneticists and researchers with a plug-and-play web-based…

Continue Reading Dante Genomics launches Avanti Software for a plug-and-play genomic interpretation that takes minutes instead of hours

Obtain number of base pairs in a genome

Obtain number of base pairs in a genome 1 HI! It’s going to be a stupid question since I’m not anyhow related to bioinformatics – I’m interested into how can I obtain the number of base pairs in my genome sample. I’m trying to remake the experiment that was made…

Continue Reading Obtain number of base pairs in a genome

The Clinical Diagnostic Utility of Array CGH in Children with Syndromic Microcephaly

Abstract Background: A prospective study using array CGH in children with Syndromic microcephaly from a tertiary pediatric healthcare centre in India. Aim: To identify the copy number variations causative of microcephaly detected through chromosomal array CGH. Patients and Methods: Of the 60 patients, 33 (55%) males and 27 (45%) females…

Continue Reading The Clinical Diagnostic Utility of Array CGH in Children with Syndromic Microcephaly

Bowtie2 which reference is best ?

Bowtie2 which reference is best ? 1 Hello I am trying to learn Bowtie2. When I compared the overall alignment rate by bowtie2, there is a significant difference between the result of GRCh37 index and GRCh38 index. The overall alignment rate to GRCh37 is 98%, but that to GRCh38 is…

Continue Reading Bowtie2 which reference is best ?

Automated dbSNP lookup by rsID position, plus genome build liftover

Hola, just passing by to say ‘hi’. Please post bugs / suggestions as comments to this tutorial. rsID to position GRCh38 cat rsids.list rs1296488112 rs1226262848 rs1225501837 rs1484860612 rs1235553513 rs1424506967 cat rsids.list | while read rsid ; do pos=$(curl -sX GET “https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=$rsid&retmode=text&rettype=text” | sed ‘s/<\//\n/g’ | grep -o -P ‘\<CHRPOS\>.{0,15}’ |…

Continue Reading Automated dbSNP lookup by rsID position, plus genome build liftover

IJMS | Free Full-Text | Endothelial Differentiation of CCM1 Knockout iPSCs Triggers the Establishment of a Specific Gene Expression Signature

1. Introduction Cerebral cavernous malformations (CCMs) are capillary–venous lesions which are primarily found in the brain and spinal cord [1]. The familial form of this neurovascular disorder is inherited in an autosomal dominant manner with incomplete penetrance. Pathogenic variants in the CCM1 gene (also known as KRIT1) can be identified…

Continue Reading IJMS | Free Full-Text | Endothelial Differentiation of CCM1 Knockout iPSCs Triggers the Establishment of a Specific Gene Expression Signature

Can’t liftover vcf file from hg19 to hg38

Can’t liftover vcf file from hg19 to hg38 1 Hello everyone I have a vcf file that I’m trying to convert from hg19 to hg38. For that I’m using bcftools +liftover command from here . I previously tried to use picard VCF but the memory cost was too much from…

Continue Reading Can’t liftover vcf file from hg19 to hg38

get build 37 positions from dbSNP rsIDs

get build 37 positions from dbSNP rsIDs 4 $ mysql –user=genome –host=genome-mysql.cse.ucsc.edu -A -D hg19 -e ‘select chrom,chromStart,chromEnd,name from snp147 where name in (“rs371194064″,”rs779258992″,”rs26″,”rs25”)’ +——-+————+———-+————-+ | chrom | chromStart | chromEnd | name | +——-+————+———-+————-+ | chr7 | 11584141 | 11584142 | rs25 | | chr7 | 11583470 | 11583471…

Continue Reading get build 37 positions from dbSNP rsIDs

Looking for LDL GWAS summary stats in hg38

Hi All, I think last time I posted on here was nearly 10 years ago (!) I’m looking for a way to get summary statistics for a GWAS on LDL levels, where the statistics are in hg38. I found a study titled “Genome-wide study for circulating metabolites identifies 62 loci…

Continue Reading Looking for LDL GWAS summary stats in hg38

microRNAs not available in TxDb.Hsapiens.UCSC.hg38.knownGene

microRNAs not available in TxDb.Hsapiens.UCSC.hg38.knownGene 0 @lluis-revilla-sancho Last seen 8 hours ago European Union I was looking to some examples and I could retrieve the microRNAs of the hg19 transcriptome, but not from the hg38 transcript annotation. I realized this might be because TxDb.Hsapiens.UCSC.hg38.knownGene doesn’t have a miRBase build ID,…

Continue Reading microRNAs not available in TxDb.Hsapiens.UCSC.hg38.knownGene

CNVKit does not output all the accessible regions in the targets bed file

CNVKit does not output all the accessible regions in the targets bed file 1 Hello everybody, I am using CNVkit on my data using hg38 as reference. The command that I am using is the following: cnvkit.py batch sample.bam -n control.bam -m wgs -f reference.fasta –target-avg-size 1000 –output-dir results/ So,…

Continue Reading CNVKit does not output all the accessible regions in the targets bed file

Bioconductor – SNPlocs.Hsapiens.dbSNP155.GRCh37 (development version)

DOI: 10.18129/B9.bioc.SNPlocs.Hsapiens.dbSNP155.GRCh37     This is the development version of SNPlocs.Hsapiens.dbSNP155.GRCh37; to use it, please install the devel version of Bioconductor. Human SNP locations and alleles extracted from dbSNP Build 155 and placed on the GRCh37/hg19 assembly Bioconductor version: Development (3.16) The 929,496,192 SNPs in this package were extracted from…

Continue Reading Bioconductor – SNPlocs.Hsapiens.dbSNP155.GRCh37 (development version)

Ensembl ID mapping GRCh37 vs GRCh38

Ensembl ID mapping GRCh37 vs GRCh38 0 I currently have a large list of Ensembl protein IDs (ENSP) that are from GRCh37. I need to map these IDs to the entry name listed on the UniProt website (e.g. ‘CASPE_HUMAN’ ). I am having trouble doing this using the UniProt dataset…

Continue Reading Ensembl ID mapping GRCh37 vs GRCh38

How to modify VCF file?

Hi community, I have a question: the SNP position in vcf file is from GRCh37/hg19, I need to change the position to GRCh38. So, I used UCSC liftover to replace the hg19 pos by GRCh38 pos and deleted some SNPs, then sorted the pos and saved to a new vcf…

Continue Reading How to modify VCF file?

Obtain equivalent variant ids (chr-pos-ref-alt) for GRCh37 and GRCh38

Obtain equivalent variant ids (chr-pos-ref-alt) for GRCh37 and GRCh38 0 Hi all, I want to obtain the equivalent variant id (chr-pos-ref-alt) from GRCh38 in GRCh37. This is to deal with some variants poorly lifted over. To exemplify, see the variant gnomad.broadinstitute.org/variant/10-17838942-A-G?dataset=gnomad_r3 It has two equivalents in GRCh37. I want to…

Continue Reading Obtain equivalent variant ids (chr-pos-ref-alt) for GRCh37 and GRCh38