Tag: hg19

Moderate Mapping percentage

Moderate Mapping percentage 1 Hi all, I received my sequenced transcriptome and genomic data from my service provider and started working with it. Both the DNA and RNA data passed quality metrics post trimming. But the mapping percentage comes out to be 90% using bowtie-DNA and 85% using Hisat2-RNA. I…

Continue Reading Moderate Mapping percentage

The Evolution from HG19 to HG38

Welcome to another blog post! ‍ Reference genomes are essential benchmarks of a species’ genome that facilitate the accurate comparison of individual genomes and are crucial tools for identifying genetic variants and diagnosing rare diseases. Here, we will explore the evolution of the human reference genome, focusing on the transition…

Continue Reading The Evolution from HG19 to HG38

A Benchmark of Genetic Variant Calling Pipelines Using Metagenomic Short-Read Sequencing

Introduction Short-read metagenomic sequencing is the technique most widely used to explore the natural habitat of millions of bacteria. In comparison with 16S rRNA sequencing, shotgun metagenomic sequencing (MGS) provides sequence information of the whole genomes, which can be used to identify different genes present in an individual bacterium and…

Continue Reading A Benchmark of Genetic Variant Calling Pipelines Using Metagenomic Short-Read Sequencing

Analysis of sepsis combined with pulmonary infection by mNGS

Introduction Sepsis is one of the major diseases that poses a serious threat to human health, and its incidence and in-hospital mortality rates remain high despite the continuous updating of sepsis guidelines.1 Its main clinical manifestations are elevated body temperature, chills, and rapid heart rate, and it is most common…

Continue Reading Analysis of sepsis combined with pulmonary infection by mNGS

Randomized phase II study of preoperative afatinib in untreated head and neck cancers: predictive and pharmacodynamic biomarkers of activity

Study objectives and endpoints The main objective consisted in identifying predictive biomarkers of efficacy by exploring correlation between baseline potential biomarkers and radiological and metabolic responses to afatinib. Secondary objectives were to identify potential pharmacodynamic biomarkers, to evaluate the efficacy and safety of afatinib and to assess the metabolic and…

Continue Reading Randomized phase II study of preoperative afatinib in untreated head and neck cancers: predictive and pharmacodynamic biomarkers of activity

Genomic hypomethylation in cell-free DNA predicts responses to checkpoint blockade in lung and breast cancer

Lung cancer ICB cohort Advanced non-small cell lung carcinoma patients who were treated with anti-PD-1/PD-L1 monotherapy at Samsung Medical Center, Seoul, Republic of Korea were enrolled for this study. The present study has been reviewed and approved by the Institutional Review Board (IRB) of the Samsung Medical Center (IRB no….

Continue Reading Genomic hypomethylation in cell-free DNA predicts responses to checkpoint blockade in lung and breast cancer

ftbfs and autopkgtest regression with htslib 1.19

Source: cyvcf2 Version: 0.30.22-1 Severity: important Tags: ftbfs upstream With the introduction of htslib 1.19 in experimental, cyvcf2 is experiencing test failures at package build time and autopkgtest time. The relevant part of the error looks like: cyvcf2/tests/test_reader.py …………………Fatal Python error: Aborted Current thread 0x00007fa7874de040 (most recent call first): File “/<<PKGBUILDDIR>>/.pybuild/cpython3_3.11_cyvcf2/build/cyvcf2/tests/test_reader.py”, line 285…

Continue Reading ftbfs and autopkgtest regression with htslib 1.19

Convert bed file from hg19 to GRCH38

Convert bed file from hg19 to GRCH38 1 Hello everyone! I have a list of over 500,000 rs and I would like to obtain the coordinates (BED file) on the GRCH38 reference genome. I am using the UCSC Table Browser tool, but unfortunately, it doesn’t find 90,000 rs, and since…

Continue Reading Convert bed file from hg19 to GRCH38

Archaic Introgression Shaped Human Circadian Traits | Genome Biology and Evolution

Abstract When the ancestors of modern Eurasians migrated out of Africa and interbred with Eurasian archaic hominins, namely, Neanderthals and Denisovans, DNA of archaic ancestry integrated into the genomes of anatomically modern humans. This process potentially accelerated adaptation to Eurasian environmental factors, including reduced ultraviolet radiation and increased variation in…

Continue Reading Archaic Introgression Shaped Human Circadian Traits | Genome Biology and Evolution

Beyond the exome: utility of long-read whole genome sequencing in exome-negative autosomal recessive diseases | Genome Medicine

Our cohort comprises 34 families in which a presumably autosomal recessive disease defied molecular diagnosis by clinical exome sequencing (short-read sequencing-based) and reanalysis performed on the index individual for each family (Fig. 1). The index patient in each family was subjected to an average of 10 × depth lrWGS except for Family F8602…

Continue Reading Beyond the exome: utility of long-read whole genome sequencing in exome-negative autosomal recessive diseases | Genome Medicine

Single-cell analysis of chromatin accessibility in the adult mouse brain

Tissue preparation and nucleus isolation All experimental procedures using live animals were approved by the SALK Institute Animal Care and Use Committee under protocol number 18-00006. Adult C57BL/6J male mice were purchased from Jackson Laboratories. Brains were extracted from 56–63-day-old mice and sectioned into 600 µm coronal sections along the anterior–posterior…

Continue Reading Single-cell analysis of chromatin accessibility in the adult mouse brain

Bioactive glycans in a microbiome-directed food for children with malnutrition

Collection and handling of biospecimens obtained from participants in the randomized controlled clinical study of the efficacy of MDCF-2 The human study entitled ‘Community-based clinical trial with microbiota-directed complementary foods (MDCFs) made of locally available food ingredients for the management of children with primary moderate acute malnutrition (MAM)’ was approved…

Continue Reading Bioactive glycans in a microbiome-directed food for children with malnutrition

Methylation Analysis Tutorial in R_part1

The code and approaches that I share here are those I am using to analyze TCGA methylation data. At the bottom of the page, you can find references used to make this tutorial. If you are coming from a computer background, please bear with a geneticist who tried to code…

Continue Reading Methylation Analysis Tutorial in R_part1

DNA polymerases in precise and predictable CRISPR/Cas9-mediated chromosomal rearrangements | BMC Biology

Cell culture The human endometrial carcinoma HEC-1-B cells were cultured in the modified Eagle’s medium (MEM) supplemented with 10% fetal bovine serum (FBS) and 1% penicillin-streptomycin at 37°C in a 5% (v/v) CO2 incubator. The human embryonic kidney HEK293T cells were cultured in the Dulbecco’s modified Eagle’s medium (DMEM) supplemented…

Continue Reading DNA polymerases in precise and predictable CRISPR/Cas9-mediated chromosomal rearrangements | BMC Biology

Metagenomic next-gene sequencing for respiratory infections

Introduction Respiratory tract infections are common and occur frequently. Rapid and accurate microbial detection is essential for timely and appropriate treatment. Traditional microbial detection methods have some limitations such as dependence on morphology, long duration, low sensitivity, and high variability.1,2 Metagenomic next-generation sequencing (mNGS) is a new detection technology characterized…

Continue Reading Metagenomic next-gene sequencing for respiratory infections

sam – Discrepancy in Read Counts Between FastQ and BAM Files in Adapter-Trimmed Pipeline

In a FastQ to BAM pipeline where only adapter trimming is performed, I’ve noticed a potential discrepancy in read counts between the initial FastQ files and their resulting BAM file. Specifically, I’m seeking clarification on whether the following statement holds true: “Total number of reads in R1 and R2 FastQ…

Continue Reading sam – Discrepancy in Read Counts Between FastQ and BAM Files in Adapter-Trimmed Pipeline

Human hg38 chr6:31,165,200-31,165,800 UCSC Genome Browser v457

     Custom Tracks ac4C-RIP-seq peaks, hESC CTL-1hidedensesquishpackfull ac4C-RIP-seq peaks, hESC CTL-2hidedensesquishpackfull ac4C-RIP-seq peaks, hESC NAT10-KD-1hidedensesquishpackfull ac4C-RIP-seq peaks, hESC NAT10-KD-2hidedensesquishpackfull    Mapping and Sequencing Base Positionhidedensefull p14 Fix Patcheshidedensesquishpackfull p14 Alt Haplotypeshidedensesquishpackfull Assemblyhidedensesquishpackfull Centromereshidedensesquishpackfull Chromosome Bandhidedensesquishpackfull Clone Endshidedensesquishpackfull Exome Probesetshidedensesquishpackfull FISH Cloneshidedensesquishpackfull Gaphidedensesquishpackfull GC Percenthidedensefull GRC Contigshidedensefull GRC Incidenthidedensesquishpackfull Hg19…

Continue Reading Human hg38 chr6:31,165,200-31,165,800 UCSC Genome Browser v457

ASEReadCounter output wrong number of coverage

ASEReadCounter output wrong number of coverage 0 Hi, I am using ASEReadCounter to count the number of reads per variant in a BAM file. For some positions, it will report 1 read covered(1 refCount or 1 altCount) while there is no read covered at those positions after checking it in…

Continue Reading ASEReadCounter output wrong number of coverage

Java error while running HiCDC overview code

Hey folks! I am trying to run the code provided in the HiCDC overview (github.com/mervesa/HiCDCPlus#diff_int). Everything runs as expected (as shown in the Github code) but at the moment of trying to save the output of HiCDCPlus_parallel() into a .hic file with hicdc2hic() I run into an error I am…

Continue Reading Java error while running HiCDC overview code

Identification of constrained sequence elements across 239 primate genomes

De novo assembly and repeat-masking To maximize the species diversity of primates in our analyses, we newly sequenced and assembled the genomes of 187 different primate species, initially presented in refs. 11,23, for which no other reference genome assembly was available. In brief, each individual was sequenced with 150 bp paired…

Continue Reading Identification of constrained sequence elements across 239 primate genomes

East Asian-specific and cross-ancestry genome-wide meta-analyses provide mechanistic insights into peptic ulcer disease

We conducted a three-stage genome-wide analysis of PUD and its subtypes. An overview of the workflow is provided in Fig. 1 and Supplementary Fig. 1. PUD cases in the east Asian populations were obtained by combining individuals with any of the two major PUD subtypes (DU and GU), which were…

Continue Reading East Asian-specific and cross-ancestry genome-wide meta-analyses provide mechanistic insights into peptic ulcer disease

DNA methylation change in blood cells of FB and CFS patients

Introduction Fibromyalgia (FM) and Chronic Fatigue Syndrome (CFS) are characterized by chronic pain, fatigue, and weakness. Patients with these symptoms also suffer from sleep abnormalities and report affected cognitive processes such as memory. The diagnosis of these two syndromes is challenging and is based on questionnaires that make the diagnosis…

Continue Reading DNA methylation change in blood cells of FB and CFS patients

missing region in the process of annotation

missing region in the process of annotation 0 Hi. I am analyzing TCGA methylation data from TCGAbiolinks and I faced one problem during annotation process with annotatr. This TCGA data has covered a gene in the chromosome 19, but annotated result did not contain one region in chromosome 19. I…

Continue Reading missing region in the process of annotation

H101 for cervical cancer | DDDT

Introduction Patients with persistent, recurrent, or metastatic (P/R/M) cervical carcinoma respond poorly to treatment despite the best available therapeutic regimens, with a 5-year survival of 17%.1 Most of them are heavily pretreated with chemotherapy and/or radiotherapy, and many patients experience complications related to treatment or advanced disease, which exclude them…

Continue Reading H101 for cervical cancer | DDDT

Evaluating 17 methods incorporating biological function with GWAS summary statistics to accelerate discovery demonstrates a tradeoff between high sensitivity and high positive predictive value

Method selection We reviewed the published literature through February 2020 to identify methods that met the following criteria: i. Descriptively categorized as (a) annotation-based; (b) pleiotropy-based; or (c) eQTL-based. ii. Utilized GWAS summary statistics, as opposed to individual-level genotype data. iii. Implemented using freely-available software or packages. iv. Provided either…

Continue Reading Evaluating 17 methods incorporating biological function with GWAS summary statistics to accelerate discovery demonstrates a tradeoff between high sensitivity and high positive predictive value

selection of reference genome

selection of reference genome 1 hello everyone, I got a vcf file with variation called using hg38 as reference genome. I wonder what would happen if I use hg19 as reference genome to annotate these variants. Would it be OK or get wrong? Thanks! hg19 reference genome hg38 • 25…

Continue Reading selection of reference genome

BBMap : NH:i:1 and XT:A:R

BBMap : NH:i:1 and XT:A:R 0 Hi, I aligned some paired-end reads on hg19 and found some strange stuff (at least for me..). Some reads have NH:i:1 i.e. the read aligned in one position and XT:A:R i.e. XT:A:R flag indicates that the second read is repetitive. I don’t understand how…

Continue Reading BBMap : NH:i:1 and XT:A:R

How to overlap patient VCF with ClinVar database annotation using bedtools?

How to overlap patient VCF with ClinVar database annotation using bedtools? 1 Hello, I’m trying to help a colleague who is trying to add ClinVar databases clinical significance column to VCF samples that she analysed. More specifically, we are trying to add overlapping/common variant annotation so that if the variant…

Continue Reading How to overlap patient VCF with ClinVar database annotation using bedtools?

How to obtain data on the coordinates of the Exon region from UCSC

How to obtain data on the coordinates of the Exon region from UCSC 1 UCSC is accessible via MySQL (genome.ucsc.edu/goldenPath/help/mysql.html) and I’ve always found it very useful to browse tables this way to see exactly what is contained in a specific table within a specific database. I use Sequel Pro…

Continue Reading How to obtain data on the coordinates of the Exon region from UCSC

Application of CNV-seq technology | IJWH

Introduction Ultrasound soft markers refer to small nonspecific variations in foetal structure found in prenatal ultrasound that are often associated with abnormal chromosome number or pathogenic copy number variations (CNVs).1,2 Common ultrasound soft markers include nuchal translucency (NT) thickness, nuchal fold (NF) thickness, nasal bone dysplasia, choroid plexus cyst, intracardiac…

Continue Reading Application of CNV-seq technology | IJWH

Allelic hierarchy for USH2A influences auditory and visual phenotypes in South Korean patients

Genotypes of USH2A-related disorders We identified 14 biallelic variants in USH2A, either homozygous or compound heterozygous, in a trans configuration. The segregation of these variants was confirmed by Sanger sequencing. Overall, 18 mutant alleles were implicated in the diagnosis of USH2A-related phenotypes, including c.251G > A:p.Cys84Tyr, c.2209C > T:p.Arg737*, c.2802 T > G:p.Cys934Trp, c.4372C > T:p.Arg1578Cys, c.4858C > T:p.Gln1620*, c.7120 + 1475A > G, c.8232G > C:p.Trp2744Cys,…

Continue Reading Allelic hierarchy for USH2A influences auditory and visual phenotypes in South Korean patients

LncRNA INHEG promotes glioma stem cell maintenance and tumorigenicity through regulating rRNA 2’-O-methylation

Ethics statement All mice procedures in this study were performed under an animal protocol approved by the Institutional Animal Care and Use Committee guidelines of Westlake University. The procedures and protocols for glioma patients were approved by the institutional review board of Beijing Tiantan Hospital. Informed consent was obtained from…

Continue Reading LncRNA INHEG promotes glioma stem cell maintenance and tumorigenicity through regulating rRNA 2’-O-methylation

Primate-specific ZNF808 is essential for pancreatic development in humans

Subjects The study was conducted in accordance with the Declaration of Helsinki and all subjects or their parents/guardian gave informed written consent for genetic testing. DNA testing and storage in the Beta Cell Research Bank was approved by the Wales Research Ethics Committee 5 Bangor (REC 17/WA/0327, IRAS project ID…

Continue Reading Primate-specific ZNF808 is essential for pancreatic development in humans

How to change "CompressedGRangesList" to "GRangesList"

Hi, I am trying to A/B compartment analysis with minfi, but I got following error. “`r Error in { : task 1 failed – “is(object, “SummarizedExperiment”) is not TRUE” “` Since I want to use data with hg38 annotation but `makeGenomicRatioSetFromMatrix` function has only `ilmn12.hg19`, I did `makeGenomicRatioSetFromMatrix` function with…

Continue Reading How to change "CompressedGRangesList" to "GRangesList"

A Cre-dependent massively parallel reporter assay allows for cell-type specific assessment of the functional effects of non-coding elements in vivo

Animal models All procedures involving animals were approved by the Institutional Animal Care and Use Committee (IACUC) at Washington University in St. Louis, MO. Veterinary care and housing was provided by the veterinarians and veterinary technicians of Washington University School of Medicine under Dougherty lab’s approved IACUC protocol. All protocols…

Continue Reading A Cre-dependent massively parallel reporter assay allows for cell-type specific assessment of the functional effects of non-coding elements in vivo

Understanding GISTIC 2.0

Hello, I ran my analysis on GISTIC_2.0 version 6.15.30 on Gene Pattern. I have 58 samples of whole genome sequencing data with the seg copy number file. I prepared the input for gistic as stated in the literature and the forum: a txt file with 6 columns: sample, chromosome, start,…

Continue Reading Understanding GISTIC 2.0

Bioconductor – GenomicRanges

    This package is for version 2.14 of Bioconductor; for the stable, up-to-date release version, see GenomicRanges. Representation and manipulation of genomic intervals Bioconductor version: 2.14 The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyze high-throughput sequencing…

Continue Reading Bioconductor – GenomicRanges

Bioconductor – BSgenome.Hsapiens.UCSC.hg19.masked

    This package is for version 3.3 of Bioconductor; for the stable, up-to-date release version, see BSgenome.Hsapiens.UCSC.hg19.masked. Full masked genome sequences for Homo sapiens (UCSC version hg19) Bioconductor version: 3.3 Full genome sequences for Homo sapiens (Human) as provided by UCSC (hg19, Feb. 2009) and stored in Biostrings objects….

Continue Reading Bioconductor – BSgenome.Hsapiens.UCSC.hg19.masked

Whole genome sequencing in high-grade cervical intraepitheli… : Medicine

1. Introduction Cervical cancer (CC) is the third most common cancer in women worldwide and has a high mortality rate among women. In 2008, CC was responsible for 275,000 deaths, thereby being the fourth leading cause of cancer death in females worldwide.[1,2] In China, CC is the second most…

Continue Reading Whole genome sequencing in high-grade cervical intraepitheli… : Medicine

Bioconductor – rtracklayer

DOI: 10.18129/B9.bioc.rtracklayer     R interface to genome annotation files and the UCSC genome browser Bioconductor version: Release (3.6) Extensible framework for interacting with multiple genome browsers (currently UCSC built-in) and manipulating annotation tracks in various formats (currently GFF, BED, bedGraph, BED15, WIG, BigWig and 2bit built-in). The user may…

Continue Reading Bioconductor – rtracklayer

Biological and genetic characterization of a newly established human external auditory canal carcinoma cell line, SCEACono2

Ethic statement The Clinical Research Ethics Review Committee of Kyushu University Hospital approved the study (permit no. 29-43, 30-268, and 700-00). Written informed consent for the current research project was obtained before the tumor tissue, and a blood sample were harvested. This study was also conducted according to the principles…

Continue Reading Biological and genetic characterization of a newly established human external auditory canal carcinoma cell line, SCEACono2

Invalid indirect expansion error on Slurm

In attempting to run juicer.sh on a Slurm cluster, I am met with an “indirect expansion” error. Is there any quick fix? Here are the steps taken: downloaded the source code of the latest stable release, v.1.6 added a command within the master script activating a conda environment with the…

Continue Reading Invalid indirect expansion error on Slurm

Bioconductor – AnnotationHub

DOI: 10.18129/B9.bioc.AnnotationHub     Client to access AnnotationHub resources Bioconductor version: Release (3.6) This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be…

Continue Reading Bioconductor – AnnotationHub

Clonal Hematopoiesis and Cardiovascular Disease in Patients With Multiple Myeloma Undergoing Hematopoietic Cell Transplant | Cardiology | JAMA Cardiology

Key Points Question  Is clonal hematopoiesis of indeterminate potential (CHIP) detected at the time of hematopoietic stem transplant (HCT) associated with increased rates of cardiovascular disease (CVD) among patients with multiple myeloma (MM) following HCT? Finding  In this cohort study of patients with MM undergoing HCT, CHIP was highly prevalent…

Continue Reading Clonal Hematopoiesis and Cardiovascular Disease in Patients With Multiple Myeloma Undergoing Hematopoietic Cell Transplant | Cardiology | JAMA Cardiology

About the item name of UCSC GWAS catalog

hgdownload.cse.ucsc.edu/goldenPath/hg19/database/gwasCatalog.sql `bin` smallint(5) unsigned NOT NULL, `chrom` varchar(255) NOT NULL, `chromStart` int(10) unsigned NOT NULL, `chromEnd` int(10) unsigned NOT NULL, `name` varchar(255) NOT NULL, `pubMedID` int(10) unsigned NOT NULL, `author` varchar(255) NOT NULL, `pubDate` varchar(255) NOT NULL, `journal` varchar(255) NOT NULL, `title` varchar(1024) NOT NULL, `trait` varchar(255) NOT NULL, `initSample`…

Continue Reading About the item name of UCSC GWAS catalog

Structural Variants in gnomAD v4

Today, we are thrilled to announce the release of genome-wide structural variants (SVs) for 63,046 unrelated samples with genome sequencing (GS) data. All site-level information for 1,199,117 high-quality SVs discovered in these samples is browsable in the gnomAD browser (gnomAD SV v4) and downloadable from the gnomAD downloads page. For…

Continue Reading Structural Variants in gnomAD v4

Single-nucleus DNA sequencing reveals hidden somatic loss-of-heterozygosity in Cerebral Cavernous Malformations

Ethical statement Our research complies with all relevant ethical regulations, including the Declaration of Helsinki and has been approved by the Institutional Review Boards of University of Chicago, Duke University and the Alliance to Cure Cavernous Malformations. Cerebral cavernous malformation lesions All human CCM tissue specimens have been previously reported18,19…

Continue Reading Single-nucleus DNA sequencing reveals hidden somatic loss-of-heterozygosity in Cerebral Cavernous Malformations

Error in Gviz (actually, rtracklayer)

Error in Gviz (actually, rtracklayer) | IdeogramTrack 0 @25075190 Last seen 7 minutes ago South Korea When I run this code (below) iTrack <- IdeogramTrack(genome = “hg19”, chromosome = “chr2”, name = “”) then I get the error Error: failed to load external entity “http://genome.ucsc.edu/FAQ/FAQreleases” Did someone else encounter this…

Continue Reading Error in Gviz (actually, rtracklayer)

Hey guys, I’m having a prob when using GATK4 BQSR . This dbsnp vcf file has chromosomes notated as 1,2 …. but my reference contiges are chr1.chr2…incompatibility in coutigs..

anilkumar@ak-omen-laptop:~/NGStools/gatk-4.4.0.0$ gatk –java-options “-DGATK_STACKTRACE_ON_USER_EXCEPTION=true” BaseRecalibrator -I “/media/anilkumar/My Passport/CRC/fastq/C_4_mkdp.bam” -R “/media/anilkumar/My Passport/CRC/fastq/hg19.fa” –known-sites “/media/anilkumar/My Passport/CRC/fastq/dbsnp_138.b37.vcf” –known-sites “/media/anilkumar/My Passport/CRC/fastq/Mills_and_1000G_gold_standard.indels.b37.vcf” –known-sites “/media/anilkumar/My Passport/CRC/fastq/1000G_phase1.indels.b37.vcf” -O “/media/anilkumar/My Passport/CRC/fastq/C_4_bqsr.table” Using GATK jar /home/anilkumar/NGStools/gatk-4.4.0.0/gatk-package-4.4.0.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -DGATK_STACKTRACE_ON_USER_EXCEPTION=true -jar /home/anilkumar/NGStools/gatk-4.4.0.0/gatk-package-4.4.0.0-local.jar BaseRecalibrator -I /media/anilkumar/My Passport/CRC/fastq/C_4_mkdp.bam -R /media/anilkumar/My Passport/CRC/fastq/hg19.fa –known-sites /media/anilkumar/My Passport/CRC/fastq/dbsnp_138.b37.vcf –known-sites /media/anilkumar/My Passport/CRC/fastq/Mills_and_1000G_gold_standard.indels.b37.vcf –known-sites…

Continue Reading Hey guys, I’m having a prob when using GATK4 BQSR . This dbsnp vcf file has chromosomes notated as 1,2 …. but my reference contiges are chr1.chr2…incompatibility in coutigs..

Bioconductor – regioneR (development version)

DOI: 10.18129/B9.bioc.regioneR   This is the development version of regioneR; for the stable release version, see regioneR. Association analysis of genomic regions based on permutation tests Bioconductor version: Development (3.19) regioneR offers a statistical framework based on customizable permutation tests to assess the association between genomic region sets and other…

Continue Reading Bioconductor – regioneR (development version)

Comparative Analysis of Structural Variant Callers on Short-Read Whole-Genome Sequencing Data

Pang, A.W., MacDonald, J.R., Pinto, D., et al., Towards a comprehensive structural variation map of an individual human genome, Genome Biol., 2010, vol. 11, no. 5, p. R52. doi.org/10.1186/gb-2010-11-5-r52 Article  CAS  PubMed  PubMed Central  Google Scholar  The International HapMap Consortium, The international HapMap project, Nature, 2003, pp. 789—796. doi.org/10.1038/nature02168 Sudmant,…

Continue Reading Comparative Analysis of Structural Variant Callers on Short-Read Whole-Genome Sequencing Data

coverage of dnase-seq narrow peak file of genome

First, generate intervals for hg19 (perhaps stripping out non-nuclear and mitochondrial chromosomes): $ fetchChromSizes hg19 | awk -v FS=”\t” -v OFS=”\t” ‘{ print $1, “0”, $2; }’ | grep -v “[_*_|MT]” | sort-bed – > hg19.nuc.bed To calculate coverage: $ bedmap –skip-unmapped –delim ‘\t’ –echo –bases-uniq –echo-ref-size –bases-uniq-f hg19.nuc.bed <(sort-bed…

Continue Reading coverage of dnase-seq narrow peak file of genome

Bioconductor – regionalpcs (development version)

DOI: 10.18129/B9.bioc.regionalpcs   This is the development version of regionalpcs; for the stable release version, see regionalpcs. Summarizing Regional Methylation with Regional Principal Components Analysis Bioconductor version: Development (3.19) Functions to summarize DNA methylation data using regional principal components. Regional principal components are computed using principal components analysis within genomic…

Continue Reading Bioconductor – regionalpcs (development version)

No valid chromosomes found! on Michigan Imputation Server

I have two vcf files – one hg19 and one hg38, analysing data from the same participants on two slightly different SNP platforms. Both files have been through the pre-imputation checks. The header (and the first line) of the hg38 version looks like: ##fileformat=VCFv4.3 ##FILTER=<ID=PASS,Description=”All filters passed”> ##fileDate=20230906 ##source=PLINKv2.00 ##contig=<ID=chr1,length=248917420>…

Continue Reading No valid chromosomes found! on Michigan Imputation Server

plotting gene structure in R

plotting gene structure in R 0 I have a list of amplified genes with genomic intervals and would like to visualise the specific part of gene that is affected. Is there a way to plot a gene and highlight the region of interest? I was thinking a plot with introns…

Continue Reading plotting gene structure in R

Pre-imputation checks using 1000G data (hg19) for a hg38 VCF

Pre-imputation checks using 1000G data (hg19) for a hg38 VCF 0 I’m trying to use the pre-imputation checks here www.well.ox.ac.uk/~wrayner/tools/ to check a vcf (on the hg38 assembly) on the 1000G phase 3 v5 data, which is hg19, before imputing using the MIS. Obviously, very few of the variants in…

Continue Reading Pre-imputation checks using 1000G data (hg19) for a hg38 VCF

GoM DE: interpreting structure in sequence count data with differential expression analysis allowing for grades of membership | Genome Biology

Models for single-cell ATAC-seq data In single-cell ATAC-seq data, \(x_{ij}\) is the number of unique reads mapping to peak or region j in cell i. Although \(x_{ij}\) can take non-negative integer values, it is common to “binarize” the accessibility data (e.g., [19, 74, 133,134,135]), meaning that \(x_{ij} = 1\) when…

Continue Reading GoM DE: interpreting structure in sequence count data with differential expression analysis allowing for grades of membership | Genome Biology

Chromatin compartmentalization regulates the response to DNA damage

Cell culture and treatments DIvA (AsiSI-ER-U2OS)19, AID-DIvA (AID-AsiSI-ER-U2OS)23 and 53BP1-GFP DIvA20 cells were developed in U2OS (ATCC HTB-96) cells and were previously described. Authentication of the U2OS cell line was performed by the provider ATCC, which uses morphology and short tandem repeat profiling to confirm the identity of human cell…

Continue Reading Chromatin compartmentalization regulates the response to DNA damage

Inactive S. aureus Cas9 downregulates alpha-synuclein and reduces mtDNA damage and oxidative stress levels in human stem cell model of Parkinson’s disease

Cloning of CRISPR/sgRNA lentiviral constructs with fluorescent selection markers A tetracycline-inducible promoter (TRE3G) was used to control the expression of S. aureus dCas9 in a lentiviral vector. To facilitate selection of cells by FACS, pHR:TRE3G-SadCas9-2xKRAB-p2a-tdTomato (Addgene ID #209298) was subcloned from a pHR:TRE3G-SadCas9-2xKRAB-p2a-zeo (A gift from Professor Stanley Qi), where zeocin…

Continue Reading Inactive S. aureus Cas9 downregulates alpha-synuclein and reduces mtDNA damage and oxidative stress levels in human stem cell model of Parkinson’s disease

R: Getting browser views

R: Getting browser views browserView-methods {rtracklayer} R Documentation Getting browser views Description Methods for creating and getting browser views. Usage browserView(object, range, track, …) Arguments object The object from which to get the views. range The GRanges or RangesList to display. If there are multiple elements, a view is created…

Continue Reading R: Getting browser views

map Ensembl gene ID from hg19 to hg38

map Ensembl gene ID from hg19 to hg38 0 Hello! I would like to convert Ensembl gene ID from hg19 to hg38 with R. I tried with this code: ensembl <- useMart(“ensembl”, dataset = “hsapiens_gene_ensembl”, host= “grch37.ensembl.org“) ensembl_ids <- c(“ENSG00000183878”, “ENSG00000146083”) converted_ids <- getLDS(attributes = c(“ensembl_gene_id”), filters = “ensembl_gene_id”, values…

Continue Reading map Ensembl gene ID from hg19 to hg38

public databases – Converting VCF format to text for use with PLINK and understanding column mapping

I successfully completed Nature PRS tutorial, which is based on PLINK. Turning to my real data, I downloaded ukb-d-20544_1.vcf.gz. Now I’m facing the problem that I seem to be unable to use it in PLINK or find the correct data format to download at all, and I am a bit…

Continue Reading public databases – Converting VCF format to text for use with PLINK and understanding column mapping

Ultra-fast deep-learned CNS tumour classification during surgery

Data simulation Short nanopore sequencing runs yield sparse and random coverage of the genome. To enable model training, we generate simulated sparse nanopore runs based on microarray data. To this end, N simulated reads are randomly sampled from the read length distribution (D) and assigned a start mapping position in…

Continue Reading Ultra-fast deep-learned CNS tumour classification during surgery

How to obtain data on the specific location of the segumental duplication.

How to obtain data on the specific location of the segumental duplication. 2 I know it can be viewed from the Repeats segemental dups at the bottom of the UCSC, but I would like to view it in the IGV, not the UCSC. So I looked for the golden path…

Continue Reading How to obtain data on the specific location of the segumental duplication.

Distribution tendencies of pathogens causing LRTI

Introduction Lower respiratory tract infection (LRTI) remains one of the leading causes of death worldwide.1 Several well-known pathogens, including Streptococcus pneumoniae, Pseudomonas aeruginosa, Klebsiella pneumoniae, Candida, Herpesvirus, and others, have been identified as significant causes of infection.2 Nonetheless, nearly half of the cases still have an undetermined etiology,3,4 despite the…

Continue Reading Distribution tendencies of pathogens causing LRTI

Bioconductor – GreyListChIP

DOI: 10.18129/B9.bioc.GreyListChIP     This package is for version 3.11 of Bioconductor; for the stable, up-to-date release version, see GreyListChIP. Grey Lists — Mask Artefact Regions Based on ChIP Inputs Bioconductor version: 3.11 Identify regions of ChIP experiments with high signal in the input, that lead to spurious peaks during…

Continue Reading Bioconductor – GreyListChIP

AlphaMissense Plugin VEP

AlphaMissense Plugin VEP 0 I’ve installed alphamissense plugin in VEP, but I can’t use it. I’ve downloaded the requested files and launch the tabix command before use it. Then I’ve launched the command but I got this error: WARNING: Failed to instantiate plugin AlphaMissense: ERROR: No file specified Try using…

Continue Reading AlphaMissense Plugin VEP

Progress and challenges in completing the human gene catalogue

In a recent review published in Nature, a group of authors reviewed the progress and challenges in annotating the human genome, including protein-coding genes, isoforms, and non-coding ribonucleic acids (RNAs), and advocated for a universal annotation standard for clinical use. Study: The status of the human gene catalogue. Image Credit:…

Continue Reading Progress and challenges in completing the human gene catalogue

How to choose LiftOver chain file

How to choose LiftOver chain file 1 I am trying to liftover a hg38 Whole Genome Sequenced VCF to hg19 VCF. Planning to use GATK Picard for this. However not sure which liftover chain file to use from this path: hg38tohg19 picard LiftOver • 32 views • link updated 31…

Continue Reading How to choose LiftOver chain file

Bioconductor – RSVSim

DOI: 10.18129/B9.bioc.RSVSim     RSVSim: an R/Bioconductor package for the simulation of structural variations Bioconductor version: Release (3.11) RSVSim is a package for the simulation of deletions, insertions, inversion, tandem-duplications and translocations of various sizes in any genome available as FASTA-file or BSgenome data package. SV breakpoints can be placed…

Continue Reading Bioconductor – RSVSim

Bioconductor – SNPlocs.Hsapiens.dbSNP142.GRCh37

DOI: 10.18129/B9.bioc.SNPlocs.Hsapiens.dbSNP142.GRCh37     This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see SNPlocs.Hsapiens.dbSNP142.GRCh37. SNP locations for Homo sapiens (dbSNP Build 142) Bioconductor version: 3.13 SNP locations and alleles for Homo sapiens extracted from NCBI dbSNP Build 142. The source data files used for…

Continue Reading Bioconductor – SNPlocs.Hsapiens.dbSNP142.GRCh37

Multitissue H3K27ac profiling of GTEx samples links epigenomic variation to disease

Samples for H3K27ac ChIP–seq Samples were collected by the GTEx Consortium. The donor enrollment and consent, informed consent approval, histopathological review procedures, and biospecimen procurement methods and fixation were the same as previously described22. No compensation was provided to the families of participants. Massachusetts Institute of Technology Committee on the…

Continue Reading Multitissue H3K27ac profiling of GTEx samples links epigenomic variation to disease

Troubles launch IGV on Linux(Debian)

Troubles launch IGV on Linux(Debian) 0 I am trying to run IGV on Debian. I have followed this steps wget data.broadinstitute.org/igv/projects/downloads/2.16/IGV_Linux_2.16.2_WithJava.zip unzip IGV_Linux_2.16.2_WithJava.zip My@machine:~/software/IGV_Linux_2.16.2$ ./igv.sh And this is the output I got WARNING: package com.sun.java.swing.plaf.windows not in java.desktop WARNING: package sun.awt.windows not in java.desktop openjdk version “11.0.13” 2021-10-19 OpenJDK Runtime…

Continue Reading Troubles launch IGV on Linux(Debian)

Diagnostic genome sequencing improves diagnostic yield: a prospective single-centre study in 1000 patients with inherited eye diseases

Introduction Although protein-coding regions represent only 1–2% of the human genome, they harbour an estimated 85% of annotated pathogenic variants.1 2 Despite these numbers, genome sequencing (GS) usually achieves a higher diagnostic yield than sequencing approaches that focus on exonic regions, not least because of its more homogeneous coverage3 4…

Continue Reading Diagnostic genome sequencing improves diagnostic yield: a prospective single-centre study in 1000 patients with inherited eye diseases

Cell-free chromatin immunoprecipitation to detect molecular pathways in heart transplantation

Abstract Existing monitoring approaches in heart transplantation lack the sensitivity to provide deep molecular assessments to guide management, or require endomyocardial biopsy, an invasive and blind procedure that lacks the precision to reliably obtain biopsy samples from diseased sites. This study examined plasma cell-free DNA chromatin immunoprecipitation sequencing (cfChIP-seq) as…

Continue Reading Cell-free chromatin immunoprecipitation to detect molecular pathways in heart transplantation

Using ExomeDepth for GRCH38 processed samples to call CNVs

The only difference would be the annotations, instead of using bedframes from data(genes.hg19) and data(exons.hg19) in ExomeDepth, I got them from the UCSC Table Browser for hg38 (genome.ucsc.edu/cgi-bin/hgTables). The only info they contain are: chromosome start end name ..and then run as before. Change bed.frame = exons.hg19 to the exon…

Continue Reading Using ExomeDepth for GRCH38 processed samples to call CNVs

FGC21024 – YFull YTree Info

R-FGC21024 – YFull YTree Info SNPs currently defining R-FGC21024 FGC85126     FGC20988     V5770 / FGC21024     Sample ID Country / Language Info Ref File Testing company Statistics Status YF075661 —— R-FGC20980* —— Hg38 .BAM FTDNA (Y700) 41X, 18.7 Mbp, 151 bp HG01947 new —— R-Y34349 HG01947_old T2T .BAM Scientific…

Continue Reading FGC21024 – YFull YTree Info

Connection timing out when downloading hg19, mm10

Hi,   We are trying to setup a mirror of the UCSC browser.  The install went fine but when trying to download data we keep getting:  root@genome:/usr/install# bash browserSetup.sh mirror hg19 mm10 | | Downloading databases hg19 mm10 plus hgFixed/proteome/go from the UCSC download server | | Determining download file…

Continue Reading Connection timing out when downloading hg19, mm10

YP311 – YFull YTree Info

Sample ID Country / Language Info Ref File Testing company Statistics Status YF122695 new Austria (Oberösterreich) / German R-YP311* —— T2T .BAM Nebula Genomics 17X, 45.1 Mbp, 150 bp YF067341 Russia (Chuvashskaya Respublika) / Chuvash R-YP311* —— Hg19 .BAM Dante Labs 13X, 23.0 Mbp, 151 bp YF016502 Italy (Palermo) R-YP311*…

Continue Reading YP311 – YFull YTree Info

Liftover GRCh37 to hg38 1kg/GATK.

Liftover GRCh37 to hg38 1kg/GATK. 1 I need to liftover a few variants from GRCh37 to hg38 1kg/GATK. UCSC lifover does not have this reference genome version available. I have tried with the standard hg38 but conversations are wrong. Where can I find GRCh37 to hg38 1kg/GATK chain files or…

Continue Reading Liftover GRCh37 to hg38 1kg/GATK.

Managing your data (BAM, VCF, sample, phenotype) with RDF and SPARQL.

Tutorial:Managing your data (BAM, VCF, sample, phenotype) with RDF and SPARQL. 0 13 years after How Do You Manage Your Files & Directories For Your Projects ? , I wrote a tutorial about how I now manage my data : BAM, VCF, sample, phenotype, reference etc… how to link everything…

Continue Reading Managing your data (BAM, VCF, sample, phenotype) with RDF and SPARQL.

How To Get Bed File Containing Exons Of Canonical Transcripts And Their Corresponding Gene Symbols

Download a bed file for the canonical transcripts using UCSC Table Browser: track: UCSC Genes table: knownCanonical output format: select fields from primary and related tables press get output select fields from hg19.knownCanonical: chrom, chromStart, chromEnd, transcript select fields from hg19.kgXref: geneSymbol press get output The file UCSC_canonical.bed looks like:…

Continue Reading How To Get Bed File Containing Exons Of Canonical Transcripts And Their Corresponding Gene Symbols

RNA-sequencing and bioinformatics analysis | COPD

Introduction COPD, a common preventable and treatable disease characterized by persistent airflow limitation and respiratory symptoms, is associated with exposure to harmful environments. COPD is currently the third leading cause of death globally. The high incidence and mortality of COPD, which seriously threaten human health, represent a public health problem…

Continue Reading RNA-sequencing and bioinformatics analysis | COPD

Z5989 – YFull YTree Info

E-Z5989 – YFull YTree Info SNPs currently defining E-Z5989 Z5989(H)     H Sample ID Country / Language Info Ref File Testing company Statistics Status HG02461 Gambia, The (Western) / Mandinka E-Z5989* —— Hg19 .BAM Scientific —— GMJOL5309977 Gambia, The (Western) E-Z5990* —— Hg38 .BAM Scientific 5X, 23.1 Mbp, 100 bp…

Continue Reading Z5989 – YFull YTree Info

over-presented sequence in negative control

miRNAseq – over-presented sequence in negative control 0 Hello, everyone I have this sequence in the negative control of miRNA seq. TGGTAATACGACGTACTTAGTGT It did not map to any references (miRNA, tRNA, rRNA, piRNA, mRNA, hg19, bacteria). I use QIAseq miRNA Library and NextSeq 550. Any idea what it might be???…

Continue Reading over-presented sequence in negative control

Clinical efficiency of mNGS in sputum for pathogen detection

Introduction Lower respiratory infections (LRIs) are the world’s most deadly communicable disease and ranks fourth as the primary cause of death globally according to the World Health Organization (WHO) 2019 report.1,2 LRIs include hospital-acquired pneumonia (HAP), community-acquired pneumonia (CAP), bronchiolitis, bronchitis, and tracheitis.3,4 Immunocompromised patients have a higher risk of…

Continue Reading Clinical efficiency of mNGS in sputum for pathogen detection

RnBeads Differential Methylation

Hi, I am trying to run a Differential methylation analysis with RnBeads as I have done previously on a similar fashion. This time I am just running it on my HPC and it seems to run just fine until it arrives to the differential methylation step. See the error output…

Continue Reading RnBeads Differential Methylation

WES CNV analysis

WES CNV analysis 0 Hi, I am new to CNV analysis and beginner in R language. I am trying to call germline CNVs using exome data using ExomeDepth. I only have the raw data with hg38 reference. If you have the ExomeDepth scripts to run on hg38 reference. Kindly share…

Continue Reading WES CNV analysis

Annovar doesnt output CADD scores

Hi, I followed the Annovar tutorial with the default dataset (avsnp147, ExAC and dbnsfp30a). The tutorial can be found here: annovar.openbioinformatics.org/en/latest/user-guide/startup/ The resulting vcf contained all the expected format and data, including CADD scores. Then, I decided to repeat this using gnomad211_exome,avsnp150, and dbnsfp42c datasets instead of those above, but…

Continue Reading Annovar doesnt output CADD scores

Gut microbial carbohydrate metabolism contributes to insulin resistance

Study participants and data collection The study participants were recruited from 2014 to 2016 during their annual health check-ups at the University of Tokyo Hospital. The individuals included both male and female Japanese individuals aged from 20 to 75 years. The exclusion criteria were as follows: established diagnosis of diabetes,…

Continue Reading Gut microbial carbohydrate metabolism contributes to insulin resistance

Multivariate Analysis of Transcript Splicing (MATS)

Install rMATS: Add the Python directory to the $PATH environment variable Add the bowtie and tophat directories to the $PATH environment variable Add the samtools directory to the $PATH environment variable Obtain bowtie index for genome by either of the following two ways Build own bowtie index using bowtie-build from…

Continue Reading Multivariate Analysis of Transcript Splicing (MATS)

GATK AnnotateVcfWithBamDepth returns zero DP for all variants in VCF

Dear all, I am using GATK (v4.1.9.0) AnnotateVcfWithBamDepth to get the DP for all variants in ClinVar VCF in a retina RNA-seq BAM file. However, the tool returns zero depth for all variants in the VCF, even though I checked multiple variants in IGV and I saw that they are…

Continue Reading GATK AnnotateVcfWithBamDepth returns zero DP for all variants in VCF

illuminahumanmethylation450k annotation for hg38

illuminahumanmethylation450k annotation for hg38 0 I am new to R. I am trying to do a methylation analysis for CRC samples (.idat files) downloaded from GDC. Here’s what I am doing: query_met <- GDCquery(project = c(“TCGA-COAD”, “TCGA-READ”), data.category = “DNA Methylation”, data.type = “Masked Intensities”, platform = “Illumina Human Methylation…

Continue Reading illuminahumanmethylation450k annotation for hg38

Bioconductor – wavClusteR

DOI: 10.18129/B9.bioc.wavClusteR     This package is for version 3.12 of Bioconductor; for the stable, up-to-date release version, see wavClusteR. Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data Bioconductor version: 3.12 The package provides an integrated pipeline for the analysis of PAR-CLIP data. PAR-CLIP-induced transitions are…

Continue Reading Bioconductor – wavClusteR

Bioconductor – vulcan

DOI: 10.18129/B9.bioc.vulcan     This package is for version 3.15 of Bioconductor; for the stable, up-to-date release version, see vulcan. VirtUaL ChIP-Seq data Analysis using Networks Bioconductor version: 3.15 Vulcan (VirtUaL ChIP-Seq Analysis through Networks) is a package that interrogates gene regulatory networks to infer cofactors significantly enriched in a…

Continue Reading Bioconductor – vulcan

Advances in methylation analysis of liquid biopsy in early cancer detection of colorectal and lung cancer

Study participants Whole blood samples were collected from 327 participants consisting of 102 with colorectal cancer, 99 with lung cancer, and 126 healthy controls. After excluding 6 patients who withdrew consent to participate and two patients with QC-failed samples, the final analysis included 96 patients with colorectal cancer, 95 with…

Continue Reading Advances in methylation analysis of liquid biopsy in early cancer detection of colorectal and lung cancer

Long-molecule scars of backup DNA repair in BRCA1- and BRCA2-deficient cancers

Pan-cancer WGS data sources GrCh37/hg19 BAM alignments for 2,489 primary tumour and matched normal whole-genome sequencing data were obtained as previously described18. In brief, 989 tumour–normal (T/N) pairs were obtained from The Cancer Genome Atlas (TCGA) Research Network (Genomic Data Commons at portal.gdc.cancer.gov/, accession: phs000178.v11.p8). Additional WGS data were obtained for 874 T/N pairs…

Continue Reading Long-molecule scars of backup DNA repair in BRCA1- and BRCA2-deficient cancers

Alternatives To Liftover

Alternatives To Liftover 5 Has anybody had any success with any tools other than LiftOver or NCBI’s Genome Remapping Service for mapping/translating reference genome positions? My experience with LiftOver has been less than satisfactory and NCBI does not seem to offer any local version of their tool. Other than BLASTing…

Continue Reading Alternatives To Liftover

Bioconductor – gwascat (development version)

DOI: 10.18129/B9.bioc.gwascat   This is the development version of gwascat; for the stable release version, see gwascat. representing and modeling data in the EMBL-EBI GWAS catalog Bioconductor version: Development (3.18) Represent and model data in the EMBL-EBI GWAS catalog. Author: VJ Carey <stvjc at channing.harvard.edu> Maintainer: VJ Carey <stvjc at…

Continue Reading Bioconductor – gwascat (development version)

how to compare mapping of WES samples to human pangenome?

how to compare mapping of WES samples to human pangenome? 0 Hi, I’m still trying to wrap my head around the new human pangenome reference and would like some advice on how to go about analyzing some of the WES (hg38/hg19 baits) that I currently have. How should I calculate…

Continue Reading how to compare mapping of WES samples to human pangenome?