Categories
Tag: hg19
Moderate Mapping percentage
Moderate Mapping percentage 1 Hi all, I received my sequenced transcriptome and genomic data from my service provider and started working with it. Both the DNA and RNA data passed quality metrics post trimming. But the mapping percentage comes out to be 90% using bowtie-DNA and 85% using Hisat2-RNA. I…
The Evolution from HG19 to HG38
Welcome to another blog post! Reference genomes are essential benchmarks of a species’ genome that facilitate the accurate comparison of individual genomes and are crucial tools for identifying genetic variants and diagnosing rare diseases. Here, we will explore the evolution of the human reference genome, focusing on the transition…
A Benchmark of Genetic Variant Calling Pipelines Using Metagenomic Short-Read Sequencing
Introduction Short-read metagenomic sequencing is the technique most widely used to explore the natural habitat of millions of bacteria. In comparison with 16S rRNA sequencing, shotgun metagenomic sequencing (MGS) provides sequence information of the whole genomes, which can be used to identify different genes present in an individual bacterium and…
Analysis of sepsis combined with pulmonary infection by mNGS
Introduction Sepsis is one of the major diseases that poses a serious threat to human health, and its incidence and in-hospital mortality rates remain high despite the continuous updating of sepsis guidelines.1 Its main clinical manifestations are elevated body temperature, chills, and rapid heart rate, and it is most common…
Randomized phase II study of preoperative afatinib in untreated head and neck cancers: predictive and pharmacodynamic biomarkers of activity
Study objectives and endpoints The main objective consisted in identifying predictive biomarkers of efficacy by exploring correlation between baseline potential biomarkers and radiological and metabolic responses to afatinib. Secondary objectives were to identify potential pharmacodynamic biomarkers, to evaluate the efficacy and safety of afatinib and to assess the metabolic and…
Genomic hypomethylation in cell-free DNA predicts responses to checkpoint blockade in lung and breast cancer
Lung cancer ICB cohort Advanced non-small cell lung carcinoma patients who were treated with anti-PD-1/PD-L1 monotherapy at Samsung Medical Center, Seoul, Republic of Korea were enrolled for this study. The present study has been reviewed and approved by the Institutional Review Board (IRB) of the Samsung Medical Center (IRB no….
ftbfs and autopkgtest regression with htslib 1.19
Source: cyvcf2 Version: 0.30.22-1 Severity: important Tags: ftbfs upstream With the introduction of htslib 1.19 in experimental, cyvcf2 is experiencing test failures at package build time and autopkgtest time. The relevant part of the error looks like: cyvcf2/tests/test_reader.py …………………Fatal Python error: Aborted Current thread 0x00007fa7874de040 (most recent call first): File “/<<PKGBUILDDIR>>/.pybuild/cpython3_3.11_cyvcf2/build/cyvcf2/tests/test_reader.py”, line 285…
Convert bed file from hg19 to GRCH38
Convert bed file from hg19 to GRCH38 1 Hello everyone! I have a list of over 500,000 rs and I would like to obtain the coordinates (BED file) on the GRCH38 reference genome. I am using the UCSC Table Browser tool, but unfortunately, it doesn’t find 90,000 rs, and since…
Archaic Introgression Shaped Human Circadian Traits | Genome Biology and Evolution
Abstract When the ancestors of modern Eurasians migrated out of Africa and interbred with Eurasian archaic hominins, namely, Neanderthals and Denisovans, DNA of archaic ancestry integrated into the genomes of anatomically modern humans. This process potentially accelerated adaptation to Eurasian environmental factors, including reduced ultraviolet radiation and increased variation in…
Beyond the exome: utility of long-read whole genome sequencing in exome-negative autosomal recessive diseases | Genome Medicine
Our cohort comprises 34 families in which a presumably autosomal recessive disease defied molecular diagnosis by clinical exome sequencing (short-read sequencing-based) and reanalysis performed on the index individual for each family (Fig. 1). The index patient in each family was subjected to an average of 10 × depth lrWGS except for Family F8602…
Single-cell analysis of chromatin accessibility in the adult mouse brain
Tissue preparation and nucleus isolation All experimental procedures using live animals were approved by the SALK Institute Animal Care and Use Committee under protocol number 18-00006. Adult C57BL/6J male mice were purchased from Jackson Laboratories. Brains were extracted from 56–63-day-old mice and sectioned into 600 µm coronal sections along the anterior–posterior…
Bioactive glycans in a microbiome-directed food for children with malnutrition
Collection and handling of biospecimens obtained from participants in the randomized controlled clinical study of the efficacy of MDCF-2 The human study entitled ‘Community-based clinical trial with microbiota-directed complementary foods (MDCFs) made of locally available food ingredients for the management of children with primary moderate acute malnutrition (MAM)’ was approved…
Methylation Analysis Tutorial in R_part1
The code and approaches that I share here are those I am using to analyze TCGA methylation data. At the bottom of the page, you can find references used to make this tutorial. If you are coming from a computer background, please bear with a geneticist who tried to code…
DNA polymerases in precise and predictable CRISPR/Cas9-mediated chromosomal rearrangements | BMC Biology
Cell culture The human endometrial carcinoma HEC-1-B cells were cultured in the modified Eagle’s medium (MEM) supplemented with 10% fetal bovine serum (FBS) and 1% penicillin-streptomycin at 37°C in a 5% (v/v) CO2 incubator. The human embryonic kidney HEK293T cells were cultured in the Dulbecco’s modified Eagle’s medium (DMEM) supplemented…
Metagenomic next-gene sequencing for respiratory infections
Introduction Respiratory tract infections are common and occur frequently. Rapid and accurate microbial detection is essential for timely and appropriate treatment. Traditional microbial detection methods have some limitations such as dependence on morphology, long duration, low sensitivity, and high variability.1,2 Metagenomic next-generation sequencing (mNGS) is a new detection technology characterized…
sam – Discrepancy in Read Counts Between FastQ and BAM Files in Adapter-Trimmed Pipeline
In a FastQ to BAM pipeline where only adapter trimming is performed, I’ve noticed a potential discrepancy in read counts between the initial FastQ files and their resulting BAM file. Specifically, I’m seeking clarification on whether the following statement holds true: “Total number of reads in R1 and R2 FastQ…
Human hg38 chr6:31,165,200-31,165,800 UCSC Genome Browser v457
Custom Tracks ac4C-RIP-seq peaks, hESC CTL-1hidedensesquishpackfull ac4C-RIP-seq peaks, hESC CTL-2hidedensesquishpackfull ac4C-RIP-seq peaks, hESC NAT10-KD-1hidedensesquishpackfull ac4C-RIP-seq peaks, hESC NAT10-KD-2hidedensesquishpackfull Mapping and Sequencing Base Positionhidedensefull p14 Fix Patcheshidedensesquishpackfull p14 Alt Haplotypeshidedensesquishpackfull Assemblyhidedensesquishpackfull Centromereshidedensesquishpackfull Chromosome Bandhidedensesquishpackfull Clone Endshidedensesquishpackfull Exome Probesetshidedensesquishpackfull FISH Cloneshidedensesquishpackfull Gaphidedensesquishpackfull GC Percenthidedensefull GRC Contigshidedensefull GRC Incidenthidedensesquishpackfull Hg19…
ASEReadCounter output wrong number of coverage
ASEReadCounter output wrong number of coverage 0 Hi, I am using ASEReadCounter to count the number of reads per variant in a BAM file. For some positions, it will report 1 read covered(1 refCount or 1 altCount) while there is no read covered at those positions after checking it in…
Java error while running HiCDC overview code
Hey folks! I am trying to run the code provided in the HiCDC overview (github.com/mervesa/HiCDCPlus#diff_int). Everything runs as expected (as shown in the Github code) but at the moment of trying to save the output of HiCDCPlus_parallel() into a .hic file with hicdc2hic() I run into an error I am…
Identification of constrained sequence elements across 239 primate genomes
De novo assembly and repeat-masking To maximize the species diversity of primates in our analyses, we newly sequenced and assembled the genomes of 187 different primate species, initially presented in refs. 11,23, for which no other reference genome assembly was available. In brief, each individual was sequenced with 150 bp paired…
East Asian-specific and cross-ancestry genome-wide meta-analyses provide mechanistic insights into peptic ulcer disease
We conducted a three-stage genome-wide analysis of PUD and its subtypes. An overview of the workflow is provided in Fig. 1 and Supplementary Fig. 1. PUD cases in the east Asian populations were obtained by combining individuals with any of the two major PUD subtypes (DU and GU), which were…
DNA methylation change in blood cells of FB and CFS patients
Introduction Fibromyalgia (FM) and Chronic Fatigue Syndrome (CFS) are characterized by chronic pain, fatigue, and weakness. Patients with these symptoms also suffer from sleep abnormalities and report affected cognitive processes such as memory. The diagnosis of these two syndromes is challenging and is based on questionnaires that make the diagnosis…
missing region in the process of annotation
missing region in the process of annotation 0 Hi. I am analyzing TCGA methylation data from TCGAbiolinks and I faced one problem during annotation process with annotatr. This TCGA data has covered a gene in the chromosome 19, but annotated result did not contain one region in chromosome 19. I…
H101 for cervical cancer | DDDT
Introduction Patients with persistent, recurrent, or metastatic (P/R/M) cervical carcinoma respond poorly to treatment despite the best available therapeutic regimens, with a 5-year survival of 17%.1 Most of them are heavily pretreated with chemotherapy and/or radiotherapy, and many patients experience complications related to treatment or advanced disease, which exclude them…
Evaluating 17 methods incorporating biological function with GWAS summary statistics to accelerate discovery demonstrates a tradeoff between high sensitivity and high positive predictive value
Method selection We reviewed the published literature through February 2020 to identify methods that met the following criteria: i. Descriptively categorized as (a) annotation-based; (b) pleiotropy-based; or (c) eQTL-based. ii. Utilized GWAS summary statistics, as opposed to individual-level genotype data. iii. Implemented using freely-available software or packages. iv. Provided either…
selection of reference genome
selection of reference genome 1 hello everyone, I got a vcf file with variation called using hg38 as reference genome. I wonder what would happen if I use hg19 as reference genome to annotate these variants. Would it be OK or get wrong? Thanks! hg19 reference genome hg38 • 25…
BBMap : NH:i:1 and XT:A:R
BBMap : NH:i:1 and XT:A:R 0 Hi, I aligned some paired-end reads on hg19 and found some strange stuff (at least for me..). Some reads have NH:i:1 i.e. the read aligned in one position and XT:A:R i.e. XT:A:R flag indicates that the second read is repetitive. I don’t understand how…
How to overlap patient VCF with ClinVar database annotation using bedtools?
How to overlap patient VCF with ClinVar database annotation using bedtools? 1 Hello, I’m trying to help a colleague who is trying to add ClinVar databases clinical significance column to VCF samples that she analysed. More specifically, we are trying to add overlapping/common variant annotation so that if the variant…
How to obtain data on the coordinates of the Exon region from UCSC
How to obtain data on the coordinates of the Exon region from UCSC 1 UCSC is accessible via MySQL (genome.ucsc.edu/goldenPath/help/mysql.html) and I’ve always found it very useful to browse tables this way to see exactly what is contained in a specific table within a specific database. I use Sequel Pro…
Application of CNV-seq technology | IJWH
Introduction Ultrasound soft markers refer to small nonspecific variations in foetal structure found in prenatal ultrasound that are often associated with abnormal chromosome number or pathogenic copy number variations (CNVs).1,2 Common ultrasound soft markers include nuchal translucency (NT) thickness, nuchal fold (NF) thickness, nasal bone dysplasia, choroid plexus cyst, intracardiac…
Allelic hierarchy for USH2A influences auditory and visual phenotypes in South Korean patients
Genotypes of USH2A-related disorders We identified 14 biallelic variants in USH2A, either homozygous or compound heterozygous, in a trans configuration. The segregation of these variants was confirmed by Sanger sequencing. Overall, 18 mutant alleles were implicated in the diagnosis of USH2A-related phenotypes, including c.251G > A:p.Cys84Tyr, c.2209C > T:p.Arg737*, c.2802 T > G:p.Cys934Trp, c.4372C > T:p.Arg1578Cys, c.4858C > T:p.Gln1620*, c.7120 + 1475A > G, c.8232G > C:p.Trp2744Cys,…
LncRNA INHEG promotes glioma stem cell maintenance and tumorigenicity through regulating rRNA 2’-O-methylation
Ethics statement All mice procedures in this study were performed under an animal protocol approved by the Institutional Animal Care and Use Committee guidelines of Westlake University. The procedures and protocols for glioma patients were approved by the institutional review board of Beijing Tiantan Hospital. Informed consent was obtained from…
Primate-specific ZNF808 is essential for pancreatic development in humans
Subjects The study was conducted in accordance with the Declaration of Helsinki and all subjects or their parents/guardian gave informed written consent for genetic testing. DNA testing and storage in the Beta Cell Research Bank was approved by the Wales Research Ethics Committee 5 Bangor (REC 17/WA/0327, IRAS project ID…
How to change "CompressedGRangesList" to "GRangesList"
Hi, I am trying to A/B compartment analysis with minfi, but I got following error. “`r Error in { : task 1 failed – “is(object, “SummarizedExperiment”) is not TRUE” “` Since I want to use data with hg38 annotation but `makeGenomicRatioSetFromMatrix` function has only `ilmn12.hg19`, I did `makeGenomicRatioSetFromMatrix` function with…
A Cre-dependent massively parallel reporter assay allows for cell-type specific assessment of the functional effects of non-coding elements in vivo
Animal models All procedures involving animals were approved by the Institutional Animal Care and Use Committee (IACUC) at Washington University in St. Louis, MO. Veterinary care and housing was provided by the veterinarians and veterinary technicians of Washington University School of Medicine under Dougherty lab’s approved IACUC protocol. All protocols…
Understanding GISTIC 2.0
Hello, I ran my analysis on GISTIC_2.0 version 6.15.30 on Gene Pattern. I have 58 samples of whole genome sequencing data with the seg copy number file. I prepared the input for gistic as stated in the literature and the forum: a txt file with 6 columns: sample, chromosome, start,…
Bioconductor – GenomicRanges
This package is for version 2.14 of Bioconductor; for the stable, up-to-date release version, see GenomicRanges. Representation and manipulation of genomic intervals Bioconductor version: 2.14 The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyze high-throughput sequencing…
Bioconductor – BSgenome.Hsapiens.UCSC.hg19.masked
This package is for version 3.3 of Bioconductor; for the stable, up-to-date release version, see BSgenome.Hsapiens.UCSC.hg19.masked. Full masked genome sequences for Homo sapiens (UCSC version hg19) Bioconductor version: 3.3 Full genome sequences for Homo sapiens (Human) as provided by UCSC (hg19, Feb. 2009) and stored in Biostrings objects….
Whole genome sequencing in high-grade cervical intraepitheli… : Medicine
1. Introduction Cervical cancer (CC) is the third most common cancer in women worldwide and has a high mortality rate among women. In 2008, CC was responsible for 275,000 deaths, thereby being the fourth leading cause of cancer death in females worldwide.[1,2] In China, CC is the second most…
Bioconductor – rtracklayer
DOI: 10.18129/B9.bioc.rtracklayer R interface to genome annotation files and the UCSC genome browser Bioconductor version: Release (3.6) Extensible framework for interacting with multiple genome browsers (currently UCSC built-in) and manipulating annotation tracks in various formats (currently GFF, BED, bedGraph, BED15, WIG, BigWig and 2bit built-in). The user may…
Biological and genetic characterization of a newly established human external auditory canal carcinoma cell line, SCEACono2
Ethic statement The Clinical Research Ethics Review Committee of Kyushu University Hospital approved the study (permit no. 29-43, 30-268, and 700-00). Written informed consent for the current research project was obtained before the tumor tissue, and a blood sample were harvested. This study was also conducted according to the principles…
Invalid indirect expansion error on Slurm
In attempting to run juicer.sh on a Slurm cluster, I am met with an “indirect expansion” error. Is there any quick fix? Here are the steps taken: downloaded the source code of the latest stable release, v.1.6 added a command within the master script activating a conda environment with the…
Bioconductor – AnnotationHub
DOI: 10.18129/B9.bioc.AnnotationHub Client to access AnnotationHub resources Bioconductor version: Release (3.6) This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be…
Clonal Hematopoiesis and Cardiovascular Disease in Patients With Multiple Myeloma Undergoing Hematopoietic Cell Transplant | Cardiology | JAMA Cardiology
Key Points Question Is clonal hematopoiesis of indeterminate potential (CHIP) detected at the time of hematopoietic stem transplant (HCT) associated with increased rates of cardiovascular disease (CVD) among patients with multiple myeloma (MM) following HCT? Finding In this cohort study of patients with MM undergoing HCT, CHIP was highly prevalent…
About the item name of UCSC GWAS catalog
hgdownload.cse.ucsc.edu/goldenPath/hg19/database/gwasCatalog.sql `bin` smallint(5) unsigned NOT NULL, `chrom` varchar(255) NOT NULL, `chromStart` int(10) unsigned NOT NULL, `chromEnd` int(10) unsigned NOT NULL, `name` varchar(255) NOT NULL, `pubMedID` int(10) unsigned NOT NULL, `author` varchar(255) NOT NULL, `pubDate` varchar(255) NOT NULL, `journal` varchar(255) NOT NULL, `title` varchar(1024) NOT NULL, `trait` varchar(255) NOT NULL, `initSample`…
Structural Variants in gnomAD v4
Today, we are thrilled to announce the release of genome-wide structural variants (SVs) for 63,046 unrelated samples with genome sequencing (GS) data. All site-level information for 1,199,117 high-quality SVs discovered in these samples is browsable in the gnomAD browser (gnomAD SV v4) and downloadable from the gnomAD downloads page. For…
Single-nucleus DNA sequencing reveals hidden somatic loss-of-heterozygosity in Cerebral Cavernous Malformations
Ethical statement Our research complies with all relevant ethical regulations, including the Declaration of Helsinki and has been approved by the Institutional Review Boards of University of Chicago, Duke University and the Alliance to Cure Cavernous Malformations. Cerebral cavernous malformation lesions All human CCM tissue specimens have been previously reported18,19…
Error in Gviz (actually, rtracklayer)
Error in Gviz (actually, rtracklayer) | IdeogramTrack 0 @25075190 Last seen 7 minutes ago South Korea When I run this code (below) iTrack <- IdeogramTrack(genome = “hg19”, chromosome = “chr2”, name = “”) then I get the error Error: failed to load external entity “http://genome.ucsc.edu/FAQ/FAQreleases” Did someone else encounter this…
Hey guys, I’m having a prob when using GATK4 BQSR . This dbsnp vcf file has chromosomes notated as 1,2 …. but my reference contiges are chr1.chr2…incompatibility in coutigs..
anilkumar@ak-omen-laptop:~/NGStools/gatk-4.4.0.0$ gatk –java-options “-DGATK_STACKTRACE_ON_USER_EXCEPTION=true” BaseRecalibrator -I “/media/anilkumar/My Passport/CRC/fastq/C_4_mkdp.bam” -R “/media/anilkumar/My Passport/CRC/fastq/hg19.fa” –known-sites “/media/anilkumar/My Passport/CRC/fastq/dbsnp_138.b37.vcf” –known-sites “/media/anilkumar/My Passport/CRC/fastq/Mills_and_1000G_gold_standard.indels.b37.vcf” –known-sites “/media/anilkumar/My Passport/CRC/fastq/1000G_phase1.indels.b37.vcf” -O “/media/anilkumar/My Passport/CRC/fastq/C_4_bqsr.table” Using GATK jar /home/anilkumar/NGStools/gatk-4.4.0.0/gatk-package-4.4.0.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -DGATK_STACKTRACE_ON_USER_EXCEPTION=true -jar /home/anilkumar/NGStools/gatk-4.4.0.0/gatk-package-4.4.0.0-local.jar BaseRecalibrator -I /media/anilkumar/My Passport/CRC/fastq/C_4_mkdp.bam -R /media/anilkumar/My Passport/CRC/fastq/hg19.fa –known-sites /media/anilkumar/My Passport/CRC/fastq/dbsnp_138.b37.vcf –known-sites /media/anilkumar/My Passport/CRC/fastq/Mills_and_1000G_gold_standard.indels.b37.vcf –known-sites…
Bioconductor – regioneR (development version)
DOI: 10.18129/B9.bioc.regioneR This is the development version of regioneR; for the stable release version, see regioneR. Association analysis of genomic regions based on permutation tests Bioconductor version: Development (3.19) regioneR offers a statistical framework based on customizable permutation tests to assess the association between genomic region sets and other…
Comparative Analysis of Structural Variant Callers on Short-Read Whole-Genome Sequencing Data
Pang, A.W., MacDonald, J.R., Pinto, D., et al., Towards a comprehensive structural variation map of an individual human genome, Genome Biol., 2010, vol. 11, no. 5, p. R52. doi.org/10.1186/gb-2010-11-5-r52 Article CAS PubMed PubMed Central Google Scholar The International HapMap Consortium, The international HapMap project, Nature, 2003, pp. 789—796. doi.org/10.1038/nature02168 Sudmant,…
coverage of dnase-seq narrow peak file of genome
First, generate intervals for hg19 (perhaps stripping out non-nuclear and mitochondrial chromosomes): $ fetchChromSizes hg19 | awk -v FS=”\t” -v OFS=”\t” ‘{ print $1, “0”, $2; }’ | grep -v “[_*_|MT]” | sort-bed – > hg19.nuc.bed To calculate coverage: $ bedmap –skip-unmapped –delim ‘\t’ –echo –bases-uniq –echo-ref-size –bases-uniq-f hg19.nuc.bed <(sort-bed…
Bioconductor – regionalpcs (development version)
DOI: 10.18129/B9.bioc.regionalpcs This is the development version of regionalpcs; for the stable release version, see regionalpcs. Summarizing Regional Methylation with Regional Principal Components Analysis Bioconductor version: Development (3.19) Functions to summarize DNA methylation data using regional principal components. Regional principal components are computed using principal components analysis within genomic…
No valid chromosomes found! on Michigan Imputation Server
I have two vcf files – one hg19 and one hg38, analysing data from the same participants on two slightly different SNP platforms. Both files have been through the pre-imputation checks. The header (and the first line) of the hg38 version looks like: ##fileformat=VCFv4.3 ##FILTER=<ID=PASS,Description=”All filters passed”> ##fileDate=20230906 ##source=PLINKv2.00 ##contig=<ID=chr1,length=248917420>…
plotting gene structure in R
plotting gene structure in R 0 I have a list of amplified genes with genomic intervals and would like to visualise the specific part of gene that is affected. Is there a way to plot a gene and highlight the region of interest? I was thinking a plot with introns…
Pre-imputation checks using 1000G data (hg19) for a hg38 VCF
Pre-imputation checks using 1000G data (hg19) for a hg38 VCF 0 I’m trying to use the pre-imputation checks here www.well.ox.ac.uk/~wrayner/tools/ to check a vcf (on the hg38 assembly) on the 1000G phase 3 v5 data, which is hg19, before imputing using the MIS. Obviously, very few of the variants in…
GoM DE: interpreting structure in sequence count data with differential expression analysis allowing for grades of membership | Genome Biology
Models for single-cell ATAC-seq data In single-cell ATAC-seq data, \(x_{ij}\) is the number of unique reads mapping to peak or region j in cell i. Although \(x_{ij}\) can take non-negative integer values, it is common to “binarize” the accessibility data (e.g., [19, 74, 133,134,135]), meaning that \(x_{ij} = 1\) when…
Chromatin compartmentalization regulates the response to DNA damage
Cell culture and treatments DIvA (AsiSI-ER-U2OS)19, AID-DIvA (AID-AsiSI-ER-U2OS)23 and 53BP1-GFP DIvA20 cells were developed in U2OS (ATCC HTB-96) cells and were previously described. Authentication of the U2OS cell line was performed by the provider ATCC, which uses morphology and short tandem repeat profiling to confirm the identity of human cell…
Inactive S. aureus Cas9 downregulates alpha-synuclein and reduces mtDNA damage and oxidative stress levels in human stem cell model of Parkinson’s disease
Cloning of CRISPR/sgRNA lentiviral constructs with fluorescent selection markers A tetracycline-inducible promoter (TRE3G) was used to control the expression of S. aureus dCas9 in a lentiviral vector. To facilitate selection of cells by FACS, pHR:TRE3G-SadCas9-2xKRAB-p2a-tdTomato (Addgene ID #209298) was subcloned from a pHR:TRE3G-SadCas9-2xKRAB-p2a-zeo (A gift from Professor Stanley Qi), where zeocin…
R: Getting browser views
R: Getting browser views browserView-methods {rtracklayer} R Documentation Getting browser views Description Methods for creating and getting browser views. Usage browserView(object, range, track, …) Arguments object The object from which to get the views. range The GRanges or RangesList to display. If there are multiple elements, a view is created…
map Ensembl gene ID from hg19 to hg38
map Ensembl gene ID from hg19 to hg38 0 Hello! I would like to convert Ensembl gene ID from hg19 to hg38 with R. I tried with this code: ensembl <- useMart(“ensembl”, dataset = “hsapiens_gene_ensembl”, host= “grch37.ensembl.org“) ensembl_ids <- c(“ENSG00000183878”, “ENSG00000146083”) converted_ids <- getLDS(attributes = c(“ensembl_gene_id”), filters = “ensembl_gene_id”, values…
public databases – Converting VCF format to text for use with PLINK and understanding column mapping
I successfully completed Nature PRS tutorial, which is based on PLINK. Turning to my real data, I downloaded ukb-d-20544_1.vcf.gz. Now I’m facing the problem that I seem to be unable to use it in PLINK or find the correct data format to download at all, and I am a bit…
Ultra-fast deep-learned CNS tumour classification during surgery
Data simulation Short nanopore sequencing runs yield sparse and random coverage of the genome. To enable model training, we generate simulated sparse nanopore runs based on microarray data. To this end, N simulated reads are randomly sampled from the read length distribution (D) and assigned a start mapping position in…
How to obtain data on the specific location of the segumental duplication.
How to obtain data on the specific location of the segumental duplication. 2 I know it can be viewed from the Repeats segemental dups at the bottom of the UCSC, but I would like to view it in the IGV, not the UCSC. So I looked for the golden path…
Distribution tendencies of pathogens causing LRTI
Introduction Lower respiratory tract infection (LRTI) remains one of the leading causes of death worldwide.1 Several well-known pathogens, including Streptococcus pneumoniae, Pseudomonas aeruginosa, Klebsiella pneumoniae, Candida, Herpesvirus, and others, have been identified as significant causes of infection.2 Nonetheless, nearly half of the cases still have an undetermined etiology,3,4 despite the…
Bioconductor – GreyListChIP
DOI: 10.18129/B9.bioc.GreyListChIP This package is for version 3.11 of Bioconductor; for the stable, up-to-date release version, see GreyListChIP. Grey Lists — Mask Artefact Regions Based on ChIP Inputs Bioconductor version: 3.11 Identify regions of ChIP experiments with high signal in the input, that lead to spurious peaks during…
AlphaMissense Plugin VEP
AlphaMissense Plugin VEP 0 I’ve installed alphamissense plugin in VEP, but I can’t use it. I’ve downloaded the requested files and launch the tabix command before use it. Then I’ve launched the command but I got this error: WARNING: Failed to instantiate plugin AlphaMissense: ERROR: No file specified Try using…
Progress and challenges in completing the human gene catalogue
In a recent review published in Nature, a group of authors reviewed the progress and challenges in annotating the human genome, including protein-coding genes, isoforms, and non-coding ribonucleic acids (RNAs), and advocated for a universal annotation standard for clinical use. Study: The status of the human gene catalogue. Image Credit:…
How to choose LiftOver chain file
How to choose LiftOver chain file 1 I am trying to liftover a hg38 Whole Genome Sequenced VCF to hg19 VCF. Planning to use GATK Picard for this. However not sure which liftover chain file to use from this path: hg38tohg19 picard LiftOver • 32 views • link updated 31…
Bioconductor – RSVSim
DOI: 10.18129/B9.bioc.RSVSim RSVSim: an R/Bioconductor package for the simulation of structural variations Bioconductor version: Release (3.11) RSVSim is a package for the simulation of deletions, insertions, inversion, tandem-duplications and translocations of various sizes in any genome available as FASTA-file or BSgenome data package. SV breakpoints can be placed…
Bioconductor – SNPlocs.Hsapiens.dbSNP142.GRCh37
DOI: 10.18129/B9.bioc.SNPlocs.Hsapiens.dbSNP142.GRCh37 This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see SNPlocs.Hsapiens.dbSNP142.GRCh37. SNP locations for Homo sapiens (dbSNP Build 142) Bioconductor version: 3.13 SNP locations and alleles for Homo sapiens extracted from NCBI dbSNP Build 142. The source data files used for…
Multitissue H3K27ac profiling of GTEx samples links epigenomic variation to disease
Samples for H3K27ac ChIP–seq Samples were collected by the GTEx Consortium. The donor enrollment and consent, informed consent approval, histopathological review procedures, and biospecimen procurement methods and fixation were the same as previously described22. No compensation was provided to the families of participants. Massachusetts Institute of Technology Committee on the…
Troubles launch IGV on Linux(Debian)
Troubles launch IGV on Linux(Debian) 0 I am trying to run IGV on Debian. I have followed this steps wget data.broadinstitute.org/igv/projects/downloads/2.16/IGV_Linux_2.16.2_WithJava.zip unzip IGV_Linux_2.16.2_WithJava.zip My@machine:~/software/IGV_Linux_2.16.2$ ./igv.sh And this is the output I got WARNING: package com.sun.java.swing.plaf.windows not in java.desktop WARNING: package sun.awt.windows not in java.desktop openjdk version “11.0.13” 2021-10-19 OpenJDK Runtime…
Diagnostic genome sequencing improves diagnostic yield: a prospective single-centre study in 1000 patients with inherited eye diseases
Introduction Although protein-coding regions represent only 1–2% of the human genome, they harbour an estimated 85% of annotated pathogenic variants.1 2 Despite these numbers, genome sequencing (GS) usually achieves a higher diagnostic yield than sequencing approaches that focus on exonic regions, not least because of its more homogeneous coverage3 4…
Cell-free chromatin immunoprecipitation to detect molecular pathways in heart transplantation
Abstract Existing monitoring approaches in heart transplantation lack the sensitivity to provide deep molecular assessments to guide management, or require endomyocardial biopsy, an invasive and blind procedure that lacks the precision to reliably obtain biopsy samples from diseased sites. This study examined plasma cell-free DNA chromatin immunoprecipitation sequencing (cfChIP-seq) as…
Using ExomeDepth for GRCH38 processed samples to call CNVs
The only difference would be the annotations, instead of using bedframes from data(genes.hg19) and data(exons.hg19) in ExomeDepth, I got them from the UCSC Table Browser for hg38 (genome.ucsc.edu/cgi-bin/hgTables). The only info they contain are: chromosome start end name ..and then run as before. Change bed.frame = exons.hg19 to the exon…
FGC21024 – YFull YTree Info
R-FGC21024 – YFull YTree Info SNPs currently defining R-FGC21024 FGC85126 FGC20988 V5770 / FGC21024 Sample ID Country / Language Info Ref File Testing company Statistics Status YF075661 —— R-FGC20980* —— Hg38 .BAM FTDNA (Y700) 41X, 18.7 Mbp, 151 bp HG01947 new —— R-Y34349 HG01947_old T2T .BAM Scientific…
Connection timing out when downloading hg19, mm10
Hi, We are trying to setup a mirror of the UCSC browser. The install went fine but when trying to download data we keep getting: root@genome:/usr/install# bash browserSetup.sh mirror hg19 mm10 | | Downloading databases hg19 mm10 plus hgFixed/proteome/go from the UCSC download server | | Determining download file…
YP311 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status YF122695 new Austria (Oberösterreich) / German R-YP311* —— T2T .BAM Nebula Genomics 17X, 45.1 Mbp, 150 bp YF067341 Russia (Chuvashskaya Respublika) / Chuvash R-YP311* —— Hg19 .BAM Dante Labs 13X, 23.0 Mbp, 151 bp YF016502 Italy (Palermo) R-YP311*…
Liftover GRCh37 to hg38 1kg/GATK.
Liftover GRCh37 to hg38 1kg/GATK. 1 I need to liftover a few variants from GRCh37 to hg38 1kg/GATK. UCSC lifover does not have this reference genome version available. I have tried with the standard hg38 but conversations are wrong. Where can I find GRCh37 to hg38 1kg/GATK chain files or…
Managing your data (BAM, VCF, sample, phenotype) with RDF and SPARQL.
Tutorial:Managing your data (BAM, VCF, sample, phenotype) with RDF and SPARQL. 0 13 years after How Do You Manage Your Files & Directories For Your Projects ? , I wrote a tutorial about how I now manage my data : BAM, VCF, sample, phenotype, reference etc… how to link everything…
How To Get Bed File Containing Exons Of Canonical Transcripts And Their Corresponding Gene Symbols
Download a bed file for the canonical transcripts using UCSC Table Browser: track: UCSC Genes table: knownCanonical output format: select fields from primary and related tables press get output select fields from hg19.knownCanonical: chrom, chromStart, chromEnd, transcript select fields from hg19.kgXref: geneSymbol press get output The file UCSC_canonical.bed looks like:…
RNA-sequencing and bioinformatics analysis | COPD
Introduction COPD, a common preventable and treatable disease characterized by persistent airflow limitation and respiratory symptoms, is associated with exposure to harmful environments. COPD is currently the third leading cause of death globally. The high incidence and mortality of COPD, which seriously threaten human health, represent a public health problem…
Z5989 – YFull YTree Info
E-Z5989 – YFull YTree Info SNPs currently defining E-Z5989 Z5989(H) H Sample ID Country / Language Info Ref File Testing company Statistics Status HG02461 Gambia, The (Western) / Mandinka E-Z5989* —— Hg19 .BAM Scientific —— GMJOL5309977 Gambia, The (Western) E-Z5990* —— Hg38 .BAM Scientific 5X, 23.1 Mbp, 100 bp…
over-presented sequence in negative control
miRNAseq – over-presented sequence in negative control 0 Hello, everyone I have this sequence in the negative control of miRNA seq. TGGTAATACGACGTACTTAGTGT It did not map to any references (miRNA, tRNA, rRNA, piRNA, mRNA, hg19, bacteria). I use QIAseq miRNA Library and NextSeq 550. Any idea what it might be???…
Clinical efficiency of mNGS in sputum for pathogen detection
Introduction Lower respiratory infections (LRIs) are the world’s most deadly communicable disease and ranks fourth as the primary cause of death globally according to the World Health Organization (WHO) 2019 report.1,2 LRIs include hospital-acquired pneumonia (HAP), community-acquired pneumonia (CAP), bronchiolitis, bronchitis, and tracheitis.3,4 Immunocompromised patients have a higher risk of…
RnBeads Differential Methylation
Hi, I am trying to run a Differential methylation analysis with RnBeads as I have done previously on a similar fashion. This time I am just running it on my HPC and it seems to run just fine until it arrives to the differential methylation step. See the error output…
WES CNV analysis
WES CNV analysis 0 Hi, I am new to CNV analysis and beginner in R language. I am trying to call germline CNVs using exome data using ExomeDepth. I only have the raw data with hg38 reference. If you have the ExomeDepth scripts to run on hg38 reference. Kindly share…
Annovar doesnt output CADD scores
Hi, I followed the Annovar tutorial with the default dataset (avsnp147, ExAC and dbnsfp30a). The tutorial can be found here: annovar.openbioinformatics.org/en/latest/user-guide/startup/ The resulting vcf contained all the expected format and data, including CADD scores. Then, I decided to repeat this using gnomad211_exome,avsnp150, and dbnsfp42c datasets instead of those above, but…
Gut microbial carbohydrate metabolism contributes to insulin resistance
Study participants and data collection The study participants were recruited from 2014 to 2016 during their annual health check-ups at the University of Tokyo Hospital. The individuals included both male and female Japanese individuals aged from 20 to 75 years. The exclusion criteria were as follows: established diagnosis of diabetes,…
Multivariate Analysis of Transcript Splicing (MATS)
Install rMATS: Add the Python directory to the $PATH environment variable Add the bowtie and tophat directories to the $PATH environment variable Add the samtools directory to the $PATH environment variable Obtain bowtie index for genome by either of the following two ways Build own bowtie index using bowtie-build from…
GATK AnnotateVcfWithBamDepth returns zero DP for all variants in VCF
Dear all, I am using GATK (v4.1.9.0) AnnotateVcfWithBamDepth to get the DP for all variants in ClinVar VCF in a retina RNA-seq BAM file. However, the tool returns zero depth for all variants in the VCF, even though I checked multiple variants in IGV and I saw that they are…
illuminahumanmethylation450k annotation for hg38
illuminahumanmethylation450k annotation for hg38 0 I am new to R. I am trying to do a methylation analysis for CRC samples (.idat files) downloaded from GDC. Here’s what I am doing: query_met <- GDCquery(project = c(“TCGA-COAD”, “TCGA-READ”), data.category = “DNA Methylation”, data.type = “Masked Intensities”, platform = “Illumina Human Methylation…
Bioconductor – wavClusteR
DOI: 10.18129/B9.bioc.wavClusteR This package is for version 3.12 of Bioconductor; for the stable, up-to-date release version, see wavClusteR. Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data Bioconductor version: 3.12 The package provides an integrated pipeline for the analysis of PAR-CLIP data. PAR-CLIP-induced transitions are…
Bioconductor – vulcan
DOI: 10.18129/B9.bioc.vulcan This package is for version 3.15 of Bioconductor; for the stable, up-to-date release version, see vulcan. VirtUaL ChIP-Seq data Analysis using Networks Bioconductor version: 3.15 Vulcan (VirtUaL ChIP-Seq Analysis through Networks) is a package that interrogates gene regulatory networks to infer cofactors significantly enriched in a…
Advances in methylation analysis of liquid biopsy in early cancer detection of colorectal and lung cancer
Study participants Whole blood samples were collected from 327 participants consisting of 102 with colorectal cancer, 99 with lung cancer, and 126 healthy controls. After excluding 6 patients who withdrew consent to participate and two patients with QC-failed samples, the final analysis included 96 patients with colorectal cancer, 95 with…
Long-molecule scars of backup DNA repair in BRCA1- and BRCA2-deficient cancers
Pan-cancer WGS data sources GrCh37/hg19 BAM alignments for 2,489 primary tumour and matched normal whole-genome sequencing data were obtained as previously described18. In brief, 989 tumour–normal (T/N) pairs were obtained from The Cancer Genome Atlas (TCGA) Research Network (Genomic Data Commons at portal.gdc.cancer.gov/, accession: phs000178.v11.p8). Additional WGS data were obtained for 874 T/N pairs…
Alternatives To Liftover
Alternatives To Liftover 5 Has anybody had any success with any tools other than LiftOver or NCBI’s Genome Remapping Service for mapping/translating reference genome positions? My experience with LiftOver has been less than satisfactory and NCBI does not seem to offer any local version of their tool. Other than BLASTing…
Bioconductor – gwascat (development version)
DOI: 10.18129/B9.bioc.gwascat This is the development version of gwascat; for the stable release version, see gwascat. representing and modeling data in the EMBL-EBI GWAS catalog Bioconductor version: Development (3.18) Represent and model data in the EMBL-EBI GWAS catalog. Author: VJ Carey <stvjc at channing.harvard.edu> Maintainer: VJ Carey <stvjc at…
how to compare mapping of WES samples to human pangenome?
how to compare mapping of WES samples to human pangenome? 0 Hi, I’m still trying to wrap my head around the new human pangenome reference and would like some advice on how to go about analyzing some of the WES (hg38/hg19 baits) that I currently have. How should I calculate…