Tag: hg38

where to obtain hg19 reference transcriptome in gencode

where to obtain hg19 reference transcriptome in gencode 1 Hi, I want to use salmon to generate an index and later use it again to get a table of counts that I can use for differential expression analysis with the hg19 version. For the hg38 reference genome I have had…

Continue Reading where to obtain hg19 reference transcriptome in gencode

Chain file for Homo_sapiens_assembly38.fasta to liftover Nebula vcf

Chain file for Homo_sapiens_assembly38.fasta to liftover Nebula vcf 0 Which chain file is needed to convert from positions in a vcf aligned to Homo_sapiens_assembly38.fasta to a vcf with positions aligned to hg19.fa. The obvious solution hg38ToHg19.over.chain.gz did not work. I couldn’t find any information in the MEGABOLT bioinformatician guide from…

Continue Reading Chain file for Homo_sapiens_assembly38.fasta to liftover Nebula vcf

Heritable transcriptional defects from aberrations of nuclear architecture

Cell culture and cell line construction Cells were cultured at 37 °C in 5% CO2 atmosphere with 100% humidity. Telomerase-immortalized RPE-1 retinal pigment epithelium cells (CRL-4000, American Type Culture Collection), U2OS osteosarcoma cells (HTB-96, American Type Culture Collection) and derivative cell lines were grown in DMEM/F12 (1:1) medium without phenol red…

Continue Reading Heritable transcriptional defects from aberrations of nuclear architecture

Differences in GTF files hg19 and hg38

Differences in GTF files hg19 and hg38 1 Hello, I have a script for gene annotation with hg19. Now I’m trying to annotate using hg38. The problem is I need the column level (level 1, level 2). I can’t find it in the gtf file hg38. Do you have any…

Continue Reading Differences in GTF files hg19 and hg38

Gviz Coverage Plots

I have two bam files from single-cell RNA sequencing mapped to the reference genome using CellRanger, I can view them in IGV and I have a particular region where the pattern of reads mapped to the reference genome are different between the two bam files but when I try the…

Continue Reading Gviz Coverage Plots

Help with running ATAC-seq using Encode pipeline

I am very new to ATAC-seq pipeline from Encode and trying to run on a HPC but not sure the steps after reading their instruction. github.com/ENCODE-DCC/atac-seq-pipeline Here is my json file: { “atac.title” : “atac)”, “atac.description” : “Encode”, “atac.pipeline_type” : “atac”, “atac.align_only” : false, “atac.true_rep_only” : false, “atac.genome_tsv” : “https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v4/hg38.tsv”,…

Continue Reading Help with running ATAC-seq using Encode pipeline

Is hg38 on the multiz 30-way alignment inaccurate?

Is hg38 on the multiz 30-way alignment inaccurate? 0 Hello, I have a general question: Is it possible that the multiz 30-way alignment is inaccurate? I am looking for specific hg38 sequences in the blocks (I am using BLAT retrieved coordinates, with PHAST to parse the maf blocks), and the…

Continue Reading Is hg38 on the multiz 30-way alignment inaccurate?

hisat2 Error 137

hisat2 Error 137 0 hi every body, I am running hisat2 with below command: hisat2 –dta -x /home/genetics/apps/Proj/Index/hg38/genom -1 SRR11573854_1P -2 SRR11573854_2P -S SRR11573854.sam -p 4 and I face this error: Killed (ERR): hisat2-align exited with value 137 could you help me to solve this problem? Thank you very much…

Continue Reading hisat2 Error 137

Difference between USCS exon coordinates and ensembl

Hello, I am trying to extract the coordinates for the exons in numerous genes. For example, APC using the MANE transcript: NM_000038.6 When downloading the USCS table browser for just exons from HG38 I get these coordinates for the exons (example exon 1): chr5 112737884 112737925 However, when I look…

Continue Reading Difference between USCS exon coordinates and ensembl

Bioconductor – REMP

DOI: 10.18129/B9.bioc.REMP     This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see REMP. Repetitive Element Methylation Prediction Bioconductor version: 3.13 Machine learning-based tools to predict DNA methylation of locus-specific repetitive elements (RE) by learning surrounding genetic and epigenetic information. These tools provide genomewide…

Continue Reading Bioconductor – REMP

How to determine the exact version of hg38 if I have only the FASTA file

How to determine the exact version of hg38 if I have only the FASTA file 1 I have a FASTA file which contains hg38 assembly. It contains the primary contigs, alt contigs, decoy, HLA, mito. How do I determine the exact version of hg38 based on the FASTA? Here some…

Continue Reading How to determine the exact version of hg38 if I have only the FASTA file

Bioconductor – gpart

DOI: 10.18129/B9.bioc.gpart     This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see gpart. Human genome partitioning of dense sequencing data by identifying haplotype blocks Bioconductor version: 3.13 we provide a new SNP sequence partitioning method which partitions the whole SNP sequence based on…

Continue Reading Bioconductor – gpart

How to extract phastcons score for all protein coding genes and lncRNAs using GenomicScores?

There are multiple ways in which you can do this, I’ll show you one. first, you need to fetch the gene annotations you want. You’ve mentioned Gencode. One version of Gencode annotations is available in theTxDb.Hsapiens.UCSC.hg38.knownGene注释包: library(TxDb.Hsapiens.UCSC.hg38.knownGene) txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene 接下来,您需要获取基因的坐标,您尚未提及是否要计算跨基因边界的平均保护,例如,剪接外显子的平均保护。我将假设您想要后者: exonsbygene <-exonsby(txdb,by by =“ gene”)class(exonsbygene)[1]“ compressedgrangeslist” attr(,“ package” 1]…

Continue Reading How to extract phastcons score for all protein coding genes and lncRNAs using GenomicScores?

DanMAC5: a browser of aggregated sequence variants from 8,671 whole genome sequenced Danish individuals | BMC Genomic Data

Demographics Data from three studies were included: Dan-NICAD: 1,649 individuals with symptoms of obstructive coronary artery disease, predominantly chest pain, undergoing coronary computed tomography angiography. In total, 52% were females, the mean age was 57 years (+/- 9 SD), median coronary artery calcium score were 0 [0–82] and 24% of…

Continue Reading DanMAC5: a browser of aggregated sequence variants from 8,671 whole genome sequenced Danish individuals | BMC Genomic Data

Circulating miRNA expression in long-standing type 1 diabetes mellitus

Participants This is an observational case–control study, carried out in adult patients who attended the Endocrinology and Nutrition Service of the Central University Hospital of Asturias, between June 2019 and December 2021. Written informed consent was obtained from all participants and the study was conducted in accordance with the principles…

Continue Reading Circulating miRNA expression in long-standing type 1 diabetes mellitus

Deferentially expressed gene with high log2foldchange by DESeq2; but not meaningful at the individual level

Hi all, I am working with the RNA-Seq data on human (24Cases-20 controls) to find differentially expressed genes. my RNA-Seq data is unstranded. Here is the comments that I used to align the fastq files: ls *_1P.fastq.gz | parallel –bar -j8 ‘R2=$(echo {} | sed s/_1/_2/) && out=$(echo {} |…

Continue Reading Deferentially expressed gene with high log2foldchange by DESeq2; but not meaningful at the individual level

Chipseq data peak calling issue

Hi , I’m trying to do analysis of chipseq data . I have 3 samples Sample1 , sample2 and input I have done QC and then alignment using Bowtie . After that I used samtool to get bam files . Then I have used Picard for duplicate removal. Now I…

Continue Reading Chipseq data peak calling issue

Genes’ fpkm values through cufflink

Hi, I am a newbie to RNA-seq data analysis. I have to identify differentially expressed genes (DEGs) between human and chimpanzee in a tissue type. I have comparable RNA-seq experiment data (reads/fastq) for the two species. Each species has 2 biological replicates(each with three technical replicates) so six runs per…

Continue Reading Genes’ fpkm values through cufflink

Error in .Call2(“C_solve_user_SEW”, refwidths, start, end, width, translate.negative.coord

Error in .Call2(“C_solve_user_SEW”, refwidths, start, end, width, translate.negative.coord 0 Dear guys, Any solutions for the error when trying to construct the trinucleotideMatrix? test_maf.tnm = trinucleotideMatrix(maf = test_maf, ref_genome = “BSgenome.Hsapiens.UCSC.hg38” ) ## some site/region cause the error -Extracting 5′ and 3′ adjacent bases -Extracting +/- 20bp around mutated bases for…

Continue Reading Error in .Call2(“C_solve_user_SEW”, refwidths, start, end, width, translate.negative.coord

bash – How to use for-loops and if-else statements to iteratively run a software on command line

For all the .mcool files, I extract the id, which is the string before the .mcool extension. If the substring of id after the last / is either 5000, 10000, or 50000 (i.e., 5k, 10k, or 50k), I want to run predictSV from EagleC. For a single .mcool file, as…

Continue Reading bash – How to use for-loops and if-else statements to iteratively run a software on command line

Instructions Associated questions refer to a gene in

Transcribed image text: Instructions Associated questions refer to a gene in the human genome. The screenshot below should help you find the location of the gene in the human genome (hg38). Navigate to this gene in the UCSC Genome Browser (not GEP mirror site) to answer attached questions. Epo-last exon…

Continue Reading Instructions Associated questions refer to a gene in

open-cravat: variant annotation tool

Tool:open-cravat: variant annotation tool 3 open-cravat is an open-source platform for rapidly developing, using, and disseminating variant annotation tools. It can handle unlimited number of variants in VCF format input files as well as its own input format and produce tab-separated text output files and excel spreadsheets. It is command-line-based…

Continue Reading open-cravat: variant annotation tool

Perl debugging help – miRWoods

Hello, I was wondering if anyone with Perl experience could help me debug a miRWoods? I tried reaching out the authors via e-mail with no response, and issues on GitHub are turned off so I’d be super grateful if anyone could provide any insight. When I run miRWoods I get…

Continue Reading Perl debugging help – miRWoods

r – Plotting infercnv results

I’m working with matched single cell data, where we have treated and untreated samples for the same patient. I ran CNV analysis using the infercnv package. I’ve followed the tutorial: # data matrix counts_matrix <- scData@assays$RNA@counts meta = data.frame(labels = Idents(scData), row.names = names(Idents(scData))) unique(meta$labels) # check the cell labels…

Continue Reading r – Plotting infercnv results

Bioconductor – SNPlocs.Hsapiens.dbSNP149.GRCh38

DOI: 10.18129/B9.bioc.SNPlocs.Hsapiens.dbSNP149.GRCh38     This package is for version 3.15 of Bioconductor; for the stable, up-to-date release version, see SNPlocs.Hsapiens.dbSNP149.GRCh38. SNP locations for Homo sapiens (dbSNP Build 149) Bioconductor version: 3.15 SNP locations and alleles for Homo sapiens extracted from NCBI dbSNP Build 149. The source data files used for…

Continue Reading Bioconductor – SNPlocs.Hsapiens.dbSNP149.GRCh38

how can I generate a VCF (in hg38 coords) of differences between hg38 and CHM13?

I downloaded s3-us-west-2.amazonaws.com/human-pangenomics/pangenomes/freeze/freeze1/minigraph/hprc-v1.0-minigraph-grch38.gfa.gz which contains hg38, chm13, and other assemblies, and now am trying to use vg to generate a VCF with the variants in CHM13 relative to hg38. After converting to vg format, by running vg convert <(gunzip -c hprc-v1.0-minigraph-grch38.gfa.gz) > hprc-v1.0-minigraph-grch38.vg, I tried a few different variations of…

Continue Reading how can I generate a VCF (in hg38 coords) of differences between hg38 and CHM13?

multi-mapping reads settings in Rsubread or Rsubjunc

multi-mapping reads settings in Rsubread or Rsubjunc 0 Hi All, I am using Rsubjunc to process my RNA seq data for DEseq2 and differential splicing analysis. I have a question about how to set multi-mapping reads alignment in Rsubjunc R package. The command I used is attached to the end…

Continue Reading multi-mapping reads settings in Rsubread or Rsubjunc

Methanol fixation is the method of choice for droplet-based single-cell transcriptomics of neural cells

hiPSC cell culture and differentiation hiPSCs were maintained on 1:40 matrigel (Corning, #354277) coated dishes in supplemented mTeSR-1 medium (StemCell Technologies, #85850) with 500 U ml−1 penicillin and 500 mg ml−1 streptomycin (Gibco, #15140122). For the differentiation of cortical neurons the protocol described previously21 was followed with slight modifications. Briefly, hiPSC colonies were seeded…

Continue Reading Methanol fixation is the method of choice for droplet-based single-cell transcriptomics of neural cells

how to input bedpe file in IGV

how to input bedpe file in IGV 0 Hello, I have a bedpe file with interacting region and its interacting score. I want to view this bedpe file in IGV. But its showing error when inputing. The file (bedpe) look like this: When I input the file in IGV: File>…

Continue Reading how to input bedpe file in IGV

bash – CNV Kit ` from . import commands ImportError: cannot import name ‘commands’ from ‘__main__’`

I am trying to run some code for my colleague in bash /path/to/cnvkit.py batch /path/to/my/folder/with/bams/*.bam \ –normal –targets ${bed_file} \ –fasta path/to/my/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta \ –output-reference /path/to/my/CD_BATCH1_reference.cnn \ –output-dir /path/to/my/Group_1 –scatter however, I keep getting this truly peculiar error Traceback (most recent call last): File “/path/to/cnvkit.py”, line 4, in <module> from ….

Continue Reading bash – CNV Kit ` from . import commands ImportError: cannot import name ‘commands’ from ‘__main__’`

Removing multi-variant records from vcf file

Removing multi-variant records from vcf file 3 I am using gatk ASEReadCounter to get the read counts per allele. To do so, I used the following command: gatk ASEReadCounter -R /path_to_genome/hg38_genome/GRCh38.p13.genome.fa -I sample.sorted.bam -V sample.vcf.gz -O output.table I used GATK4. but I realized In my VCF at position chr1:1574033, there…

Continue Reading Removing multi-variant records from vcf file

Help with error in GATK variant calling

Help with error in GATK variant calling 1 Hi all, I try some reference genome such as Homo_sapiens_assembly38.fasta and Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa but I still got the error below. Would you please have a suggestion? Thank you so much. The link in the error message doesn’t work. gatk BaseRecalibrator -I Library_1Aligned.out.sorted.bam -R…

Continue Reading Help with error in GATK variant calling

Assembly Table.

Assembly Table. A. mellifera (Apr 2011 Amel_4.5/amel5) A. carolinensis (May 2010 AnoCar2.0/anoCar2) A. thaliana (Feb 2011 TAIR10/araTha1) B. taurus (Aug 2006 Btau_3.1/bosTau3) B. taurus (Nov 2014 Bos_taurus_UMD_3.1.1/bosTau8) C. familiaris (May 2005 CanFam2.0/canFam2) C. familiaris (Sep 2011 CanFam3.1/canFam3) C. porcellus (Feb 2008 Cavpor3.0/cavPor3) C. elegans (Oct 2010 WBcel215/ce10) C. elegans (Feb…

Continue Reading Assembly Table.

how to look at interacting SNP for a gene using Hic contact map

how to look at interacting SNP for a gene using Hic contact map 0 Hello everyone, I have one HiC contact map file corresponding to a particular cells( brain frontal cortex). I want to looking at how many snps are located within 40kb distance from this gene and are interacting…

Continue Reading how to look at interacting SNP for a gene using Hic contact map

feature count command error

Hi everyone , I have ran this feature count command : featureCounts -T 4 -p -a gencode.v43.basic.annotation.gtf -o featurecount.txt *.bam this gave me this error : Process BAM file UI_E2_sorted.bam.bam… || || Paired-end reads are included. || || The reads are assigned on the single-end mode. || || Total alignments…

Continue Reading feature count command error

Help wanted for a struggling bioinfmatician! GATK Variantrecalibration

Hello, any answer on any the following questions will be much appreciated! I’m playing around with gatk’s VariantRecalibration tool. I have a few questions that I can’t find information on that I seem to understand. 1) My tranches plot has no False Positives (see provided image) did something go wrong…

Continue Reading Help wanted for a struggling bioinfmatician! GATK Variantrecalibration

Convert hg19 coordinates to hg38 coordinates

Convert hg19 coordinates to hg38 coordinates 2 Hello everyone! I would like to convert hg19 coordinates to hg38 coordinates. The problem is that I don’t really have much informations in order to do so. For example: The gene KLHL7 in hg19 the start is 23 145 353 and the end…

Continue Reading Convert hg19 coordinates to hg38 coordinates

Low SNP Overlap with Michigan 1KG and TopMed reference panel

I extracted three samples (HG02024 – HG02026) from the 1000 Genomes Project’s 30x alignment files, employing the Genome Analysis Toolkit (GATK) best practice pipeline. This process involved performing base quality score recalibration, identifying and removing duplicate reads, utilizing the HaplotypeCaller to generate a genomic VCF (gVCF) file, and calling variants…

Continue Reading Low SNP Overlap with Michigan 1KG and TopMed reference panel

Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits

ARG-Needle and ASMC-clust algorithms We introduce two algorithms to construct the ARG of a set of samples, called ARG-Needle and ASMC-clust. Both approaches leverage output from the ASMC algorithm11, which takes as input a pair of genotyping array or sequencing samples and outputs a posterior distribution of the TMRCA across…

Continue Reading Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits

Using the 27-primates UCSC Multiz Alignment

Using the 27-primates UCSC Multiz Alignment 0 Hello, I am attempting to use the 27-primates UCSC multiz alignment data so that I can find an orthogonal sequence (across different mammals) to a human reference sequence (whether an ORF, a transcript, etc). I downloaded all the data in the maf folders…

Continue Reading Using the 27-primates UCSC Multiz Alignment

Bioconductor – OutSplice (development version)

DOI: 10.18129/B9.bioc.OutSplice   This is the development version of OutSplice; to use it, please install the devel version of Bioconductor. Comparison of Splicing Events between Tumor and Normal Samples Bioconductor version: Development (3.17) An easy to use tool that can compare splicing events in tumor and normal tissue samples using…

Continue Reading Bioconductor – OutSplice (development version)

Bioconductor – Ularcirc

DOI: 10.18129/B9.bioc.Ularcirc     This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see Ularcirc. Shiny app for canonical and back splicing analysis (i.e. circular and mRNA analysis) Bioconductor version: 3.13 Ularcirc reads in STAR aligned splice junction files and provides visualisation and analysis tools…

Continue Reading Bioconductor – Ularcirc

Download the promoter, enchancer, TSS , 3 prime, exon and intron positions of all hg38 genes

Download the promoter, enchancer, TSS , 3 prime, exon and intron positions of all hg38 genes 1 Hi All, I would like to download the promoter enhancer, exon, intron, 3’prime, 5’prime positions of all genes from the human genome hg38 version. I have seen a couple of information in the…

Continue Reading Download the promoter, enchancer, TSS , 3 prime, exon and intron positions of all hg38 genes

Human hg38 chr10:21,513,475-21,525,682 UCSC Genome Browser v446

     Seq2science ChIP-seq hub ChIP-seqhidedensesquishpackfull    Mapping and Sequencing Base Positionhidedensefull p14 updated Fix Patcheshidedensesquishpackfull p14 updated Alt Haplotypeshidedensesquishpackfull Assemblyhidedensesquishpackfull Centromereshidedensesquishpackfull Chromosome Bandhidedensesquishpackfull Clone Endshidedensesquishpackfull Exome Probesetshidedensesquishpackfull FISH Cloneshidedensesquishpackfull Gaphidedensesquishpackfull GC Percenthidedensefull GRC Contigshidedensefull GRC Incidenthidedensesquishpackfull Hg19 Diffhidedensesquishpackfull INSDChidedensesquishpackfull LiftOver & ReMaphidedensesquishpackfull LRG Regionshidedensesquishpackfull Mappabilityhideshow Problematic Regionshidedensesquishpackfull new Recomb…

Continue Reading Human hg38 chr10:21,513,475-21,525,682 UCSC Genome Browser v446

Exon vs Transcript in featurecounts for RNA Seq

I am counting reads with featurecounts after aligning RNA-seq data to Hg38 using STAR. When I run featurecounts with “-t exon” (i.e. using the exon flag in the 3rd column of the GTF file), I get generally poor results, with less than 50% being assigned and greater than 50% being…

Continue Reading Exon vs Transcript in featurecounts for RNA Seq

excluderanges: exclusion sets for T2T-CHM13, GRCm39, and other genome assemblies

doi: 10.1093/bioinformatics/btad198. Online ahead of print. Affiliations Expand Affiliations 1 Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, 23298, USA. 2 Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC 27514, USA. 3 Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill,…

Continue Reading excluderanges: exclusion sets for T2T-CHM13, GRCm39, and other genome assemblies

Help with error makeTxDbFromGFF in GenomicFeatures package

Hi all, Would you have a suggestion for the error below? Thank you so much. txdb <- makeTxDbFromGFF(opt$gtf) Error in .detect_file_format(file) : Invalid ‘file’. Must be a path to a file, or an URL, or a connection object, or a GFF3File or GTFFile object. opt $cores [1] 1 $help [1]…

Continue Reading Help with error makeTxDbFromGFF in GenomicFeatures package

Ex vivo prime editing of patient haematopoietic stem cells rescues sickle-cell disease phenotypes after engraftment in mice

Optimizing prime editing systems for HSPCs We previously reported the use of prime editing to correct HBBS by plasmid transfection in HEK293T cells containing the SCD mutation, reaching up to 58% efficiency (Fig. 1a)24. In contrast to HEK293T cells, HSPCs are difficult to transfect with plasmid DNA but are amenable…

Continue Reading Ex vivo prime editing of patient haematopoietic stem cells rescues sickle-cell disease phenotypes after engraftment in mice

The Biostar Herald for Monday, April 17, 2023

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan…

Continue Reading The Biostar Herald for Monday, April 17, 2023

Issue With CRAM -> BAM -> FASTQ Conversion

Issue With CRAM -> BAM -> FASTQ Conversion 2 Please help! I am trying to obtain fastq files from the GDSC, all we have in the lab is CRAM files. Unfortunately, the reference genome seems to not exist when pulled from an online source. I have attempted to use the…

Continue Reading Issue With CRAM -> BAM -> FASTQ Conversion

STRavinsky STR database and PGTailor PGT tool demonstrate superiority of CHM13-T2T over hg38 and hg19 for STR-based applications

doi: 10.1038/s41431-023-01352-6. Online ahead of print. Affiliations Expand Affiliations 1 Morris Kahn Laboratory of Human Genetics, NIBN and Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, Israel. 2 Genetics Institute, Soroka Medical Center, Beer Sheva, Israel. 3 Morris Kahn Laboratory of Human Genetics, NIBN and Faculty…

Continue Reading STRavinsky STR database and PGTailor PGT tool demonstrate superiority of CHM13-T2T over hg38 and hg19 for STR-based applications

Pick a human gene of your interest. Using

Transcribed image text: Pick a human gene of your interest. Using bioinformatics tools on the UCSC Genome Browser (select GRCh38/hg38 assembly), identify regions that can be regulated through epigenetics mechanisms (DNA methylation, histone marks, etc.). Highlight these regions and explain mechanisms that may be important in the regulation of this…

Continue Reading Pick a human gene of your interest. Using

Annovar file

Annovar file 1 Hello My avinput.exonic variant function file and avinput.hg38_cytoband files in annovar is 0 bytes What is the reason for this. Please help what should i do Im learning to use annovar tool Exonic_variant • 359 views Can you shows your first 10 lines of your new.avinput file?…

Continue Reading Annovar file

use ROSE to identify super enhancer

use ROSE to identify super enhancer 0 hey everyone, i want to use ROSE to identify super enhancer and to see if there is difference in the super enhancer after some treatment in lung cancer cell line i see that this is the typical use: [user@cn3107 ~]$ ROSE_main.py -h Usage:…

Continue Reading use ROSE to identify super enhancer

Specific recognition of an FGFR2 fusion by tumor infiltrating lymphocytes from a patient with metastatic cholangiocarcinoma

Introduction Cholangiocarcinoma (CC) is a form of gastrointestinal cancer that originates from the epithelium of either intrahepatic or extrahepatic bile ducts. It accounts for approximately 3% of all gastrointestinal cancers, with reported incidence of one to two cases per 100,000 persons per year in the USA (and much higher incidence…

Continue Reading Specific recognition of an FGFR2 fusion by tumor infiltrating lymphocytes from a patient with metastatic cholangiocarcinoma

hg38 Ig regions

hg38 Ig regions 3 Hi, I’m looking for the Immunoglobulin regions coordinates in hg38 assembly. I want to exclude them from my CNV analysis. I know the hg19 regions but I do not want just to liftover them. Many thanks assembly sequence • 2.0k views You can use Ensembl’s BioMart…

Continue Reading hg38 Ig regions

SureSelect Clinical Research Exome V1 hg38

SureSelect Clinical Research Exome V1 hg38 0 Hi, I’m inheriting an old project where whole exome was done using hg19 and now they want hg38. The capture kit was Agilent SureSelect Clinical Research Exome V1. I noticed this post Bed For Agilent Sureselect All Exon Kits but I could only…

Continue Reading SureSelect Clinical Research Exome V1 hg38

Index of /goldenPath/hg38/vsTurTru2

Index of /goldenPath/hg38/vsTurTru2 This directory contains alignments of the following assemblies: – target/reference: Human (hg38, Dec. 2013 (GRCh38/hg38), GRCh38 Genome Reference Consortium Human Reference 38 (GCA_000001405.15)) – query: Dolphin (turTru2, Oct. 2011 (Baylor Ttru_1.4/turTru2), Baylor College of Medicine Ttru_1.4 (NCBI project 20365, GCA_000151865.2, WGS ABRN02)) Files included in this directory:…

Continue Reading Index of /goldenPath/hg38/vsTurTru2

PF4088 – YFull YTree Info

Sample ID Country / Language Info Ref File Testing company Statistics Status HGDP01069 Italy (Cagliari) / Sardinian I-PF4088* —— Hg38 .BAM Scientific 18X, 23.6 Mbp, 151 bp YF065974 Spain (Valencia / València) I-Y137878* —— Hg38 .BAM FTDNA (Y700) 39X, 18.4 Mbp, 151 bp YF072915 United Kingdom (Hampshire) I-Y25699 —— Hg38…

Continue Reading PF4088 – YFull YTree Info

How to add attributes (i.e. gene_id, transcipt_id, exon_id, etc.) annotation from .bed file onto VCF?

I’m trying to annotate genes onto a VCF file with bcftools. My annotation file is a .bed file that originally was a hg38 UCSC knownGene gtf file, converted by BEDOPS: hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/ Original GTF file: chr1 11868 12227 . . + knownGene exon . gene_id “ENST00000456328.2”; transcript_id “ENST00000456328.2”; exon_number “1”; exon_id…

Continue Reading How to add attributes (i.e. gene_id, transcipt_id, exon_id, etc.) annotation from .bed file onto VCF?

Download stats for workflow package BSgenome.Hsapiens.UCSC.hg38

Download stats for workflow package BSgenome.Hsapiens.UCSC.hg38 This page was generated on 2023-04-01 01:19:13 -0400 (Sat, 01 Apr 2023). BSgenome.Hsapiens.UCSC.hg38 home page: release version, devel version. Number of downloads for workflow package BSgenome.Hsapiens.UCSC.hg38, year by year, from 2023 back to 2015 (years with no downloads are omitted): 2023 Month Nb of distinct IPs Nb of downloads…

Continue Reading Download stats for workflow package BSgenome.Hsapiens.UCSC.hg38

Integration of RNA seq data aligned to different reference genome versions

Integration of RNA seq data aligned to different reference genome versions 0 Hi all, I would like to integrate a bulk RNA seq and an scRNA seq dataset. The data was already provided as count matrices, but unfortunately, the bulk RNA seq data was aligned to hg38 whereas the scRNA…

Continue Reading Integration of RNA seq data aligned to different reference genome versions

Correct script for featurecounts in Rsubread

I am new to R and RStudio but have been trying to work through different examples using Rsubread for my data. I have tried reading vignettes and manuals prior to posting here but I am stuck and could really use some advice. I have 7 paired-end, fastq files from Illumina…

Continue Reading Correct script for featurecounts in Rsubread

1000 genomes hg38 with dbSNP rsid

1000 genomes hg38 with dbSNP rsid 1 Hi, Anyone know where I can download the latest version of 1000 Genomes, on build hg38, in VCF format (or PLINK format), that ALSO contains the dbSNP RSid in the VCF ID field? I looked at the IGSR website, dbSNP, UCSC, etc. So…

Continue Reading 1000 genomes hg38 with dbSNP rsid

RBFOX2 modulates a metastatic signature of alternative splicing in pancreatic cancer

Patient samples and RNA-seq analysis RNA-seq data from 395 patients with PDA was obtained from the University Health Network (Toronto), Sunnybrook Health Sciences Centre (Toronto), Kingston General Hospital (Kingston), McGill University (Montreal), Mayo Clinic (Rochester), Massachusetts General Hospital (Boston) and Sheba Medical Center (Tel Aviv) and has been described previously1,12,13,14….

Continue Reading RBFOX2 modulates a metastatic signature of alternative splicing in pancreatic cancer

TxDB.Hsapiens.UCSC.hg38.knownGene with locateVariants() identifying SNPs from various chromosome being part of the same gene

I am trying to annotate a list of SNPs using the hg38 genome (knownGene) and locateVariants(). The program is able to successfully run and provide “GeneIDs” for several of the loci. However, some GeneIDs are applied to SNPs in completely different regions and on completely different chromosomes. When I cross…

Continue Reading TxDB.Hsapiens.UCSC.hg38.knownGene with locateVariants() identifying SNPs from various chromosome being part of the same gene

Ensembl Hg38 dna, dna_rm, and dna_sm

Ensembl Hg38 dna, dna_rm, and dna_sm 1 Hello, I want to ask a basic question. So, I read from the readme in the Ensembl ftp for Hg38 reference and it seems there are several type of file of dna, which is only dna, dna_sm, and dna_rm. If I want to…

Continue Reading Ensembl Hg38 dna, dna_rm, and dna_sm

No BSgenome for Human HG19 or HG38 with R version 4.2.2 and Bioconductor 3.16

No BSgenome for Human HG19 or HG38 with R version 4.2.2 and Bioconductor 3.16 1 @aac4f0b4 Last seen 17 hours ago France Hello, I’m currently trying to install BSgenome.Hsapiens.UCSC.hg19 and BSgenome.Hsapiens.UCSC.hg38 through bioconductor on my desktop (Windows) My current R version is 4.2.2.and bioconductor is 3.16 My command is this…

Continue Reading No BSgenome for Human HG19 or HG38 with R version 4.2.2 and Bioconductor 3.16

What’s the correct way to map to hg38 with alternative contigs?

I’m trying to do mapping to hg38. And I’m a bit confused how to handle alternative (random, fix, …) contigs. From documents of bwa seemingly there is a proposed way (github.com/lh3/bwa/blob/master/README-alt.md), but there are some obstructions. 1.If you use hg38 directly (I download from hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/latest/hg38.fa.gz). The log from bwa shows…

Continue Reading What’s the correct way to map to hg38 with alternative contigs?

query UCSC db for CDS coordinates (Gencode)

query UCSC db for CDS coordinates (Gencode) 0 Hi, I’m not sure how to obtain CDS coordinates from GENCODE using mysql on UCSC. Specifically, I’d like to obtain one bed per record of CDS like from the website. I tried to query …using the follow command but the db only…

Continue Reading query UCSC db for CDS coordinates (Gencode)

Bioconductor – crisprDesign (development version)

DOI: 10.18129/B9.bioc.crisprDesign   This is the development version of crisprDesign; for the stable release version, see crisprDesign. Comprehensive design of CRISPR gRNAs for nucleases and base editors Bioconductor version: Development (3.17) Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and…

Continue Reading Bioconductor – crisprDesign (development version)

BigWig format (Cut&Run)

I generated summit.bed files from my experiment cut&run, and I followed the below steps to get and visualize the bigwig files. 1) I used awk to get a bedgraph: awk ‘{printf “%s\t%d\t%d\t%2.3f\n” , $1,$2,$3,$5}’ myBed.bed > myFile.bedgraph 2) Sorting the bed files: sort -k1,1 -k2,2n myFile.bedgraph > myFile_sorted.bedgraph 3) Chrom.size:…

Continue Reading BigWig format (Cut&Run)

STAR solo segmentation fault after ‘started Solo counting’

STAR solo segmentation fault after ‘started Solo counting’ 1 Dear all, I have some 10x v3 single cell rna seq fastq files that I am trying to map to hg38 human genome using STAR aligner and generate read counts. However, I am getting the following error and hope that some…

Continue Reading STAR solo segmentation fault after ‘started Solo counting’

Bioconductor – phastCons7way.UCSC.hg38

DOI: 10.18129/B9.bioc.phastCons7way.UCSC.hg38     This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see phastCons7way.UCSC.hg38. UCSC phastCons conservation scores for hg38 Bioconductor version: 3.13 Store UCSC phastCons conservation scores for the human genome (hg38) calculated from multiple alignments with other 6 vertebrate species. Author: Robert…

Continue Reading Bioconductor – phastCons7way.UCSC.hg38

Index of /goldenPath/hg38/vsMicMur3/reciprocalBest

Index of /goldenPath/hg38/vsMicMur3/reciprocalBest This directory contains reciprocal-best netted chains for hg38-micMur3. – hg38.micMur3.rbest.net.gz: hg38-referenced recip.best net to micMur3. – hg38.micMur3.rbest.chain.gz: chains extracted from the recip.best net. These can be passed to the liftOver program to translate coords from hg38 to micMur3 through the recip.best net. – micMur3.hg38.rbest.net.gz: micMur3-referenced recip.best net….

Continue Reading Index of /goldenPath/hg38/vsMicMur3/reciprocalBest

Can not get bcftools norm to join biallelics into a multiallelic.

Forum:Can not get bcftools norm to join biallelics into a multiallelic. 0 is this the right way to use bcftools to join/merge biallelic records into a multiallelic? If so, it is not working. No errors but it gives me the same file with my command added to the headers. Example…

Continue Reading Can not get bcftools norm to join biallelics into a multiallelic.

Very high coverage Nanopore alignment Hg38

Very high coverage Nanopore alignment Hg38 0 Hi, I aligned my first Nanopore reads on the hg38 reference. Then, I used bedtools genomecov to get an idea about the mean coverage over the genome. I noticed some bases have a very high coverage >20,000X , whereas the mean coverage is…

Continue Reading Very high coverage Nanopore alignment Hg38

Running accurate, comprehensive, and efficient genomics workflows on AWS using Illumina DRAGEN v4.0

Introduction The reduced cost of DNA sequencing technology has led to an exponential growth of raw sequencing data. To keep pace with this development, secondary analysis tools that can provide fast and accurate results in a cost-effective manner are needed to extract actionable genomic insights. Illumina’s DRAGENTM (Dynamic Read Analysis for GENomics) addresses…

Continue Reading Running accurate, comprehensive, and efficient genomics workflows on AWS using Illumina DRAGEN v4.0

IJMS | Free Full-Text | Transcriptomic Analysis of CRISPR/Cas9-Mediated PARP1-Knockout Cells under the Influence of Topotecan and TDP1 Inhibitor

1. Introduction The synthesis of poly(ADP-ribose) (PAR) is an immediate response of cells to DNA damage catalyzed by poly(ADP-ribose) polymerases, which transfer ADP-ribose units from NAD+ onto target molecules [1]. PAR is a linear and branched polymer up to 200 units long that is covalently attached to the target proteins,…

Continue Reading IJMS | Free Full-Text | Transcriptomic Analysis of CRISPR/Cas9-Mediated PARP1-Knockout Cells under the Influence of Topotecan and TDP1 Inhibitor

Custom Annotaion file

Custom Annotaion file 0 Hi Everyone, Can anyone please guide me how to generate an annotation file for (5′ and 3′) UTR and CDS (all of them are one GFF/GTF file) from already existing hg38 annotation file ? I did downloaded annotation file from genome.ucsc.edu/cgi-bin/hgTables but firstly its in BED…

Continue Reading Custom Annotaion file

Does Parabricks support the GRCh38 RefSeq? – Parabricks

Hi there, I would like to know if Parabricks supports the GRCh38 reference sequence, as the GRCh38 RefSeq contains not only ATCG+N but also B, K, M, R, S, W, Y bases. I could not find any relevant information in the documentation, and the Homo_sapiens_assembly38.fasta provided by NVIDIA uses UCSC…

Continue Reading Does Parabricks support the GRCh38 RefSeq? – Parabricks

HTSeqGenie run error

Hi, I am running the HTSeqGenie on both MacOS and Linux with the test TP53 samples. They both gave me error in reading the fastq files. It seems having problems reading the fastq.gz files in each parallel process. Could anyone help me with this please? Error are at below: checkConfig.R/checkConfig.template:…

Continue Reading HTSeqGenie run error

interval_list for hg38

interval_list for hg38 2 I was wondering if anyone could help me with this. I need to run the picard command CollectRnaSeqMetrics below java -jar picard.jar CollectRnaSeqMetrics \ I=input.bam \ O=output.RNA_Metrics \ REF_FLAT=ref_flat.txt \ STRAND=SECOND_READ_TRANSCRIPTION_STRAND \ RIBOSOMAL_INTERVALS=ribosomal.interval_list and I need to create the interval_list for hg38. Is there a list…

Continue Reading interval_list for hg38

Can not generate tranches plot after VQSR

Can not generate tranches plot after VQSR 0 Hello! I have encountered with a strange issue when trying to analyze VariantRecalibrator output. I can not generate or find an R-script file of *.tranche output. At the same time i have no problems with scatter-plots generation an further import in Rstudio….

Continue Reading Can not generate tranches plot after VQSR

The GDC Legacy Archive is retiring soon.

News:The GDC Legacy Archive is retiring soon. 0 Attention GDC Users: The GDC Legacy Archive is retiring soon. SOME FILES WILL NO LONGER BE AVAILABLE! Please download any needed files as soon as possible. This will not impact data in the current GDC data portal (hg38), but will affect old…

Continue Reading The GDC Legacy Archive is retiring soon.

How should reference genome fasta files be distributed by UCSC?

Dear genomics-tools-users and Istvan Albert I work at UCSC and have a question on how to weigh consistency of links versus data updates. TLDR: Should UCSC change the main {hg19,hg38}.fa.gz when the GRC releases a new patch? Traditionally, the hg19 and hg38 fasta files were distributed at hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/ and hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/…

Continue Reading How should reference genome fasta files be distributed by UCSC?

making a BIGWIG from BAM file

making a BIGWIG from BAM file 1 Hello everyone, I have 50 BAM files, some of them single-end and some of them paired-end. Well, I want to make a single bigwig file by combining reads from all of these bam files. For this, I merged all bam files to a…

Continue Reading making a BIGWIG from BAM file

Using LiftOver to change genomic build

Using LiftOver to change genomic build 0 Hi, all – Two questions about using LiftOver: The .bed file changes after using LiftOver. Correct me if I’m wrong, but I can just use the .bim and .fam file from before LiftOver as those do not change? I have used LiftOver to…

Continue Reading Using LiftOver to change genomic build

Build/check report for BioC 3.14 annotations

Build/check report for BioC 3.14 annotations – All results for package EpiTxDb.Hs.hg38 This page was generated on 2022-04-13 06:00:07 -0400 (Wed, 13 Apr 2022). Hostname OS Arch (*) R version Installed pkgs nebbiolo2 Linux (Ubuntu 20.04.4 LTS) x86_64 4.1.3 (2022-03-10) — “One Push-Up” 4324 Click on any hostname to see more info about the system (e.g. compilers)      (*) as…

Continue Reading Build/check report for BioC 3.14 annotations

addGeneAnnotation.pl: not found

addGeneAnnotation.pl: not found 2 hey community, I am facing a problem related to the quantification step of RNAseq analysis: I have run this command ‘ /home/aarmich/Documents/000TOOLS/homer/bin/analyzeRepeats.pl rna hg38 -count genes -d /home/aarmich/Documents/000TOOLS/homer/A549ctrl_RNAseq_hg38 /home/aarmich/Documents/000TOOLS/homer/A549TGFB_RNAseq_hg38 -noadj > lastt.txt and terminal is responding like this: missing NM_003718… missing NM_152604… missing NM_001114132… missing NR_149079……

Continue Reading addGeneAnnotation.pl: not found

Solved below code is giving error . please assist. To

below code is giving error . please assist. To obtain the human protein sequences in multiple FASTA format, you can use the following script: I have written the code in Python: # Load necessary modules from Bio import SeqIO import gzip # Read in human genome file genome_file=”hg38.fa.gz” with gzip.open(genome_file,…

Continue Reading Solved below code is giving error . please assist. To

New Products Posted to GenomeWeb: Molecular Instruments, Bio-Rad, Telesis Bio, Molecular Health

Molecular Instruments HCR RNA-CISH Molecular Instruments (MI), a biotech spinout of the California Institute of Technology, has launched HCR RNA-CISH to enhance automated chromogenic in situ hybridization (ISH) workflows that depend on RNAscope. The HCR RNA-CISH kits provide better performance than any existing chromogenic in situ hybridization approach with double the turnaround…

Continue Reading New Products Posted to GenomeWeb: Molecular Instruments, Bio-Rad, Telesis Bio, Molecular Health

VQSR first step do not generating plots

Hello. After running my codes for VQSR’s first step, I receive all files (including R scripte for plots) except the pdf file that contains plots. I do not have any errors from the vqsr. My code: java -jar gatk-package-4.3.0.0-local.jar VariantRecalibrator \ -O /home/yousef/Desktop/VQSR_hg38/haplotype_hg38_SNP_Recal.vcf \ –resource:hapmap,known=false,training=true,truth=true,prior=15.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/Coverted_to_VCF/Coverted_broad_hg38_v0_hapmap_3.3.hg38.vcf’ \ –resource:omni,known=false,training=true,truth=false,prior=12.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/Coverted_to_VCF/resources_broad_hg38_v0_1000G_omni2.5.hg38.vcf’ \…

Continue Reading VQSR first step do not generating plots

Automated dbSNP lookup by rsID position, plus genome build liftover

Hola, just passing by to say ‘hi’. Please post bugs / suggestions as comments to this tutorial. rsID to position GRCh38 cat rsids.list rs1296488112 rs1226262848 rs1225501837 rs1484860612 rs1235553513 rs1424506967 cat rsids.list | while read rsid ; do pos=$(curl -sX GET “https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=$rsid&retmode=text&rettype=text” | sed ‘s/<\//\n/g’ | grep -o -P ‘\<CHRPOS\>.{0,15}’ |…

Continue Reading Automated dbSNP lookup by rsID position, plus genome build liftover

How to install `BETA` on linux server?

Hello: I am trying to install BETA on HPC: cistrome.org/BETA/index.html#inst I create virtual environment using conda: conda create -n BETA python=2.7 conda activate BETA unzip BETA_1.0.7.zip cd BETA_1.0.7 python setup.py install cc -Wall -g motif.c misp.c -o misp -O3 -lz -lm In file included from motif.c:9: motif.h:18:10: fatal error: zlib.h:…

Continue Reading How to install `BETA` on linux server?

Error in VQSR first step

Error in VQSR first step 0 Hello. After inserting following codes for VQSR first step: java -jar gatk-package-4.3.0.0-local.jar VariantRecalibrator -O /home/yousef/Desktop/haplotype_hg38_SNP_Recal.vcf –resource:hapmap,known=false,training=true,truth=true,prior=15.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/resources_broad_hg38_v0_hapmap_3.3.hg38.vcf’ –resource:omni,known=false,training=true,truth=false,prior=12.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/resources_broad_hg38_v0_1000G_omni2.5.hg38.vcf’ –resource:1000G,known=false,training=true,truth=false,prior=10.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/resources_broad_hg38_v0_1000G_phase1.snps.high_confidence.hg38.vcf’ –resource:dbsnp,known=true,training=false,truth=false,prior=2.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/resources_broad_hg38_v0_Homo_sapiens_assembly38.dbsnp138.vcf’ –tranches-file /home/yousef/Desktop/Tranches.txt -an QD -an SOR -an MQ -an FS -an SOR -an ReadPosRankSum -an MQRankSum -V /home/yousef/Desktop/haplotype.hg38.vcf –max-gaussians 4 -R…

Continue Reading Error in VQSR first step

GATK HaplotypeCaller combine info from two BAM into one line in vcf (not divide into samples column)

Hi I run the GATK HaplotypeCaller and hope to get a file where each sample will have a column. My bam file looks like this: input_bam/SRR8859080.bam input_bam/ENCFF477JTA_new.bam This is my GATK command: allele_chunk_file=rs_coord.vcf gatk_run_line=”../bin/gatk-4.1.2.0/gatk” outfile=wgs_test_out.genotypes.vcf bam_file=wgs_test.bam.list genome_seq=”../hg38.fa” intervals=wgs_test.bed $gatk_run_line \ HaplotypeCaller\ –reference $genome_seq \ –input $bam_file \ –genotyping-mode GENOTYPE_GIVEN_ALLELES \…

Continue Reading GATK HaplotypeCaller combine info from two BAM into one line in vcf (not divide into samples column)

Gene coverage issue in HG38

Gene coverage issue in HG38 0 Hi, Recently I checked the coverage of gene OCA2, using both genome HG37 and hg38. surprisingly, it was not not covered (very poorly mapped) in HG38, and full covered with good coverage in hg38 with all used capture kit. we are not able to…

Continue Reading Gene coverage issue in HG38

(1): download the human genome below is the open

(1): download the human genome below is the open source site to get the human genome data for learning purpose site so no issue to access and download genome data. sample data attached below / anyone can access it open source no copyright hence sharing . hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz (2): Use the…

Continue Reading (1): download the human genome below is the open

Solved Problem Statement: This question asks to write a

Problem Statement: This question asks to write a script to obtain all protein sequences coded in the human genome in the multiple FASTA format, using the RefSeq table obtained from the UCSC Table Browser and the human genome obtained from the given URL. The ID of each sequence should be…

Continue Reading Solved Problem Statement: This question asks to write a