Tag: hg38
where to obtain hg19 reference transcriptome in gencode
where to obtain hg19 reference transcriptome in gencode 1 Hi, I want to use salmon to generate an index and later use it again to get a table of counts that I can use for differential expression analysis with the hg19 version. For the hg38 reference genome I have had…
Chain file for Homo_sapiens_assembly38.fasta to liftover Nebula vcf
Chain file for Homo_sapiens_assembly38.fasta to liftover Nebula vcf 0 Which chain file is needed to convert from positions in a vcf aligned to Homo_sapiens_assembly38.fasta to a vcf with positions aligned to hg19.fa. The obvious solution hg38ToHg19.over.chain.gz did not work. I couldn’t find any information in the MEGABOLT bioinformatician guide from…
Heritable transcriptional defects from aberrations of nuclear architecture
Cell culture and cell line construction Cells were cultured at 37 °C in 5% CO2 atmosphere with 100% humidity. Telomerase-immortalized RPE-1 retinal pigment epithelium cells (CRL-4000, American Type Culture Collection), U2OS osteosarcoma cells (HTB-96, American Type Culture Collection) and derivative cell lines were grown in DMEM/F12 (1:1) medium without phenol red…
Differences in GTF files hg19 and hg38
Differences in GTF files hg19 and hg38 1 Hello, I have a script for gene annotation with hg19. Now I’m trying to annotate using hg38. The problem is I need the column level (level 1, level 2). I can’t find it in the gtf file hg38. Do you have any…
Gviz Coverage Plots
I have two bam files from single-cell RNA sequencing mapped to the reference genome using CellRanger, I can view them in IGV and I have a particular region where the pattern of reads mapped to the reference genome are different between the two bam files but when I try the…
Help with running ATAC-seq using Encode pipeline
I am very new to ATAC-seq pipeline from Encode and trying to run on a HPC but not sure the steps after reading their instruction. github.com/ENCODE-DCC/atac-seq-pipeline Here is my json file: { “atac.title” : “atac)”, “atac.description” : “Encode”, “atac.pipeline_type” : “atac”, “atac.align_only” : false, “atac.true_rep_only” : false, “atac.genome_tsv” : “https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v4/hg38.tsv”,…
Is hg38 on the multiz 30-way alignment inaccurate?
Is hg38 on the multiz 30-way alignment inaccurate? 0 Hello, I have a general question: Is it possible that the multiz 30-way alignment is inaccurate? I am looking for specific hg38 sequences in the blocks (I am using BLAT retrieved coordinates, with PHAST to parse the maf blocks), and the…
hisat2 Error 137
hisat2 Error 137 0 hi every body, I am running hisat2 with below command: hisat2 –dta -x /home/genetics/apps/Proj/Index/hg38/genom -1 SRR11573854_1P -2 SRR11573854_2P -S SRR11573854.sam -p 4 and I face this error: Killed (ERR): hisat2-align exited with value 137 could you help me to solve this problem? Thank you very much…
Difference between USCS exon coordinates and ensembl
Hello, I am trying to extract the coordinates for the exons in numerous genes. For example, APC using the MANE transcript: NM_000038.6 When downloading the USCS table browser for just exons from HG38 I get these coordinates for the exons (example exon 1): chr5 112737884 112737925 However, when I look…
Bioconductor – REMP
DOI: 10.18129/B9.bioc.REMP This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see REMP. Repetitive Element Methylation Prediction Bioconductor version: 3.13 Machine learning-based tools to predict DNA methylation of locus-specific repetitive elements (RE) by learning surrounding genetic and epigenetic information. These tools provide genomewide…
How to determine the exact version of hg38 if I have only the FASTA file
How to determine the exact version of hg38 if I have only the FASTA file 1 I have a FASTA file which contains hg38 assembly. It contains the primary contigs, alt contigs, decoy, HLA, mito. How do I determine the exact version of hg38 based on the FASTA? Here some…
Bioconductor – gpart
DOI: 10.18129/B9.bioc.gpart This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see gpart. Human genome partitioning of dense sequencing data by identifying haplotype blocks Bioconductor version: 3.13 we provide a new SNP sequence partitioning method which partitions the whole SNP sequence based on…
How to extract phastcons score for all protein coding genes and lncRNAs using GenomicScores?
There are multiple ways in which you can do this, I’ll show you one. first, you need to fetch the gene annotations you want. You’ve mentioned Gencode. One version of Gencode annotations is available in theTxDb.Hsapiens.UCSC.hg38.knownGene注释包: library(TxDb.Hsapiens.UCSC.hg38.knownGene) txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene 接下来,您需要获取基因的坐标,您尚未提及是否要计算跨基因边界的平均保护,例如,剪接外显子的平均保护。我将假设您想要后者: exonsbygene <-exonsby(txdb,by by =“ gene”)class(exonsbygene)[1]“ compressedgrangeslist” attr(,“ package” 1]…
DanMAC5: a browser of aggregated sequence variants from 8,671 whole genome sequenced Danish individuals | BMC Genomic Data
Demographics Data from three studies were included: Dan-NICAD: 1,649 individuals with symptoms of obstructive coronary artery disease, predominantly chest pain, undergoing coronary computed tomography angiography. In total, 52% were females, the mean age was 57 years (+/- 9 SD), median coronary artery calcium score were 0 [0–82] and 24% of…
Circulating miRNA expression in long-standing type 1 diabetes mellitus
Participants This is an observational case–control study, carried out in adult patients who attended the Endocrinology and Nutrition Service of the Central University Hospital of Asturias, between June 2019 and December 2021. Written informed consent was obtained from all participants and the study was conducted in accordance with the principles…
Deferentially expressed gene with high log2foldchange by DESeq2; but not meaningful at the individual level
Hi all, I am working with the RNA-Seq data on human (24Cases-20 controls) to find differentially expressed genes. my RNA-Seq data is unstranded. Here is the comments that I used to align the fastq files: ls *_1P.fastq.gz | parallel –bar -j8 ‘R2=$(echo {} | sed s/_1/_2/) && out=$(echo {} |…
Chipseq data peak calling issue
Hi , I’m trying to do analysis of chipseq data . I have 3 samples Sample1 , sample2 and input I have done QC and then alignment using Bowtie . After that I used samtool to get bam files . Then I have used Picard for duplicate removal. Now I…
Genes’ fpkm values through cufflink
Hi, I am a newbie to RNA-seq data analysis. I have to identify differentially expressed genes (DEGs) between human and chimpanzee in a tissue type. I have comparable RNA-seq experiment data (reads/fastq) for the two species. Each species has 2 biological replicates(each with three technical replicates) so six runs per…
Error in .Call2(“C_solve_user_SEW”, refwidths, start, end, width, translate.negative.coord
Error in .Call2(“C_solve_user_SEW”, refwidths, start, end, width, translate.negative.coord 0 Dear guys, Any solutions for the error when trying to construct the trinucleotideMatrix? test_maf.tnm = trinucleotideMatrix(maf = test_maf, ref_genome = “BSgenome.Hsapiens.UCSC.hg38” ) ## some site/region cause the error -Extracting 5′ and 3′ adjacent bases -Extracting +/- 20bp around mutated bases for…
bash – How to use for-loops and if-else statements to iteratively run a software on command line
For all the .mcool files, I extract the id, which is the string before the .mcool extension. If the substring of id after the last / is either 5000, 10000, or 50000 (i.e., 5k, 10k, or 50k), I want to run predictSV from EagleC. For a single .mcool file, as…
Instructions Associated questions refer to a gene in
Transcribed image text: Instructions Associated questions refer to a gene in the human genome. The screenshot below should help you find the location of the gene in the human genome (hg38). Navigate to this gene in the UCSC Genome Browser (not GEP mirror site) to answer attached questions. Epo-last exon…
open-cravat: variant annotation tool
Tool:open-cravat: variant annotation tool 3 open-cravat is an open-source platform for rapidly developing, using, and disseminating variant annotation tools. It can handle unlimited number of variants in VCF format input files as well as its own input format and produce tab-separated text output files and excel spreadsheets. It is command-line-based…
Perl debugging help – miRWoods
Hello, I was wondering if anyone with Perl experience could help me debug a miRWoods? I tried reaching out the authors via e-mail with no response, and issues on GitHub are turned off so I’d be super grateful if anyone could provide any insight. When I run miRWoods I get…
r – Plotting infercnv results
I’m working with matched single cell data, where we have treated and untreated samples for the same patient. I ran CNV analysis using the infercnv package. I’ve followed the tutorial: # data matrix counts_matrix <- scData@assays$RNA@counts meta = data.frame(labels = Idents(scData), row.names = names(Idents(scData))) unique(meta$labels) # check the cell labels…
Bioconductor – SNPlocs.Hsapiens.dbSNP149.GRCh38
DOI: 10.18129/B9.bioc.SNPlocs.Hsapiens.dbSNP149.GRCh38 This package is for version 3.15 of Bioconductor; for the stable, up-to-date release version, see SNPlocs.Hsapiens.dbSNP149.GRCh38. SNP locations for Homo sapiens (dbSNP Build 149) Bioconductor version: 3.15 SNP locations and alleles for Homo sapiens extracted from NCBI dbSNP Build 149. The source data files used for…
how can I generate a VCF (in hg38 coords) of differences between hg38 and CHM13?
I downloaded s3-us-west-2.amazonaws.com/human-pangenomics/pangenomes/freeze/freeze1/minigraph/hprc-v1.0-minigraph-grch38.gfa.gz which contains hg38, chm13, and other assemblies, and now am trying to use vg to generate a VCF with the variants in CHM13 relative to hg38. After converting to vg format, by running vg convert <(gunzip -c hprc-v1.0-minigraph-grch38.gfa.gz) > hprc-v1.0-minigraph-grch38.vg, I tried a few different variations of…
multi-mapping reads settings in Rsubread or Rsubjunc
multi-mapping reads settings in Rsubread or Rsubjunc 0 Hi All, I am using Rsubjunc to process my RNA seq data for DEseq2 and differential splicing analysis. I have a question about how to set multi-mapping reads alignment in Rsubjunc R package. The command I used is attached to the end…
Methanol fixation is the method of choice for droplet-based single-cell transcriptomics of neural cells
hiPSC cell culture and differentiation hiPSCs were maintained on 1:40 matrigel (Corning, #354277) coated dishes in supplemented mTeSR-1 medium (StemCell Technologies, #85850) with 500 U ml−1 penicillin and 500 mg ml−1 streptomycin (Gibco, #15140122). For the differentiation of cortical neurons the protocol described previously21 was followed with slight modifications. Briefly, hiPSC colonies were seeded…
how to input bedpe file in IGV
how to input bedpe file in IGV 0 Hello, I have a bedpe file with interacting region and its interacting score. I want to view this bedpe file in IGV. But its showing error when inputing. The file (bedpe) look like this: When I input the file in IGV: File>…
bash – CNV Kit ` from . import commands ImportError: cannot import name ‘commands’ from ‘__main__’`
I am trying to run some code for my colleague in bash /path/to/cnvkit.py batch /path/to/my/folder/with/bams/*.bam \ –normal –targets ${bed_file} \ –fasta path/to/my/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta \ –output-reference /path/to/my/CD_BATCH1_reference.cnn \ –output-dir /path/to/my/Group_1 –scatter however, I keep getting this truly peculiar error Traceback (most recent call last): File “/path/to/cnvkit.py”, line 4, in <module> from ….
Removing multi-variant records from vcf file
Removing multi-variant records from vcf file 3 I am using gatk ASEReadCounter to get the read counts per allele. To do so, I used the following command: gatk ASEReadCounter -R /path_to_genome/hg38_genome/GRCh38.p13.genome.fa -I sample.sorted.bam -V sample.vcf.gz -O output.table I used GATK4. but I realized In my VCF at position chr1:1574033, there…
Help with error in GATK variant calling
Help with error in GATK variant calling 1 Hi all, I try some reference genome such as Homo_sapiens_assembly38.fasta and Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa but I still got the error below. Would you please have a suggestion? Thank you so much. The link in the error message doesn’t work. gatk BaseRecalibrator -I Library_1Aligned.out.sorted.bam -R…
Assembly Table.
Assembly Table. A. mellifera (Apr 2011 Amel_4.5/amel5) A. carolinensis (May 2010 AnoCar2.0/anoCar2) A. thaliana (Feb 2011 TAIR10/araTha1) B. taurus (Aug 2006 Btau_3.1/bosTau3) B. taurus (Nov 2014 Bos_taurus_UMD_3.1.1/bosTau8) C. familiaris (May 2005 CanFam2.0/canFam2) C. familiaris (Sep 2011 CanFam3.1/canFam3) C. porcellus (Feb 2008 Cavpor3.0/cavPor3) C. elegans (Oct 2010 WBcel215/ce10) C. elegans (Feb…
how to look at interacting SNP for a gene using Hic contact map
how to look at interacting SNP for a gene using Hic contact map 0 Hello everyone, I have one HiC contact map file corresponding to a particular cells( brain frontal cortex). I want to looking at how many snps are located within 40kb distance from this gene and are interacting…
feature count command error
Hi everyone , I have ran this feature count command : featureCounts -T 4 -p -a gencode.v43.basic.annotation.gtf -o featurecount.txt *.bam this gave me this error : Process BAM file UI_E2_sorted.bam.bam… || || Paired-end reads are included. || || The reads are assigned on the single-end mode. || || Total alignments…
Help wanted for a struggling bioinfmatician! GATK Variantrecalibration
Hello, any answer on any the following questions will be much appreciated! I’m playing around with gatk’s VariantRecalibration tool. I have a few questions that I can’t find information on that I seem to understand. 1) My tranches plot has no False Positives (see provided image) did something go wrong…
Convert hg19 coordinates to hg38 coordinates
Convert hg19 coordinates to hg38 coordinates 2 Hello everyone! I would like to convert hg19 coordinates to hg38 coordinates. The problem is that I don’t really have much informations in order to do so. For example: The gene KLHL7 in hg19 the start is 23 145 353 and the end…
Low SNP Overlap with Michigan 1KG and TopMed reference panel
I extracted three samples (HG02024 – HG02026) from the 1000 Genomes Project’s 30x alignment files, employing the Genome Analysis Toolkit (GATK) best practice pipeline. This process involved performing base quality score recalibration, identifying and removing duplicate reads, utilizing the HaplotypeCaller to generate a genomic VCF (gVCF) file, and calling variants…
Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits
ARG-Needle and ASMC-clust algorithms We introduce two algorithms to construct the ARG of a set of samples, called ARG-Needle and ASMC-clust. Both approaches leverage output from the ASMC algorithm11, which takes as input a pair of genotyping array or sequencing samples and outputs a posterior distribution of the TMRCA across…
Using the 27-primates UCSC Multiz Alignment
Using the 27-primates UCSC Multiz Alignment 0 Hello, I am attempting to use the 27-primates UCSC multiz alignment data so that I can find an orthogonal sequence (across different mammals) to a human reference sequence (whether an ORF, a transcript, etc). I downloaded all the data in the maf folders…
Bioconductor – OutSplice (development version)
DOI: 10.18129/B9.bioc.OutSplice This is the development version of OutSplice; to use it, please install the devel version of Bioconductor. Comparison of Splicing Events between Tumor and Normal Samples Bioconductor version: Development (3.17) An easy to use tool that can compare splicing events in tumor and normal tissue samples using…
Bioconductor – Ularcirc
DOI: 10.18129/B9.bioc.Ularcirc This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see Ularcirc. Shiny app for canonical and back splicing analysis (i.e. circular and mRNA analysis) Bioconductor version: 3.13 Ularcirc reads in STAR aligned splice junction files and provides visualisation and analysis tools…
Download the promoter, enchancer, TSS , 3 prime, exon and intron positions of all hg38 genes
Download the promoter, enchancer, TSS , 3 prime, exon and intron positions of all hg38 genes 1 Hi All, I would like to download the promoter enhancer, exon, intron, 3’prime, 5’prime positions of all genes from the human genome hg38 version. I have seen a couple of information in the…
Human hg38 chr10:21,513,475-21,525,682 UCSC Genome Browser v446
Seq2science ChIP-seq hub ChIP-seqhidedensesquishpackfull Mapping and Sequencing Base Positionhidedensefull p14 updated Fix Patcheshidedensesquishpackfull p14 updated Alt Haplotypeshidedensesquishpackfull Assemblyhidedensesquishpackfull Centromereshidedensesquishpackfull Chromosome Bandhidedensesquishpackfull Clone Endshidedensesquishpackfull Exome Probesetshidedensesquishpackfull FISH Cloneshidedensesquishpackfull Gaphidedensesquishpackfull GC Percenthidedensefull GRC Contigshidedensefull GRC Incidenthidedensesquishpackfull Hg19 Diffhidedensesquishpackfull INSDChidedensesquishpackfull LiftOver & ReMaphidedensesquishpackfull LRG Regionshidedensesquishpackfull Mappabilityhideshow Problematic Regionshidedensesquishpackfull new Recomb…
Exon vs Transcript in featurecounts for RNA Seq
I am counting reads with featurecounts after aligning RNA-seq data to Hg38 using STAR. When I run featurecounts with “-t exon” (i.e. using the exon flag in the 3rd column of the GTF file), I get generally poor results, with less than 50% being assigned and greater than 50% being…
excluderanges: exclusion sets for T2T-CHM13, GRCm39, and other genome assemblies
doi: 10.1093/bioinformatics/btad198. Online ahead of print. Affiliations Expand Affiliations 1 Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, 23298, USA. 2 Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC 27514, USA. 3 Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill,…
Help with error makeTxDbFromGFF in GenomicFeatures package
Hi all, Would you have a suggestion for the error below? Thank you so much. txdb <- makeTxDbFromGFF(opt$gtf) Error in .detect_file_format(file) : Invalid ‘file’. Must be a path to a file, or an URL, or a connection object, or a GFF3File or GTFFile object. opt $cores [1] 1 $help [1]…
Ex vivo prime editing of patient haematopoietic stem cells rescues sickle-cell disease phenotypes after engraftment in mice
Optimizing prime editing systems for HSPCs We previously reported the use of prime editing to correct HBBS by plasmid transfection in HEK293T cells containing the SCD mutation, reaching up to 58% efficiency (Fig. 1a)24. In contrast to HEK293T cells, HSPCs are difficult to transfect with plasmid DNA but are amenable…
The Biostar Herald for Monday, April 17, 2023
The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan…
Issue With CRAM -> BAM -> FASTQ Conversion
Issue With CRAM -> BAM -> FASTQ Conversion 2 Please help! I am trying to obtain fastq files from the GDSC, all we have in the lab is CRAM files. Unfortunately, the reference genome seems to not exist when pulled from an online source. I have attempted to use the…
STRavinsky STR database and PGTailor PGT tool demonstrate superiority of CHM13-T2T over hg38 and hg19 for STR-based applications
doi: 10.1038/s41431-023-01352-6. Online ahead of print. Affiliations Expand Affiliations 1 Morris Kahn Laboratory of Human Genetics, NIBN and Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, Israel. 2 Genetics Institute, Soroka Medical Center, Beer Sheva, Israel. 3 Morris Kahn Laboratory of Human Genetics, NIBN and Faculty…
Pick a human gene of your interest. Using
Transcribed image text: Pick a human gene of your interest. Using bioinformatics tools on the UCSC Genome Browser (select GRCh38/hg38 assembly), identify regions that can be regulated through epigenetics mechanisms (DNA methylation, histone marks, etc.). Highlight these regions and explain mechanisms that may be important in the regulation of this…
Annovar file
Annovar file 1 Hello My avinput.exonic variant function file and avinput.hg38_cytoband files in annovar is 0 bytes What is the reason for this. Please help what should i do Im learning to use annovar tool Exonic_variant • 359 views Can you shows your first 10 lines of your new.avinput file?…
use ROSE to identify super enhancer
use ROSE to identify super enhancer 0 hey everyone, i want to use ROSE to identify super enhancer and to see if there is difference in the super enhancer after some treatment in lung cancer cell line i see that this is the typical use: [user@cn3107 ~]$ ROSE_main.py -h Usage:…
Specific recognition of an FGFR2 fusion by tumor infiltrating lymphocytes from a patient with metastatic cholangiocarcinoma
Introduction Cholangiocarcinoma (CC) is a form of gastrointestinal cancer that originates from the epithelium of either intrahepatic or extrahepatic bile ducts. It accounts for approximately 3% of all gastrointestinal cancers, with reported incidence of one to two cases per 100,000 persons per year in the USA (and much higher incidence…
hg38 Ig regions
hg38 Ig regions 3 Hi, I’m looking for the Immunoglobulin regions coordinates in hg38 assembly. I want to exclude them from my CNV analysis. I know the hg19 regions but I do not want just to liftover them. Many thanks assembly sequence • 2.0k views You can use Ensembl’s BioMart…
SureSelect Clinical Research Exome V1 hg38
SureSelect Clinical Research Exome V1 hg38 0 Hi, I’m inheriting an old project where whole exome was done using hg19 and now they want hg38. The capture kit was Agilent SureSelect Clinical Research Exome V1. I noticed this post Bed For Agilent Sureselect All Exon Kits but I could only…
Index of /goldenPath/hg38/vsTurTru2
Index of /goldenPath/hg38/vsTurTru2 This directory contains alignments of the following assemblies: – target/reference: Human (hg38, Dec. 2013 (GRCh38/hg38), GRCh38 Genome Reference Consortium Human Reference 38 (GCA_000001405.15)) – query: Dolphin (turTru2, Oct. 2011 (Baylor Ttru_1.4/turTru2), Baylor College of Medicine Ttru_1.4 (NCBI project 20365, GCA_000151865.2, WGS ABRN02)) Files included in this directory:…
PF4088 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status HGDP01069 Italy (Cagliari) / Sardinian I-PF4088* —— Hg38 .BAM Scientific 18X, 23.6 Mbp, 151 bp YF065974 Spain (Valencia / València) I-Y137878* —— Hg38 .BAM FTDNA (Y700) 39X, 18.4 Mbp, 151 bp YF072915 United Kingdom (Hampshire) I-Y25699 —— Hg38…
How to add attributes (i.e. gene_id, transcipt_id, exon_id, etc.) annotation from .bed file onto VCF?
I’m trying to annotate genes onto a VCF file with bcftools. My annotation file is a .bed file that originally was a hg38 UCSC knownGene gtf file, converted by BEDOPS: hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/ Original GTF file: chr1 11868 12227 . . + knownGene exon . gene_id “ENST00000456328.2”; transcript_id “ENST00000456328.2”; exon_number “1”; exon_id…
Download stats for workflow package BSgenome.Hsapiens.UCSC.hg38
Download stats for workflow package BSgenome.Hsapiens.UCSC.hg38 This page was generated on 2023-04-01 01:19:13 -0400 (Sat, 01 Apr 2023). BSgenome.Hsapiens.UCSC.hg38 home page: release version, devel version. Number of downloads for workflow package BSgenome.Hsapiens.UCSC.hg38, year by year, from 2023 back to 2015 (years with no downloads are omitted): 2023 Month Nb of distinct IPs Nb of downloads…
Integration of RNA seq data aligned to different reference genome versions
Integration of RNA seq data aligned to different reference genome versions 0 Hi all, I would like to integrate a bulk RNA seq and an scRNA seq dataset. The data was already provided as count matrices, but unfortunately, the bulk RNA seq data was aligned to hg38 whereas the scRNA…
Correct script for featurecounts in Rsubread
I am new to R and RStudio but have been trying to work through different examples using Rsubread for my data. I have tried reading vignettes and manuals prior to posting here but I am stuck and could really use some advice. I have 7 paired-end, fastq files from Illumina…
1000 genomes hg38 with dbSNP rsid
1000 genomes hg38 with dbSNP rsid 1 Hi, Anyone know where I can download the latest version of 1000 Genomes, on build hg38, in VCF format (or PLINK format), that ALSO contains the dbSNP RSid in the VCF ID field? I looked at the IGSR website, dbSNP, UCSC, etc. So…
RBFOX2 modulates a metastatic signature of alternative splicing in pancreatic cancer
Patient samples and RNA-seq analysis RNA-seq data from 395 patients with PDA was obtained from the University Health Network (Toronto), Sunnybrook Health Sciences Centre (Toronto), Kingston General Hospital (Kingston), McGill University (Montreal), Mayo Clinic (Rochester), Massachusetts General Hospital (Boston) and Sheba Medical Center (Tel Aviv) and has been described previously1,12,13,14….
TxDB.Hsapiens.UCSC.hg38.knownGene with locateVariants() identifying SNPs from various chromosome being part of the same gene
I am trying to annotate a list of SNPs using the hg38 genome (knownGene) and locateVariants(). The program is able to successfully run and provide “GeneIDs” for several of the loci. However, some GeneIDs are applied to SNPs in completely different regions and on completely different chromosomes. When I cross…
Ensembl Hg38 dna, dna_rm, and dna_sm
Ensembl Hg38 dna, dna_rm, and dna_sm 1 Hello, I want to ask a basic question. So, I read from the readme in the Ensembl ftp for Hg38 reference and it seems there are several type of file of dna, which is only dna, dna_sm, and dna_rm. If I want to…
No BSgenome for Human HG19 or HG38 with R version 4.2.2 and Bioconductor 3.16
No BSgenome for Human HG19 or HG38 with R version 4.2.2 and Bioconductor 3.16 1 @aac4f0b4 Last seen 17 hours ago France Hello, I’m currently trying to install BSgenome.Hsapiens.UCSC.hg19 and BSgenome.Hsapiens.UCSC.hg38 through bioconductor on my desktop (Windows) My current R version is 4.2.2.and bioconductor is 3.16 My command is this…
What’s the correct way to map to hg38 with alternative contigs?
I’m trying to do mapping to hg38. And I’m a bit confused how to handle alternative (random, fix, …) contigs. From documents of bwa seemingly there is a proposed way (github.com/lh3/bwa/blob/master/README-alt.md), but there are some obstructions. 1.If you use hg38 directly (I download from hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/latest/hg38.fa.gz). The log from bwa shows…
query UCSC db for CDS coordinates (Gencode)
query UCSC db for CDS coordinates (Gencode) 0 Hi, I’m not sure how to obtain CDS coordinates from GENCODE using mysql on UCSC. Specifically, I’d like to obtain one bed per record of CDS like from the website. I tried to query …using the follow command but the db only…
Bioconductor – crisprDesign (development version)
DOI: 10.18129/B9.bioc.crisprDesign This is the development version of crisprDesign; for the stable release version, see crisprDesign. Comprehensive design of CRISPR gRNAs for nucleases and base editors Bioconductor version: Development (3.17) Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and…
BigWig format (Cut&Run)
I generated summit.bed files from my experiment cut&run, and I followed the below steps to get and visualize the bigwig files. 1) I used awk to get a bedgraph: awk ‘{printf “%s\t%d\t%d\t%2.3f\n” , $1,$2,$3,$5}’ myBed.bed > myFile.bedgraph 2) Sorting the bed files: sort -k1,1 -k2,2n myFile.bedgraph > myFile_sorted.bedgraph 3) Chrom.size:…
STAR solo segmentation fault after ‘started Solo counting’
STAR solo segmentation fault after ‘started Solo counting’ 1 Dear all, I have some 10x v3 single cell rna seq fastq files that I am trying to map to hg38 human genome using STAR aligner and generate read counts. However, I am getting the following error and hope that some…
Bioconductor – phastCons7way.UCSC.hg38
DOI: 10.18129/B9.bioc.phastCons7way.UCSC.hg38 This package is for version 3.13 of Bioconductor; for the stable, up-to-date release version, see phastCons7way.UCSC.hg38. UCSC phastCons conservation scores for hg38 Bioconductor version: 3.13 Store UCSC phastCons conservation scores for the human genome (hg38) calculated from multiple alignments with other 6 vertebrate species. Author: Robert…
Index of /goldenPath/hg38/vsMicMur3/reciprocalBest
Index of /goldenPath/hg38/vsMicMur3/reciprocalBest This directory contains reciprocal-best netted chains for hg38-micMur3. – hg38.micMur3.rbest.net.gz: hg38-referenced recip.best net to micMur3. – hg38.micMur3.rbest.chain.gz: chains extracted from the recip.best net. These can be passed to the liftOver program to translate coords from hg38 to micMur3 through the recip.best net. – micMur3.hg38.rbest.net.gz: micMur3-referenced recip.best net….
Can not get bcftools norm to join biallelics into a multiallelic.
Forum:Can not get bcftools norm to join biallelics into a multiallelic. 0 is this the right way to use bcftools to join/merge biallelic records into a multiallelic? If so, it is not working. No errors but it gives me the same file with my command added to the headers. Example…
Very high coverage Nanopore alignment Hg38
Very high coverage Nanopore alignment Hg38 0 Hi, I aligned my first Nanopore reads on the hg38 reference. Then, I used bedtools genomecov to get an idea about the mean coverage over the genome. I noticed some bases have a very high coverage >20,000X , whereas the mean coverage is…
Running accurate, comprehensive, and efficient genomics workflows on AWS using Illumina DRAGEN v4.0
Introduction The reduced cost of DNA sequencing technology has led to an exponential growth of raw sequencing data. To keep pace with this development, secondary analysis tools that can provide fast and accurate results in a cost-effective manner are needed to extract actionable genomic insights. Illumina’s DRAGENTM (Dynamic Read Analysis for GENomics) addresses…
IJMS | Free Full-Text | Transcriptomic Analysis of CRISPR/Cas9-Mediated PARP1-Knockout Cells under the Influence of Topotecan and TDP1 Inhibitor
1. Introduction The synthesis of poly(ADP-ribose) (PAR) is an immediate response of cells to DNA damage catalyzed by poly(ADP-ribose) polymerases, which transfer ADP-ribose units from NAD+ onto target molecules [1]. PAR is a linear and branched polymer up to 200 units long that is covalently attached to the target proteins,…
Custom Annotaion file
Custom Annotaion file 0 Hi Everyone, Can anyone please guide me how to generate an annotation file for (5′ and 3′) UTR and CDS (all of them are one GFF/GTF file) from already existing hg38 annotation file ? I did downloaded annotation file from genome.ucsc.edu/cgi-bin/hgTables but firstly its in BED…
Does Parabricks support the GRCh38 RefSeq? – Parabricks
Hi there, I would like to know if Parabricks supports the GRCh38 reference sequence, as the GRCh38 RefSeq contains not only ATCG+N but also B, K, M, R, S, W, Y bases. I could not find any relevant information in the documentation, and the Homo_sapiens_assembly38.fasta provided by NVIDIA uses UCSC…
HTSeqGenie run error
Hi, I am running the HTSeqGenie on both MacOS and Linux with the test TP53 samples. They both gave me error in reading the fastq files. It seems having problems reading the fastq.gz files in each parallel process. Could anyone help me with this please? Error are at below: checkConfig.R/checkConfig.template:…
interval_list for hg38
interval_list for hg38 2 I was wondering if anyone could help me with this. I need to run the picard command CollectRnaSeqMetrics below java -jar picard.jar CollectRnaSeqMetrics \ I=input.bam \ O=output.RNA_Metrics \ REF_FLAT=ref_flat.txt \ STRAND=SECOND_READ_TRANSCRIPTION_STRAND \ RIBOSOMAL_INTERVALS=ribosomal.interval_list and I need to create the interval_list for hg38. Is there a list…
Can not generate tranches plot after VQSR
Can not generate tranches plot after VQSR 0 Hello! I have encountered with a strange issue when trying to analyze VariantRecalibrator output. I can not generate or find an R-script file of *.tranche output. At the same time i have no problems with scatter-plots generation an further import in Rstudio….
The GDC Legacy Archive is retiring soon.
News:The GDC Legacy Archive is retiring soon. 0 Attention GDC Users: The GDC Legacy Archive is retiring soon. SOME FILES WILL NO LONGER BE AVAILABLE! Please download any needed files as soon as possible. This will not impact data in the current GDC data portal (hg38), but will affect old…
How should reference genome fasta files be distributed by UCSC?
Dear genomics-tools-users and Istvan Albert I work at UCSC and have a question on how to weigh consistency of links versus data updates. TLDR: Should UCSC change the main {hg19,hg38}.fa.gz when the GRC releases a new patch? Traditionally, the hg19 and hg38 fasta files were distributed at hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/ and hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/…
making a BIGWIG from BAM file
making a BIGWIG from BAM file 1 Hello everyone, I have 50 BAM files, some of them single-end and some of them paired-end. Well, I want to make a single bigwig file by combining reads from all of these bam files. For this, I merged all bam files to a…
Using LiftOver to change genomic build
Using LiftOver to change genomic build 0 Hi, all – Two questions about using LiftOver: The .bed file changes after using LiftOver. Correct me if I’m wrong, but I can just use the .bim and .fam file from before LiftOver as those do not change? I have used LiftOver to…
Build/check report for BioC 3.14 annotations
Build/check report for BioC 3.14 annotations – All results for package EpiTxDb.Hs.hg38 This page was generated on 2022-04-13 06:00:07 -0400 (Wed, 13 Apr 2022). Hostname OS Arch (*) R version Installed pkgs nebbiolo2 Linux (Ubuntu 20.04.4 LTS) x86_64 4.1.3 (2022-03-10) — “One Push-Up” 4324 Click on any hostname to see more info about the system (e.g. compilers) (*) as…
addGeneAnnotation.pl: not found
addGeneAnnotation.pl: not found 2 hey community, I am facing a problem related to the quantification step of RNAseq analysis: I have run this command ‘ /home/aarmich/Documents/000TOOLS/homer/bin/analyzeRepeats.pl rna hg38 -count genes -d /home/aarmich/Documents/000TOOLS/homer/A549ctrl_RNAseq_hg38 /home/aarmich/Documents/000TOOLS/homer/A549TGFB_RNAseq_hg38 -noadj > lastt.txt and terminal is responding like this: missing NM_003718… missing NM_152604… missing NM_001114132… missing NR_149079……
Solved below code is giving error . please assist. To
below code is giving error . please assist. To obtain the human protein sequences in multiple FASTA format, you can use the following script: I have written the code in Python: # Load necessary modules from Bio import SeqIO import gzip # Read in human genome file genome_file=”hg38.fa.gz” with gzip.open(genome_file,…
New Products Posted to GenomeWeb: Molecular Instruments, Bio-Rad, Telesis Bio, Molecular Health
Molecular Instruments HCR RNA-CISH Molecular Instruments (MI), a biotech spinout of the California Institute of Technology, has launched HCR RNA-CISH to enhance automated chromogenic in situ hybridization (ISH) workflows that depend on RNAscope. The HCR RNA-CISH kits provide better performance than any existing chromogenic in situ hybridization approach with double the turnaround…
VQSR first step do not generating plots
Hello. After running my codes for VQSR’s first step, I receive all files (including R scripte for plots) except the pdf file that contains plots. I do not have any errors from the vqsr. My code: java -jar gatk-package-4.3.0.0-local.jar VariantRecalibrator \ -O /home/yousef/Desktop/VQSR_hg38/haplotype_hg38_SNP_Recal.vcf \ –resource:hapmap,known=false,training=true,truth=true,prior=15.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/Coverted_to_VCF/Coverted_broad_hg38_v0_hapmap_3.3.hg38.vcf’ \ –resource:omni,known=false,training=true,truth=false,prior=12.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/Coverted_to_VCF/resources_broad_hg38_v0_1000G_omni2.5.hg38.vcf’ \…
Automated dbSNP lookup by rsID position, plus genome build liftover
Hola, just passing by to say ‘hi’. Please post bugs / suggestions as comments to this tutorial. rsID to position GRCh38 cat rsids.list rs1296488112 rs1226262848 rs1225501837 rs1484860612 rs1235553513 rs1424506967 cat rsids.list | while read rsid ; do pos=$(curl -sX GET “https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=$rsid&retmode=text&rettype=text” | sed ‘s/<\//\n/g’ | grep -o -P ‘\<CHRPOS\>.{0,15}’ |…
How to install `BETA` on linux server?
Hello: I am trying to install BETA on HPC: cistrome.org/BETA/index.html#inst I create virtual environment using conda: conda create -n BETA python=2.7 conda activate BETA unzip BETA_1.0.7.zip cd BETA_1.0.7 python setup.py install cc -Wall -g motif.c misp.c -o misp -O3 -lz -lm In file included from motif.c:9: motif.h:18:10: fatal error: zlib.h:…
Error in VQSR first step
Error in VQSR first step 0 Hello. After inserting following codes for VQSR first step: java -jar gatk-package-4.3.0.0-local.jar VariantRecalibrator -O /home/yousef/Desktop/haplotype_hg38_SNP_Recal.vcf –resource:hapmap,known=false,training=true,truth=true,prior=15.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/resources_broad_hg38_v0_hapmap_3.3.hg38.vcf’ –resource:omni,known=false,training=true,truth=false,prior=12.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/resources_broad_hg38_v0_1000G_omni2.5.hg38.vcf’ –resource:1000G,known=false,training=true,truth=false,prior=10.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/resources_broad_hg38_v0_1000G_phase1.snps.high_confidence.hg38.vcf’ –resource:dbsnp,known=true,training=false,truth=false,prior=2.0 ‘/media/yousef/EEFC0BDBFC0B9CC9/Sequencing/Next_generation_sequencing/Index_files/VQSR_data/resources_broad_hg38_v0_Homo_sapiens_assembly38.dbsnp138.vcf’ –tranches-file /home/yousef/Desktop/Tranches.txt -an QD -an SOR -an MQ -an FS -an SOR -an ReadPosRankSum -an MQRankSum -V /home/yousef/Desktop/haplotype.hg38.vcf –max-gaussians 4 -R…
GATK HaplotypeCaller combine info from two BAM into one line in vcf (not divide into samples column)
Hi I run the GATK HaplotypeCaller and hope to get a file where each sample will have a column. My bam file looks like this: input_bam/SRR8859080.bam input_bam/ENCFF477JTA_new.bam This is my GATK command: allele_chunk_file=rs_coord.vcf gatk_run_line=”../bin/gatk-4.1.2.0/gatk” outfile=wgs_test_out.genotypes.vcf bam_file=wgs_test.bam.list genome_seq=”../hg38.fa” intervals=wgs_test.bed $gatk_run_line \ HaplotypeCaller\ –reference $genome_seq \ –input $bam_file \ –genotyping-mode GENOTYPE_GIVEN_ALLELES \…
Gene coverage issue in HG38
Gene coverage issue in HG38 0 Hi, Recently I checked the coverage of gene OCA2, using both genome HG37 and hg38. surprisingly, it was not not covered (very poorly mapped) in HG38, and full covered with good coverage in hg38 with all used capture kit. we are not able to…
(1): download the human genome below is the open
(1): download the human genome below is the open source site to get the human genome data for learning purpose site so no issue to access and download genome data. sample data attached below / anyone can access it open source no copyright hence sharing . hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz (2): Use the…
Solved Problem Statement: This question asks to write a
Problem Statement: This question asks to write a script to obtain all protein sequences coded in the human genome in the multiple FASTA format, using the RefSeq table obtained from the UCSC Table Browser and the human genome obtained from the given URL. The ID of each sequence should be…