Tag: GRCH37

A genotype-to-phenotype approach suggests under-reporting of single nucleotide variants in nephrocystin-1 (NPHP1) related disease (UK 100,000 Genomes Project)

Konrad, M. et al. Large homozygous deletions of the 2q13 region are a major cause of juvenile nephronophthisis. Hum. Mol. Genet. 5, 367–371 (1996). Article  CAS  PubMed  Google Scholar  Hildebrandt, F. et al. A novel gene encoding an SH3 domain protein is mutated in nephronophthisis type 1. Nat. Genet. 17,…

Continue Reading A genotype-to-phenotype approach suggests under-reporting of single nucleotide variants in nephrocystin-1 (NPHP1) related disease (UK 100,000 Genomes Project)

LOC127888533 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr17:80054421-80055336 [Homo sapiens (human)] – Gene

NEW Try the new Transcript table RefSeqs maintained independently of Annotated Genomes These reference sequences exist independently of genome builds. Explain These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by…

Continue Reading LOC127888533 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr17:80054421-80055336 [Homo sapiens (human)] – Gene

Building Dict File for GATK

Building Dict File for GATK 4 I’m going through the instructions page on gatkforums.broadinstitute.org/gatk/discussion/1601/how-can-i-prepare-a-fasta-file-to-use-as-reference Specifically, the command I don’t see how to do is: java -jar CreateSequenceDictionary.jar R= Homo_sapiens_assembly18.fasta O= Homo_sapiens_assembly18.dict [Fri Jun 19 14:09:11 EDT 2009] net.sf.picard.sam.CreateSequenceDictionary R= Homo_sapiens_assembly18.fasta O= Homo_sapiens_assembly18.dict [Fri Jun 19 14:09:58 EDT 2009] net.sf.picard.sam.CreateSequenceDictionary done….

Continue Reading Building Dict File for GATK

LOC127271744 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr1:226270255-226271122 [Homo sapiens (human)] – Gene

NEW Try the new Transcript table RefSeqs maintained independently of Annotated Genomes These reference sequences exist independently of genome builds. Explain These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by…

Continue Reading LOC127271744 H3K27ac-H3K4me1 hESC enhancer GRCh37_chr1:226270255-226271122 [Homo sapiens (human)] – Gene

How to download iGenomes from S3

How to download iGenomes from S3 1 Hi all, I run a pipeline on HPC that got an error with pulling iGenomes from S3 so I try to download it to my cluster but don’t know how. Would you have a suggestion? Thank you so much. ewels.github.io/AWS-iGenomes/ iGenome • 45…

Continue Reading How to download iGenomes from S3

Error in Adding 1000Genomes Ancestral Allele info: Using VCF tools fill-aa

Error in Adding 1000Genomes Ancestral Allele info: Using VCF tools fill-aa 1 Hi I am trying to add ancestral allele to 1000 Genomes Phase3 VCF files. I have used the “human_ancestor_GRCh37_e59.tar.bz2” files for ancestral allele input file. The steps I have used are: cat human_ancestor_3.fa | sed ‘s,^>.*,>1,’ | bgzip…

Continue Reading Error in Adding 1000Genomes Ancestral Allele info: Using VCF tools fill-aa

BAMboozle

BAMboozle 1 Hi, I am running BAMboozle to anonymize variant sequences using the GRCh37 human reference genome on my bam files. My bam files originally are 2-3 GB but when I get the output bam file from BAMboozle it is 500-600 Kb. Does BAMboozle decrease the size of the bam…

Continue Reading BAMboozle

Decoy In Reference Assembly

Decoy In Reference Assembly 2 I am using 1000 Genomes data with my new project. When I am inspecting the reference assembly they have been using, I found it contains a “decoy” contig. The 1000 Genomes FAQ says: For the final round of alignments the sequence data will be mapped…

Continue Reading Decoy In Reference Assembly

Reconstruction of the personal information from human genome reads in gut metagenome sequencing data –

Topic participation The examine protocol was accredited by the ethics committees of Osaka College and associated medical establishments in addition to the Translational Well being Science and Know-how Institute (Faridabad). Japanese people (n = 343) for whom intestine metagenome shotgun sequencing had been carried out in earlier research had been included on…

Continue Reading Reconstruction of the personal information from human genome reads in gut metagenome sequencing data –

Reconstruction of the personal information from human genome reads in gut metagenome sequencing data

Subject participation The study protocol was approved by the ethics committees of Osaka University and related medical institutions as well as the Translational Health Science and Technology Institute (Faridabad). Japanese individuals (n = 343) for whom gut metagenome shotgun sequencing were performed in previous studies were included in this study46,47,48. Among these…

Continue Reading Reconstruction of the personal information from human genome reads in gut metagenome sequencing data

Prevalence of BRCA homopolymeric indels in an ION Torrent-based tumour-to-germline testing workflow in high-grade ovarian carcinoma

Patients cohort Among consecutive patients who underwent BRCA tumour testing through ION Torrent-based sequencing between August 2017 and February 2022, we retrospectively selected 222 high-grade ovarian cancer (HGOC) patients with the following histological subtypes: 203 serous (HGSOC), seven endometrioid, five clear-cell and seven with mixed histotypes. Since NGS BRCA1/2 tumour…

Continue Reading Prevalence of BRCA homopolymeric indels in an ION Torrent-based tumour-to-germline testing workflow in high-grade ovarian carcinoma

Could not get first alignment from target

Can you share some of the image as text for easier understanding? It seems like there might be an issue with your BAM file or the region you are trying to call variants on. To help diagnose the issue, please follow these steps: 1. Check if your BAM file is…

Continue Reading Could not get first alignment from target

Novel intronic mutations of SLC12A3 gene, Gitelman syndrome

Introduction Gitelman syndrome (GS) is an autosomal recessive disease, characterized by hypokalemic alkalosis, accompanied by hypomagnesaemia, hypocalciuria, low blood pressure, and hypocalcemia, first described by Gitelman in 1966.1 It is caused by mutations in the SLC12A3 gene, which is located on the long arm of chromosome 16(16q13) and encodes the…

Continue Reading Novel intronic mutations of SLC12A3 gene, Gitelman syndrome

Targeting Poly(ADP)ribose polymerase in BCR/ABL1-positive cells

Cells and cell culture KOPN30, BV173, and K562 are BCR/ABL1-positive leukemia cell lines. All leukemia cell lines, as well as Ba/F3 cells, were maintained in RPMI-1640 medium supplemented with 15% fetal bovine serum (FBS) and penicillin–streptomycin (100 U/mL) at 37 °C in an atmosphere containing 5% CO2. KOPN30 cells were obtained…

Continue Reading Targeting Poly(ADP)ribose polymerase in BCR/ABL1-positive cells

Assembly Table.

Assembly Table. A. mellifera (Apr 2011 Amel_4.5/amel5) A. carolinensis (May 2010 AnoCar2.0/anoCar2) A. thaliana (Feb 2011 TAIR10/araTha1) B. taurus (Aug 2006 Btau_3.1/bosTau3) B. taurus (Nov 2014 Bos_taurus_UMD_3.1.1/bosTau8) C. familiaris (May 2005 CanFam2.0/canFam2) C. familiaris (Sep 2011 CanFam3.1/canFam3) C. porcellus (Feb 2008 Cavpor3.0/cavPor3) C. elegans (Oct 2010 WBcel215/ce10) C. elegans (Feb…

Continue Reading Assembly Table.

Parallel sequencing of extrachromosomal circular DNAs and transcriptomes in single cancer cells

scEC&T sequencing A detailed, step-by-step protocol of scEC&T-seq is available on the Nature Protocol Exchange46 and is described below. The duration of the protocol is approximately 8 days per 96-well plate. Cell culture Human tumor cell lines were obtained from ATCC (CHP-212) or were provided by J. J. Molenaar (TR14; Princess…

Continue Reading Parallel sequencing of extrachromosomal circular DNAs and transcriptomes in single cancer cells

Where do I get a large reference VCF?

Where do I get a large reference VCF? 1 I would like to download a large .vcf file containing many (hundreds or thousands) of samples. Ideally, I would download different population-specific .vcf files, but the ability to sort/filter by ancestry group is fine. Where do I get such a file?…

Continue Reading Where do I get a large reference VCF?

In vitro erythrocyte production using human-induced pluripotent stem cells: determining the best hematopoietic stem cell sources | Stem Cell Research & Therapy

Materials The materials used for cell cultures and characterization are listed in Additional file 1: Table S1. Cell sources After getting informed consent, PB was drawn from three healthy O, Rh D-positive donors. CB was collected from three healthy newborn babies at the Department of Obstetrics and Gynecology at Severance…

Continue Reading In vitro erythrocyte production using human-induced pluripotent stem cells: determining the best hematopoietic stem cell sources | Stem Cell Research & Therapy

Struggling with protein context of Annovar output

Struggling with protein context of Annovar output 0 Hi, Im having some troubles extracting the protein sequences of missense mutations from annovar output files. I would like to create all of the possible neopeptides arising from missense mutations of TCGA tumor samples. For this I used Annovar to get the…

Continue Reading Struggling with protein context of Annovar output

Issue With CRAM -> BAM -> FASTQ Conversion

Issue With CRAM -> BAM -> FASTQ Conversion 2 Please help! I am trying to obtain fastq files from the GDSC, all we have in the lab is CRAM files. Unfortunately, the reference genome seems to not exist when pulled from an online source. I have attempted to use the…

Continue Reading Issue With CRAM -> BAM -> FASTQ Conversion

STRavinsky STR database and PGTailor PGT tool demonstrate superiority of CHM13-T2T over hg38 and hg19 for STR-based applications

doi: 10.1038/s41431-023-01352-6. Online ahead of print. Affiliations Expand Affiliations 1 Morris Kahn Laboratory of Human Genetics, NIBN and Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, Israel. 2 Genetics Institute, Soroka Medical Center, Beer Sheva, Israel. 3 Morris Kahn Laboratory of Human Genetics, NIBN and Faculty…

Continue Reading STRavinsky STR database and PGTailor PGT tool demonstrate superiority of CHM13-T2T over hg38 and hg19 for STR-based applications

Index of /pub/clinvar

Name Last modified Size Parent Directory – ClinGen/ 2018-12-14 09:17 – document_archives/ 2014-04-24 08:19 – presentations/ 2021-06-23 17:39 – release_notes/ 2023-04-06 10:38 – submission_examples/ 2020-08-03 13:46 – submission_templates/ 2023-02-17 13:23 – tab_delimited/ 2023-04-10 15:02 – temp/ 2022-12-20 16:01 – vcf_GRCh37/ 2023-04-10 14:53 – vcf_GRCh38/ 2023-04-10 14:53 – xml/ 2023-04-10 15:02…

Continue Reading Index of /pub/clinvar

segmentation fault error

Forum:segmentation fault error 0 hi there, I have a lot of BAM files and I tried counting them using featureCounts. All the files works great, but these few files throwing these error I’m using the same annotation file all the time so I guess that’s not the problem, I also…

Continue Reading segmentation fault error

illegal reference to local variable array

Hi, Dear all, I am using Juicer to analyze Hic data, after mapping paired-end fastq file to the genome, I got the sam file. But the next step of chimeric_sam.awk reports error: (-:  Align of /home/jib79/hic/2019-NG/juicer/splits/SRR9822212.fastq.sam done successfullyawk: /home/jib79/hic/2019-NG/juicer/scripts/scripts/common/chimeric_sam.awk: line 50: illegal reference to local variable arrayawk: /home/jib79/hic/2019-NG/juicer/scripts/scripts/common/chimeric_sam.awk: line 51: illegal…

Continue Reading illegal reference to local variable array

Can’t call subsampled bam file with GATK Haplotypecaller with –disable-tool-default-read-filters

I want to simulate variant calling of an ultra-low-coverage >0.005x bam file. I subsampled reads from the (HG02024) sample of the 1KG phase 3 dataset. My code in R to do so is the following (bam and reference are just path extensions, file is the inital bam file): cov_rate <-…

Continue Reading Can’t call subsampled bam file with GATK Haplotypecaller with –disable-tool-default-read-filters

rs3750846 RefSNP Report – dbSNP

ALFA Allele FrequencyThe ALFA project provide aggregate allele frequency from dbGaP. More information is available on the project page including descriptions, data access, and terms of use. Release Version: 20201027095038 Help Frequency tab displays a table of the reference and alternate allele frequencies reported by various studies and populations. Table lines,…

Continue Reading rs3750846 RefSNP Report – dbSNP

VEP-like tool for sequence ontology and HGVS annotation of VCF files

Mehari is a software package for annotating VCF files with variant effect/consequence. The program uses hgvs-rs for projecting genomic variants to transcripts and proteins and thus has high prediction quality. Other popular tools offering variant effect/consequence prediction include: Mehari offers predictions that aim to mirror VariantValidator, the gold standard for…

Continue Reading VEP-like tool for sequence ontology and HGVS annotation of VCF files

LD correlation matrix reference file, where can I find it?

LD correlation matrix reference file, where can I find it? 0 I am searching for a website where I can download LD correlation (r^2) matrices for (any) European population. My interest is in SNPs (preferably rsid as indices. If not genomic locations for grch37/grch38). The data can be divided by…

Continue Reading LD correlation matrix reference file, where can I find it?

Dante Genomics launches Avanti Software for a plug-and-play genomic interpretation that takes minutes instead of hours

  NEW YORK, March 20, 2023 /PRNewswire/ — Dante Genomics, a global leader in genomics and precision medicine, launched today the beta version of Avanti, the Company’s proprietary B2B software for variant interpretation and report writing at scale. Avanti provides clinicians, geneticists and researchers with a plug-and-play web-based…

Continue Reading Dante Genomics launches Avanti Software for a plug-and-play genomic interpretation that takes minutes instead of hours

Obtain number of base pairs in a genome

Obtain number of base pairs in a genome 1 HI! It’s going to be a stupid question since I’m not anyhow related to bioinformatics – I’m interested into how can I obtain the number of base pairs in my genome sample. I’m trying to remake the experiment that was made…

Continue Reading Obtain number of base pairs in a genome

The Clinical Diagnostic Utility of Array CGH in Children with Syndromic Microcephaly

Abstract Background: A prospective study using array CGH in children with Syndromic microcephaly from a tertiary pediatric healthcare centre in India. Aim: To identify the copy number variations causative of microcephaly detected through chromosomal array CGH. Patients and Methods: Of the 60 patients, 33 (55%) males and 27 (45%) females…

Continue Reading The Clinical Diagnostic Utility of Array CGH in Children with Syndromic Microcephaly

Bowtie2 which reference is best ?

Bowtie2 which reference is best ? 1 Hello I am trying to learn Bowtie2. When I compared the overall alignment rate by bowtie2, there is a significant difference between the result of GRCh37 index and GRCh38 index. The overall alignment rate to GRCh37 is 98%, but that to GRCh38 is…

Continue Reading Bowtie2 which reference is best ?

Automated dbSNP lookup by rsID position, plus genome build liftover

Hola, just passing by to say ‘hi’. Please post bugs / suggestions as comments to this tutorial. rsID to position GRCh38 cat rsids.list rs1296488112 rs1226262848 rs1225501837 rs1484860612 rs1235553513 rs1424506967 cat rsids.list | while read rsid ; do pos=$(curl -sX GET “https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=$rsid&retmode=text&rettype=text” | sed ‘s/<\//\n/g’ | grep -o -P ‘\<CHRPOS\>.{0,15}’ |…

Continue Reading Automated dbSNP lookup by rsID position, plus genome build liftover

IJMS | Free Full-Text | Endothelial Differentiation of CCM1 Knockout iPSCs Triggers the Establishment of a Specific Gene Expression Signature

1. Introduction Cerebral cavernous malformations (CCMs) are capillary–venous lesions which are primarily found in the brain and spinal cord [1]. The familial form of this neurovascular disorder is inherited in an autosomal dominant manner with incomplete penetrance. Pathogenic variants in the CCM1 gene (also known as KRIT1) can be identified…

Continue Reading IJMS | Free Full-Text | Endothelial Differentiation of CCM1 Knockout iPSCs Triggers the Establishment of a Specific Gene Expression Signature

Can’t liftover vcf file from hg19 to hg38

Can’t liftover vcf file from hg19 to hg38 1 Hello everyone I have a vcf file that I’m trying to convert from hg19 to hg38. For that I’m using bcftools +liftover command from here . I previously tried to use picard VCF but the memory cost was too much from…

Continue Reading Can’t liftover vcf file from hg19 to hg38

get build 37 positions from dbSNP rsIDs

get build 37 positions from dbSNP rsIDs 4 $ mysql –user=genome –host=genome-mysql.cse.ucsc.edu -A -D hg19 -e ‘select chrom,chromStart,chromEnd,name from snp147 where name in (“rs371194064″,”rs779258992″,”rs26″,”rs25”)’ +——-+————+———-+————-+ | chrom | chromStart | chromEnd | name | +——-+————+———-+————-+ | chr7 | 11584141 | 11584142 | rs25 | | chr7 | 11583470 | 11583471…

Continue Reading get build 37 positions from dbSNP rsIDs

Looking for LDL GWAS summary stats in hg38

Hi All, I think last time I posted on here was nearly 10 years ago (!) I’m looking for a way to get summary statistics for a GWAS on LDL levels, where the statistics are in hg38. I found a study titled “Genome-wide study for circulating metabolites identifies 62 loci…

Continue Reading Looking for LDL GWAS summary stats in hg38

microRNAs not available in TxDb.Hsapiens.UCSC.hg38.knownGene

microRNAs not available in TxDb.Hsapiens.UCSC.hg38.knownGene 0 @lluis-revilla-sancho Last seen 8 hours ago European Union I was looking to some examples and I could retrieve the microRNAs of the hg19 transcriptome, but not from the hg38 transcript annotation. I realized this might be because TxDb.Hsapiens.UCSC.hg38.knownGene doesn’t have a miRBase build ID,…

Continue Reading microRNAs not available in TxDb.Hsapiens.UCSC.hg38.knownGene

CNVKit does not output all the accessible regions in the targets bed file

CNVKit does not output all the accessible regions in the targets bed file 1 Hello everybody, I am using CNVkit on my data using hg38 as reference. The command that I am using is the following: cnvkit.py batch sample.bam -n control.bam -m wgs -f reference.fasta –target-avg-size 1000 –output-dir results/ So,…

Continue Reading CNVKit does not output all the accessible regions in the targets bed file

Bioconductor – SNPlocs.Hsapiens.dbSNP155.GRCh37 (development version)

DOI: 10.18129/B9.bioc.SNPlocs.Hsapiens.dbSNP155.GRCh37     This is the development version of SNPlocs.Hsapiens.dbSNP155.GRCh37; to use it, please install the devel version of Bioconductor. Human SNP locations and alleles extracted from dbSNP Build 155 and placed on the GRCh37/hg19 assembly Bioconductor version: Development (3.16) The 929,496,192 SNPs in this package were extracted from…

Continue Reading Bioconductor – SNPlocs.Hsapiens.dbSNP155.GRCh37 (development version)

Ensembl ID mapping GRCh37 vs GRCh38

Ensembl ID mapping GRCh37 vs GRCh38 0 I currently have a large list of Ensembl protein IDs (ENSP) that are from GRCh37. I need to map these IDs to the entry name listed on the UniProt website (e.g. ‘CASPE_HUMAN’ ). I am having trouble doing this using the UniProt dataset…

Continue Reading Ensembl ID mapping GRCh37 vs GRCh38

How to modify VCF file?

Hi community, I have a question: the SNP position in vcf file is from GRCh37/hg19, I need to change the position to GRCh38. So, I used UCSC liftover to replace the hg19 pos by GRCh38 pos and deleted some SNPs, then sorted the pos and saved to a new vcf…

Continue Reading How to modify VCF file?

Obtain equivalent variant ids (chr-pos-ref-alt) for GRCh37 and GRCh38

Obtain equivalent variant ids (chr-pos-ref-alt) for GRCh37 and GRCh38 0 Hi all, I want to obtain the equivalent variant id (chr-pos-ref-alt) from GRCh38 in GRCh37. This is to deal with some variants poorly lifted over. To exemplify, see the variant gnomad.broadinstitute.org/variant/10-17838942-A-G?dataset=gnomad_r3 It has two equivalents in GRCh37. I want to…

Continue Reading Obtain equivalent variant ids (chr-pos-ref-alt) for GRCh37 and GRCh38

Genetic and chemotherapeutic influences on germline hypermutation

DNM filtering in 100,000 Genomes Project We analysed DNMs called in 13,949 parent–offspring trios from 12,609 families from the rare disease programme of the 100,000 Genomes Project. The rare disease cohort includes individuals with a wide array of diseases, including neurodevelopmental disorders, cardiovascular disorders, renal and urinary tract disorders, ophthalmological…

Continue Reading Genetic and chemotherapeutic influences on germline hypermutation

On a reference pan-genome model (Part II)

12 July 2019 I wrote a blog post on a potential reference pan-genome model. I had more thoughts in my mind. I didn’t write about them because they are immature. Nonetheless, a few readers raised questions related to my immature thoughts, so I decide to add this “Part II” as…

Continue Reading On a reference pan-genome model (Part II)

Using Rsubread buildindex with GRCh37.p13.genome.fa.gz gives me an error

Using Rsubread buildindex with GRCh37.p13.genome.fa.gz gives me an error 0 @efernandez-22025 Last seen 1 day ago Argentina Hi I am triying to build the human index using ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_19/GRCh37.p13.genome.fa.gz I am using Rsubread 2.4.3 an it gives me the following error //================================= Running ==================================\ || || || Check the integrity of…

Continue Reading Using Rsubread buildindex with GRCh37.p13.genome.fa.gz gives me an error

BTG2 gene predicts poor outcome in PT-DLBCL

Introduction Primary testicular diffuse large B-cell lymphoma (PT-DLBCL) is a rare and aggressive form of mature B-cell lymphoma.1–3 PT-DLBCL was the most common type of testicular tumor in men aged over 60 and characterized by painless uni- or bilateral testicular masses with infrequent constitutional symptoms.4–6 PT-DLBCL shows significant extranodal tropism,…

Continue Reading BTG2 gene predicts poor outcome in PT-DLBCL

rs532111960 RefSNP Report – dbSNP

Help Variant Details tab shows known variant placements on genomic sequences: chromosomes (NC_), RefSeqGene, pseudogenes or genomic regions (NG_), and in a separate table: on transcripts (NM_) and protein sequences (NP_). The corresponding transcript and protein locations are listed in adjacent lines, along with molecular consequences from Sequence Ontology. When…

Continue Reading rs532111960 RefSNP Report – dbSNP

use tcgabiolinks package to download TCGA data

TCGA Data download in terms of ease of use ,RTCGA The bag should be better , And because it’s already downloaded data , The use is relatively stable . But also because of the downloaded data , There is no guarantee that the data is new .TCGAbiolinks The package is…

Continue Reading use tcgabiolinks package to download TCGA data

rs9789283 RefSNP Report – dbSNP

Help Variant Details tab shows known variant placements on genomic sequences: chromosomes (NC_), RefSeqGene, pseudogenes or genomic regions (NG_), and in a separate table: on transcripts (NM_) and protein sequences (NP_). The corresponding transcript and protein locations are listed in adjacent lines, along with molecular consequences from Sequence Ontology. When…

Continue Reading rs9789283 RefSNP Report – dbSNP

links to Ensembl GRCh37 – gitmetadata

Open Targets Genetics reports GRCh38 coordinates but ‘External references” section points to GRCh37 (grch37.ensembl.org) rather than GRCh38 (www.ensembl.org): genetics.opentargets.org/variant/8_102432699_T_C Was this a deliberate decision (e.g. we don’t have the rsID in GRCh38 for some reason, other)? If so, we need to make this clear in the docs. If not, we…

Continue Reading links to Ensembl GRCh37 – gitmetadata

Failure to detect mutations in U2AF1 due to changes in the GRCh38 reference sequence

Materials and Methods Genomic data was collected as part of the MDS National History Study or The Cancer Genome Atlas project and consented appropriately under those protocols 8 Sekeres M.A. Gore S.D. Stablein D.M. DiFronzo N. Abel G.A. DeZern A.E. Troy J.D. Rollison D.E. Thomas J.W. Waclawiw M.A. Liu J.J….

Continue Reading Failure to detect mutations in U2AF1 due to changes in the GRCh38 reference sequence

VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

Describe the issue VEP give errors even my query and reference has same assembly version Command :$: ./vep -i examples/homo_sapiens_GRCh37.vcf –cache –refseq cache reference details while running install.pl ? 458 NB: Remember to use –refseq when running the VEP with this cache! downloading ftp.ensembl.org/pub/release-104/variation/indexed_vep_cache/homo_sapiens_refseq_vep_104_GRCh37.tar.gz unpacking homo_sapiens_refseq_vep_104_GRCh37.tar.gz converting cache, this may…

Continue Reading VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

Failed to instantiate plugin dbNSFP in VEP

Failed to instantiate plugin dbNSFP in VEP 0 Hi Team, My VEP (version 105, installed by perl INSTALL.pl) works well. But I face some problems to use dbNSFP plugin (also installed by perl INSTALL.pl) with VEP tool. My dbNSFP version 4.2a was installed by the following code without any warning…

Continue Reading Failed to instantiate plugin dbNSFP in VEP

SNP2TFBS

SNP2TFBS Viewing variants that affect TF binding – Results – SNP identifier Chrom id (Feb 2009 GRCh37/hg19) SNP position NB. of TF factors rs1800629   dbSNP NC_000006.11 (chr6) 31543031 1 TF name  PWM score on Ref PWM score on Alt Score difference Low Score Thr High Score Thr MZF1_1-4  1024  ….

Continue Reading SNP2TFBS

Bioconductor – BSgenome.Hsapiens.UCSC.hg19

    This package is for version 3.2 of Bioconductor; for the stable, up-to-date release version, see BSgenome.Hsapiens.UCSC.hg19. Full genome sequences for Homo sapiens (UCSC version hg19) Bioconductor version: 3.2 Full genome sequences for Homo sapiens (Human) as provided by UCSC (hg19, Feb. 2009) and stored in Biostrings objects. Author:…

Continue Reading Bioconductor – BSgenome.Hsapiens.UCSC.hg19

Convert SNP IDs as chr:pos:effect allele:ref allele to rsIDs

Convert SNP IDs as chr:pos:effect allele:ref allele to rsIDs 0 I have a set of 58000 SNPs for which the SNP ID is in the format of: chr:pos:effect allele:ref allele (Grch37 build), but I need to convert this to rsID where one is available for the SNP. I’ve tried using…

Continue Reading Convert SNP IDs as chr:pos:effect allele:ref allele to rsIDs

GEMINI ISSUE

Using gemini found at: /usr/local/bin/gemini /usr/local/share/gemini/anaconda/lib/python2.7/site-packages/gemini/config.py:61: YAMLLoadWarning: calling yaml.load() without Loader=… is deprecated, as the default Loader is unsafe. Please read msg.pyyaml.org/load for full details. config = yaml.load(in_handle) CADD scores are being loaded (to skip use:–skip-cadd). GERP per bp is being loaded (to skip use:–skip-gerp-bp). Traceback (most recent call last):…

Continue Reading GEMINI ISSUE

Gene coordinates for hg19

Gene coordinates for hg19 0 Hi, is there a list which gives for each gene its starting coordinate (chr:pos) and its ending one with respect to the hg19 reference genome? I have a list of positions on hg19 expressed as chr:pos and I have to assign each one to the…

Continue Reading Gene coordinates for hg19

Alternate nucleotide is more frequent than reference nucleotide. OMG I’m dizzy. How do I stop the twirl?

This is due to the fact that the very reference genomes that we use for re-alignment are themselves based on individuals who carry rare risk alleles. Thus, when we call variants against these genomes, we are, at many loci, comparing against rare disease risk alleles. As the best/worst example (depending…

Continue Reading Alternate nucleotide is more frequent than reference nucleotide. OMG I’m dizzy. How do I stop the twirl?

snpEFF not able to download GRCH38 ?

snpEFF not able to download GRCH38 ? 2 HI Why snpEff not able to download GRCH38 ? Always its showing error, But its work well with GRCH37 reference. Thanks for your comments. likithreddy@Curium:~/Downloads/snpEff_latest_core/snpEff$ java -jar snpEff.jar download GRCh38.76 java.lang.RuntimeException: Property: ‘GRCh38.76.genome’ not found at org.snpeff.interval.Genome.<init>(Genome.java:106) at org.snpeff.snpEffect.Config.readGenomeConfig(Config.java:681) at org.snpeff.snpEffect.Config.readConfig(Config.java:649) at…

Continue Reading snpEFF not able to download GRCH38 ?

Phasing with SHAPEIT

Edit June 7, 2020: The code below is for pre-phasing with SHAPEIT2. For phased imputation using the output of SHAPEIT2 and ultimate production of phased VCFs, see my answer here: A: ERROR: You must specify a valid interval for imputation using the -int argument, So, the steps are usually: pre-phasing…

Continue Reading Phasing with SHAPEIT

Picard CalculateHsMetrics perTargetCoverage for Novaseq bams

Picard CalculateHsMetrics perTargetCoverage for Novaseq bams 0 Hello, I would like to use Picard’s CalculateHsMetrics to calculate per target coverage for Novaseq bam files. It seems that the tool is not able to calculate mean/normalized coverage for Novaseq bams but works well with Hiseq bams. Novaseq bams report quality scores…

Continue Reading Picard CalculateHsMetrics perTargetCoverage for Novaseq bams

Produce PCA bi-plot for 1000 Genomes Phase III

Note1 – Previous version: Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old) Note2 – this data is for hg19 / GRCh37 Note3 – GRCh38 data is available HERE The tutorial has been updated based on the 1000 Genomes Phase III imputed genotypes. The original tutorial was…

Continue Reading Produce PCA bi-plot for 1000 Genomes Phase III

UCSC liftover

UCSC liftover 2 Hi, I’m using UCSC liftover to convert hg19 to hg38. The result came out that I don’t understand. Feb. 2009 (GRCh37/hg19) → Dec. 2013 (GRCh38/hg38) – chr1:120904787 → chr1:143905854 Dec. 2013 (GRCh38/hg38) → Feb. 2009 (GRCh37/hg19) – chr1:143905854 → chr1:149400430 (I didn’t check “Allow multiple output regions”.)…

Continue Reading UCSC liftover

Bioconductor – GGtools

DOI: 10.18129/B9.bioc.GGtools     This package is for version 3.12 of Bioconductor. This package has been removed from Bioconductor. For the last stable, up-to-date release version, see GGtools. software and data for analyses in genetics of gene expression Bioconductor version: 3.12 software and data for analyses in genetics of gene…

Continue Reading Bioconductor – GGtools

Pericentromeric noncoding RNA changes DNA binding of CTCF and inflammatory gene expression in senescence and cancer

Significance During the aging process, senescent cells secrete inflammatory factors, causing various age-related pathologies. Thus, controlling the senescence-associated secretory phenotype (SASP) can tremendously benefit human health. Although SASP seems to be induced by the alteration of chromosomal organization, its underlying mechanism remains unclear. Here, it has been revealed that noncoding…

Continue Reading Pericentromeric noncoding RNA changes DNA binding of CTCF and inflammatory gene expression in senescence and cancer

Need suggestions about pathogenicity prediction of gdc level 3 SNV file

Hi, I am trying to figure out which tool is most accurate in terms of pathogenicity prediction of TCGA SNVs level 3 data. TCGA offers SIFT, PolyPhen, and IMPACT scores for different kinds of mutations. SIFT, and PolyPhen cover mainly “Missense Mutation”, while IMPACT categorizes every kind of mutation into…

Continue Reading Need suggestions about pathogenicity prediction of gdc level 3 SNV file

GRCh37 GFF filter transcript isoforms by RefSeq Select tag or longest

GRCh37 GFF filter transcript isoforms by RefSeq Select tag or longest 0 Dear all, I tried to filter the “RefSeq Select” transcript isoforms in the GRCh37.p13 human genome annotation gff (GCF_000001405.25_GRCh37.p13_genomic.gff.gz). Specifically my goal is to retain for each gene a transcript isoform with a tag=RefSeq Select attribute if exists,…

Continue Reading GRCh37 GFF filter transcript isoforms by RefSeq Select tag or longest

What is the difference between GRCh37 and hs37? And hg19?

This is what I have found so far. Please correct me if I am wrong. GRCh37 w/o patches includes the primary assembly (22 autosomal, X. Y, and non-chromosomal supecontigs) and alternate scaffolds, but not a reference mitogenome. Non-chromosomal supercontigs are the unlocalized and unplaced scaffolds. The rCRS reference mitogenome in…

Continue Reading What is the difference between GRCh37 and hs37? And hg19?

Inquiry related to vcf file and formatting

Hello everyone, I am trying to run predixcan software. But its showing error as segmentation fault implying that there is something wrong with my vcf files. I am sharing the header of vcf file. ##fileformat=VCFv4.1 ##INFO=<ID=LDAF,Number=1,Type=Float,Description=”MLE Allele Frequency Accounting for LD”> ##INFO=<ID=AVGPOST,Number=1,Type=Float,Description=”Average posterior probability from MaCH/Thunder”> ##INFO=<ID=RSQ,Number=1,Type=Float,Description=”Genotype imputation quality from…

Continue Reading Inquiry related to vcf file and formatting

AnnotationHub::mapIds() cannot find existing ENSG (GEO supplemental data cross-referenced with ensembl.org)

Anyone know why I’m not getting ENSG ids for some of these symbols? The example below retrieves `NA` for multiple symbols, including AAED1 [whose ENSG is ENSG00000158122][1]. “` > library(AnnotationHub) > library(org.Hs.eg.db) > library(GEOquery) > temp download.file(getGEO(“GSM4430459″)@header$supplementary_file_1,temp) > genes unlink(temp) > ensids = mapIds(org.Hs.eg.db, keys=genes, column=”ENSEMBL”, keytype=”SYMBOL”, multiVals=”first”) > ensids[“AAED1”]…

Continue Reading AnnotationHub::mapIds() cannot find existing ENSG (GEO supplemental data cross-referenced with ensembl.org)