Tag: GRCH37

links to Ensembl GRCh37 – gitmetadata

Open Targets Genetics reports GRCh38 coordinates but ‘External references” section points to GRCh37 (grch37.ensembl.org) rather than GRCh38 (www.ensembl.org): genetics.opentargets.org/variant/8_102432699_T_C Was this a deliberate decision (e.g. we don’t have the rsID in GRCh38 for some reason, other)? If so, we need to make this clear in the docs. If not, we…

Continue Reading links to Ensembl GRCh37 – gitmetadata

Failure to detect mutations in U2AF1 due to changes in the GRCh38 reference sequence

Materials and Methods Genomic data was collected as part of the MDS National History Study or The Cancer Genome Atlas project and consented appropriately under those protocols 8 Sekeres M.A. Gore S.D. Stablein D.M. DiFronzo N. Abel G.A. DeZern A.E. Troy J.D. Rollison D.E. Thomas J.W. Waclawiw M.A. Liu J.J….

Continue Reading Failure to detect mutations in U2AF1 due to changes in the GRCh38 reference sequence

VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

Describe the issue VEP give errors even my query and reference has same assembly version Command :$: ./vep -i examples/homo_sapiens_GRCh37.vcf –cache –refseq cache reference details while running install.pl ? 458 NB: Remember to use –refseq when running the VEP with this cache! downloading ftp.ensembl.org/pub/release-104/variation/indexed_vep_cache/homo_sapiens_refseq_vep_104_GRCh37.tar.gz unpacking homo_sapiens_refseq_vep_104_GRCh37.tar.gz converting cache, this may…

Continue Reading VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

Failed to instantiate plugin dbNSFP in VEP

Failed to instantiate plugin dbNSFP in VEP 0 Hi Team, My VEP (version 105, installed by perl INSTALL.pl) works well. But I face some problems to use dbNSFP plugin (also installed by perl INSTALL.pl) with VEP tool. My dbNSFP version 4.2a was installed by the following code without any warning…

Continue Reading Failed to instantiate plugin dbNSFP in VEP

SNP2TFBS

SNP2TFBS Viewing variants that affect TF binding – Results – SNP identifier Chrom id (Feb 2009 GRCh37/hg19) SNP position NB. of TF factors rs1800629   dbSNP NC_000006.11 (chr6) 31543031 1 TF name  PWM score on Ref PWM score on Alt Score difference Low Score Thr High Score Thr MZF1_1-4  1024  ….

Continue Reading SNP2TFBS

Bioconductor – BSgenome.Hsapiens.UCSC.hg19

    This package is for version 3.2 of Bioconductor; for the stable, up-to-date release version, see BSgenome.Hsapiens.UCSC.hg19. Full genome sequences for Homo sapiens (UCSC version hg19) Bioconductor version: 3.2 Full genome sequences for Homo sapiens (Human) as provided by UCSC (hg19, Feb. 2009) and stored in Biostrings objects. Author:…

Continue Reading Bioconductor – BSgenome.Hsapiens.UCSC.hg19

Convert SNP IDs as chr:pos:effect allele:ref allele to rsIDs

Convert SNP IDs as chr:pos:effect allele:ref allele to rsIDs 0 I have a set of 58000 SNPs for which the SNP ID is in the format of: chr:pos:effect allele:ref allele (Grch37 build), but I need to convert this to rsID where one is available for the SNP. I’ve tried using…

Continue Reading Convert SNP IDs as chr:pos:effect allele:ref allele to rsIDs

GEMINI ISSUE

Using gemini found at: /usr/local/bin/gemini /usr/local/share/gemini/anaconda/lib/python2.7/site-packages/gemini/config.py:61: YAMLLoadWarning: calling yaml.load() without Loader=… is deprecated, as the default Loader is unsafe. Please read msg.pyyaml.org/load for full details. config = yaml.load(in_handle) CADD scores are being loaded (to skip use:–skip-cadd). GERP per bp is being loaded (to skip use:–skip-gerp-bp). Traceback (most recent call last):…

Continue Reading GEMINI ISSUE

Gene coordinates for hg19

Gene coordinates for hg19 0 Hi, is there a list which gives for each gene its starting coordinate (chr:pos) and its ending one with respect to the hg19 reference genome? I have a list of positions on hg19 expressed as chr:pos and I have to assign each one to the…

Continue Reading Gene coordinates for hg19

Alternate nucleotide is more frequent than reference nucleotide. OMG I’m dizzy. How do I stop the twirl?

This is due to the fact that the very reference genomes that we use for re-alignment are themselves based on individuals who carry rare risk alleles. Thus, when we call variants against these genomes, we are, at many loci, comparing against rare disease risk alleles. As the best/worst example (depending…

Continue Reading Alternate nucleotide is more frequent than reference nucleotide. OMG I’m dizzy. How do I stop the twirl?

snpEFF not able to download GRCH38 ?

snpEFF not able to download GRCH38 ? 2 HI Why snpEff not able to download GRCH38 ? Always its showing error, But its work well with GRCH37 reference. Thanks for your comments. likithreddy@Curium:~/Downloads/snpEff_latest_core/snpEff$ java -jar snpEff.jar download GRCh38.76 java.lang.RuntimeException: Property: ‘GRCh38.76.genome’ not found at org.snpeff.interval.Genome.<init>(Genome.java:106) at org.snpeff.snpEffect.Config.readGenomeConfig(Config.java:681) at org.snpeff.snpEffect.Config.readConfig(Config.java:649) at…

Continue Reading snpEFF not able to download GRCH38 ?

Phasing with SHAPEIT

Edit June 7, 2020: The code below is for pre-phasing with SHAPEIT2. For phased imputation using the output of SHAPEIT2 and ultimate production of phased VCFs, see my answer here: A: ERROR: You must specify a valid interval for imputation using the -int argument, So, the steps are usually: pre-phasing…

Continue Reading Phasing with SHAPEIT

Picard CalculateHsMetrics perTargetCoverage for Novaseq bams

Picard CalculateHsMetrics perTargetCoverage for Novaseq bams 0 Hello, I would like to use Picard’s CalculateHsMetrics to calculate per target coverage for Novaseq bam files. It seems that the tool is not able to calculate mean/normalized coverage for Novaseq bams but works well with Hiseq bams. Novaseq bams report quality scores…

Continue Reading Picard CalculateHsMetrics perTargetCoverage for Novaseq bams

Produce PCA bi-plot for 1000 Genomes Phase III

Note1 – Previous version: Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old) Note2 – this data is for hg19 / GRCh37 Note3 – GRCh38 data is available HERE The tutorial has been updated based on the 1000 Genomes Phase III imputed genotypes. The original tutorial was…

Continue Reading Produce PCA bi-plot for 1000 Genomes Phase III

UCSC liftover

UCSC liftover 2 Hi, I’m using UCSC liftover to convert hg19 to hg38. The result came out that I don’t understand. Feb. 2009 (GRCh37/hg19) → Dec. 2013 (GRCh38/hg38) – chr1:120904787 → chr1:143905854 Dec. 2013 (GRCh38/hg38) → Feb. 2009 (GRCh37/hg19) – chr1:143905854 → chr1:149400430 (I didn’t check “Allow multiple output regions”.)…

Continue Reading UCSC liftover

Bioconductor – GGtools

DOI: 10.18129/B9.bioc.GGtools     This package is for version 3.12 of Bioconductor. This package has been removed from Bioconductor. For the last stable, up-to-date release version, see GGtools. software and data for analyses in genetics of gene expression Bioconductor version: 3.12 software and data for analyses in genetics of gene…

Continue Reading Bioconductor – GGtools

Pericentromeric noncoding RNA changes DNA binding of CTCF and inflammatory gene expression in senescence and cancer

Significance During the aging process, senescent cells secrete inflammatory factors, causing various age-related pathologies. Thus, controlling the senescence-associated secretory phenotype (SASP) can tremendously benefit human health. Although SASP seems to be induced by the alteration of chromosomal organization, its underlying mechanism remains unclear. Here, it has been revealed that noncoding…

Continue Reading Pericentromeric noncoding RNA changes DNA binding of CTCF and inflammatory gene expression in senescence and cancer

Need suggestions about pathogenicity prediction of gdc level 3 SNV file

Hi, I am trying to figure out which tool is most accurate in terms of pathogenicity prediction of TCGA SNVs level 3 data. TCGA offers SIFT, PolyPhen, and IMPACT scores for different kinds of mutations. SIFT, and PolyPhen cover mainly “Missense Mutation”, while IMPACT categorizes every kind of mutation into…

Continue Reading Need suggestions about pathogenicity prediction of gdc level 3 SNV file

GRCh37 GFF filter transcript isoforms by RefSeq Select tag or longest

GRCh37 GFF filter transcript isoforms by RefSeq Select tag or longest 0 Dear all, I tried to filter the “RefSeq Select” transcript isoforms in the GRCh37.p13 human genome annotation gff (GCF_000001405.25_GRCh37.p13_genomic.gff.gz). Specifically my goal is to retain for each gene a transcript isoform with a tag=RefSeq Select attribute if exists,…

Continue Reading GRCh37 GFF filter transcript isoforms by RefSeq Select tag or longest

What is the difference between GRCh37 and hs37? And hg19?

This is what I have found so far. Please correct me if I am wrong. GRCh37 w/o patches includes the primary assembly (22 autosomal, X. Y, and non-chromosomal supecontigs) and alternate scaffolds, but not a reference mitogenome. Non-chromosomal supercontigs are the unlocalized and unplaced scaffolds. The rCRS reference mitogenome in…

Continue Reading What is the difference between GRCh37 and hs37? And hg19?

Inquiry related to vcf file and formatting

Hello everyone, I am trying to run predixcan software. But its showing error as segmentation fault implying that there is something wrong with my vcf files. I am sharing the header of vcf file. ##fileformat=VCFv4.1 ##INFO=<ID=LDAF,Number=1,Type=Float,Description=”MLE Allele Frequency Accounting for LD”> ##INFO=<ID=AVGPOST,Number=1,Type=Float,Description=”Average posterior probability from MaCH/Thunder”> ##INFO=<ID=RSQ,Number=1,Type=Float,Description=”Genotype imputation quality from…

Continue Reading Inquiry related to vcf file and formatting

AnnotationHub::mapIds() cannot find existing ENSG (GEO supplemental data cross-referenced with ensembl.org)

Anyone know why I’m not getting ENSG ids for some of these symbols? The example below retrieves `NA` for multiple symbols, including AAED1 [whose ENSG is ENSG00000158122][1]. “` > library(AnnotationHub) > library(org.Hs.eg.db) > library(GEOquery) > temp download.file(getGEO(“GSM4430459″)@header$supplementary_file_1,temp) > genes unlink(temp) > ensids = mapIds(org.Hs.eg.db, keys=genes, column=”ENSEMBL”, keytype=”SYMBOL”, multiVals=”first”) > ensids[“AAED1”]…

Continue Reading AnnotationHub::mapIds() cannot find existing ENSG (GEO supplemental data cross-referenced with ensembl.org)