Tag: rsIDs

get gene name from rsID

get gene name from rsID 1 I’ve got a list of rs IDs in xlsx format. I need to get the gene name for each rsID. When I use this command, I get the gene name esearch -db snp -query “rs573455” | esummary | xtract -pattern GENE_E -element NAME |…

Continue Reading get gene name from rsID

Evaluating 17 methods incorporating biological function with GWAS summary statistics to accelerate discovery demonstrates a tradeoff between high sensitivity and high positive predictive value

Method selection We reviewed the published literature through February 2020 to identify methods that met the following criteria: i. Descriptively categorized as (a) annotation-based; (b) pleiotropy-based; or (c) eQTL-based. ii. Utilized GWAS summary statistics, as opposed to individual-level genotype data. iii. Implemented using freely-available software or packages. iv. Provided either…

Continue Reading Evaluating 17 methods incorporating biological function with GWAS summary statistics to accelerate discovery demonstrates a tradeoff between high sensitivity and high positive predictive value

Genetic data QC prior to imputation

Hi there, Should SNPs that have this sort of name ‘exm_….” be removed from genetic data at the QC stage. Not necessarily, they used this ID cause it was part of their ExomeSNP array, probably because there was no RSID at the time, for example this one: www.ncbi.nlm.nih.gov/projects/SNP/snp_ss.cgi?subsnp_id=ss1958317049 Should SNPs…

Continue Reading Genetic data QC prior to imputation

ILIAD: a suite of automated Snakemake workflows for processing genomic data for downstream applications | BMC Bioinformatics

Pipeline architecture and configuration file Genomic data processing poses a challenge for genetic research studies because it involves multiple program dependency installations, vast numbers of samples with raw data from various next-generation sequencing (NGS) platforms, and inconsistent genetic variant ID and/or positions among datasets. The Iliad suite of genomic data…

Continue Reading ILIAD: a suite of automated Snakemake workflows for processing genomic data for downstream applications | BMC Bioinformatics

Locally annotating SNP IDs and Gene names of called variants

Locally annotating SNP IDs and Gene names of called variants 0 I have GWAS results after variant calling. The VCF file only had CHR (1:22) and POS (12345678 etc) information but the ID column has all “.”, namely no rsIDs in it. After GWAS analysis I have a list of…

Continue Reading Locally annotating SNP IDs and Gene names of called variants

Mapping SNP rsIds/ positions to genes coding for proteins

Mapping SNP rsIds/ positions to genes coding for proteins 1 I have started a new project and am a beginner in bioinformatics. I have a summary statistics file from a GWAS study. The file has information like the Chromosome number, position of the SNP. The rsids are not included. Now…

Continue Reading Mapping SNP rsIds/ positions to genes coding for proteins

How to add rsIDs to VCF?

How to add rsIDs to VCF? 1 Hey it’s quite some time ago but if anyone else is having a problem I just wanted to say following command worked for me: bcftools annotate -a /data/references/hg19/pipe/dbsnp138/00-All.vcf.gz -c ID -o samtools_annotated.vcf.gz samtools.vcf.gz The thing to look out for is, I think it…

Continue Reading How to add rsIDs to VCF?

Retrieve rs IDs from chromossome location info on hg19 build

Retrieve rs IDs from chromossome location info on hg19 build 0 Hello, community! I am wondering if it is possible to obtain rs IDs of variants when the information I have are like this: chr1:123456 123456 123456 A G The column names are “variant_id”, “start_hg19”, “end_hg19” “ref”, “alt”, respectively. I…

Continue Reading Retrieve rs IDs from chromossome location info on hg19 build

Bioconductor Rsids

Comment: Sample size estimation in R by Nana • 0 Alright. I have also posted in biostars and waiting for a response. I just looked up as ssize and it seems to be exclusive to Microarray d… Comment: Sample size estimation in R by James W. MacDonald 63k Oh wait….

Continue Reading Bioconductor Rsids

bcftools – How can I retrieve the GRCh38 coordinates of a list of rsids?

I have a list of about 100,000 rsids and I want to get their genomic coordinates on the GRCh38 genome build. Is there a command line tool that allows me to do this? If yes, which one? I have tried bcftools but, given the error message I got, I believe…

Continue Reading bcftools – How can I retrieve the GRCh38 coordinates of a list of rsids?

how to plot heatmap in r for all the gene across tissues to see distribution of snps

I have a dataframe which looks like this: dput(chr22_gene[,1:49]) structure(c(0.0531016390078734, -0.00413407782001034, -0.035434632568444, 0.00968736935965742, 0, 0, 0, 0, 0, 0, 0, -0.1983546, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…

Continue Reading how to plot heatmap in r for all the gene across tissues to see distribution of snps

Why aren’t authors specific about patch version?

I am finding difficulty finding the exact assembly version (e.g. patch version) of GRCh38 used for major databases. For instance, gnomad says “GRCh38”. But the only information on the version, for v3.1 comes from here, which says it “uses an updated version of Variant Effect Predictor (VEP) based on the…

Continue Reading Why aren’t authors specific about patch version?

rs3750846 RefSNP Report – dbSNP

ALFA Allele FrequencyThe ALFA project provide aggregate allele frequency from dbGaP. More information is available on the project page including descriptions, data access, and terms of use. Release Version: 20201027095038 Help Frequency tab displays a table of the reference and alternate allele frequencies reported by various studies and populations. Table lines,…

Continue Reading rs3750846 RefSNP Report – dbSNP

Use GraphQL to query miltiple rsIDs from GWAS study in Open Targets

Use GraphQL to query miltiple rsIDs from GWAS study in Open Targets 0 I have a list of rs id from a GWAS dataset like this GWAS data set with rsIDs I was wondering : a) If and how can I use the GraphQL API from open targets to query…

Continue Reading Use GraphQL to query miltiple rsIDs from GWAS study in Open Targets

1000 genomes hg38 with dbSNP rsid

1000 genomes hg38 with dbSNP rsid 1 Hi, Anyone know where I can download the latest version of 1000 Genomes, on build hg38, in VCF format (or PLINK format), that ALSO contains the dbSNP RSid in the VCF ID field? I looked at the IGSR website, dbSNP, UCSC, etc. So…

Continue Reading 1000 genomes hg38 with dbSNP rsid

PGx question regarding RSIDs with more than one variant associated with it

PGx question regarding RSIDs with more than one variant associated with it 0 Hi all, I am attempting to build a database of rsid’s that can be used to predict the function of certain drugs via the metabolism levels of a gene, then compare it to my WES data. I…

Continue Reading PGx question regarding RSIDs with more than one variant associated with it

Automated dbSNP lookup by rsID position, plus genome build liftover

Hola, just passing by to say ‘hi’. Please post bugs / suggestions as comments to this tutorial. rsID to position GRCh38 cat rsids.list rs1296488112 rs1226262848 rs1225501837 rs1484860612 rs1235553513 rs1424506967 cat rsids.list | while read rsid ; do pos=$(curl -sX GET “https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=$rsid&retmode=text&rettype=text” | sed ‘s/<\//\n/g’ | grep -o -P ‘\<CHRPOS\>.{0,15}’ |…

Continue Reading Automated dbSNP lookup by rsID position, plus genome build liftover

Visualise a variant position in a gene

Visualise a variant position in a gene 1 Hi there, I have a set of three rsids coming from the same gene and I would be interested in visualising their position relative to the gene. The closest software I found to what I want to make is Biodalliance, but I…

Continue Reading Visualise a variant position in a gene

Finding dbSNP 129 rsIDs for lifting over hg18 sumstats to hg38

Finding dbSNP 129 rsIDs for lifting over hg18 sumstats to hg38 0 Hi all, I’m currently working on lifting a summary statistics file from the genome build hg18 to hg38. The file format for the sumstats is as follows: marker Allele1 Allele2 beta1 SE pValue chr Bp chr10:100004799 a c…

Continue Reading Finding dbSNP 129 rsIDs for lifting over hg18 sumstats to hg38

Manual Polygenic Risk Score calculation

Manual Polygenic Risk Score calculation 1 Hi all, I am attempted to calculate PRS manually, and I’m very close to to obtaining a score. To recap what has been done, I have a patients individual in which I annotated their VCF with RSIDs. From there, I went to PGS catalog…

Continue Reading Manual Polygenic Risk Score calculation

get build 37 positions from dbSNP rsIDs

get build 37 positions from dbSNP rsIDs 4 $ mysql –user=genome –host=genome-mysql.cse.ucsc.edu -A -D hg19 -e ‘select chrom,chromStart,chromEnd,name from snp147 where name in (“rs371194064″,”rs779258992″,”rs26″,”rs25”)’ +——-+————+———-+————-+ | chrom | chromStart | chromEnd | name | +——-+————+———-+————-+ | chr7 | 11584141 | 11584142 | rs25 | | chr7 | 11583470 | 11583471…

Continue Reading get build 37 positions from dbSNP rsIDs

Where to download the MAF database from 1000 genomes? ANNOVAR looks incomplete

Where to download the MAF database from 1000 genomes? ANNOVAR looks incomplete 1 I need the MAF of a list of SNPs, I’m using the latest 1000 Genomes variant file from ANNOVAR available here: www.openbioinformatics.org/annovar/download/hg19_1000g2015aug.zip But this list only has some of the RSIDs included in NCBI’s dbSNP. I need…

Continue Reading Where to download the MAF database from 1000 genomes? ANNOVAR looks incomplete

Where to find vcf of dbsnp build 144 ?

Where to find vcf of dbsnp build 144 ? 0 Hi everyone, I have zipped vcf files that I would like to annotate using hg19 bsnp144. I have bed files for each chromosome but, based on other biostar answers (How to add rsIDs to VCF?), it seems it is easier…

Continue Reading Where to find vcf of dbsnp build 144 ?

As of July 2015, the VCFtools project has been moved to github! Please visit the new website here: vcftools.github.io/man_0112a.html

NAME SYNOPSIS DESCRIPTION EXAMPLES BASIC OPTIONS SITE FILTERING OPTIONS INDIVIDUAL FILTERING OPTIONS GENOTYPE FILTERING OPTIONS OUTPUT OPTIONS COMPARISON OPTIONS AUTHOR NAME VCFtools v0.1.12a − Utilities for the variant call format (VCF) and binary variant call format (BCF) SYNOPSIS vcftools [ –vcf FILE | –gzvcf FILE | –bcf FILE]…

Continue Reading As of July 2015, the VCFtools project has been moved to github! Please visit the new website here: vcftools.github.io/man_0112a.html

Updating hg 18 .bim file with lifted .map and .bed file

Updating hg 18 .bim file with lifted .map and .bed file 0 Hello, I am trying to update rsids in an hg18 .bim file with an hg38.bed and hg38.map file. I’ve tried the following: system(“./plink –file plink_hg38 –make-just-bim –out newBim –allow-extra-chr”) but got the error: Error: Failed to open plink_hg38.ped….

Continue Reading Updating hg 18 .bim file with lifted .map and .bed file

Using QCTOOL v2 to process UK Biobank .bgen files

Using QCTOOL v2 to process UK Biobank .bgen files – why so slow? 0 I’m currently using QCTOOL v2 to process imputed .bgen files from UK Biobank, however they seem to be processing very slowly. Is this normal? My command is pretty basic; I’m filtering out a list of SNPs…

Continue Reading Using QCTOOL v2 to process UK Biobank .bgen files

rs532111960 RefSNP Report – dbSNP

Help Variant Details tab shows known variant placements on genomic sequences: chromosomes (NC_), RefSeqGene, pseudogenes or genomic regions (NG_), and in a separate table: on transcripts (NM_) and protein sequences (NP_). The corresponding transcript and protein locations are listed in adjacent lines, along with molecular consequences from Sequence Ontology. When…

Continue Reading rs532111960 RefSNP Report – dbSNP

rs9789283 RefSNP Report – dbSNP

Help Variant Details tab shows known variant placements on genomic sequences: chromosomes (NC_), RefSeqGene, pseudogenes or genomic regions (NG_), and in a separate table: on transcripts (NM_) and protein sequences (NP_). The corresponding transcript and protein locations are listed in adjacent lines, along with molecular consequences from Sequence Ontology. When…

Continue Reading rs9789283 RefSNP Report – dbSNP

qctool to merge two bgen file fails with no clear reason to

Hi, I am trying to merge two bgen files using qctool as explained here. I am using qctool_v2.2.0. The command works but ends with an error: ❱ qctool -g bug/in2.bgen -s bug/in2.sample -merge-in bug/in1.bgen bug/in1.sample -og bla.bgen -os bla.sample Welcome to qctool (version: 2.2.0, revision: unknown) (C) 2009-2020 University of…

Continue Reading qctool to merge two bgen file fails with no clear reason to

Convert SNP IDs as chr:pos:effect allele:ref allele to rsIDs

Convert SNP IDs as chr:pos:effect allele:ref allele to rsIDs 0 I have a set of 58000 SNPs for which the SNP ID is in the format of: chr:pos:effect allele:ref allele (Grch37 build), but I need to convert this to rsID where one is available for the SNP. I’ve tried using…

Continue Reading Convert SNP IDs as chr:pos:effect allele:ref allele to rsIDs

How to search dbSNP using a list of SNPs and retrieve Gene name (hgnc symbol if existing, otherwise just whatever is in there)

How to search dbSNP using a list of SNPs and retrieve Gene name (hgnc symbol if existing, otherwise just whatever is in there) 2 I have a list of 500.000 SNPs from which I want to obtain the gene name. I try to search with biomaRt library(data.table) library(biomaRt) rs <-…

Continue Reading How to search dbSNP using a list of SNPs and retrieve Gene name (hgnc symbol if existing, otherwise just whatever is in there)

Calculate LD matrix from bgen file

formatting error: Calculate LD matrix from bgen file 1 Hello, I am new to plink and am learning as I go. I am trying to calculate an LD matrix for a list of variants while using a bgen file as my reference population. See the command below: ./plink2/plink2 –r2 bin…

Continue Reading Calculate LD matrix from bgen file