Category: plink

Best tool for genotype – phenotype correlation

Best tool for genotype – phenotype correlation 0 Hello, I need to perform genotype – phenotype correlation analysis. I know PLINK could be used for such purpose, but with PLINK many file preparation steps need to be done before running the actual step. I have VEP annotated VCFs. Maybe other…

Continue Reading Best tool for genotype – phenotype correlation

Why does write.ped remove the first locus?

Why does write.ped remove the first locus? 0 In order to get a VCF file from genind, I am going through hierfstat function write.ped() and then with plink I convert the result to vcf. This is my code (apologies, but I cannot provide a reproducible data for this particular scenario):…

Continue Reading Why does write.ped remove the first locus?

deflated QQ plot but lambda >1

deflated QQ plot but lambda >1 1 Dear All, What might be the reason for a deflated QQ-plot but lambda showing > 1 value. GWAS (case-control using glm-logistic regression adjusting for PC1-PC3 and three covariates) was done in plink2.0, and QQ plot using QQman package TIA package plink deflation GWAS…

Continue Reading deflated QQ plot but lambda >1

Population stratification with PCA

Population stratification with PCA 1 Hi all! I have a genotype dataset in plink format. Now I want to correct for population structure with PCA in association analysis. I split my dataset to training and testing datasets. I want to do the PCA only in the training dataset and use…

Continue Reading Population stratification with PCA

How to convert GEN or .gen format from impute.me to vcf on windows 10?

How to convert GEN or .gen format from impute.me to vcf on windows 10? 1 I tried for days to convert a gen file to vcf but it did not work. I am a beginner so i don’t know what are in vcf files and gen files or how they…

Continue Reading How to convert GEN or .gen format from impute.me to vcf on windows 10?

How to import dosage information to plink binary files?

How to import dosage information to plink binary files? 0 Hi All, I recently converted a very large Topmed imputed VCF files into a plink format. The command I used to convert this VCF was plink1.9 –vcf ${VCF} –make-bed –out ${VCF}_binary. Additionally, I also spent a significant amount of time…

Continue Reading How to import dosage information to plink binary files?

Problems Imputing X Chromosome with TOPMed

I have a large dataset whose autosomes I was able to successfully phase and impute using TOPMed. I have tried doing the same with the X chromosome but keep running into issues. Before trying to impute with TOPMed, I did per-individual QC and per-marker QC, then ran checkVCF, and corrected…

Continue Reading Problems Imputing X Chromosome with TOPMed

Plink1.9 error -chr not recognised

Plink1.9 error -chr not recognised 1 Hello, I am trying to run PLink on my tped file (SNP data set) to test for linkage disequilibrium, but it appears as if I need to set some chromosome options because I get some error at some line. My data is not mapped….

Continue Reading Plink1.9 error -chr not recognised

Convert plink files from

Convert plink files from 0 I have plink files where the .bim file is in the following format, and I want to convert the chr:pos column into rsIDs. 1 1:10177:A:AC 0 10177 AC A 1 1:10352:T:TA 0 10352 TA T 1 1:11008:C:G 0 11008 G C 1 1:11012:C:G 0 11012…

Continue Reading Convert plink files from

PLINK and population stratification with known subpopulations

PLINK and population stratification with known subpopulations 0 I want to perform a genome wide association study (GWAS) with PLINK 1.9. I have whole genome sequencing SNP calls for ~100 patients where I know in advance that there is a skew towards subpopulations of African and South American descents, with…

Continue Reading PLINK and population stratification with known subpopulations

Plink v2.0 does not produce a Z-compressed file (.zst)

Plink v2.0 does not produce a Z-compressed file (.zst) 0 Good morning, I would like to convert a merged VCF in a Plink compressed format (.pgen, .psam and .pvar files), so I run plink2 –vcf MyMerged.vcf.gz –make-pgen –zst-level 3 –out MySamples It basically works, as it produces such files: ls…

Continue Reading Plink v2.0 does not produce a Z-compressed file (.zst)

PLINK map format spliiting by chromosome

PLINK map format spliiting by chromosome 0 hi everyone I have a PLINK mapfile whit for columns including chromosome name, snp id, Genetic distance (morgans) (All column is zero), Base-pair position. NC1.1 rs1 0 145 NC1.1 rs2 0 201 NC2.1 rs3 0 208 . . NCn.1 rsn 0 509 I…

Continue Reading PLINK map format spliiting by chromosome

Recreating QC of 1000 Genomes project

Recreating QC of 1000 Genomes project – removing non overlapping SNPs 0 Hi everyone, I am attempting to recreate the the quality control analysis performed in the 1000 genomes project (tcag.ca/documents/tools/omni25_qcReport.pdf). I am fairly new to performing QC on a dataset, and am currently stuck on section 5.1 of the…

Continue Reading Recreating QC of 1000 Genomes project

PLINK dosage data – convert and/or read in R

PLINK dosage data – convert and/or read in R 1 I have PLINK dosage files in the form pgen/psam/pvar. I would like to know how to do either/both of the following: Convert the file set to bed/bim/fam files (hard-calls) Read the allele dosage from pgen into R as a continuous…

Continue Reading PLINK dosage data – convert and/or read in R

Can someone explain PLINK allele REF/ALT management strategy?

Can someone explain PLINK allele REF/ALT management strategy? 0 Sometimes when merging two plink files, the Reference (REF) and Alternative (ALT) alleles may be reversed, e.g. REF G ALT A versus REF A ALT G. The main reason for that is the default action of PLINK. You see, when using…

Continue Reading Can someone explain PLINK allele REF/ALT management strategy?

Checking chromosome builds for genotyping data

Checking chromosome builds for genotyping data 0 Hi, I have several studies worth of data (In both PLINK and vcf format), and I was wondering if anyone knew of an online tool which I could use to check my chromosome build i.e GRCh37 vs GRCh38. (I thought I used one…

Continue Reading Checking chromosome builds for genotyping data

merging multiple .bed /.bgen files uk biobank using plink

Hi All, I am having a problem merging all chromosomal UK biobank files. I ran the following command. plink2 –bfile /path/to/file/ukb_imp_chr1 –pmerge-list /path/to/file/merge.list –maf 0.01 –hwe 1e-6 –make-pgen –out /path/to/file/ukb_imp_allchr I also tried plink2 –bfile /path/to/file/ukb_imp_chr1 –pmerge-list /path/to/file/merge.list –maf 0.01 –hwe 1e-6 –make-bed –out /path/to/file/ukb_imp_allchr The merge.list has the following…

Continue Reading merging multiple .bed /.bgen files uk biobank using plink

UK Biobank Imputed Genotypes

UK Biobank Imputed Genotypes 0 I am using the UK Biobank Imputed Genetic data set and I was wondering if it is possible to use PLINK (or any other software) to view the actual calls for each SNP ranging in value from 0 –> 2 (including fractions)? I have been…

Continue Reading UK Biobank Imputed Genotypes

The genomic origins of the Bronze Age Tarim Basin mummies

1. Peyrot, M. in Aspects of Globalisation: Mobility, Exchange and the Development of Multi-Cultural States 12–17 (2017). 2. Damgaard, P. et al. 137 ancient human genomes from across the Eurasian steppes. Nature 557, 369–374 (2018). CAS  PubMed  Article  ADS  PubMed Central  Google Scholar  3. Hemphill, B. E. & Mallory, J….

Continue Reading The genomic origins of the Bronze Age Tarim Basin mummies

GWAS Studies

GWAS Studies 0 How to create a .ped file for use in PLink? I have the following csv file: Chromosome Position Sample1 Sample 2 ……….. Sample n Chr Pos Sam1 Sam2 Sam3, Sam 4, ……Sample_n 1 11 A T T, A, A, T, …… 2 141 G G G, T,…

Continue Reading GWAS Studies

GWAS data from PGC

GWAS data from PGC 0 this is my first time using the GWAS data, I downloaded some data from PGS and I Have some questions. 1- there are a lot of SNP to the same gene with different P-values, why does this occur? and if I want to use one…

Continue Reading GWAS data from PGC

Distance matrix PCA

Distance matrix PCA 0 Hi all, I generated PCA values for the 1000genomes dataset using PLINK. I know how to plot the values for PC1 and PC2, but my question is how can I generate a distance matrix to select near samples based on populations? Like for example if I…

Continue Reading Distance matrix PCA

Phasing using SHAPEIT

Hello, I need to use SHAPEIT for phasing only since I will conduct CH (compound heterozygous) analysis for recessive rare variant.. I will not perform imputation.  I am running SHAPEIT, and I see in the log file it says: Parameters :  * Seed : 1442251531  * Parallelisation: 12 threads  *…

Continue Reading Phasing using SHAPEIT

New datasets for ancestry estimation and imputation?

New datasets for ancestry estimation and imputation? 0 What datasets are people using nowadays for genotype imputation and ancestry estimation? HapMap and 1000 Genomes are good, but it was some years since their release and both have some limitations on the number of populations included and resolution (especially HapMap which…

Continue Reading New datasets for ancestry estimation and imputation?

No VCF records found in the specified interval

Beagle 5 error: No VCF records found in the specified interval 0 Hi, I am running into an issue while doing Imputation with Beagle 5 and not sure what is causing the error. I have vcf files converted from PLINK by the following command ./plink –bfile qcd_in–chr 20 –recode vcf-iid…

Continue Reading No VCF records found in the specified interval

Math behind association with PLINK

Math behind association with PLINK 1 Hi, which is the mathematical formula behind the –linear association used by plink ? plink association gwas • 307 views The most basic association test is just a Chi-squared test on a 2 x 2 contingency table of the minor allele tallies, as to…

Continue Reading Math behind association with PLINK

How long does it take to carry out the GWAS workflow?

How long does it take to carry out the GWAS workflow? 0 Including these steps: 1) raw data format transformation for five companies 2) update positions for all SNPs to hg37 version 3) Quality control within companies 4) Pre-phasing (SHAPEIT2) and imputation (IMPUTE2) for all SNPs of each company 5)…

Continue Reading How long does it take to carry out the GWAS workflow?

Runs of homozygosity in Plink

❯ plink1.9 –homozyg –help PLINK v1.90b6.22 64-bit (3 Nov 2020) www.cog-genomics.org/plink/1.9/ (C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3 –help present, ignoring other flags. –homozyg [{group | group-verbose}] [‘consensus-match’] [‘extend’] [‘subtract-1-from-lengths’] –homozyg-snp <min var count> –homozyg-kb <min length> –homozyg-density <max inverse density (kb/var)> –homozyg-gap <max internal gap…

Continue Reading Runs of homozygosity in Plink

How do I merge imputed GWAS data

How do I merge imputed GWAS data 0 I have a cohort that part of the participants were genotyped using the Illumina 2.5M DNA microarray chip and another group of participants using H3Africa chip. I have imputed them using HRC server. To increase the power for my association analyses, I…

Continue Reading How do I merge imputed GWAS data

Sample ID ERROR in converting dosage file (.txt) to VCF in PLINK 2.0

Sample ID ERROR in converting dosage file (.txt) to VCF in PLINK 2.0 0 Hi, I have a dosage file (.txt format; alternative allele count 0-2) that I would like to convert to a VCF through PLINK 2.0 but am running into sample ID issues. I am attempting the following…

Continue Reading Sample ID ERROR in converting dosage file (.txt) to VCF in PLINK 2.0

Error: Could not open temporary file.

Hello, there is an error when I try to filter a VCF document for MAF. Here is a part of the the vcf-file: 2: chr1H 523 chr1H:523 C A . PASS . GT 0|0 1: chr1H 445 chr1H:445 C T . PASS . GT 0| 2: chr1H 523 chr1H:523 C…

Continue Reading Error: Could not open temporary file.

Can we merge two VCF files

Forum:Can we merge two VCF files – a RNAseq VCF and a Whole genome Sequencing (WGS) VCF to do PCA? 0 I’m quite new to this variant calling and analysis area. We have around 30 samples of RNAseq data of tumor and normal samples. I performed variant calling and obtained…

Continue Reading Can we merge two VCF files

Wavelet Screening: a novel approach to analyzing GWAS data | BMC Bioinformatics

Haar wavelet transform Our method transforms the raw genotype data similarly to the widely used ‘Gene- or Region-Based Aggregation Tests of Multiple Variants’ method [15] (Fig. 1). Like the Burden test, the effects of the genetic variants in a given region are summed up to construct a genetic score for…

Continue Reading Wavelet Screening: a novel approach to analyzing GWAS data | BMC Bioinformatics

How to convert multiple .vcf files into single .ped (PLINK compatible files)?

How to convert multiple .vcf files into single .ped (PLINK compatible files)? 0 Hi everyone, I am a newbie to the whole bioinformatics world and I need to analyse WGS data from several case samples. I have now several individual .vcf files and would like to use PLINK for Quality…

Continue Reading How to convert multiple .vcf files into single .ped (PLINK compatible files)?

The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Sequencing of Shorea leprosula genome Sample collection Leaf samples of S. leprosula were obtained from a reproductively mature (diameter at breast height, 50 cm) diploid tree B1_19 (DNA ID 214) grown in the Dipterocarp Arboretum, Forest Research Institute Malaysia (FRIM). DNA extraction Genomic DNA was extracted from leaf samples using the…

Continue Reading The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Bioinformatics Biomedical Scientist – Bilsborough Lab

Bioinformatics Biomedical Scientist – Bilsborough Lab – Inflammatory Bowel Diseases Drug Discovery and Development Apply Now Share Requisition # HRC0697538 Join us in accelerating the pace of research and discovery within our unique IBD3 lab! Cedars-Sinai provides virtually every known gastroenterologic analytical procedure and treatment…

Continue Reading Bioinformatics Biomedical Scientist – Bilsborough Lab

PLINK basic command line usage

PLINK basic command line usage 0 Hey, I am new to PLINK. I run a tutorial of how to calculate polygenic risk score under a tutorial. choishingwan.github.io/PRS-Tutorial/target/#standard-gwas-qc I run the part of # Standard GWAS QC and the code is as follows: plink –bfile EUR –maf 0.01 –hwe 1e-6 –geno…

Continue Reading PLINK basic command line usage

Plink genome flag error

Plink genome flag error 1 I’m using plink 1.9 and running the command: plink –bfile data –genome –extract data.prune.in –remove data.fail_imiss where data.fail_imiss is a file with individuals with proportions of missing SNPs greater than greater than the threshold .05 . I get the warnings: Warning: 10745 het. haploid genotypes…

Continue Reading Plink genome flag error

Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old)

NB – Update July 29, 2020 – this thread will no longer be watched and, for all intents and purposes, will now be archived NB – Version 2 of tutorial can be found here and should be used going forward –> Produce PCA bi-plot for 1000 Genomes Phase III –…

Continue Reading Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old)

VCF file

VCF file 0 Removal Sample ###junk removal vcftools –gzvcf cohorts_combined_filtered_calls_annotated.vcf –maf 0.05 –max-missing 0.8 –min-meanDP 10 –recode –out SNP-co.vcf ####Sample removal bcftools view –samples-file ^ /media/bioinformatician/My Passport/sample_id_no-RNASEq.txt SNP-co.vcf. ###chromosome extraction vcftools –gzvcf SNP_co.vcf.gz –chr 2 –from-bp 47843908 –to-bp 47877312 –recode –out snps_filt_chr2 ##conversion vcf to plink ./plink –vcf /media/bioinformatician/My Passport/annotated/tabix/snps_filt_chr2.vcf.gz.recode.vcf…

Continue Reading VCF file

Where to find 1000 Genome phase 3 whole genome data and select only European population

Where to find 1000 Genome phase 3 whole genome data and select only European population 2 Hello: I was trying to download whole genome data from 1000Genome phase 3 data and extract only the EUR population (GBR, TSI, FIN, IBS, CEU). I used the ftp site: ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz, but apparently it…

Continue Reading Where to find 1000 Genome phase 3 whole genome data and select only European population

Gene-edited animals and crops to be approved in productivity boost | News

Gene editing of animals and crops will be approved under plans to use post-Brexit freedoms to improve productivity, make food more nutritious and reduce reliance on pesticides and antibiotics. Ministers will focus initially on relaxing the rules on gene-editing plants but they also plan to allow the technology to be…

Continue Reading Gene-edited animals and crops to be approved in productivity boost | News

genomic data scientist jobs

Provide strategic planning and perform analysis or simulations independently or in a . 401(k) savings plan match.…, Requires a Ph.D. in Biochemistry, Biotechnology, Molecular/Cell Biology, Plant Biology, or a related field and 0-3 years of relevant postdoctoral or industrial……, In addition, the analyst will help advance the groups collective expertise…

Continue Reading genomic data scientist jobs

Scientists Reconstructed the Faces of 3 Egyptian Mummies Using Ancient DNA Extracted From Their Bodies

In 2017, Live Science reported the first-ever successful DNA sequencing from Egyptian mummies that lived more than 2,000 years ago in the Abusir el-Meleq a city in ancient Egypt located in the south of Cairo. In the study, titled “Ancient Egyptian Mummy Genomes Suggest an Increase of Sub-Saharan African Ancestry…

Continue Reading Scientists Reconstructed the Faces of 3 Egyptian Mummies Using Ancient DNA Extracted From Their Bodies

extract IDs from each population in PLINK .fam file and export to .txt separately

extract IDs from each population in PLINK .fam file and export to .txt separately 0 Hi, I’d like to write a loop to extract individuals from my PLINK.fam file based on the fam ID / population code into different .txt. files just using bash. I’m pretty stumped so would appreciate…

Continue Reading extract IDs from each population in PLINK .fam file and export to .txt separately

Loading PLINK files to Haploview

Loading PLINK files to Haploview 2 I can’t seem to get Haploview to accept PLINK files. I downloaded the ‘sample.ped’ and ‘sample.info‘ trial files from this haploview website. I am loading the .ped file for ‘Results File:’ and the .info file for ‘Map File:’. All other options are left at…

Continue Reading Loading PLINK files to Haploview

Haploview and plink

Haploview and plink 1 When I executed plink to get .ped and .info file then IT generated only for 8 chromosomes out of 22, X, Y and M. Kindly help to get all chromosomes. When I am uploading .ped and .info file in Haploview.jar then it is showing loading 0%…

Continue Reading Haploview and plink

Please help me understand LD pruning algoritm

Please help me understand LD pruning algoritm 0 I’m trying to do LD pruning with PLINK but I can’t find any proper documents about the algoritm used. There are options as indep, indep pairwise and indep pairphase and suboptions as windowsize, stepsize and threshold. I’m not sure what each means…

Continue Reading Please help me understand LD pruning algoritm

use of 2 covariate files for association analysis in plink

use of 2 covariate files for association analysis in plink 1 Dear all, I want to use 2 covariate file in my association analysis in plink, is this possible ?, (without combining all the covariates in 1 file) Is it valid to use ? –bfile file –logistic –covar /covarfile1.txt –covar-number…

Continue Reading use of 2 covariate files for association analysis in plink

Isolate a Region in a Vcf File to make a Smaller Vcf File

Isolate a Region in a Vcf File to make a Smaller Vcf File 0 I have been working on a project that has caused a bit of a headache for me. While I have made some progress, now the process is simply too slow to be reasonable. I have a…

Continue Reading Isolate a Region in a Vcf File to make a Smaller Vcf File

Paths and timings of the peopling of Polynesia inferred from genomic networks

1. Low, S. Hawaiki Rising: Hōkūle‘a, Nainoa Thompson, and the Hawaiian Renaissance (Univ. of Hawaii Press, 2019). 2. Kirch, P. V. On the Road of the Winds (Univ. of California Press, 2017). 3. Mulrooney, M. A., Bickler, S. H., Allen, M. S. & Ladefoged, T. N. High-precision dating of colonization…

Continue Reading Paths and timings of the peopling of Polynesia inferred from genomic networks

Plink –merge-list only outputting fam

Plink –merge-list only outputting fam 0 I am attempting to merge Plink files towards an algorithm I am using (CookHLA). I have already made the bed/bim/fam files for my vcf files. Now, I want to merge them into a single file so I can progress with the algorithm. To do…

Continue Reading Plink –merge-list only outputting fam

P-values far too high for quantitative regenie phenotype

P-values far too high for quantitative regenie phenotype 0 Hi all, I’m having some trouble running regenie (v2.2.4) on a quantitative phenotype for a large cohort. I’m testing a standard height GWAS with heights rounded to the nearest integer. I’ve tried a few different tests to see where the issue…

Continue Reading P-values far too high for quantitative regenie phenotype

Changing the sample IDs of a bed/bim/fam PLINK fileset

Changing the sample IDs of a bed/bim/fam PLINK fileset 0 Hi everyone, I am working with a genotype set that is not identified with the samples IDs that I want. However, I do have a lookup table which I can use in R to get the right identification when I…

Continue Reading Changing the sample IDs of a bed/bim/fam PLINK fileset

I can’t get a dossage file using PLINK

Hi, I have been trying to get a dosage file from vcf, map and fam files. For that, I have written this bash script : plink –fam plink.fam –map plink.map –dosage one.vcf –write-dosage However, I got this error: –dosage: Reading from one.vcf. Error: Line 1 of one.vcf has fewer tokens…

Continue Reading I can’t get a dossage file using PLINK

Aro Biotherapeutics hiring Investigator, Genetics & Bioinformatics in Philadelphia, Pennsylvania, United States

About Aro BioTx Join the team at Aro Biotherapeutics creating breakthrough biotherapeutics based on Centyrin oligonucleotide conjugates. Centyrins are small protein domains based on the fibronectin domains of human Tenascin C that combine the affinity and specificity properties of antibodies with the stability and tissue penetration properties of small molecules….

Continue Reading Aro Biotherapeutics hiring Investigator, Genetics & Bioinformatics in Philadelphia, Pennsylvania, United States

merge bfiles with different allele format by plink

merge bfiles with different allele format by plink 0 Hello, I want to merge two bfiles using plink. However, one bfile is written as below. 1 rs123 0 123 A G 1 rs234 0 234 G T The other file is as below. 1 rs123 0 123 1 2 1…

Continue Reading merge bfiles with different allele format by plink

Principal/senior Bioinformatics Scientist (Hereditary Disease) – Portola Valley

We are looking for a highly motivated, senior level bioinformatics scientist with extensive experience and interest in translational genomic research, genetic analysis of complex traits, quantitative genetics, and/or algorithm/pipeline development. This position requires experience with scientific programming, relational data systems, algorithms development, and statistical modeling. Top candidates will also have…

Continue Reading Principal/senior Bioinformatics Scientist (Hereditary Disease) – Portola Valley

Will be able to inoculate entire population by the end of the year: NTAGI Chief NK Arora

On Friday, India established a new milestone by administering more than 2 crore Covid-19 vaccinations in a single day. The country is the first in the world to reach such a large-scale vaccination target in one day. Dr NK Arora, Chief of National Immunization Technical Advisory Group (NTAGI), praised the…

Continue Reading Will be able to inoculate entire population by the end of the year: NTAGI Chief NK Arora

Haplotype divergence supports long-term asexuality in the oribatid mite Oppiella nova

Significance Putatively ancient asexual species pose a challenge to theory because they appear to escape the predicted negative long-term consequences of asexuality. Although long-term asexuality is difficult to demonstrate, specific signatures of haplotype divergence, called the “Meselson effect,” are regarded as strong support for long-term asexuality. Here, we provide evidence…

Continue Reading Haplotype divergence supports long-term asexuality in the oribatid mite Oppiella nova

Is a variant worse than Delta on the way? Viral evolution offers clues.

Somewhere in India last October, a person—likely immunocompromised, perhaps taking drugs for rheumatoid arthritis or with an advanced case of HIV/AIDS—developed COVID-19. Their case might have been mild, but because of their body’s inability to clear the coronavirus it lingered and multiplied. As the virus replicated and moved from one…

Continue Reading Is a variant worse than Delta on the way? Viral evolution offers clues.

Question about ROH analysis by Plink 1.9

Hi all, I have recently tried to estimate runs of homozygosity (ROH) from my vcf file by using plink 1.9. I ran following code to generate binary files that plink required: plink –vcf myfile.vcf –make-bed –out out_name –no-sex –no-parents –no-fid –no-pheno –allow-extra-chr This vcf file only contains one individual and…

Continue Reading Question about ROH analysis by Plink 1.9

Phylogeographic reconstruction of the marbled crayfish origin

Procambarus fallax collections and PCR genotyping Animals were collected from various wild populations (Table S1) in compliance with state and local regulations (Georgia department of natural resources scientific collection permit 115621108, state of Florida collection permits S-19-10 and S-20-04). DNA was isolated from abdominal muscle tissue using SDS-based extraction and precipitation…

Continue Reading Phylogeographic reconstruction of the marbled crayfish origin

Why may BOLT-LMM and SAIGE (quantitative, linear-mixed model) yield different results when ran on the absolutely the same dataset?

As a validation experiment, I have run the same GWAS of a quantitative phenotype derived from the UKBiobank, alongside the genomic data from the UKBiobank, once using the program BOLT-LMM and once using SAIGE linear mixed model (with selected quantitative trait tag). I wanted to see if the results would…

Continue Reading Why may BOLT-LMM and SAIGE (quantitative, linear-mixed model) yield different results when ran on the absolutely the same dataset?

How to find Standard Error (SE) values when not provided in GWAS summary stats?

How to find Standard Error (SE) values when not provided in GWAS summary stats? 2 Hi everyone! I’m trying to do a fixed effect meta-analysis on a couple of GWASes based on p-values, Standard error and effect estimates (Beta) using METAL genetics software. For one of my GWAS studies SE…

Continue Reading How to find Standard Error (SE) values when not provided in GWAS summary stats?

Phenotype file for eQTL analysis using GEMMA

Phenotype file for eQTL analysis using GEMMA 0 Hello All, I appreciate it if someone could direct me in this regard. I am running eQTL analysis using GEMMA software. I have corrected the expression file with all samples (280 samples) and the genotype file is (170). I have a couple…

Continue Reading Phenotype file for eQTL analysis using GEMMA

PRS in UK Biobank – no covariate file and no phenotype file

PRS in UK Biobank – no covariate file and no phenotype file 1 Hi there, I am trying to undertake a PRS using UK Biobank plink data. I am trying to generate a PRS using PRSice-2. However, the issue I am having is that I do not have a covariate…

Continue Reading PRS in UK Biobank – no covariate file and no phenotype file

Remove related samples using plink

Remove related samples using plink 0 Hi, I generated pairwise IBD (PI_HAT) using plink1.9 –genome option. I have >200,000 samples, so I used –parallel and combined the sub files using cat. Is there a way to remove related samples using the output file .genome.gz ? I read about –rel-cutoff but…

Continue Reading Remove related samples using plink

British scientist to unveil new tech that may hold key to ageing | News

A British genetics pioneer who has won one of science’s most valuable prizes is poised to unveil a new technology that may transform our understanding of disease and ageing. Professor Sir Shankar Balasubramanian was awarded a $1 million Breakthrough Prize last week for helping to invent next generation sequencing (NGS),…

Continue Reading British scientist to unveil new tech that may hold key to ageing | News

custom reference panel

custom reference panel 0 Hi all, i am embarking on creating a custom reference panel by selecting specific samples. my question is how do I select the samples? i can see that the plink distance matrix would be a good option, but otherwise, can you advise what other options are…

Continue Reading custom reference panel

High frequency of an otherwise rare phenotype in a small and isolated tiger population

Significance Small and isolated populations have low genetic variation due to founding bottlenecks and genetic drift. Few empirical studies demonstrate visible phenotypic change associated with drift using genetic data in endangered species. We used genomic analyses of a captive tiger pedigree to identify the genetic basis for a rare trait,…

Continue Reading High frequency of an otherwise rare phenotype in a small and isolated tiger population

Phasing with SHAPEIT

Edit June 7, 2020: The code below is for pre-phasing with SHAPEIT2. For phased imputation using the output of SHAPEIT2 and ultimate production of phased VCFs, see my answer here: A: ERROR: You must specify a valid interval for imputation using the -int argument, So, the steps are usually: pre-phasing…

Continue Reading Phasing with SHAPEIT

Produce PCA bi-plot for 1000 Genomes Phase III

Note1 – Previous version: Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old) Note2 – this data is for hg19 / GRCh37 Note3 – GRCh38 data is available HERE The tutorial has been updated based on the 1000 Genomes Phase III imputed genotypes. The original tutorial was…

Continue Reading Produce PCA bi-plot for 1000 Genomes Phase III

1001 Arabidopsis SNP

1001 Arabidopsis SNP 1 Hi everyone, I am learning to do some GWAS analysis in Arabidopsis. I used some accessions from the 1135 list (1001 genomes project)for a GWAS experiment. I have some questions for the genotype data. I find there are several different genomes data including vcf format and…

Continue Reading 1001 Arabidopsis SNP

GWAS with phenotype adjusted for covariate (age, sex, PC, batch, centre) using PLINK or external linear model and take residual?

GWAS with phenotype adjusted for covariate (age, sex, PC, batch, centre) using PLINK or external linear model and take residual? 0 Hi everyone, I am conducting GWAS analysis using PLINK but I wonder should I use PLINK to adjust the phenotype for fix effect (age, sex, PC, batch, centre) or…

Continue Reading GWAS with phenotype adjusted for covariate (age, sex, PC, batch, centre) using PLINK or external linear model and take residual?

The result of plink –freq is filled with NA

The result of plink –freq is filled with NA 0 I downloaded the vcf file. Then I used plink to convert it to a bed file and calculated the array frequency. However, the result of plink –freq was filled with NA. Can anyone give us an opinion? command ① ./plink –vcf…

Continue Reading The result of plink –freq is filled with NA

Quick Way To Combine Two Datasets Using Only Common Markers

Quick Way To Combine Two Datasets Using Only Common Markers 6 Is there a quick way to combine two datasets so that only the common markers are kept? Currently, if I have two datasets, I have to first get the intersection of the two BIM/MAP files, then extract those markers…

Continue Reading Quick Way To Combine Two Datasets Using Only Common Markers

How does PLINK work?

How does PLINK work? 0 Hi, I would like to know how a logistic regression in a GWAS works in detail. I have an example dataset, with SampleIDs, Genotypes (GT) in dosage formt and binary phenotypes. Now, how would the logistic regression be performed if I only want to work…

Continue Reading How does PLINK work?

Cutoffs for Proxy Identification: D’ and R^2

Cutoffs for Proxy Identification: D’ and R^2 0 I’m using a proxy-finding script with plink that reports potential proxies with their D’ and R^2: proxy D’ R^2 9:12345698:rs12345 1.0 0.000758121 9:12345999:rs87654 1.0 0.039958 9:12346999:rs54321 1.0 0.0399958 cf. www.cog-genomics.org/plink/2.0/ld However, what cutoffs should be used? From what I’ve read, D’ is…

Continue Reading Cutoffs for Proxy Identification: D’ and R^2

Bioinformatics Scientist in Frederick, MD

Job DescriptionBioinformatics ScientistFull Time Direct Hire Remote positionAre you looking for bioinformatics work? Are you interested in joining a team of talented bioinformaticians dedicated to understanding the genetics of cancer? In this role you will:* Function as a scientific thought leader within for all aspects of GWAS and population genetics….

Continue Reading Bioinformatics Scientist in Frederick, MD

Polygenic Risk Score Plot

Polygenic Risk Score Plot 0 I have a dataset of about 6000 people, Chromosome 21. I calculated PRS using plink and PRSice. The plot is shown above. I have two questions regarding this. R2 on the y-axis explains the phenotype variation, but I want another measure like AUC (area under…

Continue Reading Polygenic Risk Score Plot

Missense Variant on hg19

Missense Variant on hg19 1 Hello everybody, I am using plink for doing some statistic studies on a SNP set. I would like to use only missense variant, and I have the IDs of my SNPs of interesting. Can someone suggest me how can I download a database of homo…

Continue Reading Missense Variant on hg19

Oxford Nanopore readies itself for float in London | Business

One of Britain’s most promising biotechnology companies is preparing to launch an initial public offering in the coming weeks. Oxford Nanopore, a gene sequencing company, is working with banks on a float on the London Stock Exchange and is targeting a valuation above the £2.5 billion it was valued at…

Continue Reading Oxford Nanopore readies itself for float in London | Business

Update SNP map

Update SNP map 1 I have a 43k UMD map, and I would like to update to ARS (new reference genome). How can I do it? Are there any software or R package, Plink command? assembly update SNP map • 46 views • link updated 2 hours ago by devarora…

Continue Reading Update SNP map

Data Storage in Plink Format

Data Storage in Plink Format 1 Hi all, Our genotype data is currently stored in plink format, however I’ve found the required fam file columns of family ID, parental IDs, etc aren’t really suitable for our datasets; family information isn’t very informative for our analysis or often isn’t available. Can…

Continue Reading Data Storage in Plink Format

PLINK Haplotype blocks estimation not working

Hi, I am using PLINK to estimate haplotype blocks using Gabriel’s method. I am using the following command plink –file Chr$PBS_ARRAY_INDEX –noweb –all –blocks –ld-window-kb 500 And it seemed to be working just fine but when job finished no blocks were called at all. The log file does not mention…

Continue Reading PLINK Haplotype blocks estimation not working

normality assumption for GWAS QT

normality assumption for GWAS QT 0 Hi Planning to perform GWAS on quantitative trait using linear reg in plink. From my basic knowledge of statistics, i know that the assumption of normality (residuals) should be only approximately true. But i have seen so many papers on GWAS transforming the data…

Continue Reading normality assumption for GWAS QT

Association test to get p values and OR in plink2, and file input format

Association test to get p values and OR in plink2, and file input format 0 Are there any commands for association testing in plink2 which will output p-value and OR in the resulting output file? If so, what kind of file input do I need to use for such commands…a…

Continue Reading Association test to get p values and OR in plink2, and file input format

Performing population stratification based on GWA tutorial

Performing population stratification based on GWA tutorial 0 Hi, I’m performing QC steps of Andries T. Marees GWA tutorial, currently I’m stuck at 7th step where you should begin the population stratification downloading a 61GB vcf.gz file of 1000genomes containing genetic data of 629 individuals from different ethnic backgrounds. Successively…

Continue Reading Performing population stratification based on GWA tutorial

GWASpower calculation

GWASpower calculation 0 Dear All, Sorry , sounds like a stupid question but been struggling with this for a while. I am trying to do post-hoc power calculation for my GWAS using GWASpower software for quantitative trait (i know many don’t prefer post-hoc power calculation, but reviewer asked me despite…

Continue Reading GWASpower calculation

Genome of a middle Holocene hunter-gatherer from Wallacea

1. McColl, H. et al. The prehistoric peopling of Southeast Asia. Science 361, 88–92 (2018). CAS  PubMed  Article  ADS  Google Scholar  2. Hasanuddin. Gua Panningnge di Mallawa, Maros: kajian tentang gua hunian berdasarkan artefak batu dan sisa fauna. Naditira Widya 11, 81–96 (2017). Article  Google Scholar  3. Bulbeck, D., Pasqua,…

Continue Reading Genome of a middle Holocene hunter-gatherer from Wallacea

Association test to get p values and OR in plink2, and file input format?

Association test to get p values and OR in plink2, and file input format? 0 Are there any commands for association testing in plink2 which will output p-value and OR in the resulting output file? If so, what kind of file input do I need to use for such commands?…

Continue Reading Association test to get p values and OR in plink2, and file input format?

How I do lift over a Plink bim file from Hg18 to Hg19.

How I do lift over a Plink bim file from Hg18 to Hg19. 2 I’ve got some very old SNP data from Data Dryad. The BIM files uses coordinates from Hg18, but my dataset uses coordinates from Hg19. I was wondering if anyone knows how to liftover coordinates in a…

Continue Reading How I do lift over a Plink bim file from Hg18 to Hg19.

Filter duplicate ID from PLINK file

Filter duplicate ID from PLINK file 0 Hi All, I am new to SNP-chip data analysis. I have been exploring some SNP-chip data using plink 1.9, while checking the Relatedness using KING, I am getting an error of duplicate ID. I was wondering if there is any method I can…

Continue Reading Filter duplicate ID from PLINK file

SNP Pruning Through PCA (Edit: Feature Selection Through PCA)

SNP Pruning Through PCA (Edit: Feature Selection Through PCA) 1 Hello, I have roughly 1 million SNPs from 700 individuals and I wanted to prune the SNPs down, potentially through PLINK’s –pca command. However, I’m a little perplexed with how the eignvals/vectors I receive from the –pca command are to…

Continue Reading SNP Pruning Through PCA (Edit: Feature Selection Through PCA)