Tag: plink

Different relatedness estimates by PLINK and VCFTOOLS despite same method

According to the vcftools manual, specifying the “–relatedness2” flag allows calculating relatedness statistics using the method by Manichaikul et al., BIOINFORMATICS 2010 (doi:10.1093/bioinformatics/btq559). That is, based on KING. According to the PLINK manual, PLINK uses the same method to calculate relatedness when specifying the flag “–make-king-table”. So, although both PLINK…

Continue Reading Different relatedness estimates by PLINK and VCFTOOLS despite same method

calculated an LD matrix for a locus using plink2

Hi Everyone, I have a genotype file in  .pgen  format, that I subset and would like to calculate LD Matrix for. Previously I used a .bed file format and use plink 1.9 which worked as charm. But unfortunately my file is in Pgen format which is not supported  in plink…

Continue Reading calculated an LD matrix for a locus using plink2

PCA from plink2 for SGDP using a pangenome and DeepVariant

Hi there, I’m doing my first experiments with PCA and UMAP as dimensionality reductions to visualize a dataset I’ve been working on. Basically, I used the samples from the SGDP which I then mapped on the human pangenome for, finally, calling small variants with DeepVariant. I moved on with some…

Continue Reading PCA from plink2 for SGDP using a pangenome and DeepVariant

Imputing missing genotypes in –score

Does plink 2 impute missing genotypes with this pipe? plink2 –threads 1 \                –read-freq freq.afreq \                –vcf tube.vcf \                –score score_file.anno.plink2.tsv ignore-dup-ids \             …

Continue Reading Imputing missing genotypes in –score

Genomic insights into Plasmodium vivax population structure and diversity in central Africa | Malaria Journal

Hamblin MT, Di Rienzo A. Detection of the signature of natural selection in humans: evidence from the Duffy blood group locus. Am J Hum Genet. 2000;66:1669–79. Article  CAS  PubMed  PubMed Central  Google Scholar  Hamblin MT, Thompson EE, Di Rienzo A. Complex signatures of natural selection at the Duffy blood group…

Continue Reading Genomic insights into Plasmodium vivax population structure and diversity in central Africa | Malaria Journal

Characterization of runs of Homozygosity revealed genomic inbreeding and patterns of selection in indigenous sahiwal cattle

Almeida OAC, Moreira GCM, Rezende FM et al (2019) Identification of selection signatures involved in performance traits in a paternal broiler line. BMC Genomics 20:1–20. doi.org/10.1186/s12864-019-5811-1 Article  Google Scholar  Alshawi A, Essa A, Al-Bayatti S, Hanotte O (2019) Genome Analysis Reveals Genetic Admixture and Signature of Selection for Productivity and…

Continue Reading Characterization of runs of Homozygosity revealed genomic inbreeding and patterns of selection in indigenous sahiwal cattle

How to convert and annotate apt-probeset-genotype into PLINK format

How to convert and annotate apt-probeset-genotype into PLINK format 2 Dear all, I called SNP genotypes of 100 Affy6 CEL files using apt-probeset-genotype from APT in order to perform a subsequent CNV analysis with PennCNV. As PennCNV doesn’t integrate SNP quality control procedure (move out SNP with genotype call <…

Continue Reading How to convert and annotate apt-probeset-genotype into PLINK format

TWAS revealed significant causal loci for milk production and its composition in Murrah buffaloes

Cao, C. et al. Power analysis of transcriptome-wide association study: Implications for practical protocol choice. PLoS Genet. 17(2), e1009405 (2021). Article  CAS  PubMed  PubMed Central  Google Scholar  De Camargo, G. M. F. et al. Prospecting major genes in dairy buffaloes. BMC Genomics 16, 1–14 (2015). Article  Google Scholar  El-Halawany, N….

Continue Reading TWAS revealed significant causal loci for milk production and its composition in Murrah buffaloes

Diversity and dissemination of viruses in pathogenic protozoa

Wang, A. L. & Wang, C. C. Viruses of the protozoa. Annu. Rev. Microbiol. 45, 251–263 (1991). Article  CAS  PubMed  Google Scholar  Banik, G., Stark, D., Rashid, H. & Ellis, J. Recent advances in molecular biology of parasitic viruses. Infect. Disord. – Drug Targets 14, 155–167 (2015). Article  Google Scholar …

Continue Reading Diversity and dissemination of viruses in pathogenic protozoa

Archaic Introgression Shaped Human Circadian Traits | Genome Biology and Evolution

Abstract When the ancestors of modern Eurasians migrated out of Africa and interbred with Eurasian archaic hominins, namely, Neanderthals and Denisovans, DNA of archaic ancestry integrated into the genomes of anatomically modern humans. This process potentially accelerated adaptation to Eurasian environmental factors, including reduced ultraviolet radiation and increased variation in…

Continue Reading Archaic Introgression Shaped Human Circadian Traits | Genome Biology and Evolution

How to compute Hudson’s/Bhatia’s FST in R OR with vcf?

How to compute Hudson’s/Bhatia’s FST in R OR with vcf? 1 Hi everyone, How can I compute hierarchical Fst with Bhatia’s/Hudson’s estimator using a vcf as input? My data is structured like this: there are individuals within sampling sites, and sampling sites within groups. My vcfs contain SNP data (~1000…

Continue Reading How to compute Hudson’s/Bhatia’s FST in R OR with vcf?

Indigenous Australian genomes show deep structure and rich novel variation

Inclusion and ethics The DNA samples analysed in this project form part of a collection of biospecimens, including historically collected samples, maintained under Indigenous governance by the NCIG11 at the John Curtin School of Medical Research at the Australian National University (ANU). NCIG, a statutory body within ANU, was founded…

Continue Reading Indigenous Australian genomes show deep structure and rich novel variation

genetics – Trouble with Phenotype File in PLINK GWAS – 0 Individuals with Non-Missing Phenotypes

Problem Description: I am facing an issue while running a Genome-Wide Association Study (GWAS) using PLINK. Despite specifying the phenotype file and confirming the presence of the phenotype column (‘ChildPhenotype’), I consistently receive the error message: “0 individuals have non-missing phenotypes.” I have ensured that the values in the specified…

Continue Reading genetics – Trouble with Phenotype File in PLINK GWAS – 0 Individuals with Non-Missing Phenotypes

max-maf not filtering properly

Hi Chris, I have a vcf file for which I have left aligned and split multi-allelic sites. Then used, plink2 –vcf test.vcf –make-bed –out test1; this gives me binaries file. then, I updated FID and sex (all males, all founders). In plink2 plink2 –bfile test1 –max-maf 0.01 –geno 0.05 –make-bed…

Continue Reading max-maf not filtering properly

PLINK can’t find my files?

PLINK can’t find my files? 1 Hi, When I run PLINK, I always have this error message: No file (XXX.ped) exists. However, the file exists (of course) and is located at the same place as PLINK on my computer. The software has to be in to work so I…

Continue Reading PLINK can’t find my files?

Uncovering myocardial infarction genetic signatures using GWAS exploration in Saudi and European cohorts

Benjamin, E. J. et al. Heart disease and stroke statistics-2019 update: A report from the American Heart Association. Circulation 139, e56–e528 (2019). Article  PubMed  Google Scholar  Yusuf, S. et al. Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): Case-control study. Lancet 364,…

Continue Reading Uncovering myocardial infarction genetic signatures using GWAS exploration in Saudi and European cohorts

Infer ancestry for RNA-seq data

Infer ancestry for RNA-seq data 0 I generated VCF files with bcftools for 4 patient RNA-seq samples. I was also able to generate bed, bim, and fam files with PLINK for these files. I want some guidance on how to infer ancestry for these RNA-seq samples: How do I find…

Continue Reading Infer ancestry for RNA-seq data

Failed to open /ROH/.log. Try changing the –out parameter.

Error: Failed to open /ROH/.log. Try changing the –out parameter. 0 when I used this code in R system(“plink –vcf Pakistan.total.vcf –homozyg –homozyg-window-snp 50 –homozyg-snp 50 –homozyg-window-missing 3 –homozyg-kb 100 –homozyg-density 1000 –allow-extra-chr –out /ROH/plink/n”) I got this error: Error: Failed to open /ROH/plink/n.log. Try changing the –out parameter. How…

Continue Reading Failed to open /ROH/.log. Try changing the –out parameter.

Converting txt.gz to PLINK bim

Converting txt.gz to PLINK bim 0 Hello, I’m trying to do a stratified LDSC (or S-LDSC/partitioned LDSC) between locus of interest and diseases (diabetes, arthritis, etc.). For locus of interest, I have a bed file from previous research. For diseases, I have downloaded GWAS sumstats from the GWAS atlas. I…

Continue Reading Converting txt.gz to PLINK bim

How to perform quality control for sex when there are no variants after thresholding for MAF

How to perform quality control for sex when there are no variants after thresholding for MAF 0 How to perform quality control for sex when there are no variants after thresholding for MAF? I am trying with PLINK. Would it be accurate to merge with 1000 genomes European allele frequencies…

Continue Reading How to perform quality control for sex when there are no variants after thresholding for MAF

Genomics England hiring PhD Bioinformatics Intern in London, England, United Kingdom

Company DescriptionGenomics England partners with the NHS to provide whole genome sequencing diagnostics. We also equip researchers to find the causes of disease and develop new treatments – with patients and participants at the heart of it all. Our mission is to continue refining, scaling, and evolving our ability to…

Continue Reading Genomics England hiring PhD Bioinformatics Intern in London, England, United Kingdom

Noisy manhattan plot

Hi! I’m running GWAS on plink 2.00a4LM. My case cohort has roughly 600 individuals and control cohort has ~4000. Individuals. After running the GWAS, I plot the results using R. After some data exploration I decided to exclude some samples in order to avoid having samples with close family relationships…

Continue Reading Noisy manhattan plot

PhD Bioinformatics Intern Job in Greater London, Pharmaceuticals & Life Sciences Career, Intern/Graduate Jobs in Genomics England

Company Description Genomics England partners with the NHS to provide whole genome sequencing diagnostics. We also equip researchers to find the causes of disease and develop new treatments – with patients and participants at the heart of it all. Our mission is to continue refining, scaling, and evolving our…

Continue Reading PhD Bioinformatics Intern Job in Greater London, Pharmaceuticals & Life Sciences Career, Intern/Graduate Jobs in Genomics England

Issue with genetic QC sex check

Issue with genetic QC sex check 1 Hi, I am doing a sex check on genetic data for a cohort I am working on, consisting of about 830 people. Most people seem to have incorrect sex assignment (around 560 problems). I have used plink QC and there were no people…

Continue Reading Issue with genetic QC sex check

threads in plink GWAS

hello, i am using plink with dosage file to do GWAS as below. when i set threads=10, however, i check the real cpu consuming, plink only use 1 cpu instead of 10. does plink make no difference with different threads option? module load plink/1.90b4.1 plink –dosage ${geno_dir}/${dosage_filename} noheader skip1=2 skip2=2…

Continue Reading threads in plink GWAS

VCF conservation into Treemix

VCF conservation into Treemix 1 I have a multi-sample vcf file with ~7 millions SNPs. Now I want to convert it into required format of the Treemix. I run it using vcf2treemix.sh along with plink2treemix.py, but plink2treemix.py works very very slow. So that if I use it, the analysis in…

Continue Reading VCF conservation into Treemix

Global genetic diversity, introgression, and evolutionary adaptation of indicine cattle revealed by whole genome sequencing

Loftus, R. T., MacHugh, D. E., Bradley, D. G., Sharp, P. M. & Cunningham, P. Evidence for two independent domestications of cattle. Proc. Natl Acad. Sci. USA 91, 2757–2761 (1994). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  Verdugo Marta, P. et al. Ancient cattle genomics, origins, and rapid turnover…

Continue Reading Global genetic diversity, introgression, and evolutionary adaptation of indicine cattle revealed by whole genome sequencing

Postdoctoral Researcher in Alzheimer’s disease Genetics, Multi-Omics, and Imaging Biomarkers, St Louis, MO, USA

Location: Department of Neurology, NeuroGenomics and Informatics Center, Washington University in St. Louis Description The Washington University School of Medicine, Department of Neurology, has an opening for a post- doctoral research associate to join the Belloy lab in the NeuroGenomics and Informatics Center (NGI). The successful candidate will be involved…

Continue Reading Postdoctoral Researcher in Alzheimer’s disease Genetics, Multi-Omics, and Imaging Biomarkers, St Louis, MO, USA

Negative F statistics for sex check in plink

Negative F statistics for sex check in plink 0 Hi, I have a sample of 800 people and I did the sex check in plink 1.90 and have problems for 580 of these people and a lot have a negative F statistic which is not what I expected? The PAR…

Continue Reading Negative F statistics for sex check in plink

–glm no-firth: Segmentation fault

I am running GWAS with binary phenotypes. PLINK v2.00a6LM AVX2 AMD (21 Nov 2023) First option: –glm hide-covar single-prec-cc cc-residualize => Error: Cannot proceed with –glm regression on phenotype ‘pheno1’,since covariate-only Firth regression failed to converge. Second option –glm hide-covar single-prec-cc cc-residualize no-firth => Segmentation fault Third option –glm hide-covar…

Continue Reading –glm no-firth: Segmentation fault

Pruning with –indep-pairwise with plink 1.9

I’m new to PLINK and I would like to obtain a file with SNPs in approximate linkage equilibrium. Here is my script and the outputs of each step. If someone could tell me if there is an error in the script because at…

Continue Reading Pruning with –indep-pairwise with plink 1.9

normalize not left-normalizing?

I’m running plink2 to convert a vcf to a pgen with pseudobiallelic variants. Calling –normalize does not seem to left-normalize as I would expect, at least when I look at the .pvar. Log PLINK v2.00a6LM AVX2 Intel (21 Nov 2023)       www.cog-genomics.org/plink/2.0/(C) 2005-2023 Shaun Purcell, Christopher Chang  …

Continue Reading normalize not left-normalizing?

Merging several vcf files for GWAS?

Merging several vcf files for GWAS? 0 Hello! I am a Medical Student without much background in Bioinformatics trying to perform analysis for my first GWAS study, tremendously overwhelmed. It’s a Case Control Association Study with samples from 50 subjects, that we sampled using Novogene NGS platform. The problem is,…

Continue Reading Merging several vcf files for GWAS?

Inconsistent glm output across repeated runs

Hello, I’ve repeatedly run the exact same plink2 command several times and noticed my glm output was not always the same, which seems to be some sort of nondeterministic bug. ~8-9 out of 10 times, the output is the same, but every now and then the p-value is several orders…

Continue Reading Inconsistent glm output across repeated runs

Multi-ancestry genome-wide association study of cannabis use disorder yields insight into disease biology and public health implications

Inclusion and ethics statement We included researchers from the iPSYCH biobank and the PGC, who played a role in study design. This research was not restricted or prohibited in the setting of any of the included researchers. All studies were approved by local instituational research boards and ethics review committees….

Continue Reading Multi-ancestry genome-wide association study of cannabis use disorder yields insight into disease biology and public health implications

Mutation of key signaling regulators of cerebrovascular development in vein of Galen malformations

Adams, R. H. & Eichmann, A. Axon guidance molecules in vascular patterning. Cold Spring Harb. Perspect. Biol. 2, a001875 (2010). Article  PubMed  PubMed Central  Google Scholar  Fish, J. E. & Wythe, J. D. The molecular regulation of arteriovenous specification and maintenance. Dev. Dyn. 244, 391–409 (2015). Article  CAS  PubMed  Google…

Continue Reading Mutation of key signaling regulators of cerebrovascular development in vein of Galen malformations

Clumping with r2=0 and 250kb radius in plink

Clumping with r2=0 and 250kb radius in plink 1 Hi, I am doing clumping with the follow command: plink \ –bfile ${myfilename} \ –keep all_hg38_EUR.ids \ –clump ${trait}_tmp2.txt \ –clump-snp-field SNP \ –clump-field P \ –allow-extra-chr \ –memory 30000 \ –clump-p1 5e-8 \ –clump-r2 0 \ –clump-kb 250 \ –out…

Continue Reading Clumping with r2=0 and 250kb radius in plink

Inquiry Regarding NA P-values in Logistic Regression

Thank you for your assistance. I apologize for the inconvenience, but I still have a question to ask you. In the output file after conducting logistic regression, I do not see an “ERROR” column. Does this indicate that my data has all passed the multicollinearity check? Additionally, regardless of how…

Continue Reading Inquiry Regarding NA P-values in Logistic Regression

Quality control on imputed genotypes for GWAS / application of PGS

Quality control on imputed genotypes for GWAS / application of PGS 0 Hi everyone, I want to run a GWAS on imputed genotypes from UKB. Unfortunately, I only found tutorials that describe the quality control of genotypes in preparation for a GWAS. Are there tutorials for imputed datasets? I suppose…

Continue Reading Quality control on imputed genotypes for GWAS / application of PGS

Comparative genomics and genome-wide SNPs of endangered Eld’s deer provide breeder selection for inbreeding avoidance

De novo genome assemblies and genome annotation We assembled a de novo genome of a seven-year-old male SED from Ubon Ratchathani Zoo using a combination of Illumina short-reads (92.94 × coverage) and PacBio long-reads (61.6 × coverage) (GenBank accession number: JACCHN000000000). Additionally, we used MGI short-reads (52.15 × coverage) to assemble a de novo genome of…

Continue Reading Comparative genomics and genome-wide SNPs of endangered Eld’s deer provide breeder selection for inbreeding avoidance

Alternatives to snpflip to find ambigious and flipped snps

Alternatives to snpflip to find ambigious and flipped snps 0 Hello everyone, I having an issue with strand flips when trying to perform imputation. In the past on an old HPC I used it supported snpflip, a tool which would recognize ambigious snps as well as snps that have been…

Continue Reading Alternatives to snpflip to find ambigious and flipped snps

Plink2 PCA approx memory allocation

This is great, thank you! Will this information be included in the PLINK2 documentation? The successful run we had included the log below. In the “Projecting random vectors” line, 21 steps are described, rather than the number 20 of requested principal components. I assume this is part of how the…

Continue Reading Plink2 PCA approx memory allocation

University of Alabama at Birmingham hiring BIOINFORMATICIAN I in Birmingham, Alabama, United States

Position Summary: The primary role is to execute a variety of data management and analysis tasks, ensuring the quality, reproducibility, and efficiency of processes related to high-dimensional data. You will collaborate with study investigators and fellow bioinformatics professionals within the department to contribute to high-quality, reproducible research across various scientific…

Continue Reading University of Alabama at Birmingham hiring BIOINFORMATICIAN I in Birmingham, Alabama, United States

BAD_ES in plink1.9’s meta-analysis

Hi Chris, I have a question regarding plink1.9’s meta-analysis. I’m using plink1.9 for meta-analysis with plink2’s glm’s output. I have several problematic lines in the “meta.analysis.prob” output. They are caused by monomorphic sites in some sub-studies but not for all. In the plink2’s glm’s output, those sites are marked with an error of “CONST_OMITTED_ALLELE”….

Continue Reading BAD_ES in plink1.9’s meta-analysis

Genome-wide meta-analysis, functional genomics and integrative analyses implicate new risk genes and therapeutic targets for anxiety disorders

Kessler, R. C. et al. Lifetime prevalence and age-of-onset distributions of DSM-IV disorders in the National Comorbidity Survey Replication. Arch. Gen. Psychiatry 62, 593–602 (2005). Article  PubMed  Google Scholar  Kessler, R. C. et al. Prevalence, persistence, and sociodemographic correlates of DSM-IV disorders in the National Comorbidity Survey Replication Adolescent Supplement….

Continue Reading Genome-wide meta-analysis, functional genomics and integrative analyses implicate new risk genes and therapeutic targets for anxiety disorders

Divergent mechanisms of reduced growth performance in Betula ermanii saplings from high-altitude and low-latitude range edges

Aizawa M, Yoshimaru H, Saito H, Katsuki T, Kawahara T, Kitamura K et al. (2009) Range‐wide genetic structure in a north‐east Asian spruce (Picea jezoensis) determined using nuclear microsatellite markers. J Biogeogr 36(5):996–1007 Article  Google Scholar  Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated…

Continue Reading Divergent mechanisms of reduced growth performance in Betula ermanii saplings from high-altitude and low-latitude range edges

PLINK 1.9 meta-analysis

I’m trying to meta-analyze together some PLINK2 (“–glm omit-ref hide-covar cols=+a1freq,+beta”) outputs. I think I’m having trouble understanding the syntax requirement for the PLINK 1.9’s –meta-analysis feature.  I interpreted the doc as indicating that adding ‘logscale’ after the filenames would cause it to look for ‘BETA’ in the input.   PLINK…

Continue Reading PLINK 1.9 meta-analysis

Landscape genomics reveals adaptive genetic differentiation driven by multiple environmental variables in naked barley on the Qinghai-Tibetan Plateau

Abebe TD, Naz AA, Léon J (2015) Landscape genomics reveal signatures of local adaptation in barley (Hordeum vulgare L.). Front Plant Sci 6:813 Article  PubMed  PubMed Central  Google Scholar  Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19:1655–1664 Article  CAS  PubMed …

Continue Reading Landscape genomics reveals adaptive genetic differentiation driven by multiple environmental variables in naked barley on the Qinghai-Tibetan Plateau

How to create AffymetrixSuite file for using it in apt-format-result tool?

How to create AffymetrixSuite file for using it in apt-format-result tool? 0 Hi, I am in need of creating all_genotypes_by_snps.CHP.bin and all_genotypes_by_snps.CHP.index.txt files for snpchip samples using related APT tools. I will use those files in creating genotype calling files as below, and eventually will create plink files from this…

Continue Reading How to create AffymetrixSuite file for using it in apt-format-result tool?

A question about the missing or not observed alleles in PLINK datasets

A question about the missing or not observed alleles in PLINK datasets 0 Hello everyone, I would like to ask a question about the SNP array: I have eight PLINK datasets and I noticed I have quite a large number of variants with missing or no observed allele. What could…

Continue Reading A question about the missing or not observed alleles in PLINK datasets

Looking to compute R-squared with P-value for LocusZoom plot

Looking to compute R-squared with P-value for LocusZoom plot 0 Looking to compute R-squared values for a list of SNPs associated with specific phenotypes. Interested in having both p-values and R-squared scores for each SNP. Any advice on how to do this efficiently? After Rsqure and the p-value. I want…

Continue Reading Looking to compute R-squared with P-value for LocusZoom plot

plink1.9 chr23 extraction error

Just a quick update – I used PLINK2.0 which allowed me to pass through the public1.bim file but now there is a problem with the public.fam file. (C) 2005-2023 Shaun Purcell, Christopher Chang   GNU General Public License v3Logging to public1_filtered.log.Options in effect:   –bfile public1  –extract snp_ids_only.txt  –make-bed  –out…

Continue Reading plink1.9 chr23 extraction error

‘PC1’ entry on line 8297 of is categorical

> str(pheno_cov$PC1) num [1:10691] -0.0016 -0.001615 -0.001843 -0.001882 -0.000693 … PLINK v2.00a3.6LM AVX2 Intel (14 Aug 2022)     www.cog-genomics.org/plink/2.0/(C) 2005-2022 Shaun Purcell, Christopher Chang   GNU General Public License v3Logging to ./res/test_plink2.log.Options in effect:  –bfile updateID_geno0.02_maf0.005  –ci 0.95  –covar ./pheno_cov/pheno_cov.txt  –covar-name Sex_Code,PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10  –covar-variance-standardize  –glm hide-covar omit-ref  –memory 200000  –out…

Continue Reading ‘PC1’ entry on line 8297 of is categorical

Locally annotating SNP IDs and Gene names of called variants

Locally annotating SNP IDs and Gene names of called variants 0 I have GWAS results after variant calling. The VCF file only had CHR (1:22) and POS (12345678 etc) information but the ID column has all “.”, namely no rsIDs in it. After GWAS analysis I have a list of…

Continue Reading Locally annotating SNP IDs and Gene names of called variants

Plink2 –extract not working

Hi Chris,         This problem seems so silly but I just got stuck here for a long time.         I tried to extract a set of SNPs (I’m pretty sure they all appear in the .bim file) and rename their IDs with –set-all-var-ids, and…

Continue Reading Plink2 –extract not working

Plink Error

Plink Error 1 Hi I am trying to convert ped file for hapmap3 I downloaded here ftp.ncbi.nlm.nih.gov/hapmap/genotypes/2009-01_phaseIII/plink_format/ and unzipped with binzip2 but I am getting the following error when running this command PLINK v1.90b5.2 64-bit (9 Jan 2018) www.cog-genomics.org/plink/1.9/ (C) 2005-2018 Shaun Purcell, Christopher Chang GNU General Public License v3…

Continue Reading Plink Error

No valid entries in –score file

PLINK Error: No valid entries in –score file 0 Hi, I ran this command on plink1.9 to calculate the poligenic score. plink –vcf sample –score output.txt 1 2 3 –out poligenic_results – output.txt: ID ALT UKB-b-15541 rs10399793 C 0.000345793 rs2462492 T -0.00027716 – sample.vcf: #CHROM POS ID REF ALT QUAL…

Continue Reading No valid entries in –score file

Cannot proceed with –glm regression on phenotype ‘platelet_count’, since covariate correlation matrix could not be inverted (VIF_INFINITE)

Hi, I am trying to run GWAS on the Platelet Count phenotype.I am using sex, age, assessment center, genotype measurement, and 40 PCs as a covariance matrix.  I removed NA from the phenotype file, and I fitted it to the covariance matrix so it contains the same IDs.I calculated the…

Continue Reading Cannot proceed with –glm regression on phenotype ‘platelet_count’, since covariate correlation matrix could not be inverted (VIF_INFINITE)

No –bgen REF/ALT mode specified

runPLINK <- function(PLINKoptions = “”) system(paste(“/opt/apps/plink/2.0/bin/plink2”, PLINKoptions))runPLINK() runPLINK(“–bgen /DATA/shared/bcac/genotypes/v10/bgen/icogs_euro/iCOGS_european_chr21.bgen –out /DATA/users/m.shokouhi/plink/plink2”) The mentioned code makes PGEN/PVAR/PSAM files but there is a warning in the procedure: No –bgen REF/ALT mode specified. In plink2 website it is written that it considers the first allele as a reference allele if you do not specify…

Continue Reading No –bgen REF/ALT mode specified

A question about genotyping rate

A question about genotyping rate 0 Hello everyone, I have four PLINK samples. I harmonized the samples using Genotype Harmonizer in presence of a reference panel. The genotyping rate, for each PLINK sample is around 0.98-0.99. When I merge the four PLINK sets, the genotyping rate drops to 0.76 in…

Continue Reading A question about genotyping rate

Calculating height prediction from PGS

Calculating height prediction from PGS 1 You double really get a “predicted” height per-sec, rather, you get a score that has correlates with the height. With the provided weights, you can directly use plink –score to generate the polygenic score, but most likely they won’t look like height (e.g. mean…

Continue Reading Calculating height prediction from PGS

No samples in .vcf file.

I am trying to convert my vcf file into a BED format file.  When I use this command: plink –vcf merge.bacteria.vcf.gz –make-bed –out merge.bacteria.vcf.bed  I get the following error stating:  PLINK v1.90b6.21 64-bit (19 Oct 2020)          www.cog-genomics.org/plink/1.9/(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License…

Continue Reading No samples in .vcf file.

Plink2 error

I downloaded data from 1000 genomes website. MacBook-Air-4:plink_mac mac$ ./plink –vcf ALL.chr1.shapeit2_integrated_snvindels_v2a_27022019.GRCh38.phased.vcf –make-bed –out char1 PLINK v1.90p 64-bit (13 Feb 2023)            www.cog-genomics.org/plink/1.9/ (C) 2005-2023 Shaun Purcell, Christopher Chang   GNU General Public License v3 Logging to char1.log. Options in effect:   –make-bed   –out char1…

Continue Reading Plink2 error

allow-no-covars not recognized when using plink2 glm without covariates

Hi! I would like to run association tests without covariates, to show the effect of population stratification correction. When I first tried doing this,  I was instructed to use the ‘allow-no-covars’ modifier. However, when I run my command including this modifier, I get the error Error: Unrecognized flag (‘–allow-no-covars’). The…

Continue Reading allow-no-covars not recognized when using plink2 glm without covariates

Normalisation of PLINK/VCF files?

Normalisation of PLINK/VCF files? 0 Variant notations can vary significantly, and although there are numerous tools available to address this issue, such as bcftools +fixref or bcftools norm, there’s still a chance that something might be overlooked. Is there a comprehensive tool or pipeline that automates this process to ensure…

Continue Reading Normalisation of PLINK/VCF files?

Association Analysis with Plink error

Association Analysis with Plink error 3 1. this is my phenotype file (called outputfile.txt in command line use): FID IID Cadmium_Chloride Caffeine Calcium_Chloride Cisplatin Cobalt_Chloride Congo_red Copper Cycloheximide Diamide E6_Berbamine Ethanol Formamide Galactose Hydrogen_Peroxide Hydroquinone Hydroxyurea Indoleacetic_Acid Lactate Lactose Lithium_Chloride Magnesium_Chloride Magnesium_Sulfate Maltose Mannose Menadione Neomycin Paraquat Raffinose SDS Sorbitol…

Continue Reading Association Analysis with Plink error

How do I remove duplicate SNPs in PLINK from more than 1 data set?

How do I remove duplicate SNPs in PLINK from more than 1 data set? 1 Hi there, I am trying to remove duplicate SNPs from my data but I I have data from 6 different panels, I am not sure how to do them all in plink at once? SNPs…

Continue Reading How do I remove duplicate SNPs in PLINK from more than 1 data set?

Genome-wide association study of traumatic brain injury in U.S. military veterans enrolled in the VA million veteran program

Helmick KM, Spells CA, Malik SZ, Davies CA, Marion DW, Hinds SR. Traumatic brain injury in the US military: Epidemiology and key clinical and research programs. Brain Imaging Behav. 2015;9:358–66. Article  PubMed  Google Scholar  DoD Numbers for Traumatic Brain Injury Worldwide – Totals (Defense Health Agency) (2021). Karr JE, Areshenkoff…

Continue Reading Genome-wide association study of traumatic brain injury in U.S. military veterans enrolled in the VA million veteran program

PLINK | Updating sex information issue

PLINK | Updating sex information issue 0 Hello, I am attempting to update the sex information of my cohort vcf data by using PLINK. This is the command I am running: plink –bed input_bed –bim input_bim –fam input_fam –update-sex input_sex.txt –make-bed —out output_name > stdout.out For some reason, I am…

Continue Reading PLINK | Updating sex information issue

Genomic Data Analyst job in Pojoaque, NM at Private Bioscience @ Get.It

Summary Description: We are looking for a talented bioinformatician or computational biologist who specializes in utilizing polygenic scores and machine learning techniques to analyze genomic data, particularly for predicting complex disease and trait phenotypes. The ideal candidate will possess a solid understanding of genetics and genomics, along with expertise in…

Continue Reading Genomic Data Analyst job in Pojoaque, NM at Private Bioscience @ Get.It

What do HAP A1 and HAP A2 mean in plink –freqx output?

What do HAP A1 and HAP A2 mean in plink –freqx output? 0 let’s assume: SNP1 = A/C, SNP2 = T/G, SNP3 = A/G, SNP4 = C/T how do I know what C(HAP A1) C(HAP A2) mean? is HAP A1 = ATAC and HAP A2 = CGGT? This wouldn’t make…

Continue Reading What do HAP A1 and HAP A2 mean in plink –freqx output?

Genome-wide association study in 404,302 individuals identifies 7 significant loci for reaction time variability

MacDonald SW, Li SC, Bäckman L. Neural underpinnings of within-person variability in cognitive functioning. Psychol Aging. 2009;24:792–808. Article  PubMed  Google Scholar  Haynes BI, Bunce D, Kochan NA, Wen W, Brodaty H, Sachdev PS. Associations between reaction time measures and white matter hyperintensities in very old age. Neuropsychologia. 2017;96:249–55. Article  PubMed …

Continue Reading Genome-wide association study in 404,302 individuals identifies 7 significant loci for reaction time variability

PLINK .ped file issue

Hi everyone, relative newbie here (non-bioinformatics background; got to know EWAS and TWAS before, but have no experience with plink). I am trying to run a GWAS using a .ped and a .map file (got nothing else apart from the raw .idat files). I am trying to use plink for…

Continue Reading PLINK .ped file issue

About plink2 error

Dear, Sorry for the error message I put plink and plink2 at same path (My desktop) but I can only access at plink, not plink2 like this “plink2” can’t be opened because Apple cannot check it for malicious software. Originally I was trying to plink2 –zst-decompress all_phase3.pvar.zst > all_phase3.pvar So…

Continue Reading About plink2 error

Transcriptional regulation and overexpression of GST cluster enhances pesticide resistance in the cotton bollworm, Helicoverpa armigera (Lepidoptera: Noctuidae)

Bras, A., Roy, A., Heckel, D. G., Anderson, P. & Green, K. K. Pesticide resistance in arthropods: ecology matters too. Ecol. Lett. 25, 1746–1759 (2022). Article  PubMed  PubMed Central  Google Scholar  Chen, Y. H. & Schoville, S. D. Editorial overview: ecology: ecological adaptation in agroecosystems: novel opportunities to integrate evolutionary…

Continue Reading Transcriptional regulation and overexpression of GST cluster enhances pesticide resistance in the cotton bollworm, Helicoverpa armigera (Lepidoptera: Noctuidae)

How to merge my vcf files (n=6) with existing Pf6 vcf file and do pca?

How to merge my vcf files (n=6) with existing Pf6 vcf file and do pca? 0 I sampled some Pf strains and got them WGS done. Now I want to merge them with existing Pf6 data. For this I downloaded Pf6 data for all 14 chromosomes. I then used bcftools…

Continue Reading How to merge my vcf files (n=6) with existing Pf6 vcf file and do pca?

Troubleshooting multallelic variant merging issue

Hello, I want to recode the IIDs of imputed data .bgen files into two different filesets, and merge these (working on eye-level analyses with Regenie). As I’m only interested in dosages, I’ve converted these to .pgen using PLINK2 (ref-first as UK Biobank): plink2 –bgen data.bgen ref-first –sample data.sample –update-ids recoded_ids_a.txt –make-pgen…

Continue Reading Troubleshooting multallelic variant merging issue

Range-wide and temporal genomic analyses reveal the consequences of near-extinction in Swedish moose

Ceballos, G., Ehrlich, P. R. & Raven, P. H. Vertebrates on the brink as indicators of biological annihilation and the sixth mass extinction. Proc. Natl Acad. Sci. USA 117, 13596–13602 (2020). Article  CAS  PubMed  PubMed Central  Google Scholar  Ceballos, G., Ehrlich, P. R. & Dirzo, R. Biological annihilation via the…

Continue Reading Range-wide and temporal genomic analyses reveal the consequences of near-extinction in Swedish moose

Distinct non-synonymous mutations in cytochrome b highly correlate with decoquinate resistance in apicomplexan parasite Eimeria tenella | Parasites & Vectors

Chapman HD, Rathinam T. Focused review: the role of drug combinations for the control of coccidiosis in commercially reared chickens. Int J Parasitol Drugs Drug Resist. 2022;18:32–42. PubMed  PubMed Central  Google Scholar  Peek HW, Landman WJM. Coccidiosis in poultry: anticoccidial products, vaccines and other prevention strategies. Vet Q. 2011;31:143–61. CAS …

Continue Reading Distinct non-synonymous mutations in cytochrome b highly correlate with decoquinate resistance in apicomplexan parasite Eimeria tenella | Parasites & Vectors

Allele frequncies in plink including physical position in the output

Allele frequncies in plink including physical position in the output 1 Hi, I am trying to compute allele frequencies for a large genotypic data set. The command I am using is as follow: plink2 –vcf my_file.vcf.gz –freq –map my_file.map –out my_outfile The reason I am using a map file is…

Continue Reading Allele frequncies in plink including physical position in the output

No output detected

To clarify, wes_pgen_12a refers to a set of plink files (.pgen, .pvar, .psam). I have removed some of the backslashes in case those were confusing the program, and now have: plink2 –pfile “${pgen_path}wes_pgen_12a” \       –pmerge “${pgen_path}wes_pgen_12b.pgen” \               “${pgen_path}wes_pgen_12b.pvar” \         …

Continue Reading No output detected

public databases – Converting VCF format to text for use with PLINK and understanding column mapping

I successfully completed Nature PRS tutorial, which is based on PLINK. Turning to my real data, I downloaded ukb-d-20544_1.vcf.gz. Now I’m facing the problem that I seem to be unable to use it in PLINK or find the correct data format to download at all, and I am a bit…

Continue Reading public databases – Converting VCF format to text for use with PLINK and understanding column mapping

Mexican Biobank advances population and medical genomics of diverse ancestries

Encuesta Nacional de Salud 2000 Since 1988, Mexico has established periodical National Health Surveys (Encuesta Nacional de Salud (ENSA), originally conceived as National Nutrition Surveys) for surveillance of Mexican population-based nutrition and health metrics. In this study, we use data and samples collected from the survey carried out in 2000,…

Continue Reading Mexican Biobank advances population and medical genomics of diverse ancestries

Line 15522 of data/Pheno_KFs1.txt has fewer tokens than expected in GWAS analysis

I’m performing GWAS using UKB imputed genetic data below. However, I got the error as follows. plink2 –bfile data_TL –glm hide-covar –pheno data/Pheno_KFs1.txt –pheno-name LogBUN_mg_dl –covar data/Covariatesdata.txt –covar-name PC{1..10}, Age, Tuoi, Sex –extract TL_snplist_All.txt –out output/GWAS_BUN.cvrtPLINK v2.00a6LM 64-bit Intel (27 Sep 2023)     www.cog-genomics.org/plink/2.0/(C) 2005-2023 Shaun Purcell, Christopher Chang…

Continue Reading Line 15522 of data/Pheno_KFs1.txt has fewer tokens than expected in GWAS analysis

variant calling – INDELS in PLINK files converted to VCF

I want to compare/validate variants called from sequencing data with array (plink format) variant data. I converted the plink files (.bim, .bed, and .fam files) with plink1 to vcf files. plink –bfile prefix_plink –recode vcf-iid –out prefix_out However, the plink vcf files have “I” and “D” values for INDEL variants…

Continue Reading variant calling – INDELS in PLINK files converted to VCF

Update sample information in chunks

plink –bfile {chr1_exomes} –update-ids {new_IIDs_A} –make-bed –out {updated_chr1_exomes_A} plink –bfile {chr1_exomes} –update-ids {new_IIDs_B} –make-bed –out {updated_chr1_exomes_B} plink –bfile {updated_chr1_exomes_A} –bmerge {updated_chr1_exomes_B}.bed {updated_chr1_exomes_B}.bim {updated_chr1_exomes_B}.fam –make-bed –out {merged_chr1_exomes_A_B} Original data: ID1, ID2, … New data: ID1_A, ID1_B, ID2_A, ID2_B, … Would updating the IDs of the .fam file be enough in this…

Continue Reading Update sample information in chunks

Picard Liftover MismatchedRefAllele PsychArray

Picard Liftover MismatchedRefAllele PsychArray 0 New to using liftOver and working with vcf files generally: I ran liftOver on data gathered from the PsychChip array to lift over from GRCh37 to GRCh38, and got only about 50% of variants lifted over. Most of the rejected ones had “MismatchedRefAllele” as their…

Continue Reading Picard Liftover MismatchedRefAllele PsychArray

Fast Eqtl Analysis Tool

Fast Eqtl Analysis Tool 4 I’ve got about 2M imputed SNPs and 35K gene expression probesets, and I’d like to identify all eQTL. Running this under PLINK –linear is going to take a very long time, are there any specialized tools out there to handle this sort of data? eqtl…

Continue Reading Fast Eqtl Analysis Tool

Why my GWAS p-value QQ-plot is far above diagonal

Why my GWAS p-value QQ-plot is far above diagonal 1 Hi. I’m trying to run GWAS pipeline using plink, but the results I got look really off. The QQ-plot of the p-values is far above the diagonal. The phenotype I used is the height. I’m pretty sure I followed the…

Continue Reading Why my GWAS p-value QQ-plot is far above diagonal

Whole-genome sequencing analysis of suicide deaths integrating brain-regulatory eQTLs data to identify risk loci and genes

Li QS, Shabalin AA, DiBlasi E, Gopal S, Canuso CM, FinnGen ISGC, et al. Genome-wide association study meta-analysis of suicide death and suicidal behavior. Mol. Psychiatry 2023;28:891–900. McGuffin P, Marusic A, Farmer A. What can psychiatric genetics offer suicidology? Crisis. 2001;22:61–65. Article  CAS  PubMed  Google Scholar  Pedersen NL, Fiske A….

Continue Reading Whole-genome sequencing analysis of suicide deaths integrating brain-regulatory eQTLs data to identify risk loci and genes

Need help to setup a PGS pipeline

Job:Need help to setup a PGS pipeline 0 I seek a bioinformatician with experience setting up pipelines in an academic environment. I want to be able to take different file formats from WGS and then do QC, imputation, and then run PGS using different industry-standard libraries and tools. I am…

Continue Reading Need help to setup a PGS pipeline

Pangenome analysis provides insight into the evolution of the orange subfamily and a key gene for citric acid accumulation in citrus fruits

Swingle, W. T. & Reece, P. C. In The Citrus Industry, History, World Distribution, Botany, and Varieties, Vol. 1 (eds Reuther, W. et al.) 190–143 (Univ. of California Press, 1967). Morton, C. M. & Telmer, C. New subfamily classification for the Rutaceae. Ann. Mo. Bot. Gard. 99, 620–641 (2014). Article …

Continue Reading Pangenome analysis provides insight into the evolution of the orange subfamily and a key gene for citric acid accumulation in citrus fruits

Issue with Numerical Covariate * Covariate Interaction in Association Analysis

I tried to perform a covariate-covariate interaction analysis using numerical covariates of interest, namely activity_data and age. However, I encountered an error message stating that the number of samples was equal to or less than the number of predictor columns for the specified phenotype. Just to make sure I run…

Continue Reading Issue with Numerical Covariate * Covariate Interaction in Association Analysis

QC of genetic data

QC of genetic data 0 Hi, I have some genetic data in a bim file. The chromosomes range from 0 to 23 and 26, which I have not come across before. Would the SNPs on chromosome 0 and 26 be removed from the genetic file or left in. Then, I…

Continue Reading QC of genetic data

Issue with merging in plink and eigensoft.

Issue with merging in plink and eigensoft. 0 I merged two datasets in plink1.9. It worked, but I did get the error “multiple positions seen for variant” and “variants have the same position”. How do I resolve this when trying to merge the datasets again? And, I tried to merge…

Continue Reading Issue with merging in plink and eigensoft.

Allele frequency calculation for genotype dosage value

Allele frequency calculation for genotype dosage value 0 Hello, i have a data set with the dosage data (between 0-2) from a couple million SNPs, i would like to get the MAF for each SNP. I saw somewhere (not that reliable place) that you can get it just doing: SNP1…

Continue Reading Allele frequency calculation for genotype dosage value

Highly inflated p-values in GWAS by regenie

Highly inflated p-values in GWAS by regenie 0 I was running a GWAS using REGENIE 3.2.5 on more than 250,000 samples, and the p-values returned are highly inflated with -log10P up to 5000. As a result there were over 10,000 variants called significant under the threshold of p < 5e-8,…

Continue Reading Highly inflated p-values in GWAS by regenie

Genome-Wide Association Study of Alopecia Areata in Taiwan

Introduction Alopecia areata (AA) is one of Taiwan’s most common autoimmune hair diseases and incidence rate of AA is 0.22%.1–3 The main symptoms of AA are rapid, non-scarring hair loss that affects body hair, facial hair, eyelashes, and brows.1,2 In the United States, the prevalence of AA is estimated to…

Continue Reading Genome-Wide Association Study of Alopecia Areata in Taiwan

Genetic distance in cM from VCF of non-reference species to run Beagle

I’m working with a resequenced genome of a non-reference species. The VCF contains ~7 mln of SNPs, all with their relative position on their own chromosome. I have a 10.01 % of missing data, so I need to impute these NA. I eventually settled for Beagle v5 as a tool,…

Continue Reading Genetic distance in cM from VCF of non-reference species to run Beagle

–update-ids with long IID names

 Hello! I have a plink2 pgen/pvar/psam file set where the FID corresponds to a unique user ID and the IID is a long name (typically over 50 characters) that starts with the FID and has other information separated by _. An example FID is R123456789_chipname_plate_platenumber_A01. Each R123456789 identifier is unique….

Continue Reading –update-ids with long IID names