Categories
Tag: plink
Different relatedness estimates by PLINK and VCFTOOLS despite same method
According to the vcftools manual, specifying the “–relatedness2” flag allows calculating relatedness statistics using the method by Manichaikul et al., BIOINFORMATICS 2010 (doi:10.1093/bioinformatics/btq559). That is, based on KING. According to the PLINK manual, PLINK uses the same method to calculate relatedness when specifying the flag “–make-king-table”. So, although both PLINK…
calculated an LD matrix for a locus using plink2
Hi Everyone, I have a genotype file in .pgen format, that I subset and would like to calculate LD Matrix for. Previously I used a .bed file format and use plink 1.9 which worked as charm. But unfortunately my file is in Pgen format which is not supported in plink…
PCA from plink2 for SGDP using a pangenome and DeepVariant
Hi there, I’m doing my first experiments with PCA and UMAP as dimensionality reductions to visualize a dataset I’ve been working on. Basically, I used the samples from the SGDP which I then mapped on the human pangenome for, finally, calling small variants with DeepVariant. I moved on with some…
Imputing missing genotypes in –score
Does plink 2 impute missing genotypes with this pipe? plink2 –threads 1 \ –read-freq freq.afreq \ –vcf tube.vcf \ –score score_file.anno.plink2.tsv ignore-dup-ids \ …
Genomic insights into Plasmodium vivax population structure and diversity in central Africa | Malaria Journal
Hamblin MT, Di Rienzo A. Detection of the signature of natural selection in humans: evidence from the Duffy blood group locus. Am J Hum Genet. 2000;66:1669–79. Article CAS PubMed PubMed Central Google Scholar Hamblin MT, Thompson EE, Di Rienzo A. Complex signatures of natural selection at the Duffy blood group…
Characterization of runs of Homozygosity revealed genomic inbreeding and patterns of selection in indigenous sahiwal cattle
Almeida OAC, Moreira GCM, Rezende FM et al (2019) Identification of selection signatures involved in performance traits in a paternal broiler line. BMC Genomics 20:1–20. doi.org/10.1186/s12864-019-5811-1 Article Google Scholar Alshawi A, Essa A, Al-Bayatti S, Hanotte O (2019) Genome Analysis Reveals Genetic Admixture and Signature of Selection for Productivity and…
How to convert and annotate apt-probeset-genotype into PLINK format
How to convert and annotate apt-probeset-genotype into PLINK format 2 Dear all, I called SNP genotypes of 100 Affy6 CEL files using apt-probeset-genotype from APT in order to perform a subsequent CNV analysis with PennCNV. As PennCNV doesn’t integrate SNP quality control procedure (move out SNP with genotype call <…
TWAS revealed significant causal loci for milk production and its composition in Murrah buffaloes
Cao, C. et al. Power analysis of transcriptome-wide association study: Implications for practical protocol choice. PLoS Genet. 17(2), e1009405 (2021). Article CAS PubMed PubMed Central Google Scholar De Camargo, G. M. F. et al. Prospecting major genes in dairy buffaloes. BMC Genomics 16, 1–14 (2015). Article Google Scholar El-Halawany, N….
Diversity and dissemination of viruses in pathogenic protozoa
Wang, A. L. & Wang, C. C. Viruses of the protozoa. Annu. Rev. Microbiol. 45, 251–263 (1991). Article CAS PubMed Google Scholar Banik, G., Stark, D., Rashid, H. & Ellis, J. Recent advances in molecular biology of parasitic viruses. Infect. Disord. – Drug Targets 14, 155–167 (2015). Article Google Scholar …
Archaic Introgression Shaped Human Circadian Traits | Genome Biology and Evolution
Abstract When the ancestors of modern Eurasians migrated out of Africa and interbred with Eurasian archaic hominins, namely, Neanderthals and Denisovans, DNA of archaic ancestry integrated into the genomes of anatomically modern humans. This process potentially accelerated adaptation to Eurasian environmental factors, including reduced ultraviolet radiation and increased variation in…
How to compute Hudson’s/Bhatia’s FST in R OR with vcf?
How to compute Hudson’s/Bhatia’s FST in R OR with vcf? 1 Hi everyone, How can I compute hierarchical Fst with Bhatia’s/Hudson’s estimator using a vcf as input? My data is structured like this: there are individuals within sampling sites, and sampling sites within groups. My vcfs contain SNP data (~1000…
Indigenous Australian genomes show deep structure and rich novel variation
Inclusion and ethics The DNA samples analysed in this project form part of a collection of biospecimens, including historically collected samples, maintained under Indigenous governance by the NCIG11 at the John Curtin School of Medical Research at the Australian National University (ANU). NCIG, a statutory body within ANU, was founded…
genetics – Trouble with Phenotype File in PLINK GWAS – 0 Individuals with Non-Missing Phenotypes
Problem Description: I am facing an issue while running a Genome-Wide Association Study (GWAS) using PLINK. Despite specifying the phenotype file and confirming the presence of the phenotype column (‘ChildPhenotype’), I consistently receive the error message: “0 individuals have non-missing phenotypes.” I have ensured that the values in the specified…
max-maf not filtering properly
Hi Chris, I have a vcf file for which I have left aligned and split multi-allelic sites. Then used, plink2 –vcf test.vcf –make-bed –out test1; this gives me binaries file. then, I updated FID and sex (all males, all founders). In plink2 plink2 –bfile test1 –max-maf 0.01 –geno 0.05 –make-bed…
PLINK can’t find my files?
Uncovering myocardial infarction genetic signatures using GWAS exploration in Saudi and European cohorts
Benjamin, E. J. et al. Heart disease and stroke statistics-2019 update: A report from the American Heart Association. Circulation 139, e56–e528 (2019). Article PubMed Google Scholar Yusuf, S. et al. Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): Case-control study. Lancet 364,…
Infer ancestry for RNA-seq data
Infer ancestry for RNA-seq data 0 I generated VCF files with bcftools for 4 patient RNA-seq samples. I was also able to generate bed, bim, and fam files with PLINK for these files. I want some guidance on how to infer ancestry for these RNA-seq samples: How do I find…
Failed to open /ROH/.log. Try changing the –out parameter.
Error: Failed to open /ROH/.log. Try changing the –out parameter. 0 when I used this code in R system(“plink –vcf Pakistan.total.vcf –homozyg –homozyg-window-snp 50 –homozyg-snp 50 –homozyg-window-missing 3 –homozyg-kb 100 –homozyg-density 1000 –allow-extra-chr –out /ROH/plink/n”) I got this error: Error: Failed to open /ROH/plink/n.log. Try changing the –out parameter. How…
Converting txt.gz to PLINK bim
Converting txt.gz to PLINK bim 0 Hello, I’m trying to do a stratified LDSC (or S-LDSC/partitioned LDSC) between locus of interest and diseases (diabetes, arthritis, etc.). For locus of interest, I have a bed file from previous research. For diseases, I have downloaded GWAS sumstats from the GWAS atlas. I…
How to perform quality control for sex when there are no variants after thresholding for MAF
How to perform quality control for sex when there are no variants after thresholding for MAF 0 How to perform quality control for sex when there are no variants after thresholding for MAF? I am trying with PLINK. Would it be accurate to merge with 1000 genomes European allele frequencies…
Genomics England hiring PhD Bioinformatics Intern in London, England, United Kingdom
Company DescriptionGenomics England partners with the NHS to provide whole genome sequencing diagnostics. We also equip researchers to find the causes of disease and develop new treatments – with patients and participants at the heart of it all. Our mission is to continue refining, scaling, and evolving our ability to…
Noisy manhattan plot
Hi! I’m running GWAS on plink 2.00a4LM. My case cohort has roughly 600 individuals and control cohort has ~4000. Individuals. After running the GWAS, I plot the results using R. After some data exploration I decided to exclude some samples in order to avoid having samples with close family relationships…
PhD Bioinformatics Intern Job in Greater London, Pharmaceuticals & Life Sciences Career, Intern/Graduate Jobs in Genomics England
Company Description Genomics England partners with the NHS to provide whole genome sequencing diagnostics. We also equip researchers to find the causes of disease and develop new treatments – with patients and participants at the heart of it all. Our mission is to continue refining, scaling, and evolving our…
Issue with genetic QC sex check
Issue with genetic QC sex check 1 Hi, I am doing a sex check on genetic data for a cohort I am working on, consisting of about 830 people. Most people seem to have incorrect sex assignment (around 560 problems). I have used plink QC and there were no people…
threads in plink GWAS
hello, i am using plink with dosage file to do GWAS as below. when i set threads=10, however, i check the real cpu consuming, plink only use 1 cpu instead of 10. does plink make no difference with different threads option? module load plink/1.90b4.1 plink –dosage ${geno_dir}/${dosage_filename} noheader skip1=2 skip2=2…
VCF conservation into Treemix
VCF conservation into Treemix 1 I have a multi-sample vcf file with ~7 millions SNPs. Now I want to convert it into required format of the Treemix. I run it using vcf2treemix.sh along with plink2treemix.py, but plink2treemix.py works very very slow. So that if I use it, the analysis in…
Global genetic diversity, introgression, and evolutionary adaptation of indicine cattle revealed by whole genome sequencing
Loftus, R. T., MacHugh, D. E., Bradley, D. G., Sharp, P. M. & Cunningham, P. Evidence for two independent domestications of cattle. Proc. Natl Acad. Sci. USA 91, 2757–2761 (1994). Article ADS CAS PubMed PubMed Central Google Scholar Verdugo Marta, P. et al. Ancient cattle genomics, origins, and rapid turnover…
Postdoctoral Researcher in Alzheimer’s disease Genetics, Multi-Omics, and Imaging Biomarkers, St Louis, MO, USA
Location: Department of Neurology, NeuroGenomics and Informatics Center, Washington University in St. Louis Description The Washington University School of Medicine, Department of Neurology, has an opening for a post- doctoral research associate to join the Belloy lab in the NeuroGenomics and Informatics Center (NGI). The successful candidate will be involved…
Negative F statistics for sex check in plink
Negative F statistics for sex check in plink 0 Hi, I have a sample of 800 people and I did the sex check in plink 1.90 and have problems for 580 of these people and a lot have a negative F statistic which is not what I expected? The PAR…
–glm no-firth: Segmentation fault
I am running GWAS with binary phenotypes. PLINK v2.00a6LM AVX2 AMD (21 Nov 2023) First option: –glm hide-covar single-prec-cc cc-residualize => Error: Cannot proceed with –glm regression on phenotype ‘pheno1’,since covariate-only Firth regression failed to converge. Second option –glm hide-covar single-prec-cc cc-residualize no-firth => Segmentation fault Third option –glm hide-covar…
Pruning with –indep-pairwise with plink 1.9
I’m new to PLINK and I would like to obtain a file with SNPs in approximate linkage equilibrium. Here is my script and the outputs of each step. If someone could tell me if there is an error in the script because at…
normalize not left-normalizing?
I’m running plink2 to convert a vcf to a pgen with pseudobiallelic variants. Calling –normalize does not seem to left-normalize as I would expect, at least when I look at the .pvar. Log PLINK v2.00a6LM AVX2 Intel (21 Nov 2023) www.cog-genomics.org/plink/2.0/(C) 2005-2023 Shaun Purcell, Christopher Chang …
Merging several vcf files for GWAS?
Merging several vcf files for GWAS? 0 Hello! I am a Medical Student without much background in Bioinformatics trying to perform analysis for my first GWAS study, tremendously overwhelmed. It’s a Case Control Association Study with samples from 50 subjects, that we sampled using Novogene NGS platform. The problem is,…
Inconsistent glm output across repeated runs
Hello, I’ve repeatedly run the exact same plink2 command several times and noticed my glm output was not always the same, which seems to be some sort of nondeterministic bug. ~8-9 out of 10 times, the output is the same, but every now and then the p-value is several orders…
Multi-ancestry genome-wide association study of cannabis use disorder yields insight into disease biology and public health implications
Inclusion and ethics statement We included researchers from the iPSYCH biobank and the PGC, who played a role in study design. This research was not restricted or prohibited in the setting of any of the included researchers. All studies were approved by local instituational research boards and ethics review committees….
Mutation of key signaling regulators of cerebrovascular development in vein of Galen malformations
Adams, R. H. & Eichmann, A. Axon guidance molecules in vascular patterning. Cold Spring Harb. Perspect. Biol. 2, a001875 (2010). Article PubMed PubMed Central Google Scholar Fish, J. E. & Wythe, J. D. The molecular regulation of arteriovenous specification and maintenance. Dev. Dyn. 244, 391–409 (2015). Article CAS PubMed Google…
Clumping with r2=0 and 250kb radius in plink
Clumping with r2=0 and 250kb radius in plink 1 Hi, I am doing clumping with the follow command: plink \ –bfile ${myfilename} \ –keep all_hg38_EUR.ids \ –clump ${trait}_tmp2.txt \ –clump-snp-field SNP \ –clump-field P \ –allow-extra-chr \ –memory 30000 \ –clump-p1 5e-8 \ –clump-r2 0 \ –clump-kb 250 \ –out…
Inquiry Regarding NA P-values in Logistic Regression
Thank you for your assistance. I apologize for the inconvenience, but I still have a question to ask you. In the output file after conducting logistic regression, I do not see an “ERROR” column. Does this indicate that my data has all passed the multicollinearity check? Additionally, regardless of how…
Quality control on imputed genotypes for GWAS / application of PGS
Quality control on imputed genotypes for GWAS / application of PGS 0 Hi everyone, I want to run a GWAS on imputed genotypes from UKB. Unfortunately, I only found tutorials that describe the quality control of genotypes in preparation for a GWAS. Are there tutorials for imputed datasets? I suppose…
Comparative genomics and genome-wide SNPs of endangered Eld’s deer provide breeder selection for inbreeding avoidance
De novo genome assemblies and genome annotation We assembled a de novo genome of a seven-year-old male SED from Ubon Ratchathani Zoo using a combination of Illumina short-reads (92.94 × coverage) and PacBio long-reads (61.6 × coverage) (GenBank accession number: JACCHN000000000). Additionally, we used MGI short-reads (52.15 × coverage) to assemble a de novo genome of…
Alternatives to snpflip to find ambigious and flipped snps
Alternatives to snpflip to find ambigious and flipped snps 0 Hello everyone, I having an issue with strand flips when trying to perform imputation. In the past on an old HPC I used it supported snpflip, a tool which would recognize ambigious snps as well as snps that have been…
Plink2 PCA approx memory allocation
This is great, thank you! Will this information be included in the PLINK2 documentation? The successful run we had included the log below. In the “Projecting random vectors” line, 21 steps are described, rather than the number 20 of requested principal components. I assume this is part of how the…
University of Alabama at Birmingham hiring BIOINFORMATICIAN I in Birmingham, Alabama, United States
Position Summary: The primary role is to execute a variety of data management and analysis tasks, ensuring the quality, reproducibility, and efficiency of processes related to high-dimensional data. You will collaborate with study investigators and fellow bioinformatics professionals within the department to contribute to high-quality, reproducible research across various scientific…
BAD_ES in plink1.9’s meta-analysis
Hi Chris, I have a question regarding plink1.9’s meta-analysis. I’m using plink1.9 for meta-analysis with plink2’s glm’s output. I have several problematic lines in the “meta.analysis.prob” output. They are caused by monomorphic sites in some sub-studies but not for all. In the plink2’s glm’s output, those sites are marked with an error of “CONST_OMITTED_ALLELE”….
Genome-wide meta-analysis, functional genomics and integrative analyses implicate new risk genes and therapeutic targets for anxiety disorders
Kessler, R. C. et al. Lifetime prevalence and age-of-onset distributions of DSM-IV disorders in the National Comorbidity Survey Replication. Arch. Gen. Psychiatry 62, 593–602 (2005). Article PubMed Google Scholar Kessler, R. C. et al. Prevalence, persistence, and sociodemographic correlates of DSM-IV disorders in the National Comorbidity Survey Replication Adolescent Supplement….
Divergent mechanisms of reduced growth performance in Betula ermanii saplings from high-altitude and low-latitude range edges
Aizawa M, Yoshimaru H, Saito H, Katsuki T, Kawahara T, Kitamura K et al. (2009) Range‐wide genetic structure in a north‐east Asian spruce (Picea jezoensis) determined using nuclear microsatellite markers. J Biogeogr 36(5):996–1007 Article Google Scholar Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated…
PLINK 1.9 meta-analysis
I’m trying to meta-analyze together some PLINK2 (“–glm omit-ref hide-covar cols=+a1freq,+beta”) outputs. I think I’m having trouble understanding the syntax requirement for the PLINK 1.9’s –meta-analysis feature. I interpreted the doc as indicating that adding ‘logscale’ after the filenames would cause it to look for ‘BETA’ in the input. PLINK…
Landscape genomics reveals adaptive genetic differentiation driven by multiple environmental variables in naked barley on the Qinghai-Tibetan Plateau
Abebe TD, Naz AA, Léon J (2015) Landscape genomics reveal signatures of local adaptation in barley (Hordeum vulgare L.). Front Plant Sci 6:813 Article PubMed PubMed Central Google Scholar Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19:1655–1664 Article CAS PubMed …
How to create AffymetrixSuite file for using it in apt-format-result tool?
How to create AffymetrixSuite file for using it in apt-format-result tool? 0 Hi, I am in need of creating all_genotypes_by_snps.CHP.bin and all_genotypes_by_snps.CHP.index.txt files for snpchip samples using related APT tools. I will use those files in creating genotype calling files as below, and eventually will create plink files from this…
A question about the missing or not observed alleles in PLINK datasets
A question about the missing or not observed alleles in PLINK datasets 0 Hello everyone, I would like to ask a question about the SNP array: I have eight PLINK datasets and I noticed I have quite a large number of variants with missing or no observed allele. What could…
Looking to compute R-squared with P-value for LocusZoom plot
Looking to compute R-squared with P-value for LocusZoom plot 0 Looking to compute R-squared values for a list of SNPs associated with specific phenotypes. Interested in having both p-values and R-squared scores for each SNP. Any advice on how to do this efficiently? After Rsqure and the p-value. I want…
plink1.9 chr23 extraction error
Just a quick update – I used PLINK2.0 which allowed me to pass through the public1.bim file but now there is a problem with the public.fam file. (C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3Logging to public1_filtered.log.Options in effect: –bfile public1 –extract snp_ids_only.txt –make-bed –out…
‘PC1’ entry on line 8297 of is categorical
> str(pheno_cov$PC1) num [1:10691] -0.0016 -0.001615 -0.001843 -0.001882 -0.000693 … PLINK v2.00a3.6LM AVX2 Intel (14 Aug 2022) www.cog-genomics.org/plink/2.0/(C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3Logging to ./res/test_plink2.log.Options in effect: –bfile updateID_geno0.02_maf0.005 –ci 0.95 –covar ./pheno_cov/pheno_cov.txt –covar-name Sex_Code,PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10 –covar-variance-standardize –glm hide-covar omit-ref –memory 200000 –out…
Locally annotating SNP IDs and Gene names of called variants
Locally annotating SNP IDs and Gene names of called variants 0 I have GWAS results after variant calling. The VCF file only had CHR (1:22) and POS (12345678 etc) information but the ID column has all “.”, namely no rsIDs in it. After GWAS analysis I have a list of…
Plink2 –extract not working
Hi Chris, This problem seems so silly but I just got stuck here for a long time. I tried to extract a set of SNPs (I’m pretty sure they all appear in the .bim file) and rename their IDs with –set-all-var-ids, and…
Plink Error
Plink Error 1 Hi I am trying to convert ped file for hapmap3 I downloaded here ftp.ncbi.nlm.nih.gov/hapmap/genotypes/2009-01_phaseIII/plink_format/ and unzipped with binzip2 but I am getting the following error when running this command PLINK v1.90b5.2 64-bit (9 Jan 2018) www.cog-genomics.org/plink/1.9/ (C) 2005-2018 Shaun Purcell, Christopher Chang GNU General Public License v3…
No valid entries in –score file
PLINK Error: No valid entries in –score file 0 Hi, I ran this command on plink1.9 to calculate the poligenic score. plink –vcf sample –score output.txt 1 2 3 –out poligenic_results – output.txt: ID ALT UKB-b-15541 rs10399793 C 0.000345793 rs2462492 T -0.00027716 – sample.vcf: #CHROM POS ID REF ALT QUAL…
Cannot proceed with –glm regression on phenotype ‘platelet_count’, since covariate correlation matrix could not be inverted (VIF_INFINITE)
Hi, I am trying to run GWAS on the Platelet Count phenotype.I am using sex, age, assessment center, genotype measurement, and 40 PCs as a covariance matrix. I removed NA from the phenotype file, and I fitted it to the covariance matrix so it contains the same IDs.I calculated the…
No –bgen REF/ALT mode specified
runPLINK <- function(PLINKoptions = “”) system(paste(“/opt/apps/plink/2.0/bin/plink2”, PLINKoptions))runPLINK() runPLINK(“–bgen /DATA/shared/bcac/genotypes/v10/bgen/icogs_euro/iCOGS_european_chr21.bgen –out /DATA/users/m.shokouhi/plink/plink2”) The mentioned code makes PGEN/PVAR/PSAM files but there is a warning in the procedure: No –bgen REF/ALT mode specified. In plink2 website it is written that it considers the first allele as a reference allele if you do not specify…
A question about genotyping rate
A question about genotyping rate 0 Hello everyone, I have four PLINK samples. I harmonized the samples using Genotype Harmonizer in presence of a reference panel. The genotyping rate, for each PLINK sample is around 0.98-0.99. When I merge the four PLINK sets, the genotyping rate drops to 0.76 in…
Calculating height prediction from PGS
Calculating height prediction from PGS 1 You double really get a “predicted” height per-sec, rather, you get a score that has correlates with the height. With the provided weights, you can directly use plink –score to generate the polygenic score, but most likely they won’t look like height (e.g. mean…
No samples in .vcf file.
I am trying to convert my vcf file into a BED format file. When I use this command: plink –vcf merge.bacteria.vcf.gz –make-bed –out merge.bacteria.vcf.bed I get the following error stating: PLINK v1.90b6.21 64-bit (19 Oct 2020) www.cog-genomics.org/plink/1.9/(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License…
Plink2 error
I downloaded data from 1000 genomes website. MacBook-Air-4:plink_mac mac$ ./plink –vcf ALL.chr1.shapeit2_integrated_snvindels_v2a_27022019.GRCh38.phased.vcf –make-bed –out char1 PLINK v1.90p 64-bit (13 Feb 2023) www.cog-genomics.org/plink/1.9/ (C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to char1.log. Options in effect: –make-bed –out char1…
allow-no-covars not recognized when using plink2 glm without covariates
Hi! I would like to run association tests without covariates, to show the effect of population stratification correction. When I first tried doing this, I was instructed to use the ‘allow-no-covars’ modifier. However, when I run my command including this modifier, I get the error Error: Unrecognized flag (‘–allow-no-covars’). The…
Normalisation of PLINK/VCF files?
Normalisation of PLINK/VCF files? 0 Variant notations can vary significantly, and although there are numerous tools available to address this issue, such as bcftools +fixref or bcftools norm, there’s still a chance that something might be overlooked. Is there a comprehensive tool or pipeline that automates this process to ensure…
Association Analysis with Plink error
Association Analysis with Plink error 3 1. this is my phenotype file (called outputfile.txt in command line use): FID IID Cadmium_Chloride Caffeine Calcium_Chloride Cisplatin Cobalt_Chloride Congo_red Copper Cycloheximide Diamide E6_Berbamine Ethanol Formamide Galactose Hydrogen_Peroxide Hydroquinone Hydroxyurea Indoleacetic_Acid Lactate Lactose Lithium_Chloride Magnesium_Chloride Magnesium_Sulfate Maltose Mannose Menadione Neomycin Paraquat Raffinose SDS Sorbitol…
How do I remove duplicate SNPs in PLINK from more than 1 data set?
How do I remove duplicate SNPs in PLINK from more than 1 data set? 1 Hi there, I am trying to remove duplicate SNPs from my data but I I have data from 6 different panels, I am not sure how to do them all in plink at once? SNPs…
Genome-wide association study of traumatic brain injury in U.S. military veterans enrolled in the VA million veteran program
Helmick KM, Spells CA, Malik SZ, Davies CA, Marion DW, Hinds SR. Traumatic brain injury in the US military: Epidemiology and key clinical and research programs. Brain Imaging Behav. 2015;9:358–66. Article PubMed Google Scholar DoD Numbers for Traumatic Brain Injury Worldwide – Totals (Defense Health Agency) (2021). Karr JE, Areshenkoff…
PLINK | Updating sex information issue
PLINK | Updating sex information issue 0 Hello, I am attempting to update the sex information of my cohort vcf data by using PLINK. This is the command I am running: plink –bed input_bed –bim input_bim –fam input_fam –update-sex input_sex.txt –make-bed —out output_name > stdout.out For some reason, I am…
Genomic Data Analyst job in Pojoaque, NM at Private Bioscience @ Get.It
Summary Description: We are looking for a talented bioinformatician or computational biologist who specializes in utilizing polygenic scores and machine learning techniques to analyze genomic data, particularly for predicting complex disease and trait phenotypes. The ideal candidate will possess a solid understanding of genetics and genomics, along with expertise in…
What do HAP A1 and HAP A2 mean in plink –freqx output?
What do HAP A1 and HAP A2 mean in plink –freqx output? 0 let’s assume: SNP1 = A/C, SNP2 = T/G, SNP3 = A/G, SNP4 = C/T how do I know what C(HAP A1) C(HAP A2) mean? is HAP A1 = ATAC and HAP A2 = CGGT? This wouldn’t make…
Genome-wide association study in 404,302 individuals identifies 7 significant loci for reaction time variability
MacDonald SW, Li SC, Bäckman L. Neural underpinnings of within-person variability in cognitive functioning. Psychol Aging. 2009;24:792–808. Article PubMed Google Scholar Haynes BI, Bunce D, Kochan NA, Wen W, Brodaty H, Sachdev PS. Associations between reaction time measures and white matter hyperintensities in very old age. Neuropsychologia. 2017;96:249–55. Article PubMed …
PLINK .ped file issue
Hi everyone, relative newbie here (non-bioinformatics background; got to know EWAS and TWAS before, but have no experience with plink). I am trying to run a GWAS using a .ped and a .map file (got nothing else apart from the raw .idat files). I am trying to use plink for…
About plink2 error
Dear, Sorry for the error message I put plink and plink2 at same path (My desktop) but I can only access at plink, not plink2 like this “plink2” can’t be opened because Apple cannot check it for malicious software. Originally I was trying to plink2 –zst-decompress all_phase3.pvar.zst > all_phase3.pvar So…
Transcriptional regulation and overexpression of GST cluster enhances pesticide resistance in the cotton bollworm, Helicoverpa armigera (Lepidoptera: Noctuidae)
Bras, A., Roy, A., Heckel, D. G., Anderson, P. & Green, K. K. Pesticide resistance in arthropods: ecology matters too. Ecol. Lett. 25, 1746–1759 (2022). Article PubMed PubMed Central Google Scholar Chen, Y. H. & Schoville, S. D. Editorial overview: ecology: ecological adaptation in agroecosystems: novel opportunities to integrate evolutionary…
How to merge my vcf files (n=6) with existing Pf6 vcf file and do pca?
How to merge my vcf files (n=6) with existing Pf6 vcf file and do pca? 0 I sampled some Pf strains and got them WGS done. Now I want to merge them with existing Pf6 data. For this I downloaded Pf6 data for all 14 chromosomes. I then used bcftools…
Troubleshooting multallelic variant merging issue
Hello, I want to recode the IIDs of imputed data .bgen files into two different filesets, and merge these (working on eye-level analyses with Regenie). As I’m only interested in dosages, I’ve converted these to .pgen using PLINK2 (ref-first as UK Biobank): plink2 –bgen data.bgen ref-first –sample data.sample –update-ids recoded_ids_a.txt –make-pgen…
Range-wide and temporal genomic analyses reveal the consequences of near-extinction in Swedish moose
Ceballos, G., Ehrlich, P. R. & Raven, P. H. Vertebrates on the brink as indicators of biological annihilation and the sixth mass extinction. Proc. Natl Acad. Sci. USA 117, 13596–13602 (2020). Article CAS PubMed PubMed Central Google Scholar Ceballos, G., Ehrlich, P. R. & Dirzo, R. Biological annihilation via the…
Distinct non-synonymous mutations in cytochrome b highly correlate with decoquinate resistance in apicomplexan parasite Eimeria tenella | Parasites & Vectors
Chapman HD, Rathinam T. Focused review: the role of drug combinations for the control of coccidiosis in commercially reared chickens. Int J Parasitol Drugs Drug Resist. 2022;18:32–42. PubMed PubMed Central Google Scholar Peek HW, Landman WJM. Coccidiosis in poultry: anticoccidial products, vaccines and other prevention strategies. Vet Q. 2011;31:143–61. CAS …
Allele frequncies in plink including physical position in the output
Allele frequncies in plink including physical position in the output 1 Hi, I am trying to compute allele frequencies for a large genotypic data set. The command I am using is as follow: plink2 –vcf my_file.vcf.gz –freq –map my_file.map –out my_outfile The reason I am using a map file is…
No output detected
To clarify, wes_pgen_12a refers to a set of plink files (.pgen, .pvar, .psam). I have removed some of the backslashes in case those were confusing the program, and now have: plink2 –pfile “${pgen_path}wes_pgen_12a” \ –pmerge “${pgen_path}wes_pgen_12b.pgen” \ “${pgen_path}wes_pgen_12b.pvar” \ …
public databases – Converting VCF format to text for use with PLINK and understanding column mapping
I successfully completed Nature PRS tutorial, which is based on PLINK. Turning to my real data, I downloaded ukb-d-20544_1.vcf.gz. Now I’m facing the problem that I seem to be unable to use it in PLINK or find the correct data format to download at all, and I am a bit…
Mexican Biobank advances population and medical genomics of diverse ancestries
Encuesta Nacional de Salud 2000 Since 1988, Mexico has established periodical National Health Surveys (Encuesta Nacional de Salud (ENSA), originally conceived as National Nutrition Surveys) for surveillance of Mexican population-based nutrition and health metrics. In this study, we use data and samples collected from the survey carried out in 2000,…
Line 15522 of data/Pheno_KFs1.txt has fewer tokens than expected in GWAS analysis
I’m performing GWAS using UKB imputed genetic data below. However, I got the error as follows. plink2 –bfile data_TL –glm hide-covar –pheno data/Pheno_KFs1.txt –pheno-name LogBUN_mg_dl –covar data/Covariatesdata.txt –covar-name PC{1..10}, Age, Tuoi, Sex –extract TL_snplist_All.txt –out output/GWAS_BUN.cvrtPLINK v2.00a6LM 64-bit Intel (27 Sep 2023) www.cog-genomics.org/plink/2.0/(C) 2005-2023 Shaun Purcell, Christopher Chang…
variant calling – INDELS in PLINK files converted to VCF
I want to compare/validate variants called from sequencing data with array (plink format) variant data. I converted the plink files (.bim, .bed, and .fam files) with plink1 to vcf files. plink –bfile prefix_plink –recode vcf-iid –out prefix_out However, the plink vcf files have “I” and “D” values for INDEL variants…
Update sample information in chunks
plink –bfile {chr1_exomes} –update-ids {new_IIDs_A} –make-bed –out {updated_chr1_exomes_A} plink –bfile {chr1_exomes} –update-ids {new_IIDs_B} –make-bed –out {updated_chr1_exomes_B} plink –bfile {updated_chr1_exomes_A} –bmerge {updated_chr1_exomes_B}.bed {updated_chr1_exomes_B}.bim {updated_chr1_exomes_B}.fam –make-bed –out {merged_chr1_exomes_A_B} Original data: ID1, ID2, … New data: ID1_A, ID1_B, ID2_A, ID2_B, … Would updating the IDs of the .fam file be enough in this…
Picard Liftover MismatchedRefAllele PsychArray
Picard Liftover MismatchedRefAllele PsychArray 0 New to using liftOver and working with vcf files generally: I ran liftOver on data gathered from the PsychChip array to lift over from GRCh37 to GRCh38, and got only about 50% of variants lifted over. Most of the rejected ones had “MismatchedRefAllele” as their…
Fast Eqtl Analysis Tool
Fast Eqtl Analysis Tool 4 I’ve got about 2M imputed SNPs and 35K gene expression probesets, and I’d like to identify all eQTL. Running this under PLINK –linear is going to take a very long time, are there any specialized tools out there to handle this sort of data? eqtl…
Why my GWAS p-value QQ-plot is far above diagonal
Why my GWAS p-value QQ-plot is far above diagonal 1 Hi. I’m trying to run GWAS pipeline using plink, but the results I got look really off. The QQ-plot of the p-values is far above the diagonal. The phenotype I used is the height. I’m pretty sure I followed the…
Whole-genome sequencing analysis of suicide deaths integrating brain-regulatory eQTLs data to identify risk loci and genes
Li QS, Shabalin AA, DiBlasi E, Gopal S, Canuso CM, FinnGen ISGC, et al. Genome-wide association study meta-analysis of suicide death and suicidal behavior. Mol. Psychiatry 2023;28:891–900. McGuffin P, Marusic A, Farmer A. What can psychiatric genetics offer suicidology? Crisis. 2001;22:61–65. Article CAS PubMed Google Scholar Pedersen NL, Fiske A….
Need help to setup a PGS pipeline
Job:Need help to setup a PGS pipeline 0 I seek a bioinformatician with experience setting up pipelines in an academic environment. I want to be able to take different file formats from WGS and then do QC, imputation, and then run PGS using different industry-standard libraries and tools. I am…
Pangenome analysis provides insight into the evolution of the orange subfamily and a key gene for citric acid accumulation in citrus fruits
Swingle, W. T. & Reece, P. C. In The Citrus Industry, History, World Distribution, Botany, and Varieties, Vol. 1 (eds Reuther, W. et al.) 190–143 (Univ. of California Press, 1967). Morton, C. M. & Telmer, C. New subfamily classification for the Rutaceae. Ann. Mo. Bot. Gard. 99, 620–641 (2014). Article …
Issue with Numerical Covariate * Covariate Interaction in Association Analysis
I tried to perform a covariate-covariate interaction analysis using numerical covariates of interest, namely activity_data and age. However, I encountered an error message stating that the number of samples was equal to or less than the number of predictor columns for the specified phenotype. Just to make sure I run…
QC of genetic data
QC of genetic data 0 Hi, I have some genetic data in a bim file. The chromosomes range from 0 to 23 and 26, which I have not come across before. Would the SNPs on chromosome 0 and 26 be removed from the genetic file or left in. Then, I…
Issue with merging in plink and eigensoft.
Issue with merging in plink and eigensoft. 0 I merged two datasets in plink1.9. It worked, but I did get the error “multiple positions seen for variant” and “variants have the same position”. How do I resolve this when trying to merge the datasets again? And, I tried to merge…
Allele frequency calculation for genotype dosage value
Allele frequency calculation for genotype dosage value 0 Hello, i have a data set with the dosage data (between 0-2) from a couple million SNPs, i would like to get the MAF for each SNP. I saw somewhere (not that reliable place) that you can get it just doing: SNP1…
Highly inflated p-values in GWAS by regenie
Highly inflated p-values in GWAS by regenie 0 I was running a GWAS using REGENIE 3.2.5 on more than 250,000 samples, and the p-values returned are highly inflated with -log10P up to 5000. As a result there were over 10,000 variants called significant under the threshold of p < 5e-8,…
Genome-Wide Association Study of Alopecia Areata in Taiwan
Introduction Alopecia areata (AA) is one of Taiwan’s most common autoimmune hair diseases and incidence rate of AA is 0.22%.1–3 The main symptoms of AA are rapid, non-scarring hair loss that affects body hair, facial hair, eyelashes, and brows.1,2 In the United States, the prevalence of AA is estimated to…
Genetic distance in cM from VCF of non-reference species to run Beagle
I’m working with a resequenced genome of a non-reference species. The VCF contains ~7 mln of SNPs, all with their relative position on their own chromosome. I have a 10.01 % of missing data, so I need to impute these NA. I eventually settled for Beagle v5 as a tool,…
–update-ids with long IID names
Hello! I have a plink2 pgen/pvar/psam file set where the FID corresponds to a unique user ID and the IID is a long name (typically over 50 characters) that starts with the FID and has other information separated by _. An example FID is R123456789_chipname_plate_platenumber_A01. Each R123456789 identifier is unique….