Tag: plink

Plink2.0 unrecognized flag: –het

Plink2.0 unrecognized flag: –het 1 Hi, I’m trying to run plink2.0 to calculate heterozygosity ratio, through the command: plink2.0 –pfile myFile –het but I receive this error: Error: Unrecognized flag (‘–het’) I honestly don’t have any clou about it and about it’s not recognized by plink2.0 as flag. Has this…

Continue Reading Plink2.0 unrecognized flag: –het

Plink Alternative Phenotype File Columns not being Read

Plink Alternative Phenotype File Columns not being Read 0 Hi, I have a plink alternative phenotype file with the following format: FID IID Phenotype 1 2 1 1 3 0 etc. As outlined in the plink documentation. zzz.bwh.harvard.edu/plink/data.shtml#pheno However, when I run the following command : plink –bfile ../Plink_Files/plink –logistic…

Continue Reading Plink Alternative Phenotype File Columns not being Read

The Genetic Architecture of Sleep Health Scores in the UK

Introduction Sleep is a complex neurological and physiological state. It is defined as a natural and reversible state of reduced responsiveness to external stimuli and relative inactivity, accompanied by a loss of consciousness.1 Sleep disorders can be classified as seven major categories: insomnia disorders, sleep-related breathing disorders, central disorders of…

Continue Reading The Genetic Architecture of Sleep Health Scores in the UK

Mendelian randomization analyses support causal relationships between blood metabolites and the gut microbiome

1. Wang, J. & Jia, H. Metagenome-wide association studies: fine-mining the microbiome. Nat. Rev. Microbiol. 14, 508–522 (2016). CAS  PubMed  Google Scholar  2. Moschen, A. R. et al. Lipocalin 2 protects from inflammation and tumorigenesis associated with gut microbiota alterations. Cell Host Microbe 19, 455–469 (2016). CAS  PubMed  Google Scholar …

Continue Reading Mendelian randomization analyses support causal relationships between blood metabolites and the gut microbiome

What file type does “PLINK –block” accept as input?

What file type does “PLINK –block” accept as input? 0 Hi, I have set of SNPs (distributed over all the chromosomes) and I am trying to do some haplotype block estimation to identify whether some of them are part of the same haplotype block, etc. It seems like “PLINK –blocks”…

Continue Reading What file type does “PLINK –block” accept as input?

PLINK sanity check – Bioinformatics Stack Exchange

I am a new user of PLINK and am analysing some SNP data for the first time. After creating a .bim file with $ plink –file my_data –make-bed I notice that for several SNPs my data is different from dbSNP e.g. rs145496306: BIM file: A G dbSNP: G>A,T rs3813199: BIM…

Continue Reading PLINK sanity check – Bioinformatics Stack Exchange

SNP extraction

SNP extraction 0 I want to extract specific SNPs of interest i have in a text file into an additive genetics model so that each SNP can be in a 0/1/2 format for each subject using genetics info in from PLINK (.bed, .bim, and .fam files). How can i do…

Continue Reading SNP extraction

for loop – Is there a way to permute inside using to variables in bash?

I’m using the software plink2 (www.cog-genomics.org/plink/2.0/) and I’m trying to iterate over 3 variables. This software admits an input file with .ped extention file and an exclude file with .txt extention which contains a list of names to be excluded from the input file. The idea is to iterate over…

Continue Reading for loop – Is there a way to permute inside using to variables in bash?

Merging genotyping array VCFs and then running kinship analysis

Merging genotyping array VCFs and then running kinship analysis 0 Hello, I have about 4200 array genotyping VCFs (from the Illumina Infinium CoreExome-24 Kit) and I have merged them using bcftools merge. The chip has 500K exonic SNPs. These are trio data – which means 1700 of them are probands,…

Continue Reading Merging genotyping array VCFs and then running kinship analysis

One-hot encoding for PLINK or VCF

One-hot encoding for PLINK or VCF 0 I want to write an autoencoder for SNP data. Is there an established way to one-hot-encode binary PLINK or VCF input? I believe that can be done by manipulating PLINK’s bed file but am afraid to do something wrong. By one-hot encoding I…

Continue Reading One-hot encoding for PLINK or VCF

Error: PLINK does not support more than 2^31

Error: PLINK does not support more than 2^31 – 3 variants. 0 Hi there, I was converting my vcf file into bfiles in plink, and I got an error ‘Error: PLINK does not support more than 2^31 – 3 variants’. We recommend other software, such as PLINK/SEQ, for very deep…

Continue Reading Error: PLINK does not support more than 2^31

Population-wise decay of linkage disequilibrium (LD)

Population-wise decay of linkage disequilibrium (LD) 0 I have three different populations, would like to plot LD with R using plink generated ld files. There is no information on multi-population LD here and so far all reported to plot single LD plot. I’ll appreciate any help R snp next-gen GBS…

Continue Reading Population-wise decay of linkage disequilibrium (LD)

What is the best way to load bgen files into python?

What is the best way to load bgen files into python? 0 I am new to the BGEN format, (I mainly work with PLINK formatted files or VCFs). I have been trying to load a series of BGEN files to python using bgen-reader; however, there has been some issues that…

Continue Reading What is the best way to load bgen files into python?

Snakemake restricted wildcard combinations

Snakemake restricted wildcard combinations 1 Hi, I’m new to snakemake and haven’t seen something like this in tutorials I’ve tried. I have a data frame with unique keys, chromosomes, start positions, and end positions. I essentially want to loop over every group to do an operation. How would I assign…

Continue Reading Snakemake restricted wildcard combinations

principal component analysis on pool-seq SNP data

principal component analysis on pool-seq SNP data 0 I would like to perform principal component analysis on a pool-seq SNP dataset. I’ve been looking into methods for doing this, but have had trouble finding examples that may apply for pooled data as opposed to individual genotypes. For example, I’m not…

Continue Reading principal component analysis on pool-seq SNP data

Plink to bimbam format for gemma-wrapper

Plink to bimbam format for gemma-wrapper 0 Hi, I want to use gemma-wrapper tool which needs files in BIMBAM format. Plink can convert from plink to BIMBAM format but it is very different from the real BIMBAM format. Alleles come in two columns while BIMBAM needs them in mean genotype…

Continue Reading Plink to bimbam format for gemma-wrapper

Duplicate ID in bed file

Duplicate ID in bed file 2 I am using PLINK v1.90b3s 64-bit (17 Jun 2015) to generate a LD matrix from 1000G VCF file for a long list of SNPs. I use this command to convert VCF to bed file plink –vcf ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz –make-bed –out binary_fileset I then use this…

Continue Reading Duplicate ID in bed file

principal compoent analysis on pool-seq SNP data

principal compoent analysis on pool-seq SNP data 0 I would like to perform principal component analysis on a pool-seq SNP dataset. I’ve been looking into methods for doing this, but have had trouble finding examples that may apply for pooled data as opposed to individual genotypes. For example, I’m not…

Continue Reading principal compoent analysis on pool-seq SNP data

Bug#1000782: plink2: autopkgtest regression: Segmentation fault

Source: plink2 Version: 2.00~a3-211011+dfsg-1 X-Debbugs-CC: debian…@lists.debian.org Severity: serious User: debian…@lists.debian.org Usertags: regression Dear maintainer(s), With a recent upload of plink2 the autopkgtest of plink2 fails in testing when that autopkgtest is run with the binary packages of plink2 from unstable. It passes when run with only packages from testing. In…

Continue Reading Bug#1000782: plink2: autopkgtest regression: Segmentation fault

michigan imputation server

michigan imputation server 0 Hi, I performed imputation on my GWAS data using Michigan imputation server. Now I have two output files: 1).dose.vcf.gz and 2).info.gz Michigan imputation server use mimimac3 (–format GT,DS,GP) and in the output file “.dose.vcf.gz” are present all the three formats. I’m new on this kind of…

Continue Reading michigan imputation server

Calculating polygenic score using plink

Calculating polygenic score using plink 0 Hi I am trying to calculate PRS in test sample after getting best SNPs from PRSice 2 with highest R2 My target data is in the form of bed,bim, fam for 22 chromosomes, I want to use plink for specific p-value threshold in this…

Continue Reading Calculating polygenic score using plink

BAM dataset to Genotype data conversion using PLINK

BAM dataset to Genotype data conversion using PLINK 1 You must use a genotype caller in order to obtain genotypes from a .bam file. It’s not possible to ‘convert’ .bam to genotypes. There’s a lot of options, but maybe using bcftools is the most simple. Take a read of this…

Continue Reading BAM dataset to Genotype data conversion using PLINK

No block substitutions in plink BIM file?

No block substitutions in plink BIM file? 0 The BIM file from PLINK contains the variant information that accompanies a BED file. The variants called are based on exome sequencing. However, I encounter a strange issue. The BIM file consists of only single nucleotide substitutions, insertions, and deletions of base…

Continue Reading No block substitutions in plink BIM file?

SHAPEIT phased haps file does not contain my SNPs of interest

SHAPEIT phased haps file does not contain my SNPs of interest 0 I produced a haps file using SHAPEIT, as below. However, when I query it to look for my 12 SNPs of interest, only 3 are in there. Does anyone know why? (PS they are not in the snp.strand.exclude…

Continue Reading SHAPEIT phased haps file does not contain my SNPs of interest

Genotyping-in-Thousands by sequencing of archival fish scales reveals maintenance of genetic variation following a severe demographic contraction in kokanee salmon

1. Wandeler, P., Hoeck, P. E. & Keller, L. F. Back to the future: Museum specimens in population genetics. Trends Ecol. Evol. 22, 634–642 (2007). PubMed  Google Scholar  2. Bi, K. et al. Unlocking the vault: Next-generation museum population genomics. Mol. Ecol. 22, 6018–6032 (2013). CAS  PubMed  PubMed Central  Google…

Continue Reading Genotyping-in-Thousands by sequencing of archival fish scales reveals maintenance of genetic variation following a severe demographic contraction in kokanee salmon

SHAPEIT using VCF unphased genotype input

I can get SHAPEIT to work with the default Plink PED/MAP format input files, but not with a VCF as input. As an example, here I use the demo data that comes with SHAPEIT, which runs well. DEMO=/Users/michaelflower/bin/shapeit.v2.904.3.10.0-693.11.6.el7.x86_64/example shapeit -B $DEMO/gwas.bed $DEMO/gwas.bim $DEMO/gwas.fam -M $DEMO/genetic_map.txt -O “$DIR”/shapeit/gwas.phased However, when I…

Continue Reading SHAPEIT using VCF unphased genotype input

Phasing using Beagle with a map file

I’d like to phase the SNPs in a vcf file and output consensus files for each haplotype, as suggested in this post: www.biostars.org/p/298635/ I’ve managed to install beagle in a conda environment: conda create -n beagle -c conda-forge -c bioconda beagle conda activate beagle When I run beagle using this…

Continue Reading Phasing using Beagle with a map file

UK Biobank Ref/Alt Allele Count PLINK2

UK Biobank Ref/Alt Allele Count PLINK2 0 Hi all, I have UK Biobank .BGEN and .Sample files, and I am trying to output sample major additive allele counts using PLINK 2 software. The PLINK 2 software manual states that REF alleles are now counted. Am I correct in assuming that…

Continue Reading UK Biobank Ref/Alt Allele Count PLINK2

plink –geno filters variants with good genotype rate

plink –geno filters variants with good genotype rate 0 Hi there, I’m using plink v1.9. I have a list of variants that I’m interested in checking out. After filtering with –geno 0.2, some of the variants that have less than 20% of missing genotype rate are eliminated, when they should…

Continue Reading plink –geno filters variants with good genotype rate

identification of ROH using plink

identification of ROH using plink 0 Hello All I generated vcf file using GATK (First Haplotypecaller –> CombinedGVCF –> GenotypeGVCF and then Hard filtering ). After this, I converted filtered vcf file into plink binary PED files (.bed, .fem, .bim, plink v1.9) using –make-bed command. However, when I used these…

Continue Reading identification of ROH using plink

Plink2 –keep Removing All Samples

Plink2 –keep Removing All Samples 1 I am trying to include the –keep and –remove options in my plink2 command. I am finding that despite my files for these options having identical text to the main file’s IDs, all the samples are removed. Command: plink2 –bfile pca_final –keep EUR.sample –remove…

Continue Reading Plink2 –keep Removing All Samples

How to make .ped and .map files

How to make .ped and .map files 0 Hello I have a dataset but I need to create .ped and .map files (as I understand) in order to use plink to run a GWAS. However, I do not know what files I need to use in order to create them,…

Continue Reading How to make .ped and .map files

Multiple stages of evolutionary change in anthrax toxin receptor expression in humans

Human research participants We have complied with all relevant ethical regulations and informed consent was obtained from all participants. This work was approved by the Cornell University IRB under protocol 1506005662. Animal research This work was approved by the Cornell University IACUC under protocol 2009-0044. Welfare and handling of all…

Continue Reading Multiple stages of evolutionary change in anthrax toxin receptor expression in humans

How to run a GWAS in command line with Plink2?

How to run a GWAS in command line with Plink2? 0 I am trying to run a GWAS in the command line with plink version 2, however when I run the following command plink2 –file hapmap1 I get the following error Error: Unrecognized flag (‘–file’) I am trying to follow…

Continue Reading How to run a GWAS in command line with Plink2?

Linked supergenes underlie split sex ratio and social organization in an ant

Significance Some social insects exhibit split sex ratios, wherein a subset of colonies produce future queens and others produce males. This phenomenon spawned many influential theoretical studies and empirical tests, both of which have advanced our understanding of parent–offspring conflicts and the maintenance of cooperative breeding. However, previous studies assumed…

Continue Reading Linked supergenes underlie split sex ratio and social organization in an ant

PLINK Dosage file without family ID error

PLINK Dosage file without family ID error 0 Hi, I am simply trying to extract a small set of SNPs from a .dos dosage file using plink1.9. However, I get the error: Line 1 of yourfile.dos has fewer tokens than expected I have checked that the .fam file and the…

Continue Reading PLINK Dosage file without family ID error

Relationship between Standard Error and P-value

GWAS – Relationship between Standard Error and P-value 1 Is there a relationship between the p-values obtained in a GWAS and the standard error of the effect size of a SNP that can that can be explained either explicitly or intuitively? Methods for prediction based on effect sizes, such as…

Continue Reading Relationship between Standard Error and P-value

Genome of the estuarine oyster provides insights into climate impact and adaptive plasticity

1. Hoegh-Guldberg, O. & Bruno, J. F. The impact of climate change on the world’s marine ecosystems. Science 328, 1523–1528 (2010). CAS  PubMed  Google Scholar  2. Chou, C. et al. Increase in the range between wet and dry season precipitation. Nat. Geosci. 6, 263–267 (2013). CAS  Google Scholar  3. Li,…

Continue Reading Genome of the estuarine oyster provides insights into climate impact and adaptive plasticity

VCF file generation from multiple samples fro PCA

VCF file generation from multiple samples fro PCA 0 I am trying to generate vcf file for 80 samples(human) and use it for pca. But when trying to get eigen vectors using plink it says genotyping rate is 0.12 and when i remove snps with missing data threshold all data…

Continue Reading VCF file generation from multiple samples fro PCA

Best tool for genotype – phenotype correlation

Best tool for genotype – phenotype correlation 0 Hello, I need to perform genotype – phenotype correlation analysis. I know PLINK could be used for such purpose, but with PLINK many file preparation steps need to be done before running the actual step. I have VEP annotated VCFs. Maybe other…

Continue Reading Best tool for genotype – phenotype correlation

Why does write.ped remove the first locus?

Why does write.ped remove the first locus? 0 In order to get a VCF file from genind, I am going through hierfstat function write.ped() and then with plink I convert the result to vcf. This is my code (apologies, but I cannot provide a reproducible data for this particular scenario):…

Continue Reading Why does write.ped remove the first locus?

deflated QQ plot but lambda >1

deflated QQ plot but lambda >1 1 Dear All, What might be the reason for a deflated QQ-plot but lambda showing > 1 value. GWAS (case-control using glm-logistic regression adjusting for PC1-PC3 and three covariates) was done in plink2.0, and QQ plot using QQman package TIA package plink deflation GWAS…

Continue Reading deflated QQ plot but lambda >1

Population stratification with PCA

Population stratification with PCA 1 Hi all! I have a genotype dataset in plink format. Now I want to correct for population structure with PCA in association analysis. I split my dataset to training and testing datasets. I want to do the PCA only in the training dataset and use…

Continue Reading Population stratification with PCA

How to convert GEN or .gen format from impute.me to vcf on windows 10?

How to convert GEN or .gen format from impute.me to vcf on windows 10? 1 I tried for days to convert a gen file to vcf but it did not work. I am a beginner so i don’t know what are in vcf files and gen files or how they…

Continue Reading How to convert GEN or .gen format from impute.me to vcf on windows 10?

How to import dosage information to plink binary files?

How to import dosage information to plink binary files? 0 Hi All, I recently converted a very large Topmed imputed VCF files into a plink format. The command I used to convert this VCF was plink1.9 –vcf ${VCF} –make-bed –out ${VCF}_binary. Additionally, I also spent a significant amount of time…

Continue Reading How to import dosage information to plink binary files?

Problems Imputing X Chromosome with TOPMed

I have a large dataset whose autosomes I was able to successfully phase and impute using TOPMed. I have tried doing the same with the X chromosome but keep running into issues. Before trying to impute with TOPMed, I did per-individual QC and per-marker QC, then ran checkVCF, and corrected…

Continue Reading Problems Imputing X Chromosome with TOPMed

Plink1.9 error -chr not recognised

Plink1.9 error -chr not recognised 1 Hello, I am trying to run PLink on my tped file (SNP data set) to test for linkage disequilibrium, but it appears as if I need to set some chromosome options because I get some error at some line. My data is not mapped….

Continue Reading Plink1.9 error -chr not recognised

Convert plink files from

Convert plink files from 0 I have plink files where the .bim file is in the following format, and I want to convert the chr:pos column into rsIDs. 1 1:10177:A:AC 0 10177 AC A 1 1:10352:T:TA 0 10352 TA T 1 1:11008:C:G 0 11008 G C 1 1:11012:C:G 0 11012…

Continue Reading Convert plink files from

PLINK and population stratification with known subpopulations

PLINK and population stratification with known subpopulations 0 I want to perform a genome wide association study (GWAS) with PLINK 1.9. I have whole genome sequencing SNP calls for ~100 patients where I know in advance that there is a skew towards subpopulations of African and South American descents, with…

Continue Reading PLINK and population stratification with known subpopulations

Plink v2.0 does not produce a Z-compressed file (.zst)

Plink v2.0 does not produce a Z-compressed file (.zst) 0 Good morning, I would like to convert a merged VCF in a Plink compressed format (.pgen, .psam and .pvar files), so I run plink2 –vcf MyMerged.vcf.gz –make-pgen –zst-level 3 –out MySamples It basically works, as it produces such files: ls…

Continue Reading Plink v2.0 does not produce a Z-compressed file (.zst)

PLINK map format spliiting by chromosome

PLINK map format spliiting by chromosome 0 hi everyone I have a PLINK mapfile whit for columns including chromosome name, snp id, Genetic distance (morgans) (All column is zero), Base-pair position. NC1.1 rs1 0 145 NC1.1 rs2 0 201 NC2.1 rs3 0 208 . . NCn.1 rsn 0 509 I…

Continue Reading PLINK map format spliiting by chromosome

Recreating QC of 1000 Genomes project

Recreating QC of 1000 Genomes project – removing non overlapping SNPs 0 Hi everyone, I am attempting to recreate the the quality control analysis performed in the 1000 genomes project (tcag.ca/documents/tools/omni25_qcReport.pdf). I am fairly new to performing QC on a dataset, and am currently stuck on section 5.1 of the…

Continue Reading Recreating QC of 1000 Genomes project

PLINK dosage data – convert and/or read in R

PLINK dosage data – convert and/or read in R 1 I have PLINK dosage files in the form pgen/psam/pvar. I would like to know how to do either/both of the following: Convert the file set to bed/bim/fam files (hard-calls) Read the allele dosage from pgen into R as a continuous…

Continue Reading PLINK dosage data – convert and/or read in R

Can someone explain PLINK allele REF/ALT management strategy?

Can someone explain PLINK allele REF/ALT management strategy? 0 Sometimes when merging two plink files, the Reference (REF) and Alternative (ALT) alleles may be reversed, e.g. REF G ALT A versus REF A ALT G. The main reason for that is the default action of PLINK. You see, when using…

Continue Reading Can someone explain PLINK allele REF/ALT management strategy?

Checking chromosome builds for genotyping data

Checking chromosome builds for genotyping data 0 Hi, I have several studies worth of data (In both PLINK and vcf format), and I was wondering if anyone knew of an online tool which I could use to check my chromosome build i.e GRCh37 vs GRCh38. (I thought I used one…

Continue Reading Checking chromosome builds for genotyping data

merging multiple .bed /.bgen files uk biobank using plink

Hi All, I am having a problem merging all chromosomal UK biobank files. I ran the following command. plink2 –bfile /path/to/file/ukb_imp_chr1 –pmerge-list /path/to/file/merge.list –maf 0.01 –hwe 1e-6 –make-pgen –out /path/to/file/ukb_imp_allchr I also tried plink2 –bfile /path/to/file/ukb_imp_chr1 –pmerge-list /path/to/file/merge.list –maf 0.01 –hwe 1e-6 –make-bed –out /path/to/file/ukb_imp_allchr The merge.list has the following…

Continue Reading merging multiple .bed /.bgen files uk biobank using plink

UK Biobank Imputed Genotypes

UK Biobank Imputed Genotypes 0 I am using the UK Biobank Imputed Genetic data set and I was wondering if it is possible to use PLINK (or any other software) to view the actual calls for each SNP ranging in value from 0 –> 2 (including fractions)? I have been…

Continue Reading UK Biobank Imputed Genotypes

The genomic origins of the Bronze Age Tarim Basin mummies

1. Peyrot, M. in Aspects of Globalisation: Mobility, Exchange and the Development of Multi-Cultural States 12–17 (2017). 2. Damgaard, P. et al. 137 ancient human genomes from across the Eurasian steppes. Nature 557, 369–374 (2018). CAS  PubMed  Article  ADS  PubMed Central  Google Scholar  3. Hemphill, B. E. & Mallory, J….

Continue Reading The genomic origins of the Bronze Age Tarim Basin mummies

GWAS Studies

GWAS Studies 0 How to create a .ped file for use in PLink? I have the following csv file: Chromosome Position Sample1 Sample 2 ……….. Sample n Chr Pos Sam1 Sam2 Sam3, Sam 4, ……Sample_n 1 11 A T T, A, A, T, …… 2 141 G G G, T,…

Continue Reading GWAS Studies

GWAS data from PGC

GWAS data from PGC 0 this is my first time using the GWAS data, I downloaded some data from PGS and I Have some questions. 1- there are a lot of SNP to the same gene with different P-values, why does this occur? and if I want to use one…

Continue Reading GWAS data from PGC

Distance matrix PCA

Distance matrix PCA 0 Hi all, I generated PCA values for the 1000genomes dataset using PLINK. I know how to plot the values for PC1 and PC2, but my question is how can I generate a distance matrix to select near samples based on populations? Like for example if I…

Continue Reading Distance matrix PCA

Phasing using SHAPEIT

Hello, I need to use SHAPEIT for phasing only since I will conduct CH (compound heterozygous) analysis for recessive rare variant.. I will not perform imputation.  I am running SHAPEIT, and I see in the log file it says: Parameters :  * Seed : 1442251531  * Parallelisation: 12 threads  *…

Continue Reading Phasing using SHAPEIT

New datasets for ancestry estimation and imputation?

New datasets for ancestry estimation and imputation? 0 What datasets are people using nowadays for genotype imputation and ancestry estimation? HapMap and 1000 Genomes are good, but it was some years since their release and both have some limitations on the number of populations included and resolution (especially HapMap which…

Continue Reading New datasets for ancestry estimation and imputation?

No VCF records found in the specified interval

Beagle 5 error: No VCF records found in the specified interval 0 Hi, I am running into an issue while doing Imputation with Beagle 5 and not sure what is causing the error. I have vcf files converted from PLINK by the following command ./plink –bfile qcd_in–chr 20 –recode vcf-iid…

Continue Reading No VCF records found in the specified interval

Math behind association with PLINK

Math behind association with PLINK 1 Hi, which is the mathematical formula behind the –linear association used by plink ? plink association gwas • 307 views The most basic association test is just a Chi-squared test on a 2 x 2 contingency table of the minor allele tallies, as to…

Continue Reading Math behind association with PLINK

How long does it take to carry out the GWAS workflow?

How long does it take to carry out the GWAS workflow? 0 Including these steps: 1) raw data format transformation for five companies 2) update positions for all SNPs to hg37 version 3) Quality control within companies 4) Pre-phasing (SHAPEIT2) and imputation (IMPUTE2) for all SNPs of each company 5)…

Continue Reading How long does it take to carry out the GWAS workflow?

Runs of homozygosity in Plink

❯ plink1.9 –homozyg –help PLINK v1.90b6.22 64-bit (3 Nov 2020) www.cog-genomics.org/plink/1.9/ (C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3 –help present, ignoring other flags. –homozyg [{group | group-verbose}] [‘consensus-match’] [‘extend’] [‘subtract-1-from-lengths’] –homozyg-snp <min var count> –homozyg-kb <min length> –homozyg-density <max inverse density (kb/var)> –homozyg-gap <max internal gap…

Continue Reading Runs of homozygosity in Plink

Sample ID ERROR in converting dosage file (.txt) to VCF in PLINK 2.0

Sample ID ERROR in converting dosage file (.txt) to VCF in PLINK 2.0 0 Hi, I have a dosage file (.txt format; alternative allele count 0-2) that I would like to convert to a VCF through PLINK 2.0 but am running into sample ID issues. I am attempting the following…

Continue Reading Sample ID ERROR in converting dosage file (.txt) to VCF in PLINK 2.0

Error: Could not open temporary file.

Hello, there is an error when I try to filter a VCF document for MAF. Here is a part of the the vcf-file: 2: chr1H 523 chr1H:523 C A . PASS . GT 0|0 1: chr1H 445 chr1H:445 C T . PASS . GT 0| 2: chr1H 523 chr1H:523 C…

Continue Reading Error: Could not open temporary file.

Can we merge two VCF files

Forum:Can we merge two VCF files – a RNAseq VCF and a Whole genome Sequencing (WGS) VCF to do PCA? 0 I’m quite new to this variant calling and analysis area. We have around 30 samples of RNAseq data of tumor and normal samples. I performed variant calling and obtained…

Continue Reading Can we merge two VCF files

How to convert multiple .vcf files into single .ped (PLINK compatible files)?

How to convert multiple .vcf files into single .ped (PLINK compatible files)? 0 Hi everyone, I am a newbie to the whole bioinformatics world and I need to analyse WGS data from several case samples. I have now several individual .vcf files and would like to use PLINK for Quality…

Continue Reading How to convert multiple .vcf files into single .ped (PLINK compatible files)?

The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Sequencing of Shorea leprosula genome Sample collection Leaf samples of S. leprosula were obtained from a reproductively mature (diameter at breast height, 50 cm) diploid tree B1_19 (DNA ID 214) grown in the Dipterocarp Arboretum, Forest Research Institute Malaysia (FRIM). DNA extraction Genomic DNA was extracted from leaf samples using the…

Continue Reading The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Bioinformatics Biomedical Scientist – Bilsborough Lab

Bioinformatics Biomedical Scientist – Bilsborough Lab – Inflammatory Bowel Diseases Drug Discovery and Development Apply Now Share Requisition # HRC0697538 Join us in accelerating the pace of research and discovery within our unique IBD3 lab! Cedars-Sinai provides virtually every known gastroenterologic analytical procedure and treatment…

Continue Reading Bioinformatics Biomedical Scientist – Bilsborough Lab

PLINK basic command line usage

PLINK basic command line usage 0 Hey, I am new to PLINK. I run a tutorial of how to calculate polygenic risk score under a tutorial. choishingwan.github.io/PRS-Tutorial/target/#standard-gwas-qc I run the part of # Standard GWAS QC and the code is as follows: plink –bfile EUR –maf 0.01 –hwe 1e-6 –geno…

Continue Reading PLINK basic command line usage

Plink genome flag error

Plink genome flag error 1 I’m using plink 1.9 and running the command: plink –bfile data –genome –extract data.prune.in –remove data.fail_imiss where data.fail_imiss is a file with individuals with proportions of missing SNPs greater than greater than the threshold .05 . I get the warnings: Warning: 10745 het. haploid genotypes…

Continue Reading Plink genome flag error

Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old)

NB – Update July 29, 2020 – this thread will no longer be watched and, for all intents and purposes, will now be archived NB – Version 2 of tutorial can be found here and should be used going forward –> Produce PCA bi-plot for 1000 Genomes Phase III –…

Continue Reading Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old)

VCF file

VCF file 0 Removal Sample ###junk removal vcftools –gzvcf cohorts_combined_filtered_calls_annotated.vcf –maf 0.05 –max-missing 0.8 –min-meanDP 10 –recode –out SNP-co.vcf ####Sample removal bcftools view –samples-file ^ /media/bioinformatician/My Passport/sample_id_no-RNASEq.txt SNP-co.vcf. ###chromosome extraction vcftools –gzvcf SNP_co.vcf.gz –chr 2 –from-bp 47843908 –to-bp 47877312 –recode –out snps_filt_chr2 ##conversion vcf to plink ./plink –vcf /media/bioinformatician/My Passport/annotated/tabix/snps_filt_chr2.vcf.gz.recode.vcf…

Continue Reading VCF file

Where to find 1000 Genome phase 3 whole genome data and select only European population

Where to find 1000 Genome phase 3 whole genome data and select only European population 2 Hello: I was trying to download whole genome data from 1000Genome phase 3 data and extract only the EUR population (GBR, TSI, FIN, IBS, CEU). I used the ftp site: ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz, but apparently it…

Continue Reading Where to find 1000 Genome phase 3 whole genome data and select only European population

genomic data scientist jobs

Provide strategic planning and perform analysis or simulations independently or in a . 401(k) savings plan match.…, Requires a Ph.D. in Biochemistry, Biotechnology, Molecular/Cell Biology, Plant Biology, or a related field and 0-3 years of relevant postdoctoral or industrial……, In addition, the analyst will help advance the groups collective expertise…

Continue Reading genomic data scientist jobs

extract IDs from each population in PLINK .fam file and export to .txt separately

extract IDs from each population in PLINK .fam file and export to .txt separately 0 Hi, I’d like to write a loop to extract individuals from my PLINK.fam file based on the fam ID / population code into different .txt. files just using bash. I’m pretty stumped so would appreciate…

Continue Reading extract IDs from each population in PLINK .fam file and export to .txt separately

Loading PLINK files to Haploview

Loading PLINK files to Haploview 2 I can’t seem to get Haploview to accept PLINK files. I downloaded the ‘sample.ped’ and ‘sample.info‘ trial files from this haploview website. I am loading the .ped file for ‘Results File:’ and the .info file for ‘Map File:’. All other options are left at…

Continue Reading Loading PLINK files to Haploview

Haploview and plink

Haploview and plink 1 When I executed plink to get .ped and .info file then IT generated only for 8 chromosomes out of 22, X, Y and M. Kindly help to get all chromosomes. When I am uploading .ped and .info file in Haploview.jar then it is showing loading 0%…

Continue Reading Haploview and plink

Please help me understand LD pruning algoritm

Please help me understand LD pruning algoritm 0 I’m trying to do LD pruning with PLINK but I can’t find any proper documents about the algoritm used. There are options as indep, indep pairwise and indep pairphase and suboptions as windowsize, stepsize and threshold. I’m not sure what each means…

Continue Reading Please help me understand LD pruning algoritm

use of 2 covariate files for association analysis in plink

use of 2 covariate files for association analysis in plink 1 Dear all, I want to use 2 covariate file in my association analysis in plink, is this possible ?, (without combining all the covariates in 1 file) Is it valid to use ? –bfile file –logistic –covar /covarfile1.txt –covar-number…

Continue Reading use of 2 covariate files for association analysis in plink

Isolate a Region in a Vcf File to make a Smaller Vcf File

Isolate a Region in a Vcf File to make a Smaller Vcf File 0 I have been working on a project that has caused a bit of a headache for me. While I have made some progress, now the process is simply too slow to be reasonable. I have a…

Continue Reading Isolate a Region in a Vcf File to make a Smaller Vcf File

Paths and timings of the peopling of Polynesia inferred from genomic networks

1. Low, S. Hawaiki Rising: Hōkūle‘a, Nainoa Thompson, and the Hawaiian Renaissance (Univ. of Hawaii Press, 2019). 2. Kirch, P. V. On the Road of the Winds (Univ. of California Press, 2017). 3. Mulrooney, M. A., Bickler, S. H., Allen, M. S. & Ladefoged, T. N. High-precision dating of colonization…

Continue Reading Paths and timings of the peopling of Polynesia inferred from genomic networks

Plink –merge-list only outputting fam

Plink –merge-list only outputting fam 0 I am attempting to merge Plink files towards an algorithm I am using (CookHLA). I have already made the bed/bim/fam files for my vcf files. Now, I want to merge them into a single file so I can progress with the algorithm. To do…

Continue Reading Plink –merge-list only outputting fam

P-values far too high for quantitative regenie phenotype

P-values far too high for quantitative regenie phenotype 0 Hi all, I’m having some trouble running regenie (v2.2.4) on a quantitative phenotype for a large cohort. I’m testing a standard height GWAS with heights rounded to the nearest integer. I’ve tried a few different tests to see where the issue…

Continue Reading P-values far too high for quantitative regenie phenotype

Changing the sample IDs of a bed/bim/fam PLINK fileset

Changing the sample IDs of a bed/bim/fam PLINK fileset 0 Hi everyone, I am working with a genotype set that is not identified with the samples IDs that I want. However, I do have a lookup table which I can use in R to get the right identification when I…

Continue Reading Changing the sample IDs of a bed/bim/fam PLINK fileset

I can’t get a dossage file using PLINK

Hi, I have been trying to get a dosage file from vcf, map and fam files. For that, I have written this bash script : plink –fam plink.fam –map plink.map –dosage one.vcf –write-dosage However, I got this error: –dosage: Reading from one.vcf. Error: Line 1 of one.vcf has fewer tokens…

Continue Reading I can’t get a dossage file using PLINK

Aro Biotherapeutics hiring Investigator, Genetics & Bioinformatics in Philadelphia, Pennsylvania, United States

About Aro BioTx Join the team at Aro Biotherapeutics creating breakthrough biotherapeutics based on Centyrin oligonucleotide conjugates. Centyrins are small protein domains based on the fibronectin domains of human Tenascin C that combine the affinity and specificity properties of antibodies with the stability and tissue penetration properties of small molecules….

Continue Reading Aro Biotherapeutics hiring Investigator, Genetics & Bioinformatics in Philadelphia, Pennsylvania, United States

merge bfiles with different allele format by plink

merge bfiles with different allele format by plink 0 Hello, I want to merge two bfiles using plink. However, one bfile is written as below. 1 rs123 0 123 A G 1 rs234 0 234 G T The other file is as below. 1 rs123 0 123 1 2 1…

Continue Reading merge bfiles with different allele format by plink

Principal/senior Bioinformatics Scientist (Hereditary Disease) – Portola Valley

We are looking for a highly motivated, senior level bioinformatics scientist with extensive experience and interest in translational genomic research, genetic analysis of complex traits, quantitative genetics, and/or algorithm/pipeline development. This position requires experience with scientific programming, relational data systems, algorithms development, and statistical modeling. Top candidates will also have…

Continue Reading Principal/senior Bioinformatics Scientist (Hereditary Disease) – Portola Valley

Haplotype divergence supports long-term asexuality in the oribatid mite Oppiella nova

Significance Putatively ancient asexual species pose a challenge to theory because they appear to escape the predicted negative long-term consequences of asexuality. Although long-term asexuality is difficult to demonstrate, specific signatures of haplotype divergence, called the “Meselson effect,” are regarded as strong support for long-term asexuality. Here, we provide evidence…

Continue Reading Haplotype divergence supports long-term asexuality in the oribatid mite Oppiella nova

Question about ROH analysis by Plink 1.9

Hi all, I have recently tried to estimate runs of homozygosity (ROH) from my vcf file by using plink 1.9. I ran following code to generate binary files that plink required: plink –vcf myfile.vcf –make-bed –out out_name –no-sex –no-parents –no-fid –no-pheno –allow-extra-chr This vcf file only contains one individual and…

Continue Reading Question about ROH analysis by Plink 1.9

Phylogeographic reconstruction of the marbled crayfish origin

Procambarus fallax collections and PCR genotyping Animals were collected from various wild populations (Table S1) in compliance with state and local regulations (Georgia department of natural resources scientific collection permit 115621108, state of Florida collection permits S-19-10 and S-20-04). DNA was isolated from abdominal muscle tissue using SDS-based extraction and precipitation…

Continue Reading Phylogeographic reconstruction of the marbled crayfish origin

How to find Standard Error (SE) values when not provided in GWAS summary stats?

How to find Standard Error (SE) values when not provided in GWAS summary stats? 2 Hi everyone! I’m trying to do a fixed effect meta-analysis on a couple of GWASes based on p-values, Standard error and effect estimates (Beta) using METAL genetics software. For one of my GWAS studies SE…

Continue Reading How to find Standard Error (SE) values when not provided in GWAS summary stats?

Phenotype file for eQTL analysis using GEMMA

Phenotype file for eQTL analysis using GEMMA 0 Hello All, I appreciate it if someone could direct me in this regard. I am running eQTL analysis using GEMMA software. I have corrected the expression file with all samples (280 samples) and the genotype file is (170). I have a couple…

Continue Reading Phenotype file for eQTL analysis using GEMMA

PRS in UK Biobank – no covariate file and no phenotype file

PRS in UK Biobank – no covariate file and no phenotype file 1 Hi there, I am trying to undertake a PRS using UK Biobank plink data. I am trying to generate a PRS using PRSice-2. However, the issue I am having is that I do not have a covariate…

Continue Reading PRS in UK Biobank – no covariate file and no phenotype file

Remove related samples using plink

Remove related samples using plink 0 Hi, I generated pairwise IBD (PI_HAT) using plink1.9 –genome option. I have >200,000 samples, so I used –parallel and combined the sub files using cat. Is there a way to remove related samples using the output file .genome.gz ? I read about –rel-cutoff but…

Continue Reading Remove related samples using plink