Tag: plink

Bcftools equivalent of vcftools conversion to ped & map

Bcftools equivalent of vcftools conversion to ped & map 1 I am converting a VCF to ped & map thus in vcftools vcftools –gzvcf ZZZZZTYT.vcf.gz –plink –out ZZZZZTYT which works fine. However, I have been searching and searching, can bcftools do the same with a bcf? bcftools • 103 views…

Continue Reading Bcftools equivalent of vcftools conversion to ped & map

Integrative analysis of eQTL and GWAS summary statistics reveals transcriptomic alteration in Alzheimer brains | BMC Medical Genomics

Significant gene-AD associations With the GWAS summary data from the IGAP and eQTL summary data from BRAINEAC, we performed both SMR and HEIDI tests to estimate the gene-AD associations in three human brain regions: frontal cortex, temporal cortex, and hippocampal regions. For the frontal cortex and hippocampal regions, we obtained…

Continue Reading Integrative analysis of eQTL and GWAS summary statistics reveals transcriptomic alteration in Alzheimer brains | BMC Medical Genomics

Bioinformatics Scientist for Whole Genome and Whole Exome Sequencing

** Bioinformatics Scientist for Whole Genome and Whole Exome Sequencing ** The NeuroGenomics and Informatics (NGI) Center lead by Dr. Carlos Cruchaga at Washington University School of Medicine is recruiting a Bioinformatics Scientist to work on Whole Genome and Whole Exome Sequencing. We are seeking an experienced, self-motivated, self-driven scientist…

Continue Reading Bioinformatics Scientist for Whole Genome and Whole Exome Sequencing

Extact SNPs ID’s and their Values with IID, Additive and dominance components

Extact SNPs ID’s and their Values with IID, Additive and dominance components 0 I’m dealing with GWAS data and I have 2M records of the .bed file, I’m New to P-link, Can anyone please help me with the Plink command which can Extract all SNPs Id’s and their Values with…

Continue Reading Extact SNPs ID’s and their Values with IID, Additive and dominance components

Using QCTOOL v2 to process UK Biobank .bgen files

Using QCTOOL v2 to process UK Biobank .bgen files – why so slow? 0 I’m currently using QCTOOL v2 to process imputed .bgen files from UK Biobank, however they seem to be processing very slowly. Is this normal? My command is pretty basic; I’m filtering out a list of SNPs…

Continue Reading Using QCTOOL v2 to process UK Biobank .bgen files

dx: error: unrecognized arguments: running swiss army knife plink2

Hi, I was trying to run regenie workflow in ukb rap (part E). github.com/dnanexus/UKB_RAP/blob/main/GWAS/regenie_workflow/partE-step2-qc-filter.sh All the previous steps ran perfectly without any error. But in this step partE-step2-qc-filter.sh I am getting dx: error: unrecognized arguments: –bfile ukb23155_c22_b0_v1 –no-pheno –keep natd_wes_200k.phe –autosome –maf 0.01 –mac 20 –geno 0.1 –hwe 1e-15 –mind…

Continue Reading dx: error: unrecognized arguments: running swiss army knife plink2

How to Select data for a SNP for all samples using PLINK

How to Select data for a SNP for all samples using PLINK 0 Hello Folks, I’m very new to this plink. The below task has been assigned, any reference/blogs/Articles much appreciated. Q: Be able to select data for an SNP for all samples using PLINK please help plink • 49…

Continue Reading How to Select data for a SNP for all samples using PLINK

Linkage mapping, comparative genome analysis, and QTL detection for growth in a non-model teleost, the meagre Argyrosomus regius, using ddRAD sequencing

Fricke, R., Eschmeyer, W. N. & van der Laan, R. (eds). Eschmeyer’s Catalog of Fishes: Genera, Species, Rererences. researcharchive.calacademy.org/research/ichthyology/catalog/fishcatmain.asp. Electronic version, Accessed 15 October 2021. Nelson, J. S. Fishes of the World 4th edn, 372 (Wiley, 2006). Google Scholar  Chen, X. H., Lin, K. B. & Wang, X. W. Outbreaks…

Continue Reading Linkage mapping, comparative genome analysis, and QTL detection for growth in a non-model teleost, the meagre Argyrosomus regius, using ddRAD sequencing

apt-format-result, where is the –annotation-file?

apt-format-result, where is the –annotation-file? 0 Hi everyone! I want to create a PLINK file from the axiom analysed batches using this script: apt-format-result –calls-file <file> –annotation-file <file> –export-plinkt-file My calls-file was generated in step 7 of the APT software. However, I can’t find any annotation-file (or any *.db file)….

Continue Reading apt-format-result, where is the –annotation-file?

Merge only bim files with plink

Merge only bim files with plink 0 Hello For the same dataset they provide a single BED and FAM files for all the chromosomes. However, the BIM files are split in chromosomes. I would like to generate the VCF file with the genotyping calls of all chromosomes but I need…

Continue Reading Merge only bim files with plink

Parallel reduction in flowering time from de novo mutations enable evolutionary rescue in colonizing lineages

Díaz, S. et al. Summary for Policymakers of the Global Assessment Report on Biodiversity and Ecosystem Services of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES, 2019). Fisher, R. A. The correlation between relatives on the supposition of Mendelian inheritance. Earth Environ. Sci. Trans. R. Soc. Edinb. 52,…

Continue Reading Parallel reduction in flowering time from de novo mutations enable evolutionary rescue in colonizing lineages

Genetic associations at regulatory phenotypes improve fine-mapping of causal variants for 12 immune-mediated diseases

Cooper, G. S., Bynum, M. L. K. & Somers, E. C. Recent insights in the epidemiology of autoimmune diseases: improved prevalence estimates and understanding of clustering of diseases. J. Autoimmun. 33, 197–207 (2009). PubMed  PubMed Central  Google Scholar  El-Gabalawy, H., Guenther, L. C. & Bernstein, C. N. Epidemiology of immune-mediated…

Continue Reading Genetic associations at regulatory phenotypes improve fine-mapping of causal variants for 12 immune-mediated diseases

split gtex genotype data by chromosomes.

Hello, I used and edited the command line to use –vcf to import vcf file. I used these commands: for chr in $(seq 1 22); do      plink –vcf /dbGAP/GTEx_Analysis_2017-06-05_v8_WholeExomeSeq_979Indiv_VEP_annot.vcf.gz            –chr $chr            –recode            –out…

Continue Reading split gtex genotype data by chromosomes.

What software to use for GWAS with repeated longitudinal quantitative phenotype data?

What software to use for GWAS with repeated longitudinal quantitative phenotype data? 0 I have quantitative data that has been collected at 4 different time points. Before treatment, 1 month after, 1 year after and 2 year after. I have generated Linear Mixed Models in R to obtain the residuals….

Continue Reading What software to use for GWAS with repeated longitudinal quantitative phenotype data?

No valid entries in –score file.

Hello, Thank you for developing the incredible tool that plink2 is. I have received the error “Error: No valid entries in –score file.” when using PLINK v2.00a2.3LM AVX2 Intel (24 Jan 2020). However, I don’t receive the error using the same data and arguments with a PLINK v2.00a2LM 64-bit Intel…

Continue Reading No valid entries in –score file.

Bug#1004037: Segmentation fault in plink2 (Was: src:plink2: fails to migrate to testing for too long: autopkgtest regression)

Control: reopen -1 Control: tags -1 confirmed Control: tags -1 upstream Control: forwarded -1 Christopher Chang <chrch…@alumni.caltech.edu> Hi Christopher (and Dylan), I verified the latest version (29 Jan 2022) of plink2 with the same result for the CI test we are doing in Debian (which was written by Dylan): $…

Continue Reading Bug#1004037: Segmentation fault in plink2 (Was: src:plink2: fails to migrate to testing for too long: autopkgtest regression)

How to find out why PRSice-2 excludes ambiguous SNPs

How to find out why PRSice-2 excludes ambiguous SNPs 0 When I use PRSice to calculate PRS,it alerts it excludes 25 SNPs from the base data. But how could I know the reason why they are ambiguous? Could I explore more why they are ambiguous? Question2: If I use PRSice…

Continue Reading How to find out why PRSice-2 excludes ambiguous SNPs

Plink2.0 unrecognized flag: –het

Plink2.0 unrecognized flag: –het 1 Hi, I’m trying to run plink2.0 to calculate heterozygosity ratio, through the command: plink2.0 –pfile myFile –het but I receive this error: Error: Unrecognized flag (‘–het’) I honestly don’t have any clou about it and about it’s not recognized by plink2.0 as flag. Has this…

Continue Reading Plink2.0 unrecognized flag: –het

Plink Alternative Phenotype File Columns not being Read

Plink Alternative Phenotype File Columns not being Read 0 Hi, I have a plink alternative phenotype file with the following format: FID IID Phenotype 1 2 1 1 3 0 etc. As outlined in the plink documentation. zzz.bwh.harvard.edu/plink/data.shtml#pheno However, when I run the following command : plink –bfile ../Plink_Files/plink –logistic…

Continue Reading Plink Alternative Phenotype File Columns not being Read

The Genetic Architecture of Sleep Health Scores in the UK

Introduction Sleep is a complex neurological and physiological state. It is defined as a natural and reversible state of reduced responsiveness to external stimuli and relative inactivity, accompanied by a loss of consciousness.1 Sleep disorders can be classified as seven major categories: insomnia disorders, sleep-related breathing disorders, central disorders of…

Continue Reading The Genetic Architecture of Sleep Health Scores in the UK

Mendelian randomization analyses support causal relationships between blood metabolites and the gut microbiome

1. Wang, J. & Jia, H. Metagenome-wide association studies: fine-mining the microbiome. Nat. Rev. Microbiol. 14, 508–522 (2016). CAS  PubMed  Google Scholar  2. Moschen, A. R. et al. Lipocalin 2 protects from inflammation and tumorigenesis associated with gut microbiota alterations. Cell Host Microbe 19, 455–469 (2016). CAS  PubMed  Google Scholar …

Continue Reading Mendelian randomization analyses support causal relationships between blood metabolites and the gut microbiome

What file type does “PLINK –block” accept as input?

What file type does “PLINK –block” accept as input? 0 Hi, I have set of SNPs (distributed over all the chromosomes) and I am trying to do some haplotype block estimation to identify whether some of them are part of the same haplotype block, etc. It seems like “PLINK –blocks”…

Continue Reading What file type does “PLINK –block” accept as input?

PLINK sanity check – Bioinformatics Stack Exchange

I am a new user of PLINK and am analysing some SNP data for the first time. After creating a .bim file with $ plink –file my_data –make-bed I notice that for several SNPs my data is different from dbSNP e.g. rs145496306: BIM file: A G dbSNP: G>A,T rs3813199: BIM…

Continue Reading PLINK sanity check – Bioinformatics Stack Exchange

SNP extraction

SNP extraction 0 I want to extract specific SNPs of interest i have in a text file into an additive genetics model so that each SNP can be in a 0/1/2 format for each subject using genetics info in from PLINK (.bed, .bim, and .fam files). How can i do…

Continue Reading SNP extraction

for loop – Is there a way to permute inside using to variables in bash?

I’m using the software plink2 (www.cog-genomics.org/plink/2.0/) and I’m trying to iterate over 3 variables. This software admits an input file with .ped extention file and an exclude file with .txt extention which contains a list of names to be excluded from the input file. The idea is to iterate over…

Continue Reading for loop – Is there a way to permute inside using to variables in bash?

Merging genotyping array VCFs and then running kinship analysis

Merging genotyping array VCFs and then running kinship analysis 0 Hello, I have about 4200 array genotyping VCFs (from the Illumina Infinium CoreExome-24 Kit) and I have merged them using bcftools merge. The chip has 500K exonic SNPs. These are trio data – which means 1700 of them are probands,…

Continue Reading Merging genotyping array VCFs and then running kinship analysis

One-hot encoding for PLINK or VCF

One-hot encoding for PLINK or VCF 0 I want to write an autoencoder for SNP data. Is there an established way to one-hot-encode binary PLINK or VCF input? I believe that can be done by manipulating PLINK’s bed file but am afraid to do something wrong. By one-hot encoding I…

Continue Reading One-hot encoding for PLINK or VCF

Error: PLINK does not support more than 2^31

Error: PLINK does not support more than 2^31 – 3 variants. 0 Hi there, I was converting my vcf file into bfiles in plink, and I got an error ‘Error: PLINK does not support more than 2^31 – 3 variants’. We recommend other software, such as PLINK/SEQ, for very deep…

Continue Reading Error: PLINK does not support more than 2^31

Population-wise decay of linkage disequilibrium (LD)

Population-wise decay of linkage disequilibrium (LD) 0 I have three different populations, would like to plot LD with R using plink generated ld files. There is no information on multi-population LD here and so far all reported to plot single LD plot. I’ll appreciate any help R snp next-gen GBS…

Continue Reading Population-wise decay of linkage disequilibrium (LD)

What is the best way to load bgen files into python?

What is the best way to load bgen files into python? 0 I am new to the BGEN format, (I mainly work with PLINK formatted files or VCFs). I have been trying to load a series of BGEN files to python using bgen-reader; however, there has been some issues that…

Continue Reading What is the best way to load bgen files into python?

Snakemake restricted wildcard combinations

Snakemake restricted wildcard combinations 1 Hi, I’m new to snakemake and haven’t seen something like this in tutorials I’ve tried. I have a data frame with unique keys, chromosomes, start positions, and end positions. I essentially want to loop over every group to do an operation. How would I assign…

Continue Reading Snakemake restricted wildcard combinations

principal component analysis on pool-seq SNP data

principal component analysis on pool-seq SNP data 0 I would like to perform principal component analysis on a pool-seq SNP dataset. I’ve been looking into methods for doing this, but have had trouble finding examples that may apply for pooled data as opposed to individual genotypes. For example, I’m not…

Continue Reading principal component analysis on pool-seq SNP data

Plink to bimbam format for gemma-wrapper

Plink to bimbam format for gemma-wrapper 0 Hi, I want to use gemma-wrapper tool which needs files in BIMBAM format. Plink can convert from plink to BIMBAM format but it is very different from the real BIMBAM format. Alleles come in two columns while BIMBAM needs them in mean genotype…

Continue Reading Plink to bimbam format for gemma-wrapper

Duplicate ID in bed file

Duplicate ID in bed file 2 I am using PLINK v1.90b3s 64-bit (17 Jun 2015) to generate a LD matrix from 1000G VCF file for a long list of SNPs. I use this command to convert VCF to bed file plink –vcf ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz –make-bed –out binary_fileset I then use this…

Continue Reading Duplicate ID in bed file

principal compoent analysis on pool-seq SNP data

principal compoent analysis on pool-seq SNP data 0 I would like to perform principal component analysis on a pool-seq SNP dataset. I’ve been looking into methods for doing this, but have had trouble finding examples that may apply for pooled data as opposed to individual genotypes. For example, I’m not…

Continue Reading principal compoent analysis on pool-seq SNP data

Bug#1000782: plink2: autopkgtest regression: Segmentation fault

Source: plink2 Version: 2.00~a3-211011+dfsg-1 X-Debbugs-CC: debian…@lists.debian.org Severity: serious User: debian…@lists.debian.org Usertags: regression Dear maintainer(s), With a recent upload of plink2 the autopkgtest of plink2 fails in testing when that autopkgtest is run with the binary packages of plink2 from unstable. It passes when run with only packages from testing. In…

Continue Reading Bug#1000782: plink2: autopkgtest regression: Segmentation fault

michigan imputation server

michigan imputation server 0 Hi, I performed imputation on my GWAS data using Michigan imputation server. Now I have two output files: 1).dose.vcf.gz and 2).info.gz Michigan imputation server use mimimac3 (–format GT,DS,GP) and in the output file “.dose.vcf.gz” are present all the three formats. I’m new on this kind of…

Continue Reading michigan imputation server

Calculating polygenic score using plink

Calculating polygenic score using plink 0 Hi I am trying to calculate PRS in test sample after getting best SNPs from PRSice 2 with highest R2 My target data is in the form of bed,bim, fam for 22 chromosomes, I want to use plink for specific p-value threshold in this…

Continue Reading Calculating polygenic score using plink

BAM dataset to Genotype data conversion using PLINK

BAM dataset to Genotype data conversion using PLINK 1 You must use a genotype caller in order to obtain genotypes from a .bam file. It’s not possible to ‘convert’ .bam to genotypes. There’s a lot of options, but maybe using bcftools is the most simple. Take a read of this…

Continue Reading BAM dataset to Genotype data conversion using PLINK

No block substitutions in plink BIM file?

No block substitutions in plink BIM file? 0 The BIM file from PLINK contains the variant information that accompanies a BED file. The variants called are based on exome sequencing. However, I encounter a strange issue. The BIM file consists of only single nucleotide substitutions, insertions, and deletions of base…

Continue Reading No block substitutions in plink BIM file?

SHAPEIT phased haps file does not contain my SNPs of interest

SHAPEIT phased haps file does not contain my SNPs of interest 0 I produced a haps file using SHAPEIT, as below. However, when I query it to look for my 12 SNPs of interest, only 3 are in there. Does anyone know why? (PS they are not in the snp.strand.exclude…

Continue Reading SHAPEIT phased haps file does not contain my SNPs of interest

Genotyping-in-Thousands by sequencing of archival fish scales reveals maintenance of genetic variation following a severe demographic contraction in kokanee salmon

1. Wandeler, P., Hoeck, P. E. & Keller, L. F. Back to the future: Museum specimens in population genetics. Trends Ecol. Evol. 22, 634–642 (2007). PubMed  Google Scholar  2. Bi, K. et al. Unlocking the vault: Next-generation museum population genomics. Mol. Ecol. 22, 6018–6032 (2013). CAS  PubMed  PubMed Central  Google…

Continue Reading Genotyping-in-Thousands by sequencing of archival fish scales reveals maintenance of genetic variation following a severe demographic contraction in kokanee salmon

SHAPEIT using VCF unphased genotype input

I can get SHAPEIT to work with the default Plink PED/MAP format input files, but not with a VCF as input. As an example, here I use the demo data that comes with SHAPEIT, which runs well. DEMO=/Users/michaelflower/bin/shapeit.v2.904.3.10.0-693.11.6.el7.x86_64/example shapeit -B $DEMO/gwas.bed $DEMO/gwas.bim $DEMO/gwas.fam -M $DEMO/genetic_map.txt -O “$DIR”/shapeit/gwas.phased However, when I…

Continue Reading SHAPEIT using VCF unphased genotype input

Phasing using Beagle with a map file

I’d like to phase the SNPs in a vcf file and output consensus files for each haplotype, as suggested in this post: www.biostars.org/p/298635/ I’ve managed to install beagle in a conda environment: conda create -n beagle -c conda-forge -c bioconda beagle conda activate beagle When I run beagle using this…

Continue Reading Phasing using Beagle with a map file

UK Biobank Ref/Alt Allele Count PLINK2

UK Biobank Ref/Alt Allele Count PLINK2 0 Hi all, I have UK Biobank .BGEN and .Sample files, and I am trying to output sample major additive allele counts using PLINK 2 software. The PLINK 2 software manual states that REF alleles are now counted. Am I correct in assuming that…

Continue Reading UK Biobank Ref/Alt Allele Count PLINK2

plink –geno filters variants with good genotype rate

plink –geno filters variants with good genotype rate 0 Hi there, I’m using plink v1.9. I have a list of variants that I’m interested in checking out. After filtering with –geno 0.2, some of the variants that have less than 20% of missing genotype rate are eliminated, when they should…

Continue Reading plink –geno filters variants with good genotype rate

identification of ROH using plink

identification of ROH using plink 0 Hello All I generated vcf file using GATK (First Haplotypecaller –> CombinedGVCF –> GenotypeGVCF and then Hard filtering ). After this, I converted filtered vcf file into plink binary PED files (.bed, .fem, .bim, plink v1.9) using –make-bed command. However, when I used these…

Continue Reading identification of ROH using plink

Plink2 –keep Removing All Samples

Plink2 –keep Removing All Samples 1 I am trying to include the –keep and –remove options in my plink2 command. I am finding that despite my files for these options having identical text to the main file’s IDs, all the samples are removed. Command: plink2 –bfile pca_final –keep EUR.sample –remove…

Continue Reading Plink2 –keep Removing All Samples

How to make .ped and .map files

How to make .ped and .map files 0 Hello I have a dataset but I need to create .ped and .map files (as I understand) in order to use plink to run a GWAS. However, I do not know what files I need to use in order to create them,…

Continue Reading How to make .ped and .map files

Multiple stages of evolutionary change in anthrax toxin receptor expression in humans

Human research participants We have complied with all relevant ethical regulations and informed consent was obtained from all participants. This work was approved by the Cornell University IRB under protocol 1506005662. Animal research This work was approved by the Cornell University IACUC under protocol 2009-0044. Welfare and handling of all…

Continue Reading Multiple stages of evolutionary change in anthrax toxin receptor expression in humans

How to run a GWAS in command line with Plink2?

How to run a GWAS in command line with Plink2? 0 I am trying to run a GWAS in the command line with plink version 2, however when I run the following command plink2 –file hapmap1 I get the following error Error: Unrecognized flag (‘–file’) I am trying to follow…

Continue Reading How to run a GWAS in command line with Plink2?

Linked supergenes underlie split sex ratio and social organization in an ant

Significance Some social insects exhibit split sex ratios, wherein a subset of colonies produce future queens and others produce males. This phenomenon spawned many influential theoretical studies and empirical tests, both of which have advanced our understanding of parent–offspring conflicts and the maintenance of cooperative breeding. However, previous studies assumed…

Continue Reading Linked supergenes underlie split sex ratio and social organization in an ant

PLINK Dosage file without family ID error

PLINK Dosage file without family ID error 0 Hi, I am simply trying to extract a small set of SNPs from a .dos dosage file using plink1.9. However, I get the error: Line 1 of yourfile.dos has fewer tokens than expected I have checked that the .fam file and the…

Continue Reading PLINK Dosage file without family ID error

Relationship between Standard Error and P-value

GWAS – Relationship between Standard Error and P-value 1 Is there a relationship between the p-values obtained in a GWAS and the standard error of the effect size of a SNP that can that can be explained either explicitly or intuitively? Methods for prediction based on effect sizes, such as…

Continue Reading Relationship between Standard Error and P-value

Genome of the estuarine oyster provides insights into climate impact and adaptive plasticity

1. Hoegh-Guldberg, O. & Bruno, J. F. The impact of climate change on the world’s marine ecosystems. Science 328, 1523–1528 (2010). CAS  PubMed  Google Scholar  2. Chou, C. et al. Increase in the range between wet and dry season precipitation. Nat. Geosci. 6, 263–267 (2013). CAS  Google Scholar  3. Li,…

Continue Reading Genome of the estuarine oyster provides insights into climate impact and adaptive plasticity

VCF file generation from multiple samples fro PCA

VCF file generation from multiple samples fro PCA 0 I am trying to generate vcf file for 80 samples(human) and use it for pca. But when trying to get eigen vectors using plink it says genotyping rate is 0.12 and when i remove snps with missing data threshold all data…

Continue Reading VCF file generation from multiple samples fro PCA

Best tool for genotype – phenotype correlation

Best tool for genotype – phenotype correlation 0 Hello, I need to perform genotype – phenotype correlation analysis. I know PLINK could be used for such purpose, but with PLINK many file preparation steps need to be done before running the actual step. I have VEP annotated VCFs. Maybe other…

Continue Reading Best tool for genotype – phenotype correlation

Why does write.ped remove the first locus?

Why does write.ped remove the first locus? 0 In order to get a VCF file from genind, I am going through hierfstat function write.ped() and then with plink I convert the result to vcf. This is my code (apologies, but I cannot provide a reproducible data for this particular scenario):…

Continue Reading Why does write.ped remove the first locus?

deflated QQ plot but lambda >1

deflated QQ plot but lambda >1 1 Dear All, What might be the reason for a deflated QQ-plot but lambda showing > 1 value. GWAS (case-control using glm-logistic regression adjusting for PC1-PC3 and three covariates) was done in plink2.0, and QQ plot using QQman package TIA package plink deflation GWAS…

Continue Reading deflated QQ plot but lambda >1

Population stratification with PCA

Population stratification with PCA 1 Hi all! I have a genotype dataset in plink format. Now I want to correct for population structure with PCA in association analysis. I split my dataset to training and testing datasets. I want to do the PCA only in the training dataset and use…

Continue Reading Population stratification with PCA

How to convert GEN or .gen format from impute.me to vcf on windows 10?

How to convert GEN or .gen format from impute.me to vcf on windows 10? 1 I tried for days to convert a gen file to vcf but it did not work. I am a beginner so i don’t know what are in vcf files and gen files or how they…

Continue Reading How to convert GEN or .gen format from impute.me to vcf on windows 10?

How to import dosage information to plink binary files?

How to import dosage information to plink binary files? 0 Hi All, I recently converted a very large Topmed imputed VCF files into a plink format. The command I used to convert this VCF was plink1.9 –vcf ${VCF} –make-bed –out ${VCF}_binary. Additionally, I also spent a significant amount of time…

Continue Reading How to import dosage information to plink binary files?

Problems Imputing X Chromosome with TOPMed

I have a large dataset whose autosomes I was able to successfully phase and impute using TOPMed. I have tried doing the same with the X chromosome but keep running into issues. Before trying to impute with TOPMed, I did per-individual QC and per-marker QC, then ran checkVCF, and corrected…

Continue Reading Problems Imputing X Chromosome with TOPMed

Plink1.9 error -chr not recognised

Plink1.9 error -chr not recognised 1 Hello, I am trying to run PLink on my tped file (SNP data set) to test for linkage disequilibrium, but it appears as if I need to set some chromosome options because I get some error at some line. My data is not mapped….

Continue Reading Plink1.9 error -chr not recognised

Convert plink files from

Convert plink files from 0 I have plink files where the .bim file is in the following format, and I want to convert the chr:pos column into rsIDs. 1 1:10177:A:AC 0 10177 AC A 1 1:10352:T:TA 0 10352 TA T 1 1:11008:C:G 0 11008 G C 1 1:11012:C:G 0 11012…

Continue Reading Convert plink files from

PLINK and population stratification with known subpopulations

PLINK and population stratification with known subpopulations 0 I want to perform a genome wide association study (GWAS) with PLINK 1.9. I have whole genome sequencing SNP calls for ~100 patients where I know in advance that there is a skew towards subpopulations of African and South American descents, with…

Continue Reading PLINK and population stratification with known subpopulations

Plink v2.0 does not produce a Z-compressed file (.zst)

Plink v2.0 does not produce a Z-compressed file (.zst) 0 Good morning, I would like to convert a merged VCF in a Plink compressed format (.pgen, .psam and .pvar files), so I run plink2 –vcf MyMerged.vcf.gz –make-pgen –zst-level 3 –out MySamples It basically works, as it produces such files: ls…

Continue Reading Plink v2.0 does not produce a Z-compressed file (.zst)

PLINK map format spliiting by chromosome

PLINK map format spliiting by chromosome 0 hi everyone I have a PLINK mapfile whit for columns including chromosome name, snp id, Genetic distance (morgans) (All column is zero), Base-pair position. NC1.1 rs1 0 145 NC1.1 rs2 0 201 NC2.1 rs3 0 208 . . NCn.1 rsn 0 509 I…

Continue Reading PLINK map format spliiting by chromosome

Recreating QC of 1000 Genomes project

Recreating QC of 1000 Genomes project – removing non overlapping SNPs 0 Hi everyone, I am attempting to recreate the the quality control analysis performed in the 1000 genomes project (tcag.ca/documents/tools/omni25_qcReport.pdf). I am fairly new to performing QC on a dataset, and am currently stuck on section 5.1 of the…

Continue Reading Recreating QC of 1000 Genomes project

PLINK dosage data – convert and/or read in R

PLINK dosage data – convert and/or read in R 1 I have PLINK dosage files in the form pgen/psam/pvar. I would like to know how to do either/both of the following: Convert the file set to bed/bim/fam files (hard-calls) Read the allele dosage from pgen into R as a continuous…

Continue Reading PLINK dosage data – convert and/or read in R

Can someone explain PLINK allele REF/ALT management strategy?

Can someone explain PLINK allele REF/ALT management strategy? 0 Sometimes when merging two plink files, the Reference (REF) and Alternative (ALT) alleles may be reversed, e.g. REF G ALT A versus REF A ALT G. The main reason for that is the default action of PLINK. You see, when using…

Continue Reading Can someone explain PLINK allele REF/ALT management strategy?

Checking chromosome builds for genotyping data

Checking chromosome builds for genotyping data 0 Hi, I have several studies worth of data (In both PLINK and vcf format), and I was wondering if anyone knew of an online tool which I could use to check my chromosome build i.e GRCh37 vs GRCh38. (I thought I used one…

Continue Reading Checking chromosome builds for genotyping data

merging multiple .bed /.bgen files uk biobank using plink

Hi All, I am having a problem merging all chromosomal UK biobank files. I ran the following command. plink2 –bfile /path/to/file/ukb_imp_chr1 –pmerge-list /path/to/file/merge.list –maf 0.01 –hwe 1e-6 –make-pgen –out /path/to/file/ukb_imp_allchr I also tried plink2 –bfile /path/to/file/ukb_imp_chr1 –pmerge-list /path/to/file/merge.list –maf 0.01 –hwe 1e-6 –make-bed –out /path/to/file/ukb_imp_allchr The merge.list has the following…

Continue Reading merging multiple .bed /.bgen files uk biobank using plink

UK Biobank Imputed Genotypes

UK Biobank Imputed Genotypes 0 I am using the UK Biobank Imputed Genetic data set and I was wondering if it is possible to use PLINK (or any other software) to view the actual calls for each SNP ranging in value from 0 –> 2 (including fractions)? I have been…

Continue Reading UK Biobank Imputed Genotypes

The genomic origins of the Bronze Age Tarim Basin mummies

1. Peyrot, M. in Aspects of Globalisation: Mobility, Exchange and the Development of Multi-Cultural States 12–17 (2017). 2. Damgaard, P. et al. 137 ancient human genomes from across the Eurasian steppes. Nature 557, 369–374 (2018). CAS  PubMed  Article  ADS  PubMed Central  Google Scholar  3. Hemphill, B. E. & Mallory, J….

Continue Reading The genomic origins of the Bronze Age Tarim Basin mummies

GWAS Studies

GWAS Studies 0 How to create a .ped file for use in PLink? I have the following csv file: Chromosome Position Sample1 Sample 2 ……….. Sample n Chr Pos Sam1 Sam2 Sam3, Sam 4, ……Sample_n 1 11 A T T, A, A, T, …… 2 141 G G G, T,…

Continue Reading GWAS Studies

GWAS data from PGC

GWAS data from PGC 0 this is my first time using the GWAS data, I downloaded some data from PGS and I Have some questions. 1- there are a lot of SNP to the same gene with different P-values, why does this occur? and if I want to use one…

Continue Reading GWAS data from PGC

Distance matrix PCA

Distance matrix PCA 0 Hi all, I generated PCA values for the 1000genomes dataset using PLINK. I know how to plot the values for PC1 and PC2, but my question is how can I generate a distance matrix to select near samples based on populations? Like for example if I…

Continue Reading Distance matrix PCA

Phasing using SHAPEIT

Hello, I need to use SHAPEIT for phasing only since I will conduct CH (compound heterozygous) analysis for recessive rare variant.. I will not perform imputation.  I am running SHAPEIT, and I see in the log file it says: Parameters :  * Seed : 1442251531  * Parallelisation: 12 threads  *…

Continue Reading Phasing using SHAPEIT

New datasets for ancestry estimation and imputation?

New datasets for ancestry estimation and imputation? 0 What datasets are people using nowadays for genotype imputation and ancestry estimation? HapMap and 1000 Genomes are good, but it was some years since their release and both have some limitations on the number of populations included and resolution (especially HapMap which…

Continue Reading New datasets for ancestry estimation and imputation?

No VCF records found in the specified interval

Beagle 5 error: No VCF records found in the specified interval 0 Hi, I am running into an issue while doing Imputation with Beagle 5 and not sure what is causing the error. I have vcf files converted from PLINK by the following command ./plink –bfile qcd_in–chr 20 –recode vcf-iid…

Continue Reading No VCF records found in the specified interval

Math behind association with PLINK

Math behind association with PLINK 1 Hi, which is the mathematical formula behind the –linear association used by plink ? plink association gwas • 307 views The most basic association test is just a Chi-squared test on a 2 x 2 contingency table of the minor allele tallies, as to…

Continue Reading Math behind association with PLINK

How long does it take to carry out the GWAS workflow?

How long does it take to carry out the GWAS workflow? 0 Including these steps: 1) raw data format transformation for five companies 2) update positions for all SNPs to hg37 version 3) Quality control within companies 4) Pre-phasing (SHAPEIT2) and imputation (IMPUTE2) for all SNPs of each company 5)…

Continue Reading How long does it take to carry out the GWAS workflow?

Runs of homozygosity in Plink

❯ plink1.9 –homozyg –help PLINK v1.90b6.22 64-bit (3 Nov 2020) www.cog-genomics.org/plink/1.9/ (C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3 –help present, ignoring other flags. –homozyg [{group | group-verbose}] [‘consensus-match’] [‘extend’] [‘subtract-1-from-lengths’] –homozyg-snp <min var count> –homozyg-kb <min length> –homozyg-density <max inverse density (kb/var)> –homozyg-gap <max internal gap…

Continue Reading Runs of homozygosity in Plink

Sample ID ERROR in converting dosage file (.txt) to VCF in PLINK 2.0

Sample ID ERROR in converting dosage file (.txt) to VCF in PLINK 2.0 0 Hi, I have a dosage file (.txt format; alternative allele count 0-2) that I would like to convert to a VCF through PLINK 2.0 but am running into sample ID issues. I am attempting the following…

Continue Reading Sample ID ERROR in converting dosage file (.txt) to VCF in PLINK 2.0

Error: Could not open temporary file.

Hello, there is an error when I try to filter a VCF document for MAF. Here is a part of the the vcf-file: 2: chr1H 523 chr1H:523 C A . PASS . GT 0|0 1: chr1H 445 chr1H:445 C T . PASS . GT 0| 2: chr1H 523 chr1H:523 C…

Continue Reading Error: Could not open temporary file.

Can we merge two VCF files

Forum:Can we merge two VCF files – a RNAseq VCF and a Whole genome Sequencing (WGS) VCF to do PCA? 0 I’m quite new to this variant calling and analysis area. We have around 30 samples of RNAseq data of tumor and normal samples. I performed variant calling and obtained…

Continue Reading Can we merge two VCF files

How to convert multiple .vcf files into single .ped (PLINK compatible files)?

How to convert multiple .vcf files into single .ped (PLINK compatible files)? 0 Hi everyone, I am a newbie to the whole bioinformatics world and I need to analyse WGS data from several case samples. I have now several individual .vcf files and would like to use PLINK for Quality…

Continue Reading How to convert multiple .vcf files into single .ped (PLINK compatible files)?

The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Sequencing of Shorea leprosula genome Sample collection Leaf samples of S. leprosula were obtained from a reproductively mature (diameter at breast height, 50 cm) diploid tree B1_19 (DNA ID 214) grown in the Dipterocarp Arboretum, Forest Research Institute Malaysia (FRIM). DNA extraction Genomic DNA was extracted from leaf samples using the…

Continue Reading The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests

Bioinformatics Biomedical Scientist – Bilsborough Lab

Bioinformatics Biomedical Scientist – Bilsborough Lab – Inflammatory Bowel Diseases Drug Discovery and Development Apply Now Share Requisition # HRC0697538 Join us in accelerating the pace of research and discovery within our unique IBD3 lab! Cedars-Sinai provides virtually every known gastroenterologic analytical procedure and treatment…

Continue Reading Bioinformatics Biomedical Scientist – Bilsborough Lab

PLINK basic command line usage

PLINK basic command line usage 0 Hey, I am new to PLINK. I run a tutorial of how to calculate polygenic risk score under a tutorial. choishingwan.github.io/PRS-Tutorial/target/#standard-gwas-qc I run the part of # Standard GWAS QC and the code is as follows: plink –bfile EUR –maf 0.01 –hwe 1e-6 –geno…

Continue Reading PLINK basic command line usage

Plink genome flag error

Plink genome flag error 1 I’m using plink 1.9 and running the command: plink –bfile data –genome –extract data.prune.in –remove data.fail_imiss where data.fail_imiss is a file with individuals with proportions of missing SNPs greater than greater than the threshold .05 . I get the warnings: Warning: 10745 het. haploid genotypes…

Continue Reading Plink genome flag error

Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old)

NB – Update July 29, 2020 – this thread will no longer be watched and, for all intents and purposes, will now be archived NB – Version 2 of tutorial can be found here and should be used going forward –> Produce PCA bi-plot for 1000 Genomes Phase III –…

Continue Reading Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old)

VCF file

VCF file 0 Removal Sample ###junk removal vcftools –gzvcf cohorts_combined_filtered_calls_annotated.vcf –maf 0.05 –max-missing 0.8 –min-meanDP 10 –recode –out SNP-co.vcf ####Sample removal bcftools view –samples-file ^ /media/bioinformatician/My Passport/sample_id_no-RNASEq.txt SNP-co.vcf. ###chromosome extraction vcftools –gzvcf SNP_co.vcf.gz –chr 2 –from-bp 47843908 –to-bp 47877312 –recode –out snps_filt_chr2 ##conversion vcf to plink ./plink –vcf /media/bioinformatician/My Passport/annotated/tabix/snps_filt_chr2.vcf.gz.recode.vcf…

Continue Reading VCF file

Where to find 1000 Genome phase 3 whole genome data and select only European population

Where to find 1000 Genome phase 3 whole genome data and select only European population 2 Hello: I was trying to download whole genome data from 1000Genome phase 3 data and extract only the EUR population (GBR, TSI, FIN, IBS, CEU). I used the ftp site: ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz, but apparently it…

Continue Reading Where to find 1000 Genome phase 3 whole genome data and select only European population

genomic data scientist jobs

Provide strategic planning and perform analysis or simulations independently or in a . 401(k) savings plan match.…, Requires a Ph.D. in Biochemistry, Biotechnology, Molecular/Cell Biology, Plant Biology, or a related field and 0-3 years of relevant postdoctoral or industrial……, In addition, the analyst will help advance the groups collective expertise…

Continue Reading genomic data scientist jobs

extract IDs from each population in PLINK .fam file and export to .txt separately

extract IDs from each population in PLINK .fam file and export to .txt separately 0 Hi, I’d like to write a loop to extract individuals from my PLINK.fam file based on the fam ID / population code into different .txt. files just using bash. I’m pretty stumped so would appreciate…

Continue Reading extract IDs from each population in PLINK .fam file and export to .txt separately

Loading PLINK files to Haploview

Loading PLINK files to Haploview 2 I can’t seem to get Haploview to accept PLINK files. I downloaded the ‘sample.ped’ and ‘sample.info‘ trial files from this haploview website. I am loading the .ped file for ‘Results File:’ and the .info file for ‘Map File:’. All other options are left at…

Continue Reading Loading PLINK files to Haploview

Haploview and plink

Haploview and plink 1 When I executed plink to get .ped and .info file then IT generated only for 8 chromosomes out of 22, X, Y and M. Kindly help to get all chromosomes. When I am uploading .ped and .info file in Haploview.jar then it is showing loading 0%…

Continue Reading Haploview and plink

Please help me understand LD pruning algoritm

Please help me understand LD pruning algoritm 0 I’m trying to do LD pruning with PLINK but I can’t find any proper documents about the algoritm used. There are options as indep, indep pairwise and indep pairphase and suboptions as windowsize, stepsize and threshold. I’m not sure what each means…

Continue Reading Please help me understand LD pruning algoritm