Tag: bgen

The BGEN format

Software support For command-line users: BGEN support has been implemented in several software packages, click on the links below for more information. For R users: The rbgen package can be used to load data directly into R. For python users: A number of python solutions have been developed, see e.g….

Continue Reading The BGEN format

Differentiate sequenced and imputed variants in BGEN files

Differentiate sequenced and imputed variants in BGEN files 0 Hi there! I have recently started working with UK Biobank data, particularly with the imputed BGEN files (for whoever has access to UK Biobank, the files are located at Bulk/Imputation/UKB imputation from genotype). I am also very new to the field…

Continue Reading Differentiate sequenced and imputed variants in BGEN files

Bgen file not being opened by PRSice

Bgen file not being opened by PRSice 0 I used the following command to calculate PRS of a sequenced file coming from a collaborator. I imputed the vcf file which gave me separate vcf files for each chromosome. I then converted them to bgen and generated bgi and sample files…

Continue Reading Bgen file not being opened by PRSice

No –bgen REF/ALT mode specified

runPLINK <- function(PLINKoptions = “”) system(paste(“/opt/apps/plink/2.0/bin/plink2”, PLINKoptions))runPLINK() runPLINK(“–bgen /DATA/shared/bcac/genotypes/v10/bgen/icogs_euro/iCOGS_european_chr21.bgen –out /DATA/users/m.shokouhi/plink/plink2”) The mentioned code makes PGEN/PVAR/PSAM files but there is a warning in the procedure: No –bgen REF/ALT mode specified. In plink2 website it is written that it considers the first allele as a reference allele if you do not specify…

Continue Reading No –bgen REF/ALT mode specified

Troubleshooting multallelic variant merging issue

Hello, I want to recode the IIDs of imputed data .bgen files into two different filesets, and merge these (working on eye-level analyses with Regenie). As I’m only interested in dosages, I’ve converted these to .pgen using PLINK2 (ref-first as UK Biobank): plink2 –bgen data.bgen ref-first –sample data.sample –update-ids recoded_ids_a.txt –make-pgen…

Continue Reading Troubleshooting multallelic variant merging issue

Highly inflated p-values in GWAS by regenie

Highly inflated p-values in GWAS by regenie 0 I was running a GWAS using REGENIE 3.2.5 on more than 250,000 samples, and the p-values returned are highly inflated with -log10P up to 5000. As a result there were over 10,000 variants called significant under the threshold of p < 5e-8,…

Continue Reading Highly inflated p-values in GWAS by regenie

Why and how to address this?

Dear All, I’m preparing data for Mendelian randomization (MR) analysis to assess causal effect of telomere length on kidney phenotype in UK Biobank (UKB) data. The following steps were what I have done:  1. I started to search for prior research summary data and found close to 800 SNPs for…

Continue Reading Why and how to address this?

Plink codes showed GWAS results with effect size (beta) and SE as NA: Why and How?

Plink codes showed GWAS results with effect size (beta) and SE as NA: Why and How? 0 Dear All, I’m preparing data for Mendelian randomization (MR) analysis to assess causal effect of telomere length on kidney phenotype using UK Biobank (UKB) data. The following steps were what I have done:…

Continue Reading Plink codes showed GWAS results with effect size (beta) and SE as NA: Why and How?

Invalid .bed file size (expected 9996779 bytes)

Hello Christopher, We have data in PGEN format, and as part of our workflow, we initially created temporary PSAM, PVAR, and PGEN files using the following command:./plink2 –bgen data.bgen ref-first –sample data.sample –set-missing-var-ids @:#:’\$r’:’\$a’ –new-id-max-allele-len 99 truncate –make-pgen –out data.intermediate  2. Following this step, we convert these temporary files to…

Continue Reading Invalid .bed file size (expected 9996779 bytes)

PLINK format file for BOLT-LMM

PLINK format file for BOLT-LMM 0 I would like to perform BOLT-LMM to assess the association of SNP with a quantitative trait. BOLT-LMM requires PLINK format files (bed, bim, fam) and imputed files (bgen, sample). Should I use direct genotype data typed from SNP arrays for the plink format files?…

Continue Reading PLINK format file for BOLT-LMM

How to read PAR variants in chr23 from .bgen file?

How to read PAR variants in chr23 from .bgen file? 0 Hi I want to view how my .bgen is formatted (dosage and other stuff) for PAR and non-PAR variants in chromosome 23. I have the index file and the sample file as well. would you please help me? PAR…

Continue Reading How to read PAR variants in chr23 from .bgen file?

filenameXYZ.bim could not be scanned twice, plink2

Hi Chris,  I am using PLINK2 to merge heavy files (binary filesets) and then exporting them into bgen format. One of the file caused problem, if i remove that file – merging works. I tried to investigate whether it is problem with that specific input fileset, but if i only attempt…

Continue Reading filenameXYZ.bim could not be scanned twice, plink2

How do you see the genotypes for one variant in bgen files in Python?

How do you see the genotypes for one variant in bgen files in Python? 1 I have some bgen and correpsonding samples files and would like to end up with a pandas dataframe of one column sample IDs and another columns with genotypes for one variant for each sample. Does…

Continue Reading How do you see the genotypes for one variant in bgen files in Python?

bgen, missing-code, and error in temporary.psam

Hello! I am trying to rerun (update) my analyses after reviewer responses. It has been a few years since I had to touch these files/code and I am running into problems that I did not have back in 2018 with the bgen files from ALSPAC. Specifically, that the temporary.psam file…

Continue Reading bgen, missing-code, and error in temporary.psam

PLINK, Unphased heterozygous hardcalls in partially-phased variants are poorly represented with bits=8

PLINK, Unphased heterozygous hardcalls in partially-phased variants are poorly represented with bits=8 1 Hi, I’m currently trying to convert dosage data from the vcg.gz format to bgen 1.2 format (8 bits), using plink2, in order to use it later with LDpred-2. However, during the conversion process, I encountered a warning…

Continue Reading PLINK, Unphased heterozygous hardcalls in partially-phased variants are poorly represented with bits=8

How do I restrict to only hm3 SNPs for plink analyses?

How do I restrict to only hm3 SNPs for plink analyses? 0 Hello! I am trying to use plink to select only hm3 snps from the ukb bgen files that I have, but I am at a total loss as to how to proceed. I am really new to all…

Continue Reading How do I restrict to only hm3 SNPs for plink analyses?

–update-name not working – possibly because of comma delim?

Hi there, I have two separate sets of gwas data (‘A’ and ‘B’) with a file per chr.  I am running a targeted gene/pathway analysis so I have filtered out snps per each gene location (209 snps = 209 filtered gene-specific files). For dataset A I ran the following commands:…

Continue Reading –update-name not working – possibly because of comma delim?

how to convert bcf files into bgen

how to convert bcf files into bgen 0 Hi guys, I downloaded vcf.gz files (22 per each chromosome) from TopMed after imputation. Then, I filtered my vcf.gz files per MAF through bcftools, which gave me in output files in bcf format. I am wondering: there is a tool being able…

Continue Reading how to convert bcf files into bgen

PLINK2: covar-variance-standardize option

Hi,   I have been using the newest version of PLINK2 (www.cog-genomics.org/plink/2.0/) to perform a sex-SNP interaction analysis with a binary outcome.     I ran the following command:   plink2 –bgen [FILE].bgen –sample [FILE].sample –pheno-name [pheno] –covar [FILE].phen –covar-name sex PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9…

Continue Reading PLINK2: covar-variance-standardize option

plink2 bgen to vcf ukbiobank

plink2 bgen to vcf ukbiobank 2 Hello I have imputed data from ukbiobank in bgen format. I would like to convert it to a vcf file. I can use plink2 to make pgen files and then use plink2 again to create a vcf plink2 –bgen ukb_imp_chr17_v3.bgen –sample ukimp_chr17_v3_s.sample –make-pgen plink2…

Continue Reading plink2 bgen to vcf ukbiobank

Reading BGEN file using Hail on a Spark cluster results in corrupted matrix table – Hail Query & hailctl

I have a BGEN file that I obtained from a PGEN file via PLINK2. The command I used is as follows: plink2 –pfile my_file –export bgen-1.2 bits=8 –out /some/path –output-chr chrM and I used hl.index_bgen() to create an idx2 index. Now when I have the BGEN file and the idx2…

Continue Reading Reading BGEN file using Hail on a Spark cluster results in corrupted matrix table – Hail Query & hailctl

PLINK2 selecting variants based on INFO score

Hi everyone, I am currently preprocessing UK Biobank genetic data in order to run a GWAS. I am working with the imputed data, which I transformed to .pgen files and then merged into a single file containing all chromosomes. I have run into a problem with excluding variants with poor…

Continue Reading PLINK2 selecting variants based on INFO score

How can I keep INFO value when convert bgen to VCF by using plink2?

How can I keep INFO value when convert bgen to VCF by using plink2? 1 I am working on file handling for GWAS. When I converted bgen to VCF by using plink2 with a commands below, all INFO (and also FILTER) columns became “.” in the output VCF files. A…

Continue Reading How can I keep INFO value when convert bgen to VCF by using plink2?

Using QCTOOL v2 to process UK Biobank .bgen files

Using QCTOOL v2 to process UK Biobank .bgen files – why so slow? 0 I’m currently using QCTOOL v2 to process imputed .bgen files from UK Biobank, however they seem to be processing very slowly. Is this normal? My command is pretty basic; I’m filtering out a list of SNPs…

Continue Reading Using QCTOOL v2 to process UK Biobank .bgen files

qctool to merge two bgen file fails with no clear reason to

Hi, I am trying to merge two bgen files using qctool as explained here. I am using qctool_v2.2.0. The command works but ends with an error: ❱ qctool -g bug/in2.bgen -s bug/in2.sample -merge-in bug/in1.bgen bug/in1.sample -og bla.bgen -os bla.sample Welcome to qctool (version: 2.2.0, revision: unknown) (C) 2009-2020 University of…

Continue Reading qctool to merge two bgen file fails with no clear reason to

P-values far too high for quantitative regenie phenotype

P-values far too high for quantitative regenie phenotype 0 Hi all, I’m having some trouble running regenie (v2.2.4) on a quantitative phenotype for a large cohort. I’m testing a standard height GWAS with heights rounded to the nearest integer. I’ve tried a few different tests to see where the issue…

Continue Reading P-values far too high for quantitative regenie phenotype

vcf to bgen conversion using qctool v2 yields 0 snps

Hi all, I have a vcf file that was extracted from UKB data using qctool (v2.0.6-Ubuntu16.04-x86_64) and contains data in the GP format. This contains a bunch of SNPs from a single chromosome. ❱ wc -l chromosome1.vcf 260 chromosome1.vcf Then I try to convert this file to .bgen again using…

Continue Reading vcf to bgen conversion using qctool v2 yields 0 snps

Why may BOLT-LMM and SAIGE (quantitative, linear-mixed model) yield different results when ran on the absolutely the same dataset?

As a validation experiment, I have run the same GWAS of a quantitative phenotype derived from the UKBiobank, alongside the genomic data from the UKBiobank, once using the program BOLT-LMM and once using SAIGE linear mixed model (with selected quantitative trait tag). I wanted to see if the results would…

Continue Reading Why may BOLT-LMM and SAIGE (quantitative, linear-mixed model) yield different results when ran on the absolutely the same dataset?

Calculate LD matrix from bgen file

formatting error: Calculate LD matrix from bgen file 1 Hello, I am new to plink and am learning as I go. I am trying to calculate an LD matrix for a list of variants while using a bgen file as my reference population. See the command below: ./plink2/plink2 –r2 bin…

Continue Reading Calculate LD matrix from bgen file

extract list of SNPs from multiple chr{1:22}.bgen files using plink2

extract list of SNPs from multiple chr{1:22}.bgen files using plink2 1 hello, I have extracted out list of snps based on the maf cutoff 0,,0.0001, 0.001,0.01,0.1,.55,1.0. I am running plink2 to extract this list from .bgen files for individual chromosomes using the following code plink2 –chr{1:22}.bgen –extract maf1_snps for imputed…

Continue Reading extract list of SNPs from multiple chr{1:22}.bgen files using plink2