Categories
Tag: bgen
The BGEN format
Software support For command-line users: BGEN support has been implemented in several software packages, click on the links below for more information. For R users: The rbgen package can be used to load data directly into R. For python users: A number of python solutions have been developed, see e.g….
Differentiate sequenced and imputed variants in BGEN files
Differentiate sequenced and imputed variants in BGEN files 0 Hi there! I have recently started working with UK Biobank data, particularly with the imputed BGEN files (for whoever has access to UK Biobank, the files are located at Bulk/Imputation/UKB imputation from genotype). I am also very new to the field…
Bgen file not being opened by PRSice
Bgen file not being opened by PRSice 0 I used the following command to calculate PRS of a sequenced file coming from a collaborator. I imputed the vcf file which gave me separate vcf files for each chromosome. I then converted them to bgen and generated bgi and sample files…
No –bgen REF/ALT mode specified
runPLINK <- function(PLINKoptions = “”) system(paste(“/opt/apps/plink/2.0/bin/plink2”, PLINKoptions))runPLINK() runPLINK(“–bgen /DATA/shared/bcac/genotypes/v10/bgen/icogs_euro/iCOGS_european_chr21.bgen –out /DATA/users/m.shokouhi/plink/plink2”) The mentioned code makes PGEN/PVAR/PSAM files but there is a warning in the procedure: No –bgen REF/ALT mode specified. In plink2 website it is written that it considers the first allele as a reference allele if you do not specify…
Troubleshooting multallelic variant merging issue
Hello, I want to recode the IIDs of imputed data .bgen files into two different filesets, and merge these (working on eye-level analyses with Regenie). As I’m only interested in dosages, I’ve converted these to .pgen using PLINK2 (ref-first as UK Biobank): plink2 –bgen data.bgen ref-first –sample data.sample –update-ids recoded_ids_a.txt –make-pgen…
Highly inflated p-values in GWAS by regenie
Highly inflated p-values in GWAS by regenie 0 I was running a GWAS using REGENIE 3.2.5 on more than 250,000 samples, and the p-values returned are highly inflated with -log10P up to 5000. As a result there were over 10,000 variants called significant under the threshold of p < 5e-8,…
Why and how to address this?
Dear All, I’m preparing data for Mendelian randomization (MR) analysis to assess causal effect of telomere length on kidney phenotype in UK Biobank (UKB) data. The following steps were what I have done: 1. I started to search for prior research summary data and found close to 800 SNPs for…
Plink codes showed GWAS results with effect size (beta) and SE as NA: Why and How?
Plink codes showed GWAS results with effect size (beta) and SE as NA: Why and How? 0 Dear All, I’m preparing data for Mendelian randomization (MR) analysis to assess causal effect of telomere length on kidney phenotype using UK Biobank (UKB) data. The following steps were what I have done:…
Invalid .bed file size (expected 9996779 bytes)
Hello Christopher, We have data in PGEN format, and as part of our workflow, we initially created temporary PSAM, PVAR, and PGEN files using the following command:./plink2 –bgen data.bgen ref-first –sample data.sample –set-missing-var-ids @:#:’\$r’:’\$a’ –new-id-max-allele-len 99 truncate –make-pgen –out data.intermediate 2. Following this step, we convert these temporary files to…
PLINK format file for BOLT-LMM
PLINK format file for BOLT-LMM 0 I would like to perform BOLT-LMM to assess the association of SNP with a quantitative trait. BOLT-LMM requires PLINK format files (bed, bim, fam) and imputed files (bgen, sample). Should I use direct genotype data typed from SNP arrays for the plink format files?…
How to read PAR variants in chr23 from .bgen file?
How to read PAR variants in chr23 from .bgen file? 0 Hi I want to view how my .bgen is formatted (dosage and other stuff) for PAR and non-PAR variants in chromosome 23. I have the index file and the sample file as well. would you please help me? PAR…
filenameXYZ.bim could not be scanned twice, plink2
Hi Chris, I am using PLINK2 to merge heavy files (binary filesets) and then exporting them into bgen format. One of the file caused problem, if i remove that file – merging works. I tried to investigate whether it is problem with that specific input fileset, but if i only attempt…
How do you see the genotypes for one variant in bgen files in Python?
How do you see the genotypes for one variant in bgen files in Python? 1 I have some bgen and correpsonding samples files and would like to end up with a pandas dataframe of one column sample IDs and another columns with genotypes for one variant for each sample. Does…
bgen, missing-code, and error in temporary.psam
Hello! I am trying to rerun (update) my analyses after reviewer responses. It has been a few years since I had to touch these files/code and I am running into problems that I did not have back in 2018 with the bgen files from ALSPAC. Specifically, that the temporary.psam file…
PLINK, Unphased heterozygous hardcalls in partially-phased variants are poorly represented with bits=8
PLINK, Unphased heterozygous hardcalls in partially-phased variants are poorly represented with bits=8 1 Hi, I’m currently trying to convert dosage data from the vcg.gz format to bgen 1.2 format (8 bits), using plink2, in order to use it later with LDpred-2. However, during the conversion process, I encountered a warning…
How do I restrict to only hm3 SNPs for plink analyses?
How do I restrict to only hm3 SNPs for plink analyses? 0 Hello! I am trying to use plink to select only hm3 snps from the ukb bgen files that I have, but I am at a total loss as to how to proceed. I am really new to all…
–update-name not working – possibly because of comma delim?
Hi there, I have two separate sets of gwas data (‘A’ and ‘B’) with a file per chr. I am running a targeted gene/pathway analysis so I have filtered out snps per each gene location (209 snps = 209 filtered gene-specific files). For dataset A I ran the following commands:…
how to convert bcf files into bgen
how to convert bcf files into bgen 0 Hi guys, I downloaded vcf.gz files (22 per each chromosome) from TopMed after imputation. Then, I filtered my vcf.gz files per MAF through bcftools, which gave me in output files in bcf format. I am wondering: there is a tool being able…
PLINK2: covar-variance-standardize option
Hi, I have been using the newest version of PLINK2 (www.cog-genomics.org/plink/2.0/) to perform a sex-SNP interaction analysis with a binary outcome. I ran the following command: plink2 –bgen [FILE].bgen –sample [FILE].sample –pheno-name [pheno] –covar [FILE].phen –covar-name sex PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9…
plink2 bgen to vcf ukbiobank
plink2 bgen to vcf ukbiobank 2 Hello I have imputed data from ukbiobank in bgen format. I would like to convert it to a vcf file. I can use plink2 to make pgen files and then use plink2 again to create a vcf plink2 –bgen ukb_imp_chr17_v3.bgen –sample ukimp_chr17_v3_s.sample –make-pgen plink2…
Reading BGEN file using Hail on a Spark cluster results in corrupted matrix table – Hail Query & hailctl
I have a BGEN file that I obtained from a PGEN file via PLINK2. The command I used is as follows: plink2 –pfile my_file –export bgen-1.2 bits=8 –out /some/path –output-chr chrM and I used hl.index_bgen() to create an idx2 index. Now when I have the BGEN file and the idx2…
PLINK2 selecting variants based on INFO score
Hi everyone, I am currently preprocessing UK Biobank genetic data in order to run a GWAS. I am working with the imputed data, which I transformed to .pgen files and then merged into a single file containing all chromosomes. I have run into a problem with excluding variants with poor…
How can I keep INFO value when convert bgen to VCF by using plink2?
How can I keep INFO value when convert bgen to VCF by using plink2? 1 I am working on file handling for GWAS. When I converted bgen to VCF by using plink2 with a commands below, all INFO (and also FILTER) columns became “.” in the output VCF files. A…
Using QCTOOL v2 to process UK Biobank .bgen files
Using QCTOOL v2 to process UK Biobank .bgen files – why so slow? 0 I’m currently using QCTOOL v2 to process imputed .bgen files from UK Biobank, however they seem to be processing very slowly. Is this normal? My command is pretty basic; I’m filtering out a list of SNPs…
qctool to merge two bgen file fails with no clear reason to
Hi, I am trying to merge two bgen files using qctool as explained here. I am using qctool_v2.2.0. The command works but ends with an error: ❱ qctool -g bug/in2.bgen -s bug/in2.sample -merge-in bug/in1.bgen bug/in1.sample -og bla.bgen -os bla.sample Welcome to qctool (version: 2.2.0, revision: unknown) (C) 2009-2020 University of…
P-values far too high for quantitative regenie phenotype
P-values far too high for quantitative regenie phenotype 0 Hi all, I’m having some trouble running regenie (v2.2.4) on a quantitative phenotype for a large cohort. I’m testing a standard height GWAS with heights rounded to the nearest integer. I’ve tried a few different tests to see where the issue…
vcf to bgen conversion using qctool v2 yields 0 snps
Hi all, I have a vcf file that was extracted from UKB data using qctool (v2.0.6-Ubuntu16.04-x86_64) and contains data in the GP format. This contains a bunch of SNPs from a single chromosome. ❱ wc -l chromosome1.vcf 260 chromosome1.vcf Then I try to convert this file to .bgen again using…
Why may BOLT-LMM and SAIGE (quantitative, linear-mixed model) yield different results when ran on the absolutely the same dataset?
As a validation experiment, I have run the same GWAS of a quantitative phenotype derived from the UKBiobank, alongside the genomic data from the UKBiobank, once using the program BOLT-LMM and once using SAIGE linear mixed model (with selected quantitative trait tag). I wanted to see if the results would…
Calculate LD matrix from bgen file
formatting error: Calculate LD matrix from bgen file 1 Hello, I am new to plink and am learning as I go. I am trying to calculate an LD matrix for a list of variants while using a bgen file as my reference population. See the command below: ./plink2/plink2 –r2 bin…
extract list of SNPs from multiple chr{1:22}.bgen files using plink2
extract list of SNPs from multiple chr{1:22}.bgen files using plink2 1 hello, I have extracted out list of snps based on the maf cutoff 0,,0.0001, 0.001,0.01,0.1,.55,1.0. I am running plink2 to extract this list from .bgen files for individual chromosomes using the following code plink2 –chr{1:22}.bgen –extract maf1_snps for imputed…