You can use make a QQ plot in the following ways.
- one-liner for reading tons of millions of P values from the pipe
# python
zcat pval.txt.gz | qqplot.py -out test -title "QQ plot on the fly"
# julia (recommand to run it in the REPL)
zcat pval.txt.gz | qqplot.jl --out test --title "QQ plot on the fly"
warning : If you have 100 billion P values to process you should definitely use qqplot.jl instead of qqplot.py. The hourly processed number of lines of julia version is 5 billion while python is only 700 million on my server.
- running in a julia REPL (recommanded)
include("qqplot.jl")
cmd = pipeline(`zcat pval.gz`, `awk 'NR>1{print $10}'`)
qqfly("test", cmd=cmd)
- use qqplot.py in your script
import numpy as np
from qqplot import qqplot
p = np.random.random(1000000)
qqplot(x=p, figname="test.png")
Before running bcftools merge
, you maybe need to fix the ref and alt and corresponding genotypes, otherwise bcftools
will surprise you.
usage: fixref.py [-h] REF_VCF IN_VCF OUT_VCF
When you run imputation analysis with BEAGLE
(or other imputation tools), you may want to know the distribution of genotype discordance between the original vcf and imputed vcf.
usage: calc_imputed_gt_discord.py [-h] [-chr STRING] VCF1 VCF2 OUT
warning : Before running the script, you must be sure the two vcfs have the exact same sites and samples for each chromosome.
Read more here: Source link