You can use this script in two ways
- read tons of millions of P values from stdin
# python
zcat pval.txt.gz | qqplot.py -out test -title "QQ plot on the fly"
# julia
zcat pval.txt.gz | qqplot.jl --out test --title "QQ plot on the fly"
warning : If you have 100 billion P values to process you should definitely use qqplot.jl instead of qqplot.py. The hourly processed lines of julia version is 3 billion while python is only 700 million on my server.
- use qqplot.py in your script
import numpy as np
from qqplot import qqplot
p = np.random.random(1000000)
qqplot(x=p, figname="test.png")
Before running bcftools merge
, you maybe need to fix the ref and alt and corresponding genotypes, otherwise bcftools
will surprise you.
usage: fixref.py [-h] REF_VCF IN_VCF OUT_VCF
When you run imputation analysis with BEAGLE
(or other imputation tools), you may want to know the distribution of genotype discordance between the original vcf and imputed vcf.
warning : Before running the script, you must be sure the two vcfs have the exact same sites and samples for each chromosome.
usage: calc_imputed_gt_discord.py [-h] [-chr STRING] VCF1 VCF2 OUT
Read more here: Source link