You can use this script in two ways
- read tons of millions of P values from stdin
# python zcat pval.txt.gz | qqplot.py -out test -title "QQ plot on the fly" # julia zcat pval.txt.gz | qqplot.jl --out test --title "QQ plot on the fly"
warning : If you have 100 billion P values to process you should definitely use qqplot.jl instead of qqplot.py. The hourly processed lines of julia version is 3 billion while python is only 700 million on my server.
- use qqplot.py in your script
import numpy as np from qqplot import qqplot p = np.random.random(1000000) qqplot(x=p, figname="test.png")
bcftools merge, you maybe need to fix the ref and alt and corresponding genotypes, otherwise
bcftools will surprise you.
usage: fixref.py [-h] REF_VCF IN_VCF OUT_VCF
When you run imputation analysis with
BEAGLE (or other imputation tools), you may want to know the distribution of genotype discordance between the original vcf and imputed vcf.
warning : Before running the script, you must be sure the two vcfs have the exact same sites and samples for each chromosome.
usage: calc_imputed_gt_discord.py [-h] [-chr STRING] VCF1 VCF2 OUT
Read more here: Source link