Calculate allele frequency from many VCF files in specific locus

Calculate allele frequency from many VCF files in specific locus

1

Dear all,

I have 100 VCF files (100 different samples). I would like to calculate allele frequency in specific sites.

In one specific locus I have three genotypes (GATK best practices workflow):

rs-xxxxx:
A/A occurring in 30 samples (ref hom)
A/G occurring in 21 samples (het)
G/G occurring in 49 samples (alt hom)

Frequency of genotype would be:

A/A = 0.3
A/G = 0.21
G/G = 0.49

But how do I calculate allele frequency of A/G ?

dbSNP define this like: (sum of chromosome counts over all member) / (total chromosome counts over all member)

Thank you for any educative example.

Paul.


genotyp


vcf


next-gen


freq

• 1.6k views

Read more here: Source link