I have three VCFs, a child (male), a father and a mother, and I would like to extract de novo variants in the child.
All three samples were called separately, however, using the same GATK pipeline. I ran rtg tool to try to extract the de novo variants following this command:
rtg mendelian -t /path/to/referencegenome.sdf --input trio.vcf.gz --lenient --output-inconsistent trio-non-mendelian.vcf.gz
According to the rtg manual:
Records where the presence of missing
values makes the Mendelian consistency undecidable contain MCU INFO annotations in the annotated output VCF.
The following examples illustrate some consistent, undecidable, and inconsistent calls in the presence of missing
Below are the headers related to rtg in my output vcf:
##INFO=<ID=MCV,Number=.,Type=String,Description="Variant violates mendelian inheritance constraints"> ##INFO=<ID=MCU,Number=.,Type=String,Description="Mendelian consistency status can not be determined">
Below are some annotated info in my VCF. I’m having a difficulty in understanding the output and would hightly appreciate if someone could explain how to interpret the resulting annotations.
INFO FORMAT Child Father Mother ;MCV=Child:0/0+1/1->0/0 GT:AD:DP:GQ:PGT:PID:PL . . 1/1:0,2:2:6:1|1:10433_A_ACCCTTAACCCCTAAC:90,6,0 ;MCV=Child:1/1+0/0->1/1 GT:AD:DP:GQ:PL 1/1:0,5:5:15:157,15,0 1/1:0,3:3:9:93,9,0 . ;MCV=GGSpi005:0/0+0/0->0/1 GT:AD:DP:GQ:PL 0/1:4,5:9:72:99,0,72 . .
Read more here: Source link