Tool that can merge 2 VCF files while taking “representational ambiguity” of (multi-allelic) variants into account
Is there a tool that can merge 2 VCF files while taking “representational ambiguity” of multi-allelic variants into account?
By:
- replaying all variant alleles from the 2 VCF files into the reference genome
- identifying which alleles are actually the same but just written down in a different way
- calculating what the best way is to represent the merged variants/alleles in a new (multi-allelic) variant
See also this question and answer.
Should you decompose and normalize multi-allelic variants for comparison / ID assignment?
The (multi-allelic) variants (alleles) in both VCF files are different because:
- different technology used to make the VCF files
- different alternative alleles present in samples
BCFtools merge does not take “representational ambiguity” of variants into account (as far as I know)
First decomposing and normalizing all variants to bi-allelic in both input VCF files, then merging and collapsing overlapping variants back to multi-allelic destroys some information?
• 19 views
Read more here: Source link