One-hot encoding for PLINK or VCF
I want to write an autoencoder for SNP data. Is there an established way to one-hot-encode binary PLINK or VCF input? I believe that can be done by manipulating PLINK’s bed file but am afraid to do something wrong.
By one-hot encoding I mean
MISSING = [1000]
HOM_REF = [0100]
HET = [0010]
HOM_ALT = [0001]
Thanks!
• 34 views
Read more here: Source link