Michigan imputation server input preparation

Michigan imputation server input preparation


I’m running the imputation on some WGS VCF files, and I’ve never done it before, so I started with Chr 22 (GRCh37).
I’m just worried because I started with 594949 SNPs in the file, and after the prep (using the Will Rayner perl script recommended here imputationserver.readthedocs.io/en/latest/prepare-your-data/ ) there’s only 243108 for input into the imputation server, which is a high attrition rate! I tried to do some poking, though I don’t know perl very well… best I canfigure is that most of the ones removed do not have an RSid.

My question: Is it OK to have such a reduced set of SNPs before imputation? Should I try to put on an rsid before running this?



Read more here: Source link