when explained variation per PC is too low while running PCA with SNP data

when explained variation per PC is too low while running PCA with SNP data

0

I ran PCA with 91 samples(consisted of 23breeds and one outgroup which is different subspecies).

about 18,000,000 SNPs were used when running PCA. but the variation explained were too low which was about 5% for PC1 and 4% for PC2.

I tried few attempts to increase the variation explained myself.

  1. Since the number of samples per breeds were different, I removed some breeds which have too few or too many samples in order to equalize the sample numbers.

  2. I removed the outgroup to see what happens because it would affect the PCs too much(I’m not sure if this is ok since I’m gonna use the outgroup for drawing evolutionary trees)

  3. I increased minor allele frequency from 0.01 to 0.05 to leave more moderate variations

  4. I increased –max-nocall-fraction option of SelectVariants from 0.05 to 0.09 to make the imputation process more linked(I’m not sure if this is write. maybe I should have decreased the value)

There was little difference but none of these increased the results obviously. Is there a way I could try or should I just show few other PCs together.


variation


PCA


SNP

• 13 views

2 hours ago by


?

▴

10

Read more here: Source link