SNP Pruning Through PCA (Edit: Feature Selection Through PCA)

SNP Pruning Through PCA (Edit: Feature Selection Through PCA)

1

Hello,

I have roughly 1 million SNPs from 700 individuals and I wanted to prune the SNPs down, potentially through PLINK’s –pca command. However, I’m a little perplexed with how the eignvals/vectors I receive from the –pca command are to be used in order to prune my SNPs. Or am I completely misunderstanding? Could anyone clarify?

Below is a sample of the vectors:
enter image description here

Values:
Values

Edit: I want to leave the original post up but to further clarify. From my ML experience, PCAs can perform feature selection and I wish to do the same with the SNPs (apologies if ‘pruning’ means something different in bioinformatics.)
Below is a sample of my variant weights:
Sample

In Python, the PCA does the feature selection automatically once you’ve fitted/transformed the data. So is there a way of performing feature selection on the SNPs? Like looking at the variant’s first 3 weights and only take SNPs that have a minimum weight of ‘X’?


Pruning


PCA


PLINK


SNPs


bioinformatics

• 209 views

updated 31 minutes ago by

▴

790

written 22 hours ago by

0

Read more here: Source link