Data Imputation for performing UMAP

Data Imputation for performing UMAP

1

Hi guys!

Currently I am working on a dataset with gene ID, it’s expression values and patient IDs. I want to use the UMAP method to process the data and compare results with a previous study. That study used a K-means clustering method.

At the moment my data frame have NA and UMAP cannot process that, it expects all as numeric. I did think of replacing it with zero, however a NA is not zero. Logically NA is NA, it’s not detected for some reason but it doesn’t mean it didn’t have any expression. Yet I cannot remove that gene ID, as it may have expressions in some patients, while some don’t (colnames = Patient ID ; rownames = Gene ID).

Information on Google is very limited, however I have stumble across a relatively new imputation method called ALRA (www.nature.com/articles/s41467-021-27729-z), but I’m still reading about it and I am not sure if it is appropriate for my type data.

Do you guys have any suggestions?


R


Imputation


UMAP

• 24 views

Read more here: Source link