I’m finding that the a gene name corresponds to different Ensembl IDs depending if it’s a ‘gene name’ or a ‘gene synonym’ in ENSEMBL. my understanding was that ENSEMBL ID were unique. there’s one location on the genome for each of these genes, so im very unclear how they are multiple Ensembl ID when a gene is a ‘gene name’ vs a ‘gene synonym’ in ENSEMBL.
context for the problem:
I got a list of genes from a collaborator and I wanted to look at their expression in a new dataset.
Since differences in gene names (e.g using gene synonyms) between datasets can lead to loss of information if they’re not accounted for, to make sure that I’m not missing anything from my dataset, I got a list of all the gene synonyms for each gene by looking at the ‘Gene Synonyms’ under ENSEMBL.
For example, one gene on the list was ‘SLC6A2’. I found synonyms under the same ENSEMBL ID, called ‘NAT1’, ‘SLC6A5’, and ‘NET1’. All of these had the Ensembl ID ‘ENSG00000103546’.
But if I look at NAT1, NET1, and SLC6A5 on Ensembl separately, I also separate Gene Stable IDs, for each of these genes (‘ENSG00000171428’, ‘ENSG00000173848’, ‘ENSG00000165970’, respectively).
So I’m wondering how NAT1/ SLC6A5/NET1 are synonyms of SLC6A2 if they each have their own Gene stable IDs in addition to sharing one with SLC6A2.
Do I just use SLC6A2 in my analysis or also account for all the other gene synonyms?
I know there’s been discussion on this platform, but is “gene name” synonymous with ‘HUGO’ nomenclature? Why am I getting different ensembl gene ids for a given gene symbol?
I think my issue is different because I see only one ENSEMBL ID per ‘gene name’, but if the same name is in ‘gene synonym’ I get a different ENSEMBL ID.
Read more here: Source link