Looking up Gene IDs in R
Hello,
Given a list of gene names, I need to create a table containing the Ensemble ID, chromosome, start, end of that gene.
Example:
## ens_id gene view chr start end
## 1: ENSG00000243485 MIR1302-2HG Gene Expression chr1 29553 30267
## 2: ENSG00000237613 FAM138A Gene Expression chr1 36080 36081
## 3: ENSG00000186092 OR4F5 Gene Expression chr1 65418 69055
What command can I use to look up ensemble IDs and start/end locations of genes?
• 34 views
The biomaRt library is great for this.
library("biomaRt")
genes <- c("MIR1302-2HG", "FAM138A", "OR4F5")
ensembl <- useEnsembl("genes", "hsapiens_gene_ensembl")
gene_info <- getBM(
mart=ensembl,
attributes=c("ensembl_gene_id", "external_gene_name", "gene_biotype",
"chromosome_name", "start_position", "end_position", "strand"),
filters=list(external_gene_name=genes))
> gene_info
ensembl_gene_id external_gene_name gene_biotype chromosome_name
1 ENSG00000243485 MIR1302-2HG lncRNA 1
2 ENSG00000237613 FAM138A lncRNA 1
3 ENSG00000186092 OR4F5 protein_coding 1
start_position end_position strand
1 29554 31109 1
2 34554 36081 -1
3 65419 71585 1
See the documentation for more information.
Traffic: 1564 users visited in the last hour
Read more here: Source link