I am trying to use the R package groHMM
to analyse some data and it wants me to Collapse overlapping annotations but I am working with Arabidopsis and not human data. groHMM
uses the R package GenomicFeatures
for annotations. I am using TxDb.Athaliana.BioMart.plantsmart28
but the tool wants me to collapse overlapping annotations so that overlapping transcripts are merged into “a single set, in which each annotation represents the 5′ and 3′ most boundaries of genes”. I do not know how to do this in R and the example code only works for human data (hg19). If anyone knows how to get it to work for the Arabidopsis data, that would be really helpful.
The example code in the package works like this:
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
kgdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
kgChr7 <- transcripts(kgdb, filter=list(tx_chrom = "chr7"), columns=c("gene_id", "tx_id", "tx_name"))
library(org.Hs.eg.db)
kgConsensus <- makeConsensusAnnotations(kgChr7, keytype="gene_id", mc.cores=getOption("mc.cores"))
map <- select(org.Hs.eg.db, keys=unlist(mcols(kgConsensus)$gene_id), columns=c("SYMBOL"), keytype=c("ENTREZID"))
While my attempt at using a similar Arabidopsis database fails:
map <- select(org.At.tair.db, keys=unlist(mcols(atConsensus)$gene_id), columns=c("SYMBOL"), keytype=c("ENTREZID"))
Error in .testForValidKeys(x, keys, keytype, fks) :
None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.
Even if I cannot get this exact code to work, anything that can give me the correct set of collapsed transcripts that groHMM can accept, would be great.
Read more here: Source link