I have counts data (processed already) and I want to get the lengths of the genes from Biomart, in order to normalize the data to TPM.
I’ve done this already many times in the past, and now I have new data, with 50K genes.
This is the code, and it worked fine before:
ensembl = useEnsembl(biomart="ensembl", dataset="hsapiens_gene_ensembl")
genelength = getBM(attributes=c('ensembl_gene_id','ensembl_transcript_id', 'transcript_length','cds_length'), filters="ensembl_gene_id", values = rownames(counts), mart = ensembl, useCache = FALSE)
gene_canonical_transcript = getBM(attributes=c('ensembl_gene_id','ensembl_transcript_id','transcript_is_canonical'), filters="ensembl_gene_id", values = rownames(counts), mart = ensembl, useCache = FALSE)
gene_canonical_transcript_subset = gene_canonical_transcript[!is.na(gene_canonical_transcript$transcript_is_canonical),]
genelength = merge(gene_canonical_transcript_subset, genelength, by = c("ensembl_gene_id", "ensembl_transcript_id"))
return(genelength)
Now I’m getting this error:
Error in curl::curl_fetch_memory(url, handle = handle) :
timeout was reached: [uswest.ensembl.org:443] connection timed out after 10000 milliseconds
What is the problem? I know the the server doesn’t always work because it’s used all the time, but this is a new kind of error I’ve never had something like this before.
And are there any alternatives? another package maybe?
Thank you.
Read more here: Source link