Timeout error using Biomart to get gene lengths

I have counts data (processed already) and I want to get the lengths of the genes from Biomart, in order to normalize the data to TPM.

I’ve done this already many times in the past, and now I have new data, with 50K genes.

This is the code, and it worked fine before:

  ensembl = useEnsembl(biomart="ensembl", dataset="hsapiens_gene_ensembl")
  genelength =  getBM(attributes=c('ensembl_gene_id','ensembl_transcript_id', 'transcript_length','cds_length'), filters="ensembl_gene_id", values = rownames(counts), mart = ensembl, useCache = FALSE)
  gene_canonical_transcript =  getBM(attributes=c('ensembl_gene_id','ensembl_transcript_id','transcript_is_canonical'), filters="ensembl_gene_id", values = rownames(counts), mart = ensembl, useCache = FALSE)
  gene_canonical_transcript_subset = gene_canonical_transcript[!is.na(gene_canonical_transcript$transcript_is_canonical),]
  genelength = merge(gene_canonical_transcript_subset, genelength, by = c("ensembl_gene_id", "ensembl_transcript_id"))
  return(genelength)

Now I’m getting this error:

Error in curl::curl_fetch_memory(url, handle = handle) : 
timeout was reached: [uswest.ensembl.org:443] connection timed out after 10000 milliseconds

What is the problem? I know the the server doesn’t always work because it’s used all the time, but this is a new kind of error I’ve never had something like this before.

And are there any alternatives? another package maybe?

Thank you.

Read more here: Source link