Retrieving phytozome data using the R bioconductor package biomaRt

Short answer is that I think for now you have to bypass some of the biomaRt functions, and create a Mart object yourself. So give this a try:

library(biomaRt)

phytozomeMart <- new("Mart", 
                 biomart = "phytozome_mart",
                 vschema = "zome_mart", 
                 host = "https://phytozome.jgi.doe.gov:443/biomart/martservice")

The rest of your code should work using this object now, e.g.:

mysets <- listDatasets(phytozomeMart)
mydataset <- mysets$dataset[mysets$dataset == "phytozome"]
myusemart <- useDataset(as.character(mydataset), mart = phytozomeMart)

allattributes <- listAttributes(mart = myusemart)
resultTable <- getBM(attributes = "organism_name", mart = myusemart)

Checking the content:

> resultTable[1:5, ,drop = FALSE]
  organism_name
1 Smoellendorffii
2         Cpapaya
3       Rcommunis
4        Csativus
5       Vvinifera

As for why this is necessary, it looks like this instance of BioMart automatically redirects you to a https server when you run the query, and you need to access port 443. Although you can provide a port argument to useMart, it is currently overridden if there’s a port specified in the registry on the server. For the mart, that registry is located at phytozome.jgi.doe.gov/biomart/martservice?type=registry, and referencs port 80 throughout. Since biomaRt expects that document to be up-to-date it uses the values there and fails. This is obviously annoying, so I’ll have a think about how best to deal with this situation.

Read more here: Source link