Proper way(s) to perform enrichment analysis in R

I am not sure what is the proper way to carry out over-representation analysis (and also gene set enrichment analysis) for RNAseq data. Ideally, the analysis can be performed in R, otherwise, if the software/ platform can export the output file (also include all the non-statistical-significant term) will also be sufficient.

I come up with this question is mainly because I recently discovered a website:

I found that this website host a list of pathway/ ontology databases (quite comprehensively). However, the p-value calculated there cannot be reproduced with two other methods I previously used: using the R package clusterProfiler and the web of g:Profiler. Even more surprising to me, the results from clusterProfiler and g:Profiler are also not the same, i.e. I got three distinct results from three analysis platforms while I am using the same gene list, and same gmt file.

For reproducibility purposes (only the R section), here are the codes I used for clusterProfiler (I just recently upgraded to the latest version (v4.0.5):

interests <- c("TNF", "PTGS2", "ZFP36", "GHRL", "CDH6", "TNFRSF1A", "IL6", "CXCL8", "CD27", "PTHLH", "IL1B", "CCL2", "PPARA", "CCL3", "AGT", "CCL5", "IL10", "ICAM1", "EDN1", "CAT", "CXCL10", "NOS2", "COL3A1",  "IL1RN", "TIMP1", "FABP4", "MMP3", "RELA", "COL1A1", "VCAM1", "CALCA", "CASP3", "TGFB1", "SERPINE1", "IL18", "IGF1", "VEGFA", "SLC22A1", "SOD2", "LPL", "NOS3", "PTGS1", "IFNG", "FN1", "MMP8")
wp <- read.gmt(url(""))
ewp <- enricher(interests, TERM2GENE=wp[,c("term", "gene")], TERM2NAME=wp[,c("term", "term")])

Then, the most significant term is platelet alpha granule lumen (GO:0031093) with pvalue = 1.658067e-8, p.adjust = 8.456139e-7, and qvalue = 6.283200e-7

I then use the same gene list and query against the g:Profiler by manually uploading the same gmt file. The platelet alpha granule lumen (GO:0031093) has p.adjust = 1.642×10-6.

Finally, the original, enrichr returns the pavlue = 9.476e-9 and p.adjust = 5.780e-7.

Does anyone know what is the underlying cause of the differences? I tried to look up the underlying formula for the clusterProfiler, and find that it uses

pvalues <- apply(args.df, 1, function(n) phyper(n[1], n[2], n[3], n[4], lower.tail = FALSE))

More questions are thus raised:

  1. Should we use a two-tail test for the null hypothesis? I read the article Isabelle Rivals, et al mentioned we probably should not use a one-tail test.
  2. Assuming the one-tail test is justified, how can enrichr get a much lower p-value (I assume maybe the g:Profiler used the two-tail test, but not sure where to check their internal algorithm)?
  3. enrichr stated that they use fisher’s exact test, but the results are obviously off. I am not sure why.
  4. Generally, do you have advice on where to perform the GSEA instead of the over-representation analysis I did here.

Read more here: Source link