Well, years after the original question, I think this is worth a revisit. While I agree with Kevin’s answer fully, it’s interesting to see how python has progressed in recent years to catchup with R on the statistical and visualization side of things. As others have mentioned, numpy, pandas, and scipy yield a huge amount of flexibility to python in terms of data manipulation and statistical analysis. It’s true that it doesn’t have many of the purely ‘omics-based packages that R does, but more and more are being ported to python.
Really, the languages complement each other very nicely, in my opinion. Data munging and handling common file formats is easy in python, particularly with pysam, pybedtools, pyvcf, and pandas, and Rpy2 allows you to access those powerful R stats/modeling packages from within python.
I feel confident in saying python matches R’s visualization capabilities at this point in time. I’ve never had a moment where I felt I had to go to R to create the figure I want. The creation of seaborn and plot.ly allow the creation of high-quality, interactive figures very easily without having to fiddle with matplotlib parameters much (if at all). Couple these with ipython and you’ve got some really interesting ways to explain and interactively wade through your data.
Python package installation has also come a long with with the advent of pip, anaconda, and bioconda. Similar to R, nearly every python package is a one-line install. This seemed to be a big complaint of many people 4 years ago, but it’s been largely resolved now.
I don’t see R going anywhere due to all of the packages made specifically to handle analysis of sequencing experiments, network interactions, etc. Python can do those things perfectly well, but why reinvent the wheel when you can pull it out and stick it on your own car whenever you need?
Overall, I feel python has established itself as an important player in bioinformatics for years to come. Part of this is due to its incredibly easy to pick up syntax, general flexibility, and extremely active developer base. I personally hate R as a language, but there’s no denying its status as the backbone of statistical analysis in the bioinformatics community. Of course, that doesn’t mean we have to interact with it anymore than is necessary or that python won’t continue to make advances to help bridge the gap.
Read more here: Source link