How to compare Rpy2, pyrserve and PypeR?

I would like to access R from a Python program. I know Rpy2, pyrserve and PypeR.

What are the advantages or disadvantages of these three options?

+49
python r rpy2 pyrserve pyper
Apr 12 2018-11-11T00:
source share
4 answers

I know one of the 3 best than the rest, but in the order asked in the question:

rpy2:

  • C-level interface between Python and R (R works as an inline process)
  • Python affected R objects without having to copy data over
  • Conversely, Python numpy arrays can be exposed to R without creating a copy
  • Low level interface (next to R C-API) and high level interface (for convenience)
  • Modification of the place for vectors and arrays is possible.
  • R callback functions can be implemented in Python
  • Anonymous R objects with a Python label may be present
  • Possible python etching
  • Complete customization of R behavior using its console (it is possible to implement a full R GUI)
  • MSWindows Limited Support

pyrserve:

  • native Python code (will / should / can work with CPython, Jython, IronPython)
  • use r rserve
  • advantages and disadvantages associated with remote computing and RServe

Piper:

  • native Python code (will / should / can work with CPython, Jython, IronPython)
  • using pipes to communicate Python with R (with the advantages and disadvantages associated with it)

edit: Windows support for rpy2

+32
Apr 13 '11 at 1:19
source share

From an article in the Statistical Software Journal on PypeR :

RPy provides a simple and efficient way to access R from Python. It is robust and very convenient for frequent interactions between Python and R. This package allows Python programs to pass Python objects of basic data types to R functions and return them to Python objects. Such features make it an attractive solution for cases where Python and R interact frequently. However, there are still limitations to this package, as indicated below.
Performance:
RPy may not behave very well for large data sets or for intensive duty calculations. A lot of time and memory is inevitably consumed when creating a Python copy of the R data, because in each round of conversation RPy converts the returned value of the R expression into a Python object of basic types or an NumPy array. RPy2, a newly developed RPy branch, uses Python objects to reference R objects instead of copying them back to Python objects. This strategy avoids frequent data conversions and improves speed. However, memory consumption remains a problem. [...] When we introduced WebArray (Xia et al., 2005), an online platform for analyzing microchip data, the work consumed about a quarter of the computational time when R was launched via RPy, and not through the R command line user interface. Therefore, we decided to launch R in Python through channels in subsequent developments, for example WebArrayDB (Xia et al., 2009), which retained the same performance as R independently. We don’t know the exact reason for this difference in performance, but we noticed that RPy directly uses the shared R library to run R scripts. In contrast, running R through pipes means direct execution of the interpreter R.
Memory:
R was convicted of his uneconomical use of memory. Memory, objects of size R are rarely freed up after deleting these objects. Sometimes a way to free memory from R is to close R. The RPy module wraps R in a Python object. However, the R library will remain in memory even if the Python object is deleted. In other words, the memory used by R cannot be released until the Python script host is terminated.
Portability:
As a module with extensions written in C, the source RPy package must be compiled with a specific version of R on the POSIX (Portable Operating System Interface for Unix) system, and R must be compiled with the shared library turned on. In addition, binary distributions for Windows are associated with specific combinations of different versions of Python / R, so quite often the user encounters difficulties in finding a distribution that ts a user program environment.

+14
Apr 12 2018-11-12T00:
source share

in pyper, I cannot pass a large matrix from python to an r instance using the assign () function. however i have no problem with rpy2. it's just my experience.

+3
Nov 29
source share

From a developer's perspective, we used rpy / rpy2 to provide statistical and graphical functions to our Python application. This caused huge problems in delivering our application, because rpy / rpy2 needs to be compiled for specific combinations of Python and R, which makes it impossible to provide binary distributions that do not work in the box, except when we also combine R. Since rpy / rpy2 is not particularly easy to install, we replaced the corresponding parts with native Python modules, such as matplotlib. We would switch to pyrserve if we had to use R, because we could start the R-server locally and connect to it without worrying about version R.

+3
Mar 11 '15 at 15:12
source share



All Articles