R Methodology: Management and I / O between the R Notebook and large computing servers

Question

R Methodology: Management and I / O between the R Notebook and large computing servers

This is a general methodological issue regarding R as a means to:

set up and run tasks on remote computing platforms for various intensive modeling tasks,
then receive data from these remote computing servers and
then do the analysis.

R, of course, depends on this task, and I believe that this is a problem that many others have considered and implemented, so I hope to learn from previous experiments.

I am currently using R along with the R system command to manage Putty pscp and Plink to transfer a batch file, invoke a process, wait for it to complete, and then copy the results for processing.

I admit it's rude, but it works surprisingly well.

Are there any better ways? The returned data files may be large.

I would like to define the next step in a gradual progression, not too crazy. It should be easy.

+4

r

bob123 Jan 24 '12 at 6:37

source share

1 answer

Matt dowle · Answer 1 · 2012-01-24T14:21:13+0000

There is an 8-foot video of client-server interaction on the data.table page , which may or may not be interesting, but have you looked at all the HPC Task View already? However, this does not seem to mention Rserve, so take a look at it just like Hansi suggested.

R Methodology: Management and I / O between the R Notebook and large computing servers

More articles: