I have a program that outputs strings of CSV data that I want to load into a data frame. I am currently loading data as follows:
tmpFilename <- "tmp_file" system(paste(procName, ">", tmpFilename), wait=TRUE) myData <- read.csv(tmpFilename)
However, I thought that redirecting the output to a file just for reading from it was inefficient (the program spills out about 30 MB, so I want to deal with it with optimal performance). I thought textConnection would solve this, so I tried:
con <- textConnection(system(procName, intern=TRUE)) myData <- read.csv(con)
This works much slower, and although the first solution decreases linearly with input size, the performance of the textConnection solution deteriorates exponentially. The slowest part creates a textConnection . read.csv actually ends here faster than in the first solution, as it reads from memory.
My question is, does a read.csv only read.csv on it read.csv my best option regarding speed? Is there a way to speed up the creation of a text join? bonus: why is creating textConnection so slow?
Hudon source share