Parallel text processing in julia

Question

Parallel text processing in julia

I'm trying to write a simple function that reads a series of files and does some sort of regular expression search (or just the number of words) on them, and then returns the number of matches, and I try to make this run in parallel to speed it up, but so far I haven't could achieve this.

If I do a simple loop with a math operation, I get a significant increase in performance. However, a similar idea for the grep function does not provide an increase in speed:

function open_count(file)
    fh = open(file)
    text = readall(fh)
    length(split(text))
end



tic()
total = 0
for name in files
    total += open_count(string(dir,"/",name))
    total
end
toc()
elapsed time: 29.474181026 seconds


tic()
total = 0
total = @parallel (+) for name in files
    open_count(string(dir,"/",name))
end
toc()

elapsed time: 29.086511895 seconds

I tried different versions, but also did not get a significant increase in speed. Am I doing something wrong?

+4

parallel-processing julia-lang

Matías Guzmán Naranjo Jan 23 '14 at 6:46

source share

1 answer

niczky12 · Answer 1 · 2016-01-22T22:39:16+0000

R Python. , .

, . . RAMDisk, ( ), .

, , : , . , . , . , , . , .

, .

Parallel text processing in julia

More articles: