I'm trying to write a simple function that reads a series of files and does some sort of regular expression search (or just the number of words) on them, and then returns the number of matches, and I try to make this run in parallel to speed it up, but so far I haven't could achieve this.
If I do a simple loop with a math operation, I get a significant increase in performance. However, a similar idea for the grep function does not provide an increase in speed:
function open_count(file)
fh = open(file)
text = readall(fh)
length(split(text))
end
tic()
total = 0
for name in files
total += open_count(string(dir,"/",name))
total
end
toc()
elapsed time: 29.474181026 seconds
tic()
total = 0
total = @parallel (+) for name in files
open_count(string(dir,"/",name))
end
toc()
elapsed time: 29.086511895 seconds
I tried different versions, but also did not get a significant increase in speed. Am I doing something wrong?
source
share