How to quickly load web pages in ruby? Parallelizing download?

Question

How to quickly load web pages in ruby? Parallelizing download?

I need to clear (using scrAPI) 400+ ruby web pages, my actual code is very consistent:

data = urls.map {|url| scraper.scrape url }

Actually, the code is a little different (exception handling, etc.).

How can I do it faster? How can I parallelize downloads?

+3

ruby parallel-processing download wget

mykelyk Feb 18 '09 at 12:58

source share

2 answers

womble · Answer 1 · 2009-02-18T01:02:58+0000

th = []
data = []
dlock = Mutex.new

urls.each do |url|
  th << Thread.new(url) do |url|
    d = scraper.scrape url
    dlock.synchronize { data << d }
  end
end

th.each { |t| t.join }

TA-dah! (A warning written from memory, not verified, may eat your kitten, etc.)

Edit: I decided that someone should have written a generalized version of this, and so they have: http://peach.rubyforge.org/ - enjoy!

runako · Answer 2 · 2009-02-18T01:04:30+0000

, Pickaxe :

http://www.rubycentral.com/pickaxe/tut_threads.html

Pickaxe , .

How to quickly load web pages in ruby? Parallelizing download?

More articles: