http://www.example.com/books?_pop=mheader
What will be the regular expression to match this and any URL that has βbooksβ in the URLs, as one of the pattern matches? This site has a category of books and various other subcategories. How do I get to finding all the URLs for a book?
require 'anemone' Pattern = %r[(\/books)*] Anemone.crawl("http://www.example.com/") do |anemone| anemone.on_pages_like(Pattern) do |page| puts page.url end end
source share