Ruby Regular Expression

http://www.example.com/books?_pop=mheader

What will be the regular expression to match this and any URL that has β€œbooks” in the URLs, as one of the pattern matches? This site has a category of books and various other subcategories. How do I get to finding all the URLs for a book?

require 'anemone' Pattern = %r[(\/books)*] Anemone.crawl("http://www.example.com/") do |anemone| anemone.on_pages_like(Pattern) do |page| puts page.url end end 
+4
source share
2 answers

http://rubular.com/ is a useful regular expression test tool for Ruby.

The regular expression will be simple, /http:\/\/.+(books)/ . It matches http:// and also helps provide its URL. Here is the rubular test from http://www.example.com/reference-books-2300 .

+3
source

The pattern for matching / books in your url should just be "/ books"

This is a good site for checking your regular expressions http://regexpal.com to make sure you have at least some of your code.

+1
source

Source: https://habr.com/ru/post/1432860/


All Articles