Extract a single line from HTML using Ruby / Mechanize (and Nokogiri)

I am retrieving data from a forum. My script -based operating normally. Now I need to derive the date and time (December 21, 2009, 20:39) from one record. I can not make it work. I used FireXPath to define xpath.

Code example:

 require 'rubygems'
 require 'mechanize'

   post_agent = WWW::Mechanize.new
    post_page = post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')
    puts  post_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div[2]/text()').to_s.strip
    puts  post_page.parser.at_xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div[2]/text()').to_s.strip
    puts post_page.parser.xpath('//[@id="post1960370"]/tbody/tr[1]/td/div[2]/text()')

all my attempts end in a blank line or an error.


I cannot find documentation on using Nokogiri in Mechanize. Mechanize documentation at the bottom of the page:

After you use Mechanize to go to the page you want to clear, then clear it using the Nokogiri methods.

? ? Nokogiri.

+3
2

. , .

Mechanize::Page::parser, . , "xpath" "at_xpath" Nokogiri. xpaths. , xpath, , . , , :

puts  post_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div[2]/text()').to_s.strip

:

puts post_page.parser.xpath('//table').to_html

, , html. HTML, , . , , , , , . , , , , , CSS "userdata", :

puts post_page.parser.xpath("//table[@class='userdata']").to_html

, , xpath, . , :

puts post_page.parser.xpath("//table[@class='userdata']//tr").to_html

, "to_html", Nokogiri, .

.

+27

, Firebug, firebug , ... - . ... , !

+6

Source: https://habr.com/ru/post/1729509/


All Articles