Extract content attribute content in website meta tag with given name attribute value with nokogiri in ruby?

Question

Extract content attribute content in website meta tag with given name attribute value with nokogiri in ruby?

My first question is here, it would be great to find the answer. I am new to using nokogiri.

Here is my problem. I have something like this in the HTML header on the target site (here is the Techcrunch post):

<meta content="During my time at TechCrunch I've seen thousands of startups and written about hundreds of them. I sure as hell don't know all ..." name="description"/>

Now, I would like the script to execute meta tags, find one that has the name "description" attribute, and get what is in the content attribute.

I tried something like this

require 'rubygems'
require 'nokogiri'
require 'open-uri'

url = "http://www.techcrunch.com/2009/10/11/the-underutilized-power-of-the-video-demo-to-explain-what-the-hell-you-actually-do/"
doc = Nokogiri::HTML(open(url))
posts = doc.xpath("//meta")
posts.each do |link|
  a = link.attributes['name']
  b = link.attributes['content']
end

after which I can select a link where the attribute name is equal to the description - but this code returns nil for a and b.

I played with posts = doc.xpath("//meta"), posts = doc.xpath("//meta/*")etc., but still zero.

+3

ruby xpath nokogiri

Stevensson Jan 4 '10 at 23:28

source share

2

akuhn · Answer 1 · 2010-01-05T02:00:32+0000

xpath, , . , puts doc . , ( , HTML- libxml2).

. , <meta> , , /<meta name="([^"]*)" content="([^"]*)"/

mykhal · Answer 2 · 2010-01-05T16:24:46+0000

doc = Nokogiri::HTML(open(url))

doc = Nokogiri::HTML(open(url).read)

update: , , :) , ruby 1.8.7/nokogiri 1.4.0

Extract content attribute content in website meta tag with given name attribute value with nokogiri in ruby?

More articles: