Extract content attribute content in website meta tag with given name attribute value with nokogiri in ruby?

My first question is here, it would be great to find the answer. I am new to using nokogiri.

Here is my problem. I have something like this in the HTML header on the target site (here is the Techcrunch post):

<meta content="During my time at TechCrunch I've seen thousands of startups and written about hundreds of them. I sure as hell don't know all ..." name="description"/>

Now, I would like the script to execute meta tags, find one that has the name "description" attribute, and get what is in the content attribute.

I tried something like this

require 'rubygems'
require 'nokogiri'
require 'open-uri'

url = "http://www.techcrunch.com/2009/10/11/the-underutilized-power-of-the-video-demo-to-explain-what-the-hell-you-actually-do/"
doc = Nokogiri::HTML(open(url))
posts = doc.xpath("//meta")
posts.each do |link|
  a = link.attributes['name']
  b = link.attributes['content']
end

after which I can select a link where the attribute name is equal to the description - but this code returns nil for a and b.

I played with posts = doc.xpath("//meta"), posts = doc.xpath("//meta/*")etc., but still zero.

+3
source share
2

xpath, , . , puts doc . , ( , HTML- libxml2).

. , <meta> , , /<meta name="([^"]*)" content="([^"]*)"/

+1

doc = Nokogiri::HTML(open(url))

doc = Nokogiri::HTML(open(url).read)

update: , , :) , ruby ​​1.8.7/nokogiri 1.4.0

0

Source: https://habr.com/ru/post/1727300/


All Articles