I am trying to extract every href link to an html page to evaluate w / nokogiri and xpath. The fact that I still seem to be pulling out only the page headers. I'm not interested in the name of the link, but rather the URL that is pointed to.
Here is what I have:
doc = Nokogiri::HTML(open("http://www.cnn.com"))
doc.xpath('//a').each do |node|
puts node.text
end
Can someone guide me on how to fix this so that I draw the actual href instead of the text itself?
source
share