How to visit a URL with Ruby via http and read the output?

Question

How to visit a URL with Ruby via http and read the output?

So far I have managed to sew this together :)

begin open("http://www.somemain.com/" + path + "/" + blah) rescue OpenURI::HTTPError @failure += painting.permalink else @success += painting.permalink end

But how can I read the output of a service that I would call?

+6

http ruby

Geekedout May 20, '11 at 23:56

source share

3 answers

 doc = open("http://etc..") content = doc.read

More often than not, people want to analyze the returned document, for which use something like hpricot or nokogiri

+6

smathy May 21 '11 at 12:04

source share

I'm not sure if you want to do it yourself, damn it or not, but if you don't. Mecanize is a really good stone for this.

It will visit the desired page and automatically wrap the page using nokogiri so that you can access them using css selectors such as "div # header h1". Ryan Bates has a video tutorial in which you will learn everything you need to know in order to use it.

Basically you can just

 require 'rubygems' require 'mechanize' agent = Mechanize.new agent.get("http://www.google.com") agent.page.at("some css selector").text

It's simple.

+4

David tuite May 21 '11 at 9:07

source share

the tin man · Accepted Answer · 2011-05-21T00:32:38+0000

The Open-URI extends open , so you get the return type of the returned I / O stream:

 open('http://www.example.com') #=> #<StringIO:0x00000100977420>

You must read this to get the content:

 open('http://www.example.com').read[0 .. 10] #=> "<!DOCTYPE h"

Many times, a method allows you to pass different types as a parameter. They check what it is and either use the content directly, in the case of a string, or read the handle if it is a stream.

For HTML and XML, such as RSS feeds, we usually pass the handle to the parser and let it grab the content, parse it and return an object suitable for further search:

 require 'nokogiri' doc = Nokogiri::HTML(open('http://www.example.com')) doc.class #=> Nokogiri::HTML::Document doc.to_html[0 .. 10] #=> "<!DOCTYPE h" doc.at('h1').text #=> "Example Domains"

How to visit a URL with Ruby via http and read the output?

More articles: