How to visit a URL with Ruby via http and read the output?

So far I have managed to sew this together :)

begin open("http://www.somemain.com/" + path + "/" + blah) rescue OpenURI::HTTPError @failure += painting.permalink else @success += painting.permalink end 

But how can I read the output of a service that I would call?

+6
source share
3 answers

The Open-URI extends open , so you get the return type of the returned I / O stream:

 open('http://www.example.com') #=> #<StringIO:0x00000100977420> 

You must read this to get the content:

 open('http://www.example.com').read[0 .. 10] #=> "<!DOCTYPE h" 

Many times, a method allows you to pass different types as a parameter. They check what it is and either use the content directly, in the case of a string, or read the handle if it is a stream.

For HTML and XML, such as RSS feeds, we usually pass the handle to the parser and let it grab the content, parse it and return an object suitable for further search:

 require 'nokogiri' doc = Nokogiri::HTML(open('http://www.example.com')) doc.class #=> Nokogiri::HTML::Document doc.to_html[0 .. 10] #=> "<!DOCTYPE h" doc.at('h1').text #=> "Example Domains" 
+6
source
 doc = open("http://etc..") content = doc.read 

More often than not, people want to analyze the returned document, for which use something like hpricot or nokogiri

+6
source

I'm not sure if you want to do it yourself, damn it or not, but if you don't. Mecanize is a really good stone for this.

It will visit the desired page and automatically wrap the page using nokogiri so that you can access them using css selectors such as "div # header h1". Ryan Bates has a video tutorial in which you will learn everything you need to know in order to use it.

Basically you can just

 require 'rubygems' require 'mechanize' agent = Mechanize.new agent.get("http://www.google.com") agent.page.at("some css selector").text 

It's simple.

+4
source

Source: https://habr.com/ru/post/888680/


All Articles