Using a Mechanize gem to return a collection of links based on their position in the DOM

I struggle with mechanization. I want to "click" to establish a set of links that can only be identified by their position (all links in the content div #) or their href.

I have tried both of these identification methods above without success.

From the documentation, I could not understand how to return a collection of links (for a click) based on their position in the DOM, and not on the attributes directly on the link.

Secondly, in the documentation you can use: href for partial href,

page = agent.get('http://foo.com/').links_with(:href => "/something") 

but the only way to return it for a link is to pass the full URL, for example

 page = agent.get('http://foo.com/').links_with(:href => "http://foo.com/something/a") 

This is not very useful if I want to return a collection of links using href's

 http://foo.com/something/a http://foo.com/something/b http://foo.com/something/c etc... 

Am I doing something wrong? Do I have unrealistic expectations?

+6
source share
3 answers

Part II The value you pass in: href should be an exact match by default. So the href in your example will only match <a href="/something"></a> , not <a href="foo.com/something/a"></a>

What you want to do is pass in the regular expression so that it matches the substring in the href field. For instance:

 page = agent.get('http://foo.com/').links_with(:href => %r{/something/}) 

edit Part I To make it select links only by reference, add the nokogiri style lookup method to your line. Like this:

 page = agent.get('http://foo.com/').search("div#content").links_with(:href => %r{/something/}) # ** 

Well, that doesn't work, because after doing page = agent.get('http://foo.com/').search("div#content") you get a Nokogiri object instead of mechanizing, so links_with will not work. However, you can extract links from a Nokogiri object using the css method. I would suggest something like:

page = agent.get('http://foo.com/').search("div#content").css("a")

If this does not work, I suggest checking out http://nokogiri.org/tutorials

+8
source

nth link:

 page.links[n-1] 

The first 5 links:

 page.links[0..4] 

links to "something" in href:

 page.links_with :href => /something/ 
+2
source

You can get mechanized links using nokogiri nodes. See the source code for the links () method.

 # File lib/mechanize/page.rb, line 352 def links @links ||= %w{ a area }.map do |tag| search(tag).map do |node| Link.new(node, @mech, self) end end.flatten end 

So this means:

 the_links= page.search("valid_selector").map do |node| Mechanize::Page::Link.new(node, agent, page) end 

This will give you useful href, text and uri methods.

+1
source

Source: https://habr.com/ru/post/915197/


All Articles