Playing with Scrapi in Rails 3 .. getting a segmentation error / trap interruption

What have I done so far ..

sudo gem install scrapi

sudo gem install tidy

This did not work because it did not have libtidy.dylib

So, I did this:

sudo port install tidy

sudo cp libtidy.dylib /Library/Ruby/Gems/1.8/gems/scrapi-1.2.0/lib/tidy/libtidy.dylib

Then I started using the simple railscast at: http://media.railscasts.com/videos/173_screen_scraping_with_scrapi.mov

Right after Mr. Bates finished the first save scrapitest.rb, I tried to run this code:

require 'rubygems'
require 'scrapi'

scraper = Scraper.define do
  process "title", :page_name => :text
  result :page_name
end

uri = URI.parse("http://www.walmart.com/search/search-ng.do?search_query=lost+season+3&ic=48_0&search_constraint=0")
p scraper.scrape(uri)

With this code:

ruby scrapitest.rb

And he returned this error:

/Library/Ruby/Gems/1.8/gems/tidy-1.1.2/lib/tidy/tidybuf.rb:39: [BUG] Segmentation fault
ruby 1.8.7 (2009-06-12 patchlevel 174) [universal-darwin10.0]

Abort trap

Completely out of ideas.

+3
source share
2 answers

I had this problem, and then the subsequent problem, in which the seg error would occur non-deterministically.

: http://rubyforge.org/tracker/index.php?func=detail&aid=10007&group_id=435&atid=1744

tidy-1.1.2/lib/tidy/tidylib.rb:

1. "load" Tidylib:

  extern "void tidyBufInit(void*)"
2. , buf_init Tidylib:

  # tidyBufInit, using default allocator
  #
  def buf_init(buf)
    tidyBufInit(buf)
  end

tidy-1.1.2/lib/tidy/tidybuf.rb:

3. Tidybuf malloc:

   Tidylib.buf_init(@struct)

:


  # tidyBufInit, using default allocator
  #
  def buf_init(buf)
    @struct = TidyBuffer.malloc
    Tidylib.buf_init(@struct)
  end
4. , TidyBuffer , :

  TidyBuffer = struct [
    "TidyAllocator* allocator",
    "byte* bp",
    "uint size",
    "uint allocated",
    "uint next"
  ] 
+2

: /opt/ruby/ruby -1.8.6/lib/ruby/gems/1.8/gems/tidy-1.1.2/lib/tidy/tidybuf.rb:40: [BUG]

, Tidy (0.99) (/usr/include/buffio.h - $Date: 2007/01/23 11:17:45 $)

tidybuf.rb:

--- tidybuf.rb  2007-04-10 09:09:01.000000000 -0500
+++ tidybuf.rb.patched  2007-04-10 09:08:55.000000000 -0500
@@ -11,6 +11,7 @@
   # Mimic TidyBuffer.
   #
   TidyBuffer = struct [
+    "int* allocator",
     "byte* bp",
     "uint size",
     "uint allocated",    
+1

Source: https://habr.com/ru/post/1767268/


All Articles