I have a piece of code that processes a 500 MB XML file using libxml-ruby gem. Surprisingly for me, this code runs slower with the GC disabled , which seems inconsistent. What could be the reason? I have a lot of available memory, and the system does not swap places.
require 'xml'
Here are the results I got:
ruby gc on gc off 2.2.0 16.93s 18.81s 2.1.5 16.22s 18.58s 2.0.0 17.63s 17.99s
Why turn off the garbage collector? I read in Ruby Performance Optimization that Ruby is slower because programmers donβt think about memory consumption, which makes the garbage collector take a lot of time to execute. Thus, turning off the GC should instantly speed up work (by using memory, of course) until the system changes.
I wanted to know if my XML parsing module could be improved, so I started experimenting with it by disabling GC, which led me to this problem. I expected significant acceleration with the GC disabled, but instead I got the opposite. I know that the differences are not huge, but still this is strange to me.
libxml-ruby gem uses the built-in C implementation of LibXML under the hood - could this be the reason?
The file I used is manually propagated by the books.xml example downloaded from the Microsoft documentation:
<catalog> <book id="bk101"> <author>John Doe</author> <title>XML for dummies</title> <genre>Computer</genre> <price>44.95</price> <publish_date>2000-10-01</publish_date> <description>Some description</description> </book> .... </catalog>
My setup: OS X Yosemite, Intel Core i5 2.6 GHz, 16 GB of RAM.
Thanks for any suggestions.