Deploy XPath or CSS faster (for Nokogiri in HTML files)?

I would like to know if Nokogiri XPath or CSS parsing works faster with HTML files. How is the speed different?

+6
source share
1 answer

Nokogiri does not have XPath or CSS syntax. It parses XML / HTML into a single DOM, which can then use CSS or XPath syntax to query.

CSS selectors are internally converted to XPath before requesting libxml2 to execute the request. Thus (for the same selectors) the XPath version would be a tiny fraction faster since CSS does not need to be converted to XPath first.

However, your question does not have a general answer; it depends on what you choose and what your XPath looks like. Most likely, you will not write the same XPath as Nokogiri. For example, see if you can guess XPath for the following two CSS statements:

puts Nokogiri::CSS.xpath_for('#foo') #=> //*[@id = 'foo'] puts Nokogiri::CSS.xpath_for 'div.article a.external' #=> //div[contains(concat(' ', @class, ' '), ' article ')]//a[contains(concat(' ', @class, ' '), ' external ')] 

Unlike a web browser, the id and class attributes do not have an accelerated cache, so choosing them does not help. In fact, a general interpretation of div.article involves much more work than something like div[@class='article'] .

As @LBg commented, you should check yourself if absolute speed is critical.

However, I would suggest the following: do not worry about it . Computers are fast. Write what is most convenient for you, a programmer. If the CSS selector is easier to handle, faster to type, and more understandable when viewing code later, use . Use XPath when you need to do what you cannot do with CSS selector syntax.

How long will it take Nokogiri to convert complex CSS to XPath?

 t = Time.now 1000.times do |i| # Use a different CSS string each time to avoid built-in caching css = "body#foo table#bar#{i} thead th, body#foo table#bar#{i} tbody td" Nokogiri::CSS.xpath_for(css) end puts (Time.now - t)/1000 #=> 0.000405041 

Less than half a millisecond.

+18
source

Source: https://habr.com/ru/post/902015/


All Articles