How can I make Hpricot play well with HTML5?

I use Hpricot to parse a theme file. I noticed, however, that if I submit a valid HTML5 document to Hpricot (), it automatically closes the HTML5 tags (e.g. <section>) and gets confused with DOCTYPE.

Are there any extensions for Hpricot, or perhaps a flag that I need to set that will allow HTML5 documents to be processed correctly?

+3
source share
2 answers

I know this works around a direct question, but I would suggest you try Nokogiri http://nokogiri.org/ as mentioned in some comments on your question post. I had no problems with this analysis of any HTML / XML, such as structured text, including HTML5.

+2
source

I think the Hpricot to_original_html method is exactly what you are looking for.

From the docs, to_original_html

Trying to save the original HTML document, only displaying new tags for elements that have changed.

0
source

Source: https://habr.com/ru/post/1745917/


All Articles