This is a simplified example of how to do this using a parser:
require 'nokogiri' html = '<p>lorem ipsum blah blah ipsum</p> REPLACE MULTI-LINE CONTENT HERE... <p>other stuff still here...</p>' doc = Nokogiri.HTML(html) puts doc.to_html
After parsing we get:
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> # >> <html><body> # >> <p>lorem ipsum blah blah ipsum</p> # >> # >> # >> REPLACE MULTI-LINE # >> CONTENT HERE... # >> # >> # >> <p>other stuff still here...</p> # >> </body></html> doc.at('//comment()/following-sibling::text()').content = "\nhello world!\n" puts doc.to_html
After searching for the comment, moving on to the next text() node and replacing it:
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> # >> <html><body> # >> <p>lorem ipsum blah blah ipsum</p> # >> # >> # >> hello world! # >> # >> # >> <p>other stuff still here...</p> # >> </body></html>
If your HTML will always be simple, without the ability to have lines that violate your search patterns, you can go with search / replace.
If you check, you will see that for any non-trivial HTML manipulations you have to go with a parser. This is due to the fact that they are dealing with the actual structure of the document, so if the document changes, it is more likely that the parser will not be confused.
source share