Changing text inside html nodes - nokogiri

Let's say I have the following HTML:

<ul><li>Bullet 1.</li> <li>Bullet 2.</li> <li>Bullet 3.</li> <li>Bullet 4.</li> <li>Bullet 5.</li></ul> 

What I want to do with this replaces any periods, question marks or exclamation marks with myself and the ending star that is inside the HTML node and then converted back to HTML. Thus, the result will be:

 <ul><li>Bullet 1.*</li> <li>Bullet 2.*</li> <li>Bullet 3.*</li> <li>Bullet 4.*</li> <li>Bullet 5.*</li></ul> 

I understand this a bit on IRB, but I can’t understand. here I have the code:

  html = "<ul><li>Bullet 1.</li> <li>Bullet 2.</li> <li>Bullet 3.</li> <li>Bullet 4.</li> <li>Bullet 5.</li></ul>" doc = Nokogiri::HTML::DocumentFragment.parse(html) doc.search("*").map { |n| n.inner_text.gsub(/(?<=[.!?])(?!\*)/, "#{$1}*") } 

The returned array is parsed correctly, but I'm just not sure how to convert it to HTML. Is there any other way that I can use to change inner_text as such?

+6
source share
2 answers

How about this code?

 doc.traverse do |x| if x.text? x.content = x.content.gsub(/(?<=[.!?])(?!\*)/, "#{$1}*") end end 

The traverse method does almost the same as search("*").each . Then you verify that the node is Nokogiri::XML::Text and, if so, change the content as you wish.

+6
source

Thanks to the post here, Nokogiri will replace tag values , I was able to modify it a bit and understand.

 doc = Nokogiri::HTML::DocumentFragment.parse(html) doc.search("*").each do |node| dummy = node.add_previous_sibling(Nokogiri::XML::Node.new("dummy", doc)) dummy.add_previous_sibling(Nokogiri::XML::Text.new(node.to_s.gsub(/(?<=[.!?])(?!\*)/, "#{$1}*"), doc)) node.remove dummy.remove end puts doc.to_html.gsub("&lt;", "<").gsub("&gt;", ">") 
-1
source

Source: https://habr.com/ru/post/896150/


All Articles