Replace consecutive repeated tags with one of these tags in Ruby

I am trying to replace multiple consecutive tags with a <br>single tag <br>using Ruby.

For instance:

Hello
<br><br/><br>
World!

will become

Hello
<br>
World!
+3
source share
1 answer

You can do this with a regular expression, for example:

 "Hello\n<br><br/><br>\nworld".gsub(/(?im)(<br\s*\/?>\s*)+/,'<br>')

To explain that: in the section (?im)there are options indicating that the match should be case insensitive and that it .should match newlines. A grouped expression (<br\s*\/?>\s*)matches <br>(optionally with a space and a trailing /), followed by a space, and +says to match one or more of this group.

, HTML - . , Nokogiri:

require 'nokogiri'

document = Nokogiri::HTML.parse("Hello
<br><br/><br>
World!")

document.search('//br').each do |node|
    node.remove if node.next.name == 'br'
end

puts document

, :

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>Hello
<br>
World!</p></body></html>

( , DOCTYPE <html><body><p>).

+4

Source: https://habr.com/ru/post/1792975/


All Articles