Remove space from html document with ruby
So I have a line in ruby, something like
str = "<html>\n<head>\n\n <title>My Page</title>\n\n\n</head>\n\n<body>" +
" <h1>My Page</h1>\n\n<div id=\"pageContent\">\n <p>Here is a para" +
"graph. It can contain spaces that should not be removed.\n\nBut\n" +
"line breaks that should be removed.</p></body></html>"
How to remove all spaces (spaces, tabs and lines) that are outside the tag / not inside a tag that has content such as <p>using only native Ruby?
(I would like to avoid using XSLT or something for the task, it's simple.)
+3
4 answers
You can condense all groups of spaces into one space (i.e. hello worldin hello world) using String # squeeze:
"hello world".squeeze(" ") # => "hello world"
If the compression parameter is a character that should be compressed.
EDIT: I misunderstood your question, sorry.
This will
- remove spaces inside tags
- leave separate spaces out of tags
Now I will work on a solution.
+1