Replacing HTML tag content with sed

I am trying to replace the contents of some HTML tags in an HTML page using sed in a bash script. For some reason, I am not getting the proper result, since it does not replace anything. It must be something very simple / stupid impression, anyone want to help me?

HTML to search / replace in:

Unlocked <span id="unlockedCount"></span>/<span id="totalCount"></span> achievements for <span id="totalPoints"></span> points. 
Command used

sed:

 cat index.html | sed -i -e "s/\<span id\=\"unlockedCount\"\>([0-9]\{0,\})\<\/span\>/${unlockedCount}/g" index.html 

The purpose of this is to analyze the HTML page and update the data in accordance with some external data. For the first run, the contents of the tags will be empty, after which they will be filled.


EDIT:

In the end, I used a combination of answers that led to the following code:

 sed -i -e 's|<span id="unlockedCount">\([0-9]\{0,\}\)</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html 

Thanks a lot @Sorpigal, @tripleee, @classic for the help!

+6
source share
3 answers

Try the following:

 sed -i -e "s/\(<span id=\"unlockedCount\">\)\(<\/span>\)/\1${unlockedCount}\2/g" index.html 
+5
source
 sed -i -e 's%<span id="unlockedCount">([0-9]*)</span\>/'"${unlockedCount}/g" index.html 

I removed the useless use of Cat, pulled out a bunch of unnecessary backslashes, added single quotes around the regex to protect it from shell expansion, and fixed a repeat statement. You may need a backslash in parentheses; my sed at least wants \ (... \).

Note the use of single and double quotes next to each other. Single quotes protect against shell expansion, so you cannot use them around "$ {unlockedCount}" where you want the shell to interpolate a variable.

+1
source

What you say you want to do is not what you say sed .

You want to insert the number in the tag or replace it, if any. What you're trying to say sed is to replace the span tag and its contents, if there is one or a number, with the value of the shell variable.

You also use many complex, annoying, and erosive sequences that are simply not needed.

Here is what you want:

 sed -r -i -e 's|<span id="unlockedCount">([0-9]{0,})</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html 

Please note the differences:

  • Added -r to enable extended expressions without which your capture pattern would not work.
  • Use | instead of / as a delimiter for substitution, to avoid the need for / .
  • Single quoting sed statements to avoid having things from the shell.
  • The corresponding span tag is included in the replacement section so that it is not deleted.
  • To expand the unlockedCount variable, close the one-shot expression, and then open it again.
  • Omitted cat | which is useless here.

I also used double quotes around the shell variable extension because it is good practice, but if it does not contain spaces, it is not necessary.

I did not, strictly speaking, add -r . The usual old sed will work if you say \([0-9]\{0,\}\) , but the idea here was to simplify.

+1
source

Source: https://habr.com/ru/post/895860/


All Articles