You need a regular expression to get rid of the brackets in the html image tag filename

So say I have html with an image tag like this:

<p> (1) some image is below: <img src="/somwhere/filename_(1).jpg"> </p> 

I want a regular expression that just gets rid of the bracket in the file name, so my html will look like this:

 <p> (1) some image is below: <img src="/somwhere/filename_1.jpg"> </p> 

Does anyone know how to do this? My programming language is C #, if that matters ...

I will be eternally grateful and send you very nice karma. :)

+4
source share
5 answers

I suspect that your work will be much easier if you used HTML Agility , which can help you do this instead of regular expression, judging by the answers, it will make HTML analysis a lot easier for you to achieve what you are trying to do.

Hope this helps, Regards, Tom.

+1
source

This (fairly dense) regex should do this:

 string s = Regex.Replace(input, @"(<img\s+[^>]*src=""[^""]*)\((\d+)\)([^""]*""[^>]*>)", "$1$2$3"); 
+1
source

The nickname solution is fine if the file names always match this format, but it matches any bracket anywhere in the attribute:

 s = Regex.Replace(@"(?i)(?<=<img\s+[^>]*\bsrc\s*=\s*""[^""]*)[()]", ""); 

Lookbehind ensures that a match occurs within the src attribute of the img tag. It assumes the attribute is enclosed in double quotation marks (quotation marks); if you need to allow single quotes (apostrophes) or no quotes at all, the regex becomes much more complicated. I will post it if you need it.

+1
source

In this simple case, you can simply use string.Replace , for example:

 string imgFilename = "/somewhere/image_(1).jpg"; imgFilename = imgFilename.Replace("(", "").Replace(")", ""); 

Or do you need a regular expression to replace the full tag inside an HTML string?

0
source
 Regex.Replace(some_input, @"(?<=<\s*img\s*src\s*=\s*""[^""]*?)(?:\(|\))(?=[^""]*?""\s*\/?\s*?>)", ""); 

Finds ( or ) preceded by <img src =" and, optionally, text (with any combination of spaces, although I did not include a new line), and then additional text and "> or "/> , again with any combination of spaces and replaces their nothing.

0
source

Source: https://habr.com/ru/post/1299791/


All Articles