RegEx for <li> </li> Tags

I am working on a C # WinForm application. In this application, I have a snippet like this:

<ul> <li>abc <li>bbc <li>xyz <li>pqr </li></li></li></li> </ul> 

but, I want to get a conclusion like ..

 <ul> <li>abc</li> <li>bbc</li> <li>xyz</li> <li>pqr</li> </ul> 

Is there any method by which this thing can be accomplished?

Can anyone suggest me any RegEx for this problem?

Thanks. Best wishes.

+4
source share
6 answers

Simple without using any fancy regex

Try below, you can implement your own code

  1. first Remove all </li> from the snippet line.replace("</li>","") 2. Read each line starts with <li> if (line.startswith("<li">) 3. and append the </li> at the end line+ ="</li>" 4. combine all the line resString += line; 
+2
source

This works on your specific example, but it can break down a lot on another input (for example, if the <li> tags should have covered line breaks), so if it does not give the desired results, please edit your question in more detail.

 cleanString = Regex.Replace(subjectString, "(?:</li>)+", "", RegexOptions.IgnoreCase); resultString = Regex.Replace(cleanString, "<li>(.*)", "<li>$1</li>", RegexOptions.IgnoreCase); 
+2
source

public AddLiandOl line (xhtml line) {

  xhtml = xhtml.Replace("</li>", string.Empty); xhtml = xhtml.Replace("<li>", "</li><li>"); xhtml = xhtml.Replace("</ol>", "</li></ol>"); xhtml = xhtml.Replace("</ul>", "</li></ul>"); Regex replaceul = new Regex("<ul>(.+?)</li>", RegexOptions.IgnoreCase | RegexOptions.Singleline); xhtml = replaceul.Replace(xhtml,"<ul>"); Regex replaceol = new Regex("<ol>(.+?)</li>", RegexOptions.IgnoreCase | RegexOptions.Singleline); xhtml = replaceol.Replace(xhtml, "<ol>"); return xhtml; } 

Try this, I tested it. it works ... It will take almost 30 seconds to replace all tags.

+1
source
 StringBuilder output = new StringBuilder("<ul>\n"); foreach (i in Regex.Matches(snippet, "<li>\\w*")) { output.Append(i.Value).Append("</li>\n"); } output.Append("\n</ul>"); 
0
source

This is not the most pleasant solution to your problem, but it is insanely fast. Regular expressions are slow compared to straight string methods.

My string method compared to Tim Pitzker is two Regex.Replace. (Sorry Tim, I had to choose someone, and you have upvote :))

it's 10,000 reps. numbers - the number of ticks passed:

regex replace: avg: 40.9659. max: 2273

replace the line: Aug: 18.4566. max: 1478

 string strOrg = "<ul>\n" + "<li>abc\n" + "<li>bbc\n" + "<li>xyz\n" + "<li>pqr </li></li></li></li>\n" + "</ul>"; string strFinal = FixUnorderedList(strOrg); public static string FixUnorderedList(string str) { //remove what we're going to put back later //(these could be placed on the same line, one after the other) str = str.Replace("\n", string.Empty); str = str.Replace("</li>", string.Empty); str = str.Replace("<ul>", string.Empty); str = str.Replace("</ul>", string.Empty); //get each li element string[] astrLIs = str.Split(new string[] { "<li>" }, StringSplitOptions.RemoveEmptyEntries); //rebuild the list correctly string strFinal = "<ul>"; foreach(string strLI in astrLIs) strFinal += string.Format("\n<li>{0}</li>", strLI.Trim()); strFinal += "\n</ul>"; return strFinal; } 
0
source
  string unorderlist = "<ul><li>ONE</li><li>TWO</li><li>THREE</li></ul>"; Regex regexul = new Regex("<ul>"); Match m = regexul.Match(unorderlist); if (m.Success) { unorderlist = regexul.Replace(unorderlist, string.Empty); Regex regex1 = new Regex("<li>"); unorderlist = regex1.Replace(unorderlist, ":"); Regex regex2 = new Regex("</li>"); unorderlist = regex2.Replace(unorderlist, "\n"); Regex regex3 = new Regex("</ul>"); unorderlist = regex3.Replace(unorderlist, "\n"); Console.WriteLine(unorderlist); } 
0
source

Source: https://habr.com/ru/post/1332883/


All Articles