Remove HTML formatting in Razor MVC 3

Question

Remove HTML formatting in Razor MVC 3

I use MVC 3 and Razor View mechanisms.

What am i trying to do

I am making a blog using MVC 3, I want to remove all HTML formatting tags such as <p> <b> <i> , etc.

Why am I using the following code. (it works)

  @{ post.PostContent = post.PostContent.Replace("<p>", " "); post.PostContent = post.PostContent.Replace("</p>", " "); post.PostContent = post.PostContent.Replace("<b>", " "); post.PostContent = post.PostContent.Replace("</b>", " "); post.PostContent = post.PostContent.Replace("<i>", " "); post.PostContent = post.PostContent.Replace("</i>", " "); }

I feel that there definitely should be a better way to do this. Can anyone direct me to this.

+6

asp.net-mvc-3 razor

Yasser Jul 31 '12 at 7:07

source share

4 answers

The regular expression is slow. use this, this is faster:

 public static string StripHtmlTagByCharArray(string htmlString) { char[] array = new char[htmlString.Length]; int arrayIndex = 0; bool inside = false; for (int i = 0; i < htmlString.Length; i++) { char let = htmlString[i]; if (let == '<') { inside = true; continue; } if (let == '>') { inside = false; continue; } if (!inside) { array[arrayIndex] = let; arrayIndex++; } } return new string(array, 0, arrayIndex); }

You can take a look at http://www.dotnetperls.com/remove-html-tags

+2

Edi wang Aug 1 '12 at 5:53

source share

You can use regex.

This article can help you.

0

Jonas t Jul 31 '12 at 7:12

source share

Just in case, if you want to use the regular expression in .NET for marking up HTML tags, it looks like it works very well in the source code for this page. This is better than some of the other answers on this page because it searches for actual HTML tags instead of blindly deleting everything between < and > . On BBS days, we typed <grin> lot instead :) , so removing <grin> not an option. :)

This solution only removes tags. It does not delete the contents of these tags in situations where this may be important - a script tag, for example. You will see the script, but the script will not be executed because the script tag itself will be deleted. Removing the contents of an HTML tag is very difficult, and it is practically required that the HTML fragment be well formed ...

Also pay attention to the RegexOption.Singleline parameter. This is very important for any HTML block. since there is nothing wrong with opening an HTML tag on one line and closing it in another.

 string strRegex = @"</{0,1}(!DOCTYPE|a|abbr|acronym|address|applet|area|article|aside|audio|b|base|basefont|bdi|bdo|big|blockquote|body|br|button|canvas|caption|center|cite|code|col|colgroup|datalist|dd|del|details|dfn|dialog|dir|div|dl|dt|em|embed|fieldset|figcaption|figure|font|footer|form|frame|frameset|h1|h2|h3|h4|h5|h6|head|header|hr|html|i|iframe|img|input|ins|kbd|keygen|label|legend|li|link|main|map|mark|menu|menuitem|meta|meter|nav|noframes|noscript|object|ol|optgroup|option|output|p|param|pre|progress|q|rp|rt|ruby|s|samp|script|section|select|small|source|span|strike|strong|style|sub|summary|sup|table|tbody|td|textarea|tfoot|th|thead|time|title|tr|track|tt|u|ul|var|video|wbr){1}(\s*/{0,1}>|\s+.*?/{0,1}>)"; Regex myRegex = new Regex(strRegex, RegexOptions.Singleline); string strTargetString = @"<p>Hello, World</p>"; string strReplace = @""; return myRegex.Replace(strTargetString, strReplace);

I am not saying that this is the best answer. This is just an option and it is perfect for me.

0

Alex Dresko Jun 10 '14 at 13:52

source share

Yasser · Accepted Answer · 2012-07-31T07:15:20+0000

Thanks Alex Yaroshevich,

Here is what I use now.

 post.PostContent = Regex.Replace(post.PostContent, @"<[^>]*>", String.Empty);

Remove HTML formatting in Razor MVC 3

More articles: