Regex to analyze html from CDATA with C #

Question

Regex to analyze html from CDATA with C #

I would like to parse any HTML data that is returned in CDATA.

As an example <![CDATA[<table><tr><td>Approved</td></tr></table>]]>

Thanks!

+3

c # regex cdata

Little Larry Sellers May 01, '09 at 17:14

source share

6 answers

I know this may seem incredibly simple, but have you tried string.Replace ()?

string x = "<![CDATA[<table><tr><td>Approved</td></tr></table>]]>";
string y = x.Replace("<![CDATA[", string.Empty).Replace("]]>", string.Empty);

There are probably more efficient ways to handle this, but maybe you need something easy ...

+4

Scott anderson May 01, '09 at 17:21

source share

, , , :

/<!\[CDATA\[(.*?)\]\]>/

+2

Chad Birch 01 '09 17:22

The regular expression for finding CDATA partitions will be:

(?:<!\[CDATA\[)(.*?)(?:\]\]>)

+1

Tomalak May 01, '09 at 17:23

source share

Regex r = new Regex("(?<=<!\[CDATA\[).*?(?=\]\])");

0

patjbs May 01, '09 at 17:25

source share

Why do you want to use Regex for such a simple task? Try the following:

str = str.Trim().Substring(9);
str = str.Substring(0, str.Length-3);

0

Adren Sep 09 '11 at 15:47

source share

Ron harlev · Accepted Answer · 2009-05-01T17:24:38+0000

The expression to process your example will be

\<\!\[CDATA\[(?<text>[^\]]*)\]\]\>

If the group "text" will contain your HTML.

C # code you need:

using System.Text.RegularExpressions;
RegexOptions   options = RegexOptions.None;
Regex          regex = new Regex(@"\<\!\[CDATA\[(?<text>[^\]]*)\]\]\>", options);
string         input = @"<![CDATA[<table><tr><td>Approved</td></tr></table>]]>";

// Check for match
bool   isMatch = regex.IsMatch(input);
if( isMatch )
  Match   match = regex.Match(input);
  string   HTMLtext = match.Groups["text"].Value;
end if

The variable "input" is here to use the input pattern you entered.

Regex to analyze html from CDATA with C #

More articles: