How can I match and remove the backslash "\" and "\ n" characters using the .NET Regex library?

I get the XML from the web service in the following format, and I want to clear it (remove the extra characters "\" and "\ n") before working with it. I am currently using regex for matching. However, only the characters "\ n" are cleared, and the characters "\" between equal and double quotation marks are preserved.

What will you advice me?

private string ValidateXml(string dirtyXml) {
    Regex regex = new Regex(@"[\\\][\n]");
    var cleanXml = regex.Replace(dirtyXml, "");
    return cleanXml;
}

"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n\n<ISBNdb server_time=\"2010-01-28T11:31:08Z\">\n<BookList total_results=\"1\" page_size=\"10\" page_number=\"1\" shown_results=\"1\">\n<BookData book_id=\"quantitative_techniques\" isbn=\"0826458548\" isbn13=\"9780826458544\">\n<Title>Quantitative techniques</Title>\n<TitleLong></TitleLong>\n<AuthorsText>Terry Lucey</AuthorsText>\n<PublisherText publisher_id=\"continuum\">London : Continuum, 2002.</PublisherText>\n</BookData>\n</BookList>\n</ISBNdb>\n"
+3
source share
4 answers

: XML ( ) , , , \" \n ? ? , , , , "n", . :

static void Main(string[] args)
{
  string dirtyXml = @"""<?xml version=\""1.0\"" encoding=\""UTF-8\""?>\n\n<ISBNdb server_time=\""2010-01-28T11:31:08Z\"">\n<BookList total_results=\""1\"" page_size=\""10\"" page_number=\""1\"" shown_results=\""1\"">\n<BookData book_id=\""quantitative_techniques\"" isbn=\""0826458548\"" isbn13=\""9780826458544\"">\n<Title>Quantitative techniques</Title>\n<TitleLong></TitleLong>\n<AuthorsText>Terry Lucey</AuthorsText>\n<PublisherText publisher_id=\""continuum\"">London : Continuum, 2002.</PublisherText>\n</BookData>\n</BookList>\n</ISBNdb>\n""";
  Console.WriteLine(dirtyXml);
  Console.WriteLine();
  Console.WriteLine(Regex.Replace(dirtyXml, @"^""|""$|\\n?", ""));
}

:

"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n\n<ISBNdb server_time=\"2010-01-28T11:31:08Z\">\n<BookList total_results=\"1\" page_size=\"10\" page_number=\"1\" shown_results=\"1\">\n<BookData book_id=\"quantitative_techniques\" isbn=\"0826458548\" isbn13=\"9780826458544\">\n<Title>Quantitative techniques</Title>\n<TitleLong></TitleLong>\n<AuthorsText>Terry Lucey</AuthorsText>\n<PublisherText publisher_id=\"continuum\">London : Continuum, 2002.</PublisherText>\n</BookData>\n</BookList>\n</ISBNdb>\n"

<?xml version="1.0" encoding="UTF-8"?><ISBNdb server_time="2010-01-28T11:31:08Z"><BookList total_results="1" page_size="10" page_number="1" shown_results="1"><BookData book_id="quantitative_techniques" isbn="0826458548" isbn13="9780826458544"><Title>Quantitative techniques</Title><TitleLong></TitleLong><AuthorsText>Terry Lucey</AuthorsText><PublisherText publisher_id="continuum">London : Continuum, 2002.</PublisherText></BookData></BookList></ISBNdb>

, , ?

+3

, :

  • \\
  • \[ single [character
  • ] ]
  • \n

, :

@"\\n?"

\n, \. , , . , :

@"(\\n)|(\\(?=""))"
+1

, String.Replace.

:

var cleanXml = dirtyXml.Replace("\\n", "").Replace("\\\"", "\"");
0

, | , , \n, \

[\\][n]|[\\]
0

Source: https://habr.com/ru/post/1730378/


All Articles