How to stop .net Xml Serialization by inserting illegal characters

Anything below 0x20 (except 0x09, 0x0a, 0x0d, i.e., tab, carriage return, and line) cannot be included in the XML document.

I have some data coming out of a database and transmitted in response to a web service request.

The Soap formatter happily encodes the 0x12 character (Ascii 18, Device Control 2) as  , but the response on the client with a hex value of 0x12 is an invalid character

<rant> What I find pretty disappointing is the two sides of the same coin, both the client and the service are .net applications. Why does the soap formatter write poor xml if it doesn't read anything? </rant>

I would like either

  • Get an Xml Serialiser to handle these odd characters correctly or
  • Failed to execute request in web service

I searched googled and could not find much on this except: a) "disinfect your entries" or b) "change the structure of the document."

a) Not a runner, as some of these data are +20 years

b) is not a great choice, since in addition to our own interface, we have clients that directly refer to the web service.

Is there something obvious that I'm missing? Or is it just a case of code around AscII control codes?

thanks

Update
This is actually a problem with the XmlSerialiser, the following code serializes an invalid character for the stream, but does not deserialize it

 [Serializable] public class MyData { public string Text { get; set; } } class Program { public static void Main(string[] args) { var myData = new MyData {Text = "hello " + ASCIIEncoding.ASCII.GetString(new byte[] { 0x12 }) + " world"}; var serializer = new XmlSerializer(typeof(MyData)); var xmlWriter = new StringWriter(); serializer.Serialize(xmlWriter, myData); var xmlReader = new StringReader(xmlWriter.ToString()); var newData = (MyData)serializer.Deserialize(xmlReader); // Exception // hexadecimal value 0x12, is an invalid character. } } 

I can make it strangle the xml entry by explicitly creating an XmlWriter and passing it to Serialise (I will post this as soon as my own answer), but it still means sanitizing my data before sending it.
Since these characters are significant, I can’t just break them down, I need to encode them before transmitting and decode them when reading, and I am really very surprised that there is no existing structure method for this.

+4
source share
2 answers

Second : solution

Using DataContractSerializer (which is used by default for WCF services) instead of XmlSerializer works with cure

 [Serializable] public class MyData { public string Text { get; set; } } class Program { public static void Main(string[] args) { var myData = new MyData { Text = "hello " + ASCIIEncoding.ASCII.GetString(new byte[] { 0x12 }) + " world" }; var serializer = new DataContractSerializer(typeof(MyData)); var mem = new MemoryStream(); serializer.WriteObject(mem, myData); mem.Seek(0, SeekOrigin.Begin); MyData myData2 = (MyData)serializer.ReadObject(mem); Console.WriteLine("myData2 {0}", myData2.Text); } } 

Frist : workaround

I can make him strangle while writing Xml using XmlWriter, which is probably better than a client choking on it. eg.

However, it does not fix the basic problem of sending invalid characters

 [Serializable] public class MyData { public string Text { get; set; } } class Program { public static void Main(string[] args) { var myData = new MyData {Text = "hello " + ASCIIEncoding.ASCII.GetString(new byte[] { 0x12 }) + " world"}; var serializer = new System.Xml.Serialization.XmlSerializer(typeof(MyData)); var sw = new StringWriter(); XmlWriterSettings settings = new XmlWriterSettings(); using (var writer = XmlWriter.Create(sw)) { serializer.Serialize(writer, myData); // Exception // hexadecimal value 0x12, is an invalid character } var xmlReader = new StringReader(sw.ToString()); var newUser = (MyData)serializer.Deserialize(xmlReader); Console.WriteLine("User Name = {0}", newUser); } } 
+1
source

The combination of a Binary Worrier message with a special character filter inserted filters the object very well before returning it:

 public List<MyData> MyWebServiceMethod() { var mydata = GetMyData(); return Helper.ScrubObjectOfSpecialCharacters<List<MyData>>(mydata); } 

Helper Class:

 public static T ScrubObjectOfSpecialCharacters<T>(T obj) { var serializer = new XmlSerializer(obj.GetType()); using (StringWriter writer = new StringWriter()) { serializer.Serialize(writer, obj); string content = writer.ToString(); content = FixSpecialCharacters(content); using (StringReader reader = new StringReader(content)) { obj = (T)serializer.Deserialize(reader); } } return obj; } public static string FixSpecialCharacters(string input) { if (string.IsNullOrEmpty(input)) return input; StringBuilder output = new StringBuilder(); for (int i = 0; i < input.Length; i++) { int charCode = (int)input[i]; switch (charCode) { case 8211: case 8212: { // replaces short and long hyphen output.Append('-'); break; } default: { if ((31 < charCode && charCode < 127) || charCode == 9) { output.Append(input[i]); } break; } } } return output.ToString(); } 
0
source

Source: https://habr.com/ru/post/1382325/


All Articles