The most efficient way to fix invalid JSON

I am stuck in an impossible situation. I have JSON from space (there is no way to change it). Here is json

{ user:'180111', title:'I\'m sure "E pluribus unum" means \'Out of Many, One.\' \n\nhttp://en.wikipedia.org/wiki/E_pluribus_unum.\n\n\'', date:'2007/01/10 19:48:38', "id":"3322121", "previd":112211, "body":"\'You\' can \"read\" more here [url=http:\/\/en.wikipedia.org\/?search=E_pluribus_unum]E pluribus unum[\/url]'s. Cheers \\*/ :\/", "from":"112221", "username":"mikethunder", "creationdate":"2007\/01\/10 14:04:49" } 

โ€œNowhere near live JSON,โ€ I said. And their answer was "emmm !, but Javascript can read it without complaint":

 <html> <script type="text/javascript"> var obj = {"PUT JSON FROM UP THERE HERE"}; document.write(obj.title); document.write("<br />"); document.write(obj.creationdate + " " + obj.date); document.write("<br />"); document.write(obj.body); document.write("<br />"); </script> <body> </body> </html> 

Problem

I have to read and parse this line through .NET (4), and it broke 3 of the 14 libraries mentioned in the C # Json.org section (have not tried them). To fix the problem, I wrote the following function to fix the problem with single and double quotes.

 public static string JSONBeautify(string InStr){ bool inSingleQuote = false; bool inDoubleQuote = false; bool escaped = false; StringBuilder sb = new StringBuilder(InStr); sb = sb.Replace("`", "<ยฐ)))><"); // replace all instances of "grave accent" to "fish" so we can use that mark later. // Hopefully there is no "fish" in our JSON for (int i = 0; i < sb.Length; i++) { switch (sb[i]) { case '\\': if (!escaped) escaped = true; else escaped = false; break; case '\'': if (!inSingleQuote && !inDoubleQuote) { sb[i] = '"'; // Change opening single quote string markers to double qoute inSingleQuote = true; } else if (inSingleQuote && !escaped) { sb[i] = '"'; // Change closing single quote string markers to double qoute inSingleQuote = false; } else if (escaped) { escaped = false; } break; case '"': if (!inSingleQuote && !inDoubleQuote) { inDoubleQuote = true; // This is a opening double quote string marker } else if (inSingleQuote && !escaped) { sb[i] = '`'; // Change unescaped double qoute to grave accent } else if (inDoubleQuote && !escaped) { inDoubleQuote = false; // This is a closing double quote string marker } else if (escaped) { escaped = false; } break; default: escaped = false; break; } } return sb.ToString() .Replace("\\/", "/") // Remove all instances of escaped / (\/) .hopefully no smileys in string .Replace("`", "\\\"") // Change all "grave accent"s to escaped double quote \" .Replace("<ยฐ)))><", "`") // change all fishes back to "grave accent" .Replace("\\'","'"); // change all escaped single quotes to just single quote } 

Now JSONlint only complains about attribute names, and I can use the JSON.NET and SimpleJSON libraries to parse the above JSON.

Question

I am sure my code is not the best way to install the mentioned JSON. Is there any script that could break my code? Is there a better way to do this?

+6
source share
2 answers

You need to run this via JavaScript. Launch the JavaScript parser in .net. Give the string as input to JavaScript and use your own JavaScript JSON.stringify to convert:

  obj = { "user":'180111', "title":'I\'m sure "E pluribus unum" means \'Out of Many, One.\' \n\nhttp://en.wikipedia.org/wiki/E_pluribus_unum.\n\n', "date":'2007/01/10 19:48:38', "id":"3322121", "previd":"112211", "body":"\'You\' can \"read\" more here [url=http:\/\/en.wikipedia.org\/?search=E_pluribus_unum]E pluribus unum[\/url]'s. Cheers \\*/ :\/", "from":"112221", "username":"mikethunder", "creationdate":"2007\/01\/10 14:04:49" } console.log(JSON.stringify(obj)); document.write(JSON.stringify(obj)); 

Remember that the string (or rather the object) that you received is invalid JSON and cannot be parsed using the JSON library. It must first be converted to valid JSON. However, it is valid JavaScript.

To fulfill this answer: you can use JavaScriptSerializer in .Net. For this solution, you will need the following assemblies:

  • System.net
  • System.Web.Script.Serialization

      var webClient = new WebClient(); string readHtml = webClient.DownloadString("uri to your source (extraterrestrial)"); var a = new JavaScriptSerializer(); Dictionary<string, object> results = a.Deserialize<Dictionary<string, object>>(readHtml); 
+6
source

How about this:

  string AlienJSON = "your alien JSON"; JavaScriptSerializer js = new JavaScriptSerializer(); string ProperJSON = js.Serialize(js.DeserializeObject(AlienJSON)); 

Or just destroy the object after deserialize instead of converting it back to a string and passing it to a JSON parser for an extra headache

As Mouser mentioned, you need to use System.Web.Script.Serialization , which is available, including the .web.extensions.dll system in your project, and for this you need to change the target structure in the project properties to .NET Framework 4 .

EDIT

The trick to using a deserialized object is using dynamic

 JavaScriptSerializer js = new JavaScriptSerializer(); dynamic obj = js.DeserializeObject(AlienJSON); 

for JSON in your question just use

 string body = obj["body"]; 

or if your JSON is an array

 if (obj is Array) { foreach(dynamic o in obj){ string body = obj[0]["body"]; // ... do something with it } } 
+2
source

Source: https://habr.com/ru/post/982132/


All Articles