What is the format of this data? Is this a custom format?

I get this data as ajax response:

{ "idArray" = ( "99516", "99518", "97344", "97345", "98425" ); "frame" = { "size" = { "width" = "8"; "height" = "8"; }; "origin" = { "x" = "244"; "y" = "345"; }; }; }, 

This is just part of the data, but it continues in the same format. I do not have access to the source of the files that generate this data.

Is this a known format or something common?

+6
source share
3 answers

Since people tend to throw regular expressions at everything, even those things that cannot be analyzed using regular expressions (i.e., irregular languages): I wrote a parser to prove this data format:

 $input = '{ "idArray" = ( "99516", "99518", "97344", "97345", "98425" ); "frame" = { "size" = { "width" = "8"; "height" = "8"; }; "origin" = { "x" = "244"; "y" = "345"; }; }; }'; echo json_encode(parse($input)); function parse($input) { $tokens = tokenize($input); $index = 0; $result = parse_value($tokens, $index); if ($result[1] !== count($tokens)) { throw new Exception("parsing stopped at token " . $result[1] . " but there is more input"); } return $result[0][1]; } function tokenize($input) { $tokens = array(); $length = strlen($input); $pos = 0; while($pos < $length) { list($token, $pos) = find_token($input, $pos); $tokens[] = $token; } return $tokens; } function find_token($input, $pos) { $static_tokens = array("=", "{", "}", "(", ")", ";", ","); while(preg_match("/\s/mis", substr($input, $pos, 1))) { // eat whitespace $pos += 1; } foreach ($static_tokens as $static_token) { if (substr($input, $pos, strlen($static_token)) === $static_token) { return array($static_token, $pos + strlen($static_token)); } } if (substr($input, $pos, 1) === '"') { $length = strlen($input); $token_length = 1; while ($pos + $token_length < $length) { if (substr($input, $pos + $token_length, 1) === '"') { return array(array("value", substr($input, $pos + 1, $token_length - 1)), $pos + $token_length + 1); } $token_length += 1; } } throw new Exception("invalid input at " . $pos . ": `" . substr($input, $pos - 10, 20) . "`"); } // value is either an object {}, an array (), or a literal "" function parse_value($tokens, $index) { if ($tokens[$index] === "{") { // object: a list of key-value pairs, glued together by ";" $return_value = array(); $index += 1; while ($tokens[$index] !== "}") { list($key, $value, $index) = parse_key_value($tokens, $index); $return_value[$key] = $value[1]; if ($tokens[$index] !== ";") { throw new Exception("Unexpected: " . print_r($tokens[$index], true)); } $index += 1; } return array(array("object", $return_value), $index + 1); } if ($tokens[$index] === "(") { // array: a list of values, glued together by ",", the last "," is optional $return_value = array(); $index += 1; while ($tokens[$index] !== ")") { list($value, $index) = parse_value($tokens, $index); $return_value[] = $value[1]; if ($tokens[$index] === ",") { // last, is optional $index += 1; } else { if ($tokens[$index] !== ")") { throw new Exception("Unexpected: " . print_r($tokens[$index], true)); } return array(array("array", $return_value), $index + 1); } } return array(array("array", $return_value), $index + 1); } if ($tokens[$index][0] === "value") { return array(array("string", $tokens[$index][1]), $index + 1); } throw new Exception("Unexpected: " . print_r($tokens[$index], true)); } // find a key (string) followed by '=' followed by a value (any value) function parse_key_value($tokens, $index) { list($key, $index) = parse_value($tokens, $index); if ($key[0] !== "string") { // key must be a string throw new Exception("Unexpected: " . print_r($key, true)); } if ($tokens[$index] !== "=" ) { throw new Exception("'=' expected"); } $index += 1; list($value, $index) = parse_value($tokens, $index); return array($key[1], $value, $index); } 

Output:

 {"idArray":["99516","99518","97344","97345","98425"],"frame":{"size":{"width":"8","height":"8"},"origin":{"x":"244","y":"345"}}} 

Notes

  • source input has final,. I deleted this character. It throws an error (more input) if you return it.

  • This parser is naive in the sense that it marxes all input data before parsing begins. This is not good for big input.

  • I did not add escape detection for strings in the tokenizer. For example: "foo\"bar" .

It was fun. If you have any questions, let me know.

Edit: I see this is a JavaScript issue. Porting PHP to JavaScript should not be too complicated. The value of list($foo, $bar) = func() equivalent to: var res = func(); var foo = res[0]; var bar = res[1]; var res = func(); var foo = res[0]; var bar = res[1];

+3
source

Try using this function with the response text as a parameter:

 function getJsonData(str){ str = str.replace(/,/g, '') //remove , .replace(/\(/g, '[') //replace ( .replace(/\[/g)', ']') //replace ) .replace(/;/g, ',') //replace ; .replace(/=/g, ':'); //replace : return JSON.parse(str); } 

This is an edit made by @SamSal

 function getJsonData(str){ str = str.replace(/\(/g, '[') //replace ( .replace(/\)/g, ']') //replace ) .replace(/;\n\s+}/g, '}') //replace ;} with } .replace(/;/g, ',') //replace remaining ; with , .replace(/=/g, ':'); //replace : return JSON.parse(str); } 
+1
source

Is this a known format or something common?

This is a non-standard format that looks a bit like JSON , without actually being JSON.

0
source

Source: https://habr.com/ru/post/979735/


All Articles