PHP: String for multidimensional array

(Sorry for my bad english)

I have a string that I want to split into an array. Angle brackets are multiple nested arrays. Escaped characters must be saved.

This is an example line:

$string = '[[["Hello, \"how\" are you?","Good!",,,123]],,"ok"]' 

The structure of the result should look like this:

 array ( 0 => array ( 0 => array ( 0 => 'Hello, \"how\" are you?', 1 => 'Good!', 2 => '', 3 => '', 4 => '123', ), ), 1 => '', 2 => 'ok', ) 

I tested it with

 $pattern = '/[^"\\]*(?:\\.[^"\\]*)*/s'; $return = preg_match_all($pattern, $string, null); 

But that did not work. I do not understand these RegEx patterns (I found this in another example on this page). I do not know if preg_match_all is the correct command.

Hope someone can help me.

Thank you very much!!!

+5
source share
3 answers

You might want to use a lexer in conjunction with a recursive function that actually creates the structure.

The following tokens were used for your purpose:

 \[ # opening bracket \] # closing bracket ".+?(?<!\\)" # " to ", making sure it not escaped ,(?!,) # a comma, not followed by a comma \d+ # at least one digit ,(?=,) # a comma followed by a comma 

The rest is programming logic, see demo at ideone.com . Inspired by this post .


 class Lexer { protected static $_terminals = array( '~^(\[)~' => "T_OPEN", '~^(\])~' => "T_CLOSE", '~^(".+?(?<!\\\\)")~' => "T_ITEM", '~^(,)(?!,)~' => "T_SEPARATOR", '~^(\d+)~' => "T_NUMBER", '~^(,)(?=,)~' => "T_EMPTY" ); public static function run($line) { $tokens = array(); $offset = 0; while($offset < strlen($line)) { $result = static::_match($line, $offset); if($result === false) { throw new Exception("Unable to parse line " . ($line+1) . "."); } $tokens[] = $result; $offset += strlen($result['match']); } return static::_generate($tokens); } protected static function _match($line, $offset) { $string = substr($line, $offset); foreach(static::$_terminals as $pattern => $name) { if(preg_match($pattern, $string, $matches)) { return array( 'match' => $matches[1], 'token' => $name ); } } return false; } // a recursive function to actually build the structure protected static function _generate($arr=array(), $idx=0) { $output = array(); $current = 0; for($i=$idx;$i<count($arr);$i++) { $type = $arr[$i]["token"]; $element = $arr[$i]["match"]; switch ($type) { case 'T_OPEN': list($out, $index) = static::_generate($arr, $i+1); $output[] = $out; $i = $index; break; case 'T_CLOSE': return array($output, $i); break; case 'T_ITEM': case 'T_NUMBER': $output[] = $element; break; case 'T_EMPTY': $output[] = ""; break; } } return $output; } } $input = '[[["Hello, \"how\" are you?","Good!",,,123]],,"ok"]'; $items = Lexer::run($input); print_r($items); ?> 
0
source

This is difficult for regular expression, but there is an answer to your question (apologies in advance).

A string is an almost valid array literal, but for ,, s. You can match these pairs and then convert to ,'' with

/,(?=,)/

You can then eval to include this string in the output array you are looking for.

For instance:

 // input $str1 = '[[["Hello, \\"how\\" are you?","Good!",,,123]],,"ok"]'; // replace , followed by , with ,'' with a regex $pattern = '/,(?=,)/'; $replace = ",''"; $str2 = preg_replace($pattern, $replace, $str1); // eval updated string $arr = eval("return $str2;"); var_dump($arr); 

I get this:

 array(3) { [0]=> array(1) { [0]=> array(5) { [0]=> string(21) "Hello, "how" are you?" [1]=> string(5) "Good!" [2]=> string(0) "" [3]=> string(0) "" [4]=> int(123) } } [1]=> string(0) "" [2]=> string(2) "ok" } 

Edit

Noting the inherent danger of eval , the best option is to use json_decode with the code above, for example:

 // input $str1 = '[[["Hello, \\"how\\" are you?","Good!",,,123]],,"ok"]'; // replace , followed by , with ,'' with a regex $pattern = '/,(?=,)/'; $replace = ',""'; $str2 = preg_replace($pattern, $replace, $str1); // eval updated string $arr = json_decode($str2); var_dump($arr); 
+2
source

If you can edit the code that serializes the data, then it is best to enable serialization using json_encode and json_decode. There is no need to reinvent the wheel on this.

Good cat by the way.

+1
source

Source: https://habr.com/ru/post/1264385/


All Articles