Parse url string (path and parameters) to an array

Just write a small function here and you need help with optimization!

All requests are redirected to the index page,

I have this function that parses the url in an array.

The type of URL is depicted as:

http://localhost/{user}/{page}/?sub_page={sub_page}&action={action} 

So an example would be:

 http://localhost/admin/stock/?sub_page=products&action=add 

When uri is requested, the domain is excluded, so my function accepts the following lines:

 /admin/stock/?sub_page=products&action=add 

My function is as follows and the WARNING is very procedural.

for those of you who can't be bothered with reading and understanding, ive added an explanation below;)

 function uri_to_array($uri){ // uri will be in format: /{user}/{page}/?sub_page={subpage}&action={action} ... && plus additional parameters // define array that will be returned $return_uri_array = array(); // separate path from querystring; $array_tmp_uri = explode("?", $uri); // if explode returns the same as input $string, no delimeter was found if ($uri == $array_tmp_uri[0]){ // no question mark found. // format either '/{user}/{page}/' or '/{user}/' $uri = trim($array_tmp_uri[0], "/"); // remove excess baggage unset ($array_tmp_uri); // format either '{user}/{page}' or '{user}' $array_uri = explode("/", $uri); // if explode returns the same as input $string, no delimiter was found if ($uri == $array_uri[0]){ // no {page} defined, just user. $return_uri_array["user"] = $array_uri[0]; } else{ // {user} and {page} defined. $return_uri_array["user"] = $array_uri[0]; $return_uri_array["page"] = $array_uri[1]; } } else{ // query string is defined // format either '/{user}/{page}/' or '/{user}/' $uri = trim($array_tmp_uri[0], "/"); $parameters = trim($array_tmp_uri[1]); // PARSE PATH // remove excess baggage unset ($array_tmp_uri); // format either '{user}/{page}' or '{user}' $array_uri = explode("/", $uri); // if explode returns the same as input $string, no delimiter was found if ($uri == $array_uri[0]){ // no {page} defined, just user. $return_uri_array["user"] = $array_uri[0]; } else{ // {user} and {page} defined. $return_uri_array["user"] = $array_uri[0]; $return_uri_array["page"] = $array_uri[1]; } // parse parameter string $parameter_array = array(); parse_str($parameters, $parameter_array); // copy parameter array into return array foreach ($parameter_array as $key => $value){ $return_uri_array[$key] = $value; } } return $return_uri_array; } 

in principle, there is one basic if statement, one way if the sequence of requests is not defined (no '?'), and another way if '?' exist.

I just want to make this feature better.

Would it make sense to make it a class?

Essentially I need a function that takes /{user}/{page}/?sub_page={sub_page}&action={action} as an argument and returns

 array( "user" => {user}, "page" => {page}, "sub_page" => {sub_page}, "action" => {action} ) 

Greetings, Alex

+4
source share
3 answers

If you want to

  • Do it right
  • Use regex
  • Use the same method to parse all URLs: s ( parse_url() does not support relative paths called only_path below)

It can satisfy your taste:

 $url = 'http://localhost/admin/stock/?sub_page=products&action=add'; preg_match ("!^((?P<scheme>[a-zA-Z][a-zA-Z\d+-.]*):)?(((//(((?P<credentials>([a-zA-Z\d\-._~\!$&'()*+,;=%]*)(:([a-zA-Z\d\-._~\!$&'()*+,;=:%]*))?)@)?(?P<host>([\w\d-.%]+)|(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(\[([a-fA-F\d.:]+)\]))?(:(?P<port>\d*))?))(?<path>(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*))|(?P<only_path>(/(([a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*))?)|([a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*)))?(?P<query>\?([a-zA-Z\d\-._~\!$&'()*+,;=:@%/?]*))?(?P<fragment>#([a-zA-Z\d\-._~\!$&'()*+,;=:@%/?]*))?$!u", $url, $matches); $parts = array_intersect_key ($matches, array ('scheme' => '', 'credentials' => '', 'host' => '', 'port' => '', 'path' => '', 'query' => '', 'fragment' => '', 'only_path' => '', )); var_dump ($parts); 

It should cover almost all possible valid URLs: s

If host empty, only_path must contain path , i.e. protocol -less and host without a URL.

UPDATE:

Maybe I should read the question a little better. This will parse the URL in components that you can use to more easily get the parts you are interested in. Run something like:

 // split the URL preg_match ('!^((?P<scheme>[a-zA-Z][a-zA-Z\d+-.]*):)?(((//(((?P<credentials>([a-zA-Z\d\-._~\!$&'()*+,;=%]*)(:([a-zA-Z\d\-._~\!$&'()*+,;=:%]*))?)@)?(?P<host>([\w\d-.%]+)|(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(\[([a-fA-F\d.:]+)\]))?(:(?P<port>\d*))?))(?<path>(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*))|(?P<only_path>(/(([a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*))?)|([a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*)))?(\?(?P<query>([a-zA-Z\d\-._~\!$&'()*+,;=:@%/?]*)))?(#(?P<fragment>([a-zA-Z\d\-._~\!$&'()*+,;=:@%/?]*)))?$!u', $url, $matches); $parts = array_intersect_key ($matches, array ('scheme' => '', 'credentials' => '', 'host' => '', 'port' => '', 'path' => '', 'query' => '', 'fragment' => '', 'only_path' => '', )); // extract the user and page preg_match ('!/*(?P<user>.*)/(?P<page>.*)/!u', $parts['path'], $matches); $user_and_page = array_intersect_key ($matches, array ('user' => '', 'page' => '', )); // the query string stuff $query = array (); parse_str ($parts['query'], $query); 

References

To clarify, here are the relevant documents used to formulate a regular expression:

  • RFC3986 Scheme / Protocol
  • User and password RFC3986
  • RFC1035 host name
    • Or RFC3986 IPv4
    • Or RFC2732 IPv6
  • RFC3986 Request
  • RFC3986 fragment
+2
source

Some suggestions for improving this feature.

First use parse_url instead of a break to separate the host name, path, and query string.

Secondly, put the code to parse the path before you decide if there is a query string, since you are parsing the path anyway.

Third, instead of a foreach , use array_merge to copy the parameters as follows:

 // put $return_uri_array last so $parameter_array can't override values $return_uri_array = array_merge($parameter_array, $return_uri_array); 

Whether it should be a class or not depends on your programming style. As a rule, I always used classes because they were easier to deride in unit tests.

The most compact way is a regular expression like this (not fully tested, just to show the principle)

 if(preg_match('!http://localhost/(?P<user>\w+)(?:/(?P<page>\w+))/(?:\?sub_page=(?P<sub_page>\w+)&action=(?P<action>\w+))!', $uri, $matches)) { return $matches; } 

The resulting array will also have numerical match indices, but you can simply ignore them or filter the desired keys with array_intersect_keys . The \w+ pattern matches all characters in a word, you can replace it with character classes such as [-a-zA-Z0-9_] or something similar.

+2
source

Is it mabye?

 function uri_to_array($uri){ $result = array(); parse_str(substr($uri, strpos($uri, '?') + 1), $result); list($result['user'], $result['page']) = explode('/', trim($uri, '/')); return $result; } print_r( uri_to_array('/admin/stock/?sub_page=products&action=add') ); /* Array ( [sub_page] => products [action] => add [page] => stock [user] => admin ) */ 

demo: http://codepad.org/nBCj38zT

+2
source

Source: https://habr.com/ru/post/1392673/


All Articles