The php query string contains two question mark delimiters ?? problematic? (File.php? Parm1 = val1 & parm2 = val2? Parm3 = val3 & ...)

So, here are some more details for this question ...

I have two systems from two different suppliers that are proprietary. We will call them System A and System B. Both systems function as standalone objects, but System B provides some additional and very specific functions that are not available in System A. The System B provider integrates with System A at a fairly basic level (System B integrates with other supplier systems similar to system A in the same market). While system B is a php-based application, system A is not. The integration method looks something like this:

System B exports a structured control file (also known as a text file) containing various parameter / value pairs. System A is designed to import a control file into system A. System A takes data in a control file, combines it with some of its own data, and creates a URI. This URL is presented as a link to the user on the corresponding page in System A. This is a URI that contains a double '?' in the query string. The same URI contains what I can only describe as feedback URIs (a specific URI in system A to receive data from system B, forming a bi-directional communication).

Thus, the user action, clicking the link, performs the following:

Step 1: system A transfers a series of parameters and values ​​to system B in the query string in file.php.

Step 2: system B receives data from system A, performs some verification and what is not, writes some information to the database and in the process launches a new window of the child browser, separate from system A. Then the user switches to his work in system B.

Step 3: When the user completes the task in System B and submits his work, System B transfers a series of parameters and values ​​back to System A through the reverse copy URI. System A does some checking, writes the results to the database, then issues a System B command to end the session (as long as I say the session, this is not what you think of the true session as there is no information) and System B closes the child window.

In most cases, this process works. Both vendors support this process. However, sometimes this process fails either at stage 1 or at stage 3. The error appears either as a hard error (as in stage 1) of system B, because it does not receive the required information and expects in the query line, or the error with which the system B sends its data back to system A (as in step 3), but System A cannot retrieve the data. The last error is not displayed to the user and does not generate any error log data. It just doesn't exist. We know about this only after the user looks into System A for their work, and there is nothing there.

And since it has already been taken, System A encodes / decodes the URL data in the query data lines.

Since the error is intermittent, and I cannot replicate in my QA environment, I only have an assumption to leave.

+6
source share
5 answers

Yes, that will not work. What you need to do is use the PHP UrlEncode function to encode all the values ​​when generating the URL. This will force PHP to correctly decode the source values.

http://www.php.net/urlencode 

If you intend to put the URL on an HTML page, for example, in tags or tags, you need to also use HtmlSpecialChars after encoding the URL.

 http://www.php.net/htmlspecialchars 
+1
source

Made some quick tests in PHP because I was curious. Here is a list of URLs and what is displayed in $ _GET from each of them:

 http://192.168.1.200/test.php?var1=test&var2=anotherTest array(2) { ["var1"]=> string(4) "test" ["var2"]=> string(11) "anotherTest" } http://192.168.1.200/test.php?var1=test?var2=anotherTest array(1) { ["var1"]=> string(21) "test?var2=anotherTest" } http://192.168.1.200/test.php?var1=test&?var2=anotherTest array(2) { ["var1"]=> string(4) "test" ["?var2"]=> string(11) "anotherTest" } http://192.168.1.200/test.php?var1=test?&var2=anotherTest array(2) { ["var1"]=> string(5) "test?" ["var2"]=> string(11) "anotherTest" } http://192.168.1.200/test.php??var1=test&var2=anotherTest array(2) { ["?var1"]=> string(4) "test" ["var2"]=> string(11) "anotherTest" } 

So basically the second "?" always considered as part of the data. Maybe bring up variable names or cause other problems, dunno.

+2
source

According to part 3.4 of the RFC URI (# 3986):

The request component is indicated by the first question mark ("?") Character and ends with a number character ("#") or the end of the URI.

query = * (pchar / "/" / "?")

The slash ("/") and question mark ("?") Characters can represent data in the query component. Beware that some old, erroneous implementations cannot correctly process such data when they are used as the base URI for relative references (section 5.1), apparently because they cannot distinguish query data from path data when searching for hierarchical delimiters. However, since request components are often used to carry identifying information in the form of key-value pairs and one frequently used value is a reference to another URI, it is sometimes better for ease of use to avoid percentage encoding of these characters.

So when I read this, is uncoded allowed ? , but it can represent data, not necessarily the transition between data segments. In my limited experience, I have never seen this in practice. I could imagine that some URL analysis packages (especially homebrews) could skip this part and potentially strangle when they run into the second ? .

+1
source
 file.php?parm1=val1&parm2=val2?parm3=val3&… 

No, you should not / cannot do this, and this will lead to unexpected consequences that sound like you are beginning to testify. You will need to find a solution that does not create this scenario. If you have the opportunity to create this URL and handle its construction, you can certainly use it instead.

0
source

Yes, they are problematic.

But since your system sometimes works, maybe you can intercept these URLs on your web server, for example, Apache has mod_rewrite that allow you to redirect these URLs to another script, and then try creating a clean_url script that will fix Then The URL will call (redirect) your B script system, replacing the fixed quote with the and character.

If you can do the first way, you can also do it the other way around.

Hope it solves your problem.

0
source

Source: https://habr.com/ru/post/901230/


All Articles