(the parameter "srh_txt =% u05E0% u05D9% u05D1" encodes srh_txt = ניב in UNICODE)
This is actually not the case. This is not a URL encoding and the sequence %u not valid in the URL.
%u05E0%u05D9%u05D1" encodes ניב only in JavaScript oddball escape syntax. escape matches the URL encoding for all ASCII characters except + , but the %u#### escape codes that it produces for Unicode characters are completely owned his own invention.
(In general, you should never use escape . Using encodeURIComponent instead produces the correct URL-encoded UTF-8, ניב = %D7%A0%D7%99%D7%91 )
If the site requires the string %u#### in the query string, it is very broken.
Is there a way to create non-UTF-8 encoded URIs?
Yes, URIs can use any character encoding you like. This is usually UTF-8; what IRI is required and which browsers usually send if the user enters non-ASCII characters in the address bar, but the URI itself refers only to bytes.
So, you can convert ניב to %F0%E9%E1 . The web application cannot say that these bytes represent characters encoded on code page 1255 (Hebrew, similar to ISO-8859-8). But it seems that it works at the link above, which is not in the UTF-8 version. Oh dear!
bobince Feb 17 2018-10-17 13:32
source share