Invalid URI with Chinese characters (Java)

Failed to configure URL connection with Chinese characters in the URL. It works with latin characters:

String xstr = "η»΄δΉŸηΊ³ζ©ζ–―η‰Ήε“ˆδ½©ε°”ηƒεœΊ" ;
URI uri = new URI("http","ajax.googleapis.com","/ajax/services/language/detect","v=1.0&q="+xstr,null);   
URL url = uri.toURL(); 
URLConnection connection = url.openConnection();
InputStream is = connection.getInputStream() ;

A call to getInputStream () results in:

java.lang.IllegalArgumentException: Invalid uri 'http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=???????????': Invalid query
+3
source share
4 answers

The problem arises because it URI.toURL()does not have percent encoding for non-ASCII characters. Use the following instead:

URL url = new URL(uri.toASCIIString());  
+7
source

The answer axtavt above saved me from madness, thanks! Only one comment (I could not figure out how to comment below the answer :)

If you start with a URL, you need to encode the quotes before creating the URI:

String s = "your_url?with=\"quotes\"";
URI su = new URI (s.replaceAll("\"", "%22");
URL ur = new URL( su.toASCIIString());
+2
source

I think this is due to the "UTF-8" encoding. Take a look at this topic to find out more, as well as this Chinese in java

0
source

In the RFC URI (see section 2.4), non-US-ASCII characters are not allowed in the URI. You must encode them.

0
source

Source: https://habr.com/ru/post/1788555/


All Articles