Decoding international characters in AppEngine

I am doing a small project on Google AppEngine, but I have problems with international characters. My program takes data from the user through the url "page.html? Data1 & data2 ..." and saves it for display later.

But when the user uses some international characters, such as Γ₯Àâ, it is encoded as% F4,% F5 and% F6. I assume this is due to the fact that only the first 128 (?) Characters in the ASCII table are allowed in http requests.

Does anyone have a good solution for this? Any easy way to decode text? And is it better to decode it before I save the data, or should I decode it when displayed to the user.

+1
source share
2 answers

URLs can contain anything, but this must be encoded . In Java, you can use URLEncoder and URLDecoder to encode and decode URLs with the desired character encoding.

Keep in mind that these classes are actually designed to encode an HTML form, but they can be applied to the query string (parameters) of URLs, so do not use them on all URLs - only by parameters.

+1
source

The URI specification ( RFC 3986 ) limits the characters that can be used in a URI (see ABNF ) and defines a percentage coding scheme for transmitting "unsafe" characters. As Bozho says , part of the URL request is usually encoded according to the HTML specification ( application / x-www-form-urlencoded ).

doc for App Engine says:

App Engine uses the Java Servlet standard for web applications.

So, you should probably let the servlet API decrypt the parameters for you. See Parameter Methods on the HttpServletRequest . This type of encoding should usually be stored at the presentation level, so the data will be saved in unencrypted form.

If you do this manually, check out this character blog post on handling URI characters .

0
source

Source: https://habr.com/ru/post/888846/


All Articles