URL decoding of Japanese characters, etc. In java

I have a servlet that receives some POST data. Since this data is x-www-form-urlencoded, a string such as サ ボ テ ン will be encoded in & # 12469; & # 12508; & # 12486; & # 12531 ;.

How would I shorten this string to the correct characters? I tried using URLDecoder.decode("encoded string", "UTF-8");, but that doesn’t matter.

The reason I would like to disable them is that before I show this data on a web page, I run away and receive and at the moment it eludes & s in the encoded string, so the characters are not displayed correctly.

+3
source share
4 answers

URL. %E3%82%B5%E3%83%9C%E3%83%86%E3%83%B3. HTML/XML. HTML/XML, Apache Commons Lang StringEscapeUtils.


: , UTF-8. JSP, :

<%@ page pageEncoding="UTF-8" %>

. halfway . -UTF8-all-the-way, , , .

+5

/ . - , , ASCII, , $#xxxx;

, $#xxxx;, . , .

, , UTF-8, .

+1

, Tomcat?

, , Connector Tomcat URIEncoding UTF-8. Google, , ,

UTF-8 Java Webapps?

0

?

Pattern pattern = Pattern.compile("&([^a][^m][^p][^;])?");
Matcher matcher = pattern.matcher(inputStr);
String output = matcher.replaceAll("&amp;$1");
0

Source: https://habr.com/ru/post/1784616/


All Articles