Does the servlet know the encoding of the submitted form, which is specified using http-equiv?

Does the servlet know the encoding of the submitted form, which is specified using http-equiv?

When I specify the encoding of the POSTed form using http-equiv, like this:

<HTML> <head> <meta http-equiv='Content-Type' content='text/html; charset=gb2312'/> </head> <BODY > <form name="form" method="post" > <input type="text" name="v_rcvname" value="็›ธๅฎœๆœฌ่‰"> </form> </BODY> </HTML> 

And then on the servlet I use the method, request.getCharacterEncoding() I got null ! So, is there a way that I can tell the server that I am encoding data in some char encoding ??

+4
source share
2 answers

This will really return null from most web browsers. But usually you can safely assume that the web browser actually used the encoding, as indicated in the original response header, which in this case is gb2312 . The general approach is to create a Filter that checks the encoding of the request and then uses ServletRequest#setCharacterEncoding() to force the desired value (which you should use the course consistently throughout the web application).

 public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException { if (request.getCharacterEncoding() == null) { request.setCharacterEncoding("gb2312"); } chain.doFilter(request, response); } 

Mark this Filter on url-pattern , covering all servlet requests, for example. /* .

If you didnโ€™t do this and release it, then the servletcontainer will use its default encoding for parameter analysis, usually ISO-8859-1 , which, in turn, is incorrect. Your entry ็›ธๅฎœๆœฌ่‰ will end as รร ร’ร‹ยฑยพยฒร .

+6
source

Unable to send POST data back to GB2312. I think UTF-8 is a W3C recommendation, and all new browsers send data only in Latin-1 or UTF-8.

We were able to get the data encoded in GB2312 in the old IE on Win 95, but this is generally not possible in new Unicode-based browsers.

See this test in Firefox,

 POST / HTTP/1.1 Host: localhost:1234 User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive Content-Type: application/x-www-form-urlencoded Content-Length: 46 

My page is in GB2312, and I pointed to GB2312 everywhere, but Firefox just ignores it.

Some broken browsers even encode Chinese in Latin-1. We recently added a hidden field with a known value. By checking the value, we can determine the encoding.

request.getCharacterEncoding () returns the encoding from the Content-Type. As you can see from my trace, it is always null.

+1
source

Source: https://habr.com/ru/post/1310183/


All Articles