Why does CGI.pm not handle UTF-8 properly when working with application / x-www-form-urlencoded forms?

I updated Debian stabilizer yesterday and with the new β€œPerl 5.14” I also received the β€œnew” CGI module (v3.52). The previous version was 3.43, I think. The update broke my old web forms, and I realized that the UTF-8 characters from the form fields with enctype "application / x-www-form-urlencoded" are double decoded. Although with enctype "multipart / form-data" everything works fine.

Question:

  • Why is UTF-8 on forms with "application / x-www-form-urlencoded" not handling correctly? They should still be decoded correctly, even if "multipart / form-data" might be better for processing binary data.

Here is a small test file that addresses the decoding problem:

#!/usr/bin/perl use strict; use warnings; use utf8::all; use CGI qw(:all -utf8); my $q = new CGI; sub build_form { return q| <form method="post" enctype="application/x-www-form-urlencoded"> <br /> Y: <input type="text" name="y" /> </form>|; } print $q->header( -type=>"text/html; charset=utf-8", ), $q->start_html( -title=>"test", -encoding=>"utf-8" ), $q->h1( $q->param( 'x' ) . " " ), $q->start_form(), "X: ", $q->textfield( -name=>'x' ), $q->end_form(), "\n\n", $q->br(), $q->h1( $q->param( 'y' ) . " " ), build_form(), $q->end_html; 

PS. I do not think the update violated UTF-8 decoding. It seems that after the update, the automatically generated forms were with the wrong enctype ("application / x-www-form-urlencoded"), because they use outdated helper methods (for example, startform instead of start_form ).

+4
source share
1 answer
 use utf8::all; 

does

 binmode(STDIN, ':encoding(UTF-8)'); 

which distorts the data sent by the browser. Follow the

 binmode(STDIN); 

to undo the change and prevent damage.

+4
source

Source: https://habr.com/ru/post/1501269/


All Articles