Which line should be used to specify the encoding in Perl POD, "utf8", "UTF-8" or "utf-8"?

You can write Perl documentation in UTF-8. To do this, you must write in your POD:

=encoding NNN 

But what should you write instead of NNN ? Different sources give different answers.

What is the correct answer? What correct line should be written to the POD?

+6
source share
2 answers
 =encoding UTF-8 

According to the IANA, encoding names are not case-sensitive , so utf-8 - the same thing.

utf8 is a Perl lax version of UTF-8. However, for security, you want to be strict on your POD processors.

+13
source

As Daxim points out, I was misled. =encoding=UTF-8 and =encoding=UTF-8 use strong encoding, and =encoding=utf8 is soft encoding:

 $ cat enc-test.pod =encoding ENCNAME =head1 TEST '\344\273\245\376\202\200\200\200\200\200' =cut 

(here \xxx means a literal byte with the value xxx . \344\273\245 is a valid UTF-8 sequence, \376\202\200\200\200\200\200 not)

=encoding=UTF-8 :

 $ perl -pe 's/ENCNAME/utf-8/' enc-test.pod | pod2cpanhtml | grep /h1 >TEST &#39;&#20197;&#27492;&#65533;&#39;</a></h1> 

=encoding=utf8 :

 $ perl -pe 's/ENCNAME/utf8/' enc-test.pod | pod2cpanhtml | grep /h1 Code point 0x80000000 is not Unicode, no properties match it; ... Code point 0x80000000 is not Unicode, no properties match it; ... Code point 0x80000000 is not Unicode, no properties match it; ... >TEST &#39;&#20197;&#2147483648;&#39;</a></h1> 

All of them are equivalent. The =encoding argument is expected to be the name recognized by the Encode::Supported module. When you go to this document, you see

  • canonical encoding name utf8
  • the UTF-8 name is an alias for utf8 and
  • names are case insensitive, therefore UTF-8 equivalent to UTF-8

What is the best practice? I'm not sure. I do not think that you are mistaken using the official IANA name (according to daxim's answer), but you also cannot be mistaken in the official Perl documentation.

+3
source

Source: https://habr.com/ru/post/951240/


All Articles