May be encoded :: Guess tell utf-8 from iso-8859-1?

I have a $ data string encoded in utf-8. I assume that I do not know if this line is utf-8 or iso-8859-1. I want to use the Perl Encode :: Guess module to make sure it is one or the other. I find it hard to understand how this module works.

I tried the following four methods (from http://perldoc.perl.org/Encode/Guess.html ):

use Encode::Guess qw/utf8 latin1/;

my $decoder = guess_encoding($data);

print "$decoder\n";

Result: iso-8859-1 or utf8

use Encode::Guess qw/utf8 latin1/;

my $enc = guess_encoding($data, qw/utf8 latin1/);
ref($enc) or die "Can't guess: $enc";
my $utf8 = $enc->decode($data); 

print "$utf8\n";

Result: I can not guess: iso-8859-1 or utf8 on the line encodage-windows.pl 25, line 18110.

use Encode::Guess qw/utf8 latin1/;

my $decoder = Encode::Guess->guess($data);
die $decoder unless ref($decoder);
my $utf8 = $decoder->decode($data);

print "$utf8\n";

Result: iso-8859-1 or utf8 on line encodage-windows.pl 30, line 18110.

use Encode::Guess qw/utf8 latin1/;

my $utf8 = Encode::decode("Guess", $data);

print "$utf8\n";

Result: iso-8859-1 or utf8 on /usr/local/lib/perl5/Encode.pm line 175.

: ( )? : , ?

+5
1

,

my $decoder = guess_encoding($data, 'utf8');
$decoder = guess_encoding($data, 'iso-8859-1') unless ref $decoder;
die $decoder unless ref $decoder;

printf "Decoding as %s\n\n", $decoder->name;
$data = $decoder->decode($data);

, UTF-8, ISO-8859-1 , , / , ( , ).

+4

Source: https://habr.com/ru/post/1536122/


All Articles