How to convert a Unicode file to an ASCII file in a perl script on a Windows machine

I have a Unicode file on a windows machine. Is there a way to convert it to ASCII format on a windows machine using a perl script

This is the UTF-16 specification.

+4
source share
2 answers

If you want to convert unicode to ascii, you should know that some characters cannot be converted, because they simply do not exist in ascii. If you can live with this, you can try the following:

#!/usr/bin/env perl use strict; use warnings; use autodie; use open IN => ':encoding(UTF-16)'; use open OUT => ':encoding(ascii)'; my $buffer; open(my $ifh, '<', 'utf16bom.txt'); read($ifh, $buffer, -s $ifh); close($ifh); open(my $ofh, '>', 'ascii.txt'); print($ofh $buffer); close($ofh); 

If you don't have autodie, just delete this line - you should then change your open / close statements with

 open(...) or die "error: $!\n"; 

If you have characters that cannot be converted, you will receive warnings on the console, and your output file will have, for example. text as

 \x{00e4}\x{00f6}\x{00fc}\x{00df} 

. BTW: If you donโ€™t have a mom, but know that itโ€™s Big Endian (Little Endian), you can change the encoding line to

 use open IN => ':encoding(UTF-16BE)'; 

or

 use open IN => ':encoding(UTF-16LE)'; 

Hope it works on Windows as well. I canโ€™t try now.

+10
source

Take a look at the encoding option in the Perl open command. You can specify the encoding when opening the file for reading or writing:

It will be something like this:

 #! /usr/bin/env perl use strict; use warnings; use feature qw(say switch); use Data::Dumper; use autodie; open (my $utf16_fh, "<:encoding(UTF-16BE)", "test.utf16.txt"); open (my $ascii_fh, ">:encoding(ASCII)", ".gvimrc"); while (my $line = <$utf16_fh>) { print $ascii_fh $line; } close $utf16_fh; close $ascii_fh; 
+3
source

Source: https://habr.com/ru/post/1381376/


All Articles