Error with Python UTF-16 output and Windows endings?

With this code:

test.py

import sys
import codecs

sys.stdout = codecs.getwriter('utf-16')(sys.stdout)

print "test1"
print "test2"

Then I run it as:

test.py > test.txt

In Python 2.6 on Windows 2000, I found that newlines are output as a sequence of bytes , which of course is not true for UTF-16. \x0D\x0A\x00

Am I missing something, or is this a mistake?

+2
source share
3 answers

Try the following:

import sys
import codecs

if sys.platform == "win32":
    import os, msvcrt
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

class CRLFWrapper(object):
    def __init__(self, output):
        self.output = output

    def write(self, s):
        self.output.write(s.replace("\n", "\r\n"))

    def __getattr__(self, key):
        return getattr(self.output, key)

sys.stdout = CRLFWrapper(codecs.getwriter('utf-16')(sys.stdout))
print "test1"
print "test2"
+3
source

Newline translation occurs inside the stdout file. You write "test1 \ n" in sys.stdout (StreamWriter). StreamWriter translates this to "t \ x00e \ x00s \ x00t \ x001 \ x00 \ n \ x00" and sends it to the real file, the original sys.stderr file.

, UTF-16; , , , \n \x0D\x0A, , .

+3

, , UTF-16 .

-, Python print UTF-16 ( Unix):

import sys
import codecs

sys.stdout = codecs.open("outputfile.txt", "w", encoding="utf16")

print "test1"
print "test2"

-, stdout UTF-16 ( Unix) ( ActiveState):

import sys
import codecs

sys.stdout = codecs.getwriter('utf-16')(sys.stdout)

if sys.platform == "win32":
    import os, msvcrt
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

print "test1"
print "test2"
0

Source: https://habr.com/ru/post/1713809/


All Articles