Line endings in python

Possible duplicate:
Handling \ r \ n vs \ n newlines in python on Mac and Windows

I'm a bit confused about something, and I wonder if this is a python thing. I have a text file that uses Windows line endings ("\ r \ n"), but if I repeat some lines in a file, save them in a list and print the string representation of the list on the console, it shows the end of lines "\ n". Did I miss something?

+6
source share
3 answers

Yes, this is a python thing; by default, open() opens files in text mode, where line endings are broadcast depending on which platform your code is running on. You will need to open the file in binary mode ('b') to prevent this.

From the open documentation :

The most commonly used mode values ​​are: β€œr” for reading, β€œw” for writing (trimming the file, if it already exists), and β€œa” for adding (which on some Unix systems means that all records are added to the end of the file, regardless from the current search position). If the mode is omitted, it defaults to "r". By default, text mode is used, which can convert the characters "\ n" to a platform-specific representation when writing and reading back. Thus, when opening a binary file, you must add β€œb” to the mode value to open the file in binary mode, which will improve portability. (Adding "b" is useful even on systems that do not process binary and text files differently, where they serve as documentation.)

+6
source

Opening the file in binary mode will avoid this in Py2 on Windows. However, in Py3 (and in Py2.6 +, if you use io.open instead of the built-in one), binary mode in text mode means something clearly defined and platform independent and does not affect universal newlines. Instead, you can:

 file = open(filename, 'r', newline='') 

And newline will not be normalized.

+7
source

What you have to do is open a file with universal newline support (for Python 2.x). This is done with the β€œU” or β€œrU” mode. Then any type of newline is supported. The following instruction is in the python manual http://docs.python.org/library/functions.html#open :

In addition to the standard modes, fopen () can be "U" or "rU". Python is usually created with universal newline support; supply "U" opens the file as a text file, but lines can be interrupted by one of the following: Unix final string convention '\ n', Macintosh convention '\ r' or Windows convention '\ r \ n ". All of these external representations are considered like "\ n" in a Python program. If Python is built without universal newline support, the "U" mode is the same as regular text mode. Note that file objects opened in this way also have the newlines attribute, which has the value None (if new characters have not yet been viewed), '\ n', '\ r', '\ r \ n' or merge containing all kind of new lines.

For Python 3, there is a newline option that controls the behavior of newlines. Looking at the documentation, it seems that universal newline support is standard.

+5
source

Source: https://habr.com/ru/post/916845/


All Articles