How to change the default newline when reading lines from a file in Python 3?

A recent question about splitting a binary using null characters made me think of a similar text question.

Given the following file:

Parse me using spaces, please. 

Using Perl 6, I can parse this file using a space (or any selected character) as a newline, this way:

 my $fh = open('spaced.txt', nl-in => ' '); while $fh.get -> $line { put $line; } 

Or more briefly:

 .put for 'spaced.txt'.IO.lines(nl-in => ' '); 

Any of them gives the following result:

 Parse me using spaces, please. 

Is there something equivalent in Python 3?

the closest I could find required reading the whole file into memory:

 for line in f.read().split('\0'): print line 

Update: I found several other older questions and answers that seemed to indicate that this was not available, but I thought that over the past few years there may have been new developments in this area:
Python restricts newline for readlines ()
Change newline character .readline () is looking for

+5
source share
3 answers

There is no built-in support for reading a file separated by a custom character.

However, downloading a file using the "U" -flag allows a universal newline, which can be obtained using file.newlines. It saves the new mode in the whole file.

Here is my generator for reading a file, and not everything in memory:

 def customReadlines(fileNextBuff, char): """ \param fileNextBuff a function returning the next buffer or "" on EOF \param char a string with the lines are splitted, the char is not included in the yielded elements """ lastLine = "" lenChar = len(char) while True: thisLine = fileNextBuff if not thisLine: break #EOF fnd = thisLine.find(char) while fnd != -1: yield lastLine + thisLine[:fnd] lastLine = "" thisLine = thisLine[fnd+lenChar:] fnd = thisLine.find(char) lastLine+= thisLine yield lastLine ### EXAMPLES ### #open file.txt and print each part of the file ending with Null-terminator by loading a buffer of 256 characters with open("file.bin", "r") as f: for l in customReadlines((lambda: f.read(0x100)), "\0"): print(l) # open the file errors.log and split the file with a special string, while it loads a whole line at a time with open("errors.log", "r") as f: for l in customReadlines(f.readline, "ERROR:") print(l) print(" " + '-' * 78) # some seperator 
+3
source

Will this do what you need?

 def newreadline(f, newlinechar='\0'): c = f.read(1) b = [c] while(c != newlinechar and c != ''): c = f.read(1) b.append(c) return ''.join(b) 

EDIT: Added replacement for readlines() :

 def newreadlines(f, newlinechar='\0'): line = newreadline(f, newlinechar) while line: yield line line = newreadline(f, newlinechar) 

so the OP can do the following:

 for line in newreadlines(f, newlinechar='\0'): print(line) 
+1
source
 def parse(fp, split_char, read_size=16): def give_chunks(): while True: stuff = fp.read(read_size) if not stuff: break yield stuff leftover = '' for chunk in give_chunks(): *stuff, leftover = (leftover + chunk).split(split_char) yield from stuff if leftover: yield leftover 

If you are ok with splitting with new lines along with split_char, below one works (for example, an example of reading a text file by word)

 def parse(fobj, split_char): for line in fobj: yield from line.split(split_char) In [5]: for word in parse(open('stuff.txt'), ' '): ...: print(word) 
0
source

Source: https://habr.com/ru/post/1270543/


All Articles