How can I remove a carriage return from a text file using Python?

What I was looking for did not work, so I turn to the experts!

I have text in a tab delimited text file that has some kind of carriage return (when I open it in Notepad ++ and use "show all characters", I see [CR] [LF] at the end of the line). I need to remove this carriage return (or whatever), but I can't figure it out. Here is a fragment of a text file showing a line with a carriage return:

firstcolumn secondcolumn third fourth fifth sixth seventh moreoftheseventh 8th 9th 10th 11th 12th 13th 

Here is the code I'm trying to use to replace it, but it does not find a return:

 with open(infile, "r") as f: for line in f: if "\n" in line: line = line.replace("\n", " ") 

My script just does not find a carriage return. Am I doing something wrong or incorrectly believing that this is a carriage return? I could just delete it manually in a text editor, but the text file contains about 5,000 entries, which may also contain this problem.

Additional information: The goal here is to select two columns from a text file, so I break up the \ t characters and refer to the values โ€‹โ€‹as part of the array. It works with any line without returns, but fails in the return lines, because, for example, there is no element 9 in these lines.

 vals = line.split("\t") print(vals[0] + " " + vals[9]) 

So, for the line of text above, this code fails because there is no index 9 in this particular array. For lines of text that do not have [CR] [LF], it works as expected.

+4
source share
6 answers

Technically, there is an answer!

 with open(filetoread, "rb") as inf: with open(filetowrite, "w") as fixed: for line in inf: fixed.write(line) 

b in open(filetoread, "rb") apparently opens the file so that I can access these line breaks and delete them. This answer actually came from user Qaru Kenneth Reitz from the site.

Thanks everyone!

+3
source

Depending on the type of file (and the OS from which it comes, etc.), the carriage return may be '\r' , '\n' or '\r'\n' . The best way to get rid of them, no matter who he is, is to use line.rstrip() .

 with open(infile, "r") as f: for line in f: line = line.rstrip() # strip out all tailing whitespace 

If you want to get rid of ONLY the carriage, and not any additional spaces that may be at the end, you can provide an optional rstrip argument:

 with open(infile, "r") as f: for line in f: line = line.rstrip('\r\n') # strip out all tailing whitespace 

Hope this helps

+1
source

Python opens files in the so-called universal newline mode , so newlines are always \n .

Python is usually created with universal newline support; supplying 'U' opens the file as a text file, but the lines can be interrupted by any of the following: the final Unix convention '\ n', the Macintosh convention '\ r', or the Windows convention '\ r \ n'. All of these external representations are treated as "\ n" by the Python program.

You iterate over rows by row. And you replace \n in lines. But actually there is no \n , because the lines are already separated by the iterator \n , and each line does not contain \n .

You can just read from the f.read() file. And then replace \n on it.

 with open(infile, "r") as f: content = f.read() content = content.replace('\n', ' ') #do something with content 
+1
source

I'm going to close it. Someone let me know if this is not the right way to close the question. I understand that I'm starting from a completely wrong angle. Even if I could remove the carriage return, I would end up with one long line instead of 5000 lines.

Thanks to all the answers. Anyway, I learned something. Sorry to waste any time!

+1
source

I have created the code for this and it works:

 end1='C:\...\file1.txt' end2='C:\...\file2.txt' with open(end1, "rb") as inf: with open(end2, "w") as fixed: for line in inf: line = line.replace("\n", "") line = line.replace("\r", "") fixed.write(line) 
0
source

Here's how to remove a carriage return without using a temporary file:

 with open(file_name, 'r') as file: content = file.read() with open(file_name, 'w', newline='\n') as file: file.write(content) 
0
source

Source: https://habr.com/ru/post/1491470/


All Articles