Why is the output of my tool overwriting itself and how to fix it?

The purpose of this question is to answer daily questions, the answer to which is “you have a DOS line ending”, so we can just close them as duplicates of this without repeating the same ad nauseam answers.

NOTE: This is NOT a duplicate of any existing question . The purpose of this Q&A is not only to provide the answer “run this tool”, but to explain the problem in such a way that we can simply point someone with the appropriate question here and they will find a clear explanation of why they were listed here as well as a startup tool to solve their problem. I spent hours reading all the existing Q and A questions, and they lack an explanation of the problem, alternative tools that can be used to solve it, and / or the pros / cons / warnings of possible solutions. Also, some of them accepted answers that are simply dangerous and should never be used.

Now back to the typical question that the referral would bring here:

I have a file containing 1 line:

what isgoingon

and when I print it using this awk script to reverse the order of the fields:

awk '{print $2, $1}' file

instead of the expected result:

isgoingon what

I get a field that should be at the end of the line, appearing at the beginning of the line, overwriting the text at the beginning of the line:

 whatngon

or I get output on two lines:

isgoingon
 what

What is the problem and how to fix it?

+7
source share
3 answers

, DOS CRLF UNIX LF, UNIX, CR , UNIX. CR \r control-M (^M) cat -vE , LF - \n $ cat -vE.

, :

what isgoingon

:

what isgoingon\r\n

cat -v:

$ cat -vE file
what isgoingon^M$

od -c:

$ od -c file
0000000   w   h   a   t       i   s   g   o   i   n   g   o   n  \r  \n
0000020

, UNIX, awk ( \n ) , \n , 2 :

<what> <isgoingon\r>

\r . \r Carriage Return, , :

print $2, $1

awk isgoingon, what, what, , isgoingon.

, :

dos2unix file
sed 's/\r$//' file
awk '{sub(/\r$/,"")}1' file
perl -pe 's/\r$//' file

- dos2unix aka frodos UNIX (, Ubuntu).

, tr -d '\r', , all \r , .

, GNU awk DOS, RS :

gawk -v RS='\r\n' '...' file

awks , POSIX awks RS, awks RS='\r\n' RS='\r'. , -v BINMODE=3 gawk, \r, , C , . Cygwin.

, , - , CSV, Windows, Excel, CRLF , LF CSV, :

"field1","field2.1
field2.2","field3"

:

"field1","field2.1\nfield2.2","field3"\r\n

, \r\n \n, , , , ., LFs CRLF LF s:

gawk -v RS='\r\n' '{gsub(/\n/,"\t")}1' file

GNU awk , awks , CR .

+11

dos2unix. , , , Linux/Unix, .

Fedora dnf install dos2unix dos2unix ( ).

dos2unix deb Debian.

. \r\n \n.

, DOS Unix, , . - tr, \r!

tr -d '\r' < infile > outfile
+2

\R PCRE . , Unicode . \R Unicode .

, "", s/\R$/\n/, \n. s/\R/\n/g " " \n.

:

$ printf "what\risgoingon\r\n" > file
$ od -c file
0000000    w   h   a   t  \r   i   s   g   o   i   n   g   o   n  \r  \n
0000020

Perl Ruby PCRE \R $ ( ):

$ perl -pe 's/\R$/\n/' file | od -c
0000000    w   h   a   t  \r   i   s   g   o   i   n   g   o   n  \n    
0000017
$ ruby -pe '$_.sub!(/\R$/,"\n")' file | od -c
0000000    w   h   a   t  \r   i   s   g   o   i   n   g   o   n  \n    
0000017

( , \R )

\R, (?>\r\n|\v) PCRE.

POSIX , , awk:

$ awk '{sub(/\r$/,"")} 1' file | od -c
0000000    w   h   a   t  \r   i   s   g   o   i   n   g   o   n  \n    
0000017

, ( ):

tr \R, ( \R , XML , \R , tr - ):

$ tr -d "\r" < file | od -c
0000000    w   h   a   t   i   s   g   o   i   n   g   o   n  \n        
0000016

GNU sed , POSIX sed, \R \x0D POSIX.

GNU sed only:

$ sed 's/\x0D//' file | od -c   # also sed 's/\r//'
0000000    w   h   a   t  \r   i   s   g   o   i   n   g   o   n  \n    
0000017

The Unicode Regular Expression Guide is probably the best choice for what is the final relation to what the “new line” is.

+2
source

Source: https://habr.com/ru/post/1684027/


All Articles