How to check the end of line of a text file to find out if it is unix or dos format?

I need to convert a text file to dos format (ending with each line 0x0d0x0a , not just 0x0a ) if the file is in unix format ( 0x0a only at the end of each line).

I know how to convert it ( sed 's/$/^M/' ), but not how to determine the ending character of a file.

I am using ksh.

Any help would be appreciated.

[Update]: The view is computed, and here is my ksh script to check.

 [ qiangxu@host :/my/folder]# cat eol_check.ksh #!/usr/bin/ksh if ! head -1 $1 |grep ^M$ >/dev/null 2>&1; then echo UNIX else echo DOS fi 

In the above script, ^M should be inserted into vi using Ctrl-V and Ctrl-M .

Want to find out if there is a better way.

+4
source share
6 answers
 if awk '/\r$/{exit 0;} 1{exit 1;}' myFile then echo "is DOS" fi 
+2
source

Just use the file command. If the file contains lines with CR LF at the end, it will be printed with a comment: "ASCII text with CRLF line terminators."

eg.

 if file myFile | grep "CRLF" > /dev/null 2>&1; then .... fi 
+10
source

The latest version (7.1) is the dos2unix (and unix2dos ) version that is installed with Cygwin, and some recent Linux distributions have a convenient one - information that prints the number of different types of newlines in each file. This is dos2unix 7.1 (2014-10-06) http://waterlan.home.xs4all.nl/dos2unix.html

On the man page:

 --info[=FLAGS] FILE ... Display file information. No conversion is done. The following information is printed, in this order: number of DOS line breaks, number of Unix line breaks, number of Mac line breaks, byte order mark, text or binary, file name. Example output: 6 0 0 no_bom text dos.txt 0 6 0 no_bom text unix.txt 0 0 6 no_bom text mac.txt 6 6 6 no_bom text mixed.txt 50 0 0 UTF-16LE text utf16le.txt 0 50 0 no_bom text utf8unix.txt 50 0 0 UTF-8 text utf8dos.txt 2 418 219 no_bom binary dos2unix.exe Optionally extra flags can be set to change the output. One or more flags can be added. d Print number of DOS line breaks. u Print number of Unix line breaks. m Print number of Mac line breaks. b Print the byte order mark. t Print if file is text or binary. c Print only the files that would be converted. With the "c" flag dos2unix will print only the files that contain DOS line breaks, unix2dos will print only file names that have Unix line breaks. 

Thus:

 if [[ -n $(dos2unix --info=c "${filename}") ]] ; then echo DOS; fi 

And vice versa:

 if [[ -n $(unix2dos --info=c "${filename}") ]] ; then echo UNIX; fi 
+4
source

I cannot test AIX, but try:

 if [[ "$(head -1 filename)" == *$'\r' ]]; then echo DOS; fi 
+1
source

You can simply remove any existing carriage returns from all lines, and then add carriage returns to the end of all lines. Then it does not matter in what format the incoming file is. The outgoing format will always be the DOS format.

 sed 's/\r$//;s/$/\r/' 
+1
source

I'm probably late for this one, but I had the same problem and didn't want to put the special character ^M in my script (I worry that some editors might not display special ones or some later programmer might replace it with two normal characters: ^ and M ...).

The solution I found passes a special character to grep, allowing the shell to convert its hex value:

 if head -1 ${filename} | grep $'[\x0D]' >/dev/null then echo "Win" else echo "Unix" fi 

Unfortunately, I cannot make the $'[\x0D]' construct in ksh. In ksh, I found the following: if head -1 $ {filename} | od -x | grep '0d0a $'> / dev / null then echo "win" yet echo "unix" c

od -x displays text in hexadecimal codes. '0d0a$' is the hexadecimal code for CR-LF (DOS-Win line terminator). Unix line terminator has the value '0a00$'

0
source

Source: https://habr.com/ru/post/1495564/


All Articles