\ r \ n translates to \ r \ r \ n in Haskell

I am on a 64-bit version of Windows 7.

My program should get some text (Uts encoded) from an external source, do something with it, and then save it to disk. The source code uses the sequence "\ r \ n" to represent the lines of a new line (I'm glad it was).

Problem . When using Data.Text.writeFile, each sequence "\ r \ n" seems to translate to "\ r \ r \ n", that is, each "\ n" is translated to "\ r \ n", even if it already preceded by "\ r" in the source code. I understand that when writing to a file in Windows, "\ n" should be translated to "\ r \ n" when it is not preceded by "\ r", but the translation of "\ r \ n" to "\ r \ r \ n "doesn't seem right.

Using ByteString.writeLine, applied to the code version of the code, works well (although the extra \ r \ n is not added to the "sequence" \ r \ n)

A simple example:

{-# LANGUAGE OverloadedStrings #-} import qualified Data.ByteString as B import qualified Data.Text as T import qualified Data.Text.IO as T (writeFile) import qualified Data.Text.Encoding as T (encodeUtf8) str = "Line 1 is here\r\nLine 2 is here\r\nLine 3 is here" :: T.Text main = do B.writeFile "byt.bin" $ T.encodeUtf8 str T.writeFile "txt.bin" str 

Looking at each file created by this code with a hex editor, you can see the added extra x0D before each x0A in the file created along the T.writeFile line.

B.writeFile: enter image description here

T.writeFile: enter image description here

My question is : What have I done wrong? Is there a way to use T.writeFile for Windows and not translate "\ r \ n" to "\ r \ r \ n"?

+6
source share
1 answer

Your answer is in the docs :

Starting with GHC 6.12, text I / O is performed using the system or processes current locales and line termination conventions.

Seeing that you do not open the descriptor yourself, it seems very likely that the library opens the file in text mode, which leads to the translation of the final characters by the operating system. Instead, you can open the file in binary mode using openBinaryFile , and then use Data.Text.hPutStr . to prevent this.

However, the OS that processes your encoding may also not be what you want. Depending on your scenario, encoding / decoding a string explicitly used with ByteString might be a better idea.

+10
source

Source: https://habr.com/ru/post/989593/


All Articles