Escaping non-ASCII characters (or how to delete a specification?)

I need to create an ANSI text file from an Access record set that is output in JSON and YAML. I can write a file, but the output comes out with the original characters, and I need to avoid them. For example, umlaut-O (รถ) should be "\ u00f6".

I thought file encoding would work like UTF-8, but it is not. However, looking at the file encoding, if you write "UTF-8 without specification" then everything works.

Does anybody know how

a) Write text as UTF-8 without specification, or b) Write in ANSI, but avoiding non-ASCII characters?

Public Sub testoutput() Set db = CurrentDb() str_filename = "anothertest.json" MyFile = CurrentProject.Path & "\" & str_filename str_temp = "Hello world here is an รถ" fnum = FreeFile Open MyFile For Output As fnum Print #fnum, str_temp Close #fnum End Sub 
+4
source share
2 answers

... ok .... I found a sample code on how to remove the spec. I would have thought that one could do it more elegantly when actually writing the text first. Never mind. The following code removes the specification.

(This was originally published by Simon Pedersen at http://www.imagemagick.org/discourse-server/viewtopic.php?f=8&t=12705 )

 ' Removes the Byte Order Mark - BOM from a text file with UTF-8 encoding ' The BOM defines that the file was stored with an UTF-8 encoding. Public Function RemoveBOM(filePath) ' Create a reader and a writer Dim writer, reader, fileSize Set writer = CreateObject("Adodb.Stream") Set reader = CreateObject("Adodb.Stream") ' Load from the text file we just wrote reader.Open reader.LoadFromFile filePath ' Copy all data from reader to writer, except the BOM writer.Mode = 3 writer.Type = 1 writer.Open reader.Position = 5 reader.CopyTo writer, -1 ' Overwrite file writer.SaveToFile filePath, 2 ' Return file name RemoveBOM = filePath ' Kill objects Set writer = Nothing Set reader = Nothing End Function 

It may be useful for someone else.

+6
source

It's late to play here, but I can't be the only encoder that got tired when my SQL queries were broken by text files using a byte marker. There are very few stack questions that concern the problem - this is one of the closest - so I post an overlapping answer here.

I say โ€œoverlapโ€ because the code below solves a slightly different problem for you - the main goal is to write a schema file for a folder with a heterogeneous collection of files, but the specification processing segment is clearly marked.

The key functionality is that we iterate over all the โ€œ.csvโ€ files in the folder, and we test each file with a quick nibble of the first four bytes: and we only select the byte order marker if we see one.

After that, we work with low-level code for processing files from primary C. We should, up to using byte arrays, because everything else you do in VBA will be byte byte markers built into the structure of the string variable .

So, without further adodb, here is the code:

BOM code to delete text files in the schema.ini file:

  Public Sub SetSchema (strFolder As String)
 On error Continue on 
"This is necessary if we do not have registry privileges to install 'correct' ImportMixedTypes = Text 'registry value that overrides IMEX = 1
"Code defined by the OEM code page is not supported: additional encoding required
Dim strSchema As String Dim strFile As String Dim hnd file as long Dim arrFile () as byte Dim arrBytes (0 to 4) As a byte
"\" Then strFolder = strFolder and "\"
0
"Utility specifications." Byte order characters confuse OLEDB text drivers:
hndFile = FreeFile Open strFolder and strFile for binaries as #hndFile ReDim arrFile (0K LOF (hndFile) - 1) Get #hndFile, arrFile Close #hndFile

0 then


The code is easier to understand if you know that a byte array can be assigned to VBA.String and vice versa. The BigReplace () function is a hack that wraps up some of the inefficient processing of VBA strings, especially distribution: you will find that large files cause serious memory and performance problems if you do this in any other way.

+1
source

Source: https://habr.com/ru/post/896506/


All Articles