Why does Powershell concatenation convert UTF8 to UTF16?

I am running the following Powershell script to combine a number of output files into a single CSV file. whidataXX.htm (where xx is a two-digit serial number), and the number of files created varies from run to run.

 $metadataPath = "\\ServerPath\foo" function concatenateMetadata { $cFile = $metadataPath + "whiconcat.csv" Clear-Content $cFile $metadataFiles = gci $metadataPath $iterations = $metadataFiles.Count for ($i=0;$i -le $iterations-1;$i++) { $iFile = "whidata"+$i+".htm" $FileExists = (Test-Path $metadataPath$iFile -PathType Leaf) if (!($FileExists)) { break } elseif ($FileExists) { Write-Host "Adding " $metadataPath$iFile Get-Content $metadataPath$iFile | Out-File $cFile -append Write-Host "to" $cfile } } } 

The whidataXX.htm files are encoded in UTF8, but my output file is encoded in UTF16. When I view the file in Notepad, it looks correct, but when I view it in the Hex editor, the hexadecimal value 00 appears between each character, and when I pull the file into the Java program for processing, the file prints a console with extra spaces between characters .

Firstly, is this normal for PowerShell? or is there something in the source files that can cause this?

Secondly, how can I fix this encoding problem in the code noted above?

+6
source share
2 answers

Out- * commands (for example, Out-File) format data, and the default format is unicode.

You can add the -Encoding parameter to the Out-file:

 Get-Content $metadataPath$iFile | Out-File $cFile -Encoding UTF8 -append 

or switch to Add-Content, which is not reformatted

 Get-Content $metadataPath$iFile | Add-Content $cFile 
+12
source

Firstly, the fact that you get 2 bytes per character indicates that a fixed length of UTF16 is used. More precisely, this is called UCS-2. This article explains that file redirection in Powershell causes output in UCS-2. See http://www.kongsli.net/nblog/2012/04/20/powershell-gotchas-redirect-to-file-encodes-in-unicode/ . The same article also contains a correction.

+2
source

Source: https://habr.com/ru/post/956004/


All Articles