Ebcdic Decoding

Question

Ebcdic Decoding

I get the data encoded in ebcdic. Sort of:

s = u'@@@@@@@@@@@@@@@@@@@ÂÖÉâÅ@ÉÄ'

The attempt .decode('cp500')is wrong, but what is the right approach? If I copy the string to something like Notepad ++, I can convert it from EBCDIC to ascii, but I cannot find a viable approach in python to achieve the same. For what it's worth, the correct result: BOISE ID(plus or minus a space).

Information is retrieved from the JSON object string file. This file is as follows:

{ "command": "flush-text", "text": "@@@@@O@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@O" }
{ "command": "flush-text", "text": "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\u00C9\u00C4@\u00D5\u00A4\u0094\u0082\u0085\u0099z@@@@@@@@@@\u00D9\u00F5\u00F9\u00F7\u00F6\u00F8\u00F7\u00F2\u00F4" }
{ "command": "flush-text", "text": "@@@@@OmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmO" }
{ "command": "flush-text", "text": "@@@@@O@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@O" }

And the processing cycle looks something like this:

with open('myfile.txt', 'rb') as fh:
  for line in fh:
    data = json.loads(line)

+4

python character-encoding ebcdic

gddc Jan 31 '16 at 8:40

source share

2 answers

, , cp500.

>>> s = u'@@@@@@@@@@@@@@@@@@@ÂÖÉâÅ@ÉÄ'
>>> bytearray(ord(c) for c in s).decode('cp500')
u'                   BOISE ID'

:

>>> s.encode('latin-1').decode('cp500')
u'                   BOISE ID'

+3

timgeb 31 . '16 9:14

Alastair McCormack · Accepted Answer · 2016-01-31T09:16:34+0000

If Notepad ++ converts it to normal, then you just need to:

Python 2.7:

with io.open('myfile.txt', 'r', encoding="cp500") as fh:
  for line in fh:
    data = json.loads(line)

Python 3.x:

with open('myfile.txt', 'r', encoding="cp500") as fh:
  for line in fh:
    data = json.loads(line)

TextWrapper . io Python 3 open Python 2.x, /TextWrapper

Ebcdic Decoding

More articles: