Writing pandas DataFrame to JSON in unicode

I am trying to write a pandas DataFrame containing unicode for json, but the inline function .to_jsonescapes characters. How to fix it?

Example:

import pandas as pd
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])
df.to_json('df.json')

This gives:

{"0":{"0":"\u03c4","1":"\u03c0"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}

Which differs from the desired result:

{"0":{"0":"τ","1":"π"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}


I tried to add an argument force_ascii=False:
import pandas as pd
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])
df.to_json('df.json', force_ascii=False)

But this gives the following error:

UnicodeEncodeError: 'charmap' codec can't encode character '\u03c4' in position 11: character maps to <undefined>


I am using WinPython 3.4.4.2 64bit with pandas 0.18.0
+4
source share
1 answer

Opening the file with the encoding set in utf-8 and then passing this file to the function .to_jsonfixes the problem:

with open('df.json', 'w', encoding='utf-8') as file:
    df.to_json(file, force_ascii=False)

gives the correct value:

{"0":{"0":"τ","1":"π"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}

Note: an argument is still required for it force_ascii=False.

+6
source

Source: https://habr.com/ru/post/1655301/


All Articles