I work with yaml files which should be readable and editable, but will also be edited from Python code. I am using Python 2.7.3
The file should handle accents (mainly for processing French text).
Here is an example of my problem:
import codecs import yaml file = r'toto.txt' f = codecs.open(file,"w",encoding="utf-8") text = u'héhéhé, hûhûhû' textDict = {"data": text} f.write( 'write unicode : ' + text + '\n' ) f.write( 'write dict : ' + unicode(textDict) + '\n' ) f.write( 'yaml dump unicode : ' + yaml.dump(text)) f.write( 'yaml dump dict : ' + yaml.dump(textDict)) f.write( 'yaml safe unicode : ' + yaml.safe_dump(text)) f.write( 'yaml safe dict : ' + yaml.safe_dump(textDict)) f.close()
The writing file contains:
write unicode : héhéhé, hûhûhû write dict : {'data': u'h\xe9h\xe9h\xe9, h\xfbh\xfbh\xfb\n'} yaml dump unicode : "h\xE9h\xE9h\xE9, h\xFBh\xFBh\xFB" yaml dump dict : {data: "h\xE9h\xE9h\xE9, h\xFBh\xFBh\xFB"} yaml safe unicode : "h\xE9h\xE9h\xE9, h\xFBh\xFBh\xFB" yaml safe dict : {data: "h\xE9h\xE9h\xE9, h\xFBh\xFBh\xFB"}
A yuml dump works fine for loading with yaml, but it is not human readable.
As you can see in the example code, the result is the same when I try to write a unicode dict view (I don't know if this is related or not).
I would like the dump to contain accented text, not a unicode code. Is it possible?
source share