The backslash that controls ascii characters in the middle of Unicode data is definitely a useful task. But this does not just elude them, it properly cancels them when you want to return the actual data.
In python stdlib there should be a way to do this, but no. I sent an error report: http://bugs.python.org/issue18679
but at the same time, work here works using translation and hacking:
tm = dict((k, repr(chr(k))[1:-1]) for k in range(32)) tm[0] = r'\0' tm[7] = r'\a' tm[8] = r'\b' tm[11] = r'\v' tm[12] = r'\f' tm[ord('\\')] = '\\\\' b = u"\n" c = b.translate(tm) print(c)
All control characters without a backslash character will be escaped using the sequence \ x ##, but if you need something else, with this, your translation matrix can do this. However, this approach is not unprofitable, so it works for me.
But getting it back is too hacky because you cannot just translate character sequences back to individual characters using translation.
d = c.encode('latin1', 'backslashreplace').decode('unicode_escape') print(d)
you really need to encode characters that map to bytes individually using latin1, while a backslash escapes unicode characters that latin1 doesn't know about, so the unicode_escape codec can handle all the correct paths.
UPDATE
So, I had a case where I need this to work in both python2.7 and python3.3. Here is what I did (buried in the _compat.py module):
if isinstance(b"", str): byte_types = (str, bytes, bytearray) text_types = (unicode, ) def uton(x): return x.encode('utf-8', 'surrogateescape') def ntob(x): return x def ntou(x): return x.decode('utf-8', 'surrogateescape') def bton(x): return x else: byte_types = (bytes, bytearray) text_types = (str, ) def uton(x): return x def ntob(x): return x.encode('utf-8', 'surrogateescape') def ntou(x): return x def bton(x): return x.decode('utf-8', 'surrogateescape') escape_tm = dict((k, ntou(repr(chr(k))[1:-1])) for k in range(32)) escape_tm[0] = u'\0' escape_tm[7] = u'\a' escape_tm[8] = u'\b' escape_tm[11] = u'\v' escape_tm[12] = u'\f' escape_tm[ord('\\')] = u'\\\\' def escape_control(s): if isinstance(s, text_types): return s.translate(escape_tm) else: return s.decode('utf-8', 'surrogateescape').translate(escape_tm).encode('utf-8', 'surrogateescape') def unescape_control(s): if isinstance(s, text_types): return s.encode('latin1', 'backslashreplace').decode('unicode_escape') else: return s.decode('utf-8', 'surrogateescape').encode('latin1', 'backslashreplace').decode('unicode_escape').encode('utf-8', 'surrogateescape')