How to encode / decode escape characters in python

how to encode / decode escape character '\ x13' in python into a character that is valid in RSS or XML.

use case is, I get data from arbitrary sources and create an RSS feed for this data. The data source sometimes has an escape character that interrupts my RSS feed.

So how can I sanitize input with escape character.

+3
source share
1 answer

\x13(ASCII 19, 'DC3) cannot be escaped; it is not valid in XML 1.0, period. You can include one encoded as &#19;or &#x13;in XML 1.1, but then you must include the declaration <?xml version="1.1"?>, and many tools will not like it.

I don’t know why this character will be included in your data, but the way forward probably removes the control codes completely. For instance:

re.sub('[\x00-\x08\x0B-\x1F]', '', s)

For some types of escape sequences (such as ANSI color codes), you can leave the homeless (uncontrolled) characters still there, in which case you will probably need a custom parser for this particular format.

+2
source

Source: https://habr.com/ru/post/1733524/


All Articles