Efficient Python string obfuscation

Question

Efficient Python string obfuscation

I need to obfuscate Unicode text lines to slow down those who can extract them. Ideally, this will be done using the Python built-in module or a small additional library; the length of the string will be the same or less than the original; and "unobfuscation" will be as fast as possible.

I tried various character substitutions and XOR routines, but they are slow. Base64 and hexadecimal encoding significantly increase the size. To date, the most efficient method I have found compresses zlib at the lowest setting (1). Is there a better way?

+6

python string unicode

Tim 20 sept '11 at 17:11

source share

2 answers

Tom zych · Answer 1 · 2011-09-20T18:09:22+0000

It uses a simple, fast encryption scheme for bytes objects.

 # For Python 3 - strings are Unicode, print is a function def obfuscate(byt): # Use same function in both directions. Input and output are bytes # objects. mask = b'keyword' lmask = len(mask) return bytes(c ^ mask[i % lmask] for i, c in enumerate(byt)) def test(s): data = obfuscate(s.encode()) print(len(s), len(data), data) newdata = obfuscate(data).decode() print(newdata == s) simple_string = 'Just plain ASCII' unicode_string = ('sensei = \N{HIRAGANA LETTER SE}\N{HIRAGANA LETTER N}' '\N{HIRAGANA LETTER SE}\N{HIRAGANA LETTER I}') test(simple_string) test(unicode_string)

Python Version 2:

 # For Python 2 mask = 'keyword' nmask = [ord(c) for c in mask] lmask = len(mask) def obfuscate(s): # Use same function in both directions. Input and output are # Python 2 strings, ASCII only. return ''.join([chr(ord(c) ^ nmask[i % lmask]) for i, c in enumerate(s)]) def test(s): data = obfuscate(s.encode('utf-8')) print len(s), len(data), repr(data) newdata = obfuscate(data).decode('utf-8') print newdata == s simple_string = u'Just plain ASCII' unicode_string = (u'sensei = \N{HIRAGANA LETTER SE}\N{HIRAGANA LETTER N}' '\N{HIRAGANA LETTER SE}\N{HIRAGANA LETTER I}') test(simple_string) test(unicode_string)

jterrace · Answer 2 · 2011-09-20T18:26:23+0000

How about the old ROT13 stunt?

 >>> x = 'some string' >>> y = x.encode('rot13') >>> y 'fbzr fgevat' >>> y.decode('rot13') u'some string'

For a Unicode string:

 >>> x = u'國碼' >>> print x國碼>>> y = x.encode('unicode-escape').encode('rot13') >>> print y \h570o\h78op >>> print y.decode('rot13').decode('unicode-escape')國碼

Efficient Python string obfuscation

More articles: