The raw unicode literal that is valid in Python 2 and Python 3?

The ur"" syntax seems to be disabled in Python 3. However, I need it! โ€œWhy?โ€ You ask. Well, I need the u prefix because it is a Unicode string, and my code should work in Python 2. Regarding the r prefix, maybe this is not essential, but the markup format that I use requires a lot of backslash, and this will help avoid mistakes.

Here is an example that does what I want in Python 2, but is illegal in Python 3:

 tamil_letter_ma = u"\u0bae" marked_text = ur"\a%s\bthe Tamil\cletter\dMa\e" % tamil_letter_ma 

After this problem, I found http://bugs.python.org/issue15096 and noticed this quote:

It is easy to overcome the limitation.

Would anyone like to suggest an idea on how?

Related: What exactly does the "u" and "r" string flags in Python do and what are the original string literals?

+5
source share
4 answers

Why don't you just use the raw string literal ( r'....' ), you do not need to specify u because in Python 3 strings are Unicode strings.

 >>> tamil_letter_ma = "\u0bae" >>> marked_text = r"\a%s\bthe Tamil\cletter\dMa\e" % tamil_letter_ma >>> marked_text '\\aเฎฎ\\bthe Tamil\\cletter\\dMa\\e' 

To make it work in Python 2.x, add the following Future import statement at the very beginning of your source code so that all string literals in the source code become Unicode.

 from __future__ import unicode_literals 
+10
source

The preferred way is to remove the u'' prefix and use from __future__ import unicode_literals as the @falsetru suggested . But in your particular case, you can abuse the fact that "ascii-only string" % unicode returns Unicode:

 >>> tamil_letter_ma = u"\u0bae" >>> marked_text = r"\a%s\bthe Tamil\cletter\dMa\e" % tamil_letter_ma >>> marked_text u'\\a\u0bae\\bthe Tamil\\cletter\\dMa\\e' 
+1
source

Regarding Python3: by default, all lines are always Unicode. The u prefix has lost its function and has been removed from the language.

However, starting with version 3.3, u again accepted (and ignored) to simplify porting from Python2.

So, one way to make your strings compatible with Python3 and Python2 is to require version 3.3+ released 3 years ago.

0
source

Unicode strings are the default values โ€‹โ€‹in Python 3.x, so using only r will result in the same thing as ur in Python 2.

-1
source

Source: https://habr.com/ru/post/1233293/


All Articles