Is there a good reason not to use unicode rather than a string?

Many of the problems I encountered in Python are related to the lack of something in Unicode. Is there a good reason not to use Unicode by default? I understand that you need to translate something to ASCII, but it seems to be an exception, not a rule.

I know Python 3 uses Unicode for all strings. Should this encourage me as a developer to unicode() all my lines?

+6
source share
2 answers

In general, I'm going to say no, there is no good reason to use string over unicode . Remember also that you do not need to call unicode() to create a Unicode string; you can do this by prefixing the string with lower case u like u"this is a unicode string" .

+6
source

In Python 2.x:

  • The str object is just a sequence of bytes.
  • A unicode object is a sequence of characters.

Knowing this, it should be easy to choose the right type:

  • If you want the character string to use unicode .
  • If you want a string encoded in bytes to use str (in many other languages ​​you would use byte[] ).

In Python 3.x, the str type is a string of characters, as you would expect. You can use bytes if you need a sequence of bytes.

+4
source

Source: https://habr.com/ru/post/902597/


All Articles