Inline String String :: Python :: Unicode Object-Coded String

Inline :: Python :: Overloading an object '""'(stringify) with this:

sub __inline_str__ {
    my ($self) = @_;
    return Inline::Python::py_has_attr($self, '__str__') ? $self->__str__() : $self;
}

The method __str__()tries to convert to ASCII, which means that if the object Inline::Python::Objectrepresents a Python Unicode string, the likely result is:

exceptions.UnicodeEncodeError: codec 'ascii' cannot encode character u '\ xe7' at position 6: serial number not in range (128) on line 1252

One way that seems to work is replacing $self->__str__()with $self->encode('utf8'). I don't really like modifying a module like this, and subclassing seems like a significant problem. Moreover, I am not 100% sure why my fix works, which is a little worrying.

I am sure that I am not the first person to use the Unicode Python string in Perl. How should this be done?

+4
source share
1 answer

One workaround that seems to work is replacing $ self → str () with $ self-> encode ('utf8').

This is the right way to handle this. This code will encode any UTF characters somehow like this:

>>> u'\ufdef'.__str__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufdef' in position 0: ordinal not in range(128)
>>> u'\ufdef'.encode('utf-8')
'\xef\xb7\xaf'

Then, you probably want to use the UTF-8 decoder in PERL to correctly display the value.

-1
source

Source: https://habr.com/ru/post/1531331/


All Articles