How to save unicode using SQLAlchemy?

I encountered this error:

File "/vagrant/env/local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 435, in do_execute cursor.execute(statement, parameters) exceptions.UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 8410: ordinal not in range(128) 

This happens when Im trying to save an ORM object with the assigned Python unicode string. And as a result, dict parameters has a unicode string as one of its values ​​and causes an error when enforcing str .

I tried to set convert_unicode=True setting on the engine and column, but to no avail.

So what is a good way to handle unicode in SQLAlchemy?

UPDATE

Here are some details about my setup:

Table:

  Table "public.documents" Column | Type | Modifiers ------------+--------------------------+-------------------------------------------------------- id | integer | not null default nextval('documents_id_seq'::regclass) sha256 | text | not null url | text | source | text | not null downloaded | timestamp with time zone | not null tags | json | not null Indexes: "documents_pkey" PRIMARY KEY, btree (id) "documents_sha256_key" UNIQUE CONSTRAINT, btree (sha256) 

ORM Model:

 class Document(Base): __tablename__ = 'documents' id = Column(INTEGER, primary_key=True) sha256 = Column(TEXT(convert_unicode=True), nullable=False, unique=True) url = Column(TEXT(convert_unicode=True)) source = Column(TEXT(convert_unicode=True), nullable=False) downloaded = Column(DateTime(timezone=True), nullable=False) tags = Column(JSON, nullable=False) 

SQLAlchemy installs:

 ENGINE = create_engine('postgresql://me: secret@localhost /my_db', encoding='utf8', convert_unicode=True) Session = sessionmaker(bind=ENGINE) 

And the code that creates the error simply creates a session, creates a Document object and saves it with the source with unicode` strign field assigned to it.

UPDATE # 2

Check out this repo - it has automated the configuration of Vagrant / Ansible and reproduces this error.

+6
source share
3 answers

Your problem is here:

 $ sudo grep client_encoding /etc/postgresql/9.3/main/postgresql.conf client_encoding = sql_ascii 

This calls psycopg2 by default for ASCII:

 >>> import psycopg2 >>> psycopg2.connect('dbname=dev_db user=dev').encoding 'SQLASCII' 

... which effectively disables psycopg2's ability to handle Unicode.

You can fix this in postgresql.conf:

 client_encoding = utf8 

(and then sudo invoke-rc.d postgresql reload ), or you can explicitly specify the encoding when creating the engine:

 self._conn = create_engine(src, client_encoding='utf8') 

I recommend the first, because in the early nineties it has long been gone. :)

+9
source

I cannot reproduce your problem (also you did not specify examples of how you actually add your elements to the database, maybe there may be a mistake). However, I recommend that you test your code in complete isolation with the rest of your system to make sure that you really want to work without the intervention of your other code. I created this file solely to verify what you want to do, and the main method inserted the corresponding object as a string into the database.

 # encoding: utf-8 from sqlalchemy import Column, Integer, String, Boolean, Float, Text from sqlalchemy import Column, INTEGER, TEXT from sqlalchemy import create_engine, MetaData from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import sessionmaker Base = declarative_base() class Demo(Base): __tablename__ = 'demo' id = Column(INTEGER, primary_key=True) key = Column(TEXT(convert_unicode=True)) value = Column(TEXT(convert_unicode=True)) class Backend(object): def __init__(self, src=None): if not src: src = 'sqlite://' self._conn = create_engine(src) self._metadata = MetaData() self._metadata.reflect(bind=self._conn) Base.metadata.create_all(self._conn) self._sessions = sessionmaker(bind=self._conn) def session(self): return self._sessions() def main(): backend = Backend('postgresql:// postgres@localhost /test') s = backend.session() obj = Demo() obj.key = 'test' obj.value = u'–test–' s.add(obj) s.commit() return backend 

Running this inside the interpreter:

 >>> b = main() >>> s = b.session() >>> s.query(Demo).get(1).value u'\u2013test\u2013' 

And inside psql:

 postgres=# \c test You are now connected to database "test" as user "postgres". test=# select * from demo; id | key | value ----+------+-------- 1 | test | –test– (1 row) 

Sorry I couldn’t help you, but I hope this tells you (or someone else) to find out why your code is receiving a Unicode decoding error. The software versions that I used are python-2.7.7, sqlalchemy-0.9.6, psycopg2-2.5.3, postgresql-9.3.4.

+3
source

I can not reproduce your mistake. I can provide some tips on handling unicode with SQLAlchemy, which may or may not help:

  • Instead of using convert_unicode just use the sqlalchemy.types.Unicode () column type. It will always be right.
  • You assign an instance of str ( 'key' ) to the key column, even if you used convert_unicode=True . You either want to assign a Unicode value, or use a non-Unicode code column type.
  • Always check if the encoding is set correctly for your PostgreSQL database for UTF-8.
  • Usually you do not need the encoding and convert_unicode for create_engine.
0
source

Source: https://habr.com/ru/post/972362/


All Articles