Why does SQLAlchemy create_engine with charset = utf8 return the python <str> type and not the <unicode> type?
Using Python 2.7 and SQLAlchemy 0.7, I connect to the MySQL database using the command:
engine = create_engine('mysql://username: password@host /dbname?charset=utf8',echo=False) According to SQLAlchemy docs, setting charset = utf8 automatically implies use_unicode = 1, so all rows should be returned as unicode. http://docs.sqlalchemy.org/en/rel_0_7/dialects/mysql.html gives a concrete example
#set client encoding for utf8; all rows are returned as unicode create_engine ('MySQL + MySQLdb: /// MYDB encoding = UTF-8')
So why, then, when I request a text field in a mapped class, does this field end with a type of "str"?
Base = declarative_base(engine) class RegionTranslation(Base): '''''' __tablename__ = 'RegionTranslation' __table_args__ = {'autoload':True} def __init__(self, region_id, lang_id, name): self.region_id = region_id self.lang_id = lang_id self.name = name rtrans = session.query(RegionTranslation).filter_by(region_id = 1, lang_id = 6).one() print (type(rtrans.name)) Output signal
<type 'str'> If I just accept this and decrypt the string before using it, everything is fine. But I don’t understand why the above code does not return the type “unicode”. Can someone please explain this?
Finally, I found the answer, finding that another script that I had successfully executed many times did not work anymore.
I changed the setting in my database from utf8_general_ci to utf8_bin. There is an error in MySQLdb 1.2.3 that causes utf8_bin strings to not be recognized as text, so conversion to unicode does not occur. This has been fixed in MySQLdb 1.2.4.