Mysql varbinary vs varchar

We use varchar (255) to store the "keywords" in mysql. We are faced with the problem that mysql ignores all trailing spaces for comparison purposes in "=". It respects trailing spaces in the โ€œhowโ€ comparison, but it does not allow us to store the same word with or without trailing spaces in the varchar column if it has an โ€œUNIQUEโ€ index above it.

So, we are considering switching to varbinary. Can anyone suggest what might be the consequence when there are multibyte characters in the columns?

+4
source share
3 answers

Andomar,

We are using version 5.0.5. All versions of mysql ignore trailing spaces for comparison. From the manual:

All MySQL mappings are of type PADSPACE. This means that all CHAR and Compare VARCHAR values โ€‹โ€‹in MySQL without regard to any finite spaces. This is true for all versions of MySQL, and it doesn't matter, your version trims trailing spaces from VARCHAR values โ€‹โ€‹before storing them

Moreover, mysql considers texts with / without finite spaces that are duplicated in indexes:

In cases where the back panel characters are deleted or compared ignore them, if the column has an index that requires unique values, inserting into the values โ€‹โ€‹of columns that differ only in the number of back panel characters will lead to a duplicate error. For example, if the table contains 'a', trying to store 'a' causes a duplicate key error.

And we absolutely need a keyword index. So, I have two options: varbinary or text. We will evaluate the performance of text and multibyte functions for varbinary.

+2
source

This is what the MySQL manual says about trailing spaces:

Finishing whitespace is version-dependent. Starting with MySQL 5.0.3, trailing spaces are preserved when values โ€‹โ€‹are stored and retrieved in accordance with the SQL standard. Prior to MySQL 5.0.3, trailing spaces are removed from values โ€‹โ€‹when they are stored in the VARCHAR column; this means that spaces are also missing from the values โ€‹โ€‹obtained.

Since your question says that MySQL doesn't repeat trailing spaces, I assume your version is lower than 5.0.3. Consider using a TEXT type for your column; they preserve finite spaces. TEXT will handle string encoding and decoding for you, so you don't need to worry about multibyte characters.

TEXT performs slower than VARBINARY. If the actual data shows that the performance is unacceptable, you may need VARBINARY (or BLOB.) In this case, you need to save the string in a specific encoding, for example UTF-8 . As long as all your clients use the same encoding, this will work fine for multi-byte characters. Test your customers with different regional settings :)

0
source

In addition to the finite space problem, your UNIQUE INDEX in MySQL will be limited to 767 bytes (which makes 767/3 ~ = 255 for 3-byte UTF8). See also:

0
source

Source: https://habr.com/ru/post/1285992/


All Articles