Maximum index length with InnoDB and UTF-8

I read that MySQL 5.6 can only index the first 767 bytes of varchar (or other text types). My character set is utf-8 , so each character can be stored up to 3 bytes. Starting at 767/3 = 255.66, this indicates that the maximum length of a text column should be indexed at 255 characters. Experience seems to confirm this, since the following:

 create table gaga ( val varchar(255), index(val) ) engine = InnoDB; 

But changing the definition of val to varchar(256) gives "Error code: 1071. The specified key was too long, the maximum key length is 767 bytes."

On this day, the age limit of up to 255 characters seems terribly low, therefore: is this correct? If this is the best way to get large chunks of text indexed using MySQL? (Should I avoid this? Store SHA? Use a different index type? Use a different character encoding for the database?)

+6
source share
1 answer

Although the restriction may seem ridiculous, it makes you wonder if you really need an index for such a long varchar field. Even with 767 bytes, the size of the index grows very quickly, and for a large table (where it is most useful), it most likely does not fit in memory.

On the other hand, the only frequent case, at least in my experience, when I needed to index a long varchar field, was a unique constraint. And in all these cases, the composite index of some group identifier and MD5 from the varchar field were sufficient. The only problem is to match case-insensitive sorting (which considers accented characters and non-accented equal ones), although in all my cases I used binary sorting anyway, so this is not a problem.

UPD Another common case for indexing a long cook is ordering. In this case, I usually define a separate indexed sorter field, which is a prefix of 5-15 characters depending on the distribution of data. For me, a compact index is preferable to a rare, inaccurate order.

+7
source

Source: https://habr.com/ru/post/943991/


All Articles