Sorting and character set are two different things.
A character set is simply an “unordered" list of characters and their representation. utf8mb4 is a character set and spans many characters.
Collation determines the order of the characters (determines the end result of the order, for example) and defines other rules (for example, which characters or combinations of characters should be considered the same). Collaborations are made from character sets; there can be more than one sort for the same character set. (This is a character set extension - sorta)
In utf8mb4_unicode_ci all (most?) utf8mb4_unicode_ci characters are treated as the same character, so you get u and ü . In short, this comparison is an asymmetric comparison with an accent.
This is similar to the fact that German comparisons consider ss and ß as such.
utf8mb4_bin is another sort, and it treats all characters as different. You may or may not want to use it by default, it is up to you and your business rules.
You can also convert sorting in queries, but keep in mind that this will prevent the use of MySQL indexes.
Here's an example using a similar, but perhaps a bit more familiar part of sorts:
ci at the end of the sort means Case Insensitive , and almost all comparisons to ci have a pair ending in cs , which means Case Sensitive .
When your column is case insensitive, the where column = 'foo' condition will find it all: foo Foo fOo FoO FOo FoO fOO, FOO.
Now, if you try to set case-sensitive matching (for example utf8mb4_unicode_cs ), all the above values are treated as different values.
Localized mappings (e.g. German, British, American, Hungarian, etc.) comply with named language rules. In Germany, ss and ß same, and this is stated in the rules of the German language. When a German user searches for the value of Straße , he expects that the software (supporting German or written in Germany) will return both Straße and Strasse .
To go further when it comes to ordering, two words are the same, they are equal, their meaning is the same, so there is no definite order.
Remember that a UNIQUE constraint is just a way to organize / filter values. Therefore, if there is a unique key defined in the column with German matching, it will not allow inserting both Straße and Strasse , since according to the rules of the language they should be considered as equal.
Now let's take a look at our original sort: utf8mb4_unicode_ci . This is a universal sort, which means that it tries to simplify everything, since ü not a very common character, and most users do not know how to enter it, this mapping makes it equal to u . This is a simplification to support most languages, but as you already know, these simplifications have some side effects. (e.g. organizing, filtering, using unique constraints, etc.).
utf8mb4_bin is the other end of the spectrum. This mapping should be as rigorous as possible. To achieve this, he literally uses character codes to distinguish between characters. This means that each character form is different, this sorting is implicitly case sensitive and accent sensitive.
Both of them have disadvantages: localized and general mappings are designed for one specific language or to provide a common solution. ( utf8mb4_unicode_ci is the "extension" of the old utf8_general_ci sort)
The binary requires special care when interacting with the user. Since these are cs and AS , this can confuse the users who are used to get the value “Foo” when they look for the value “foo”. Also as a developer, you have to be careful when it comes to joins and other functions. INNER JOIN 'foo' = 'Foo' will not return anything, since 'foo' is not equal to 'Foo'.
I hope these examples and explanations help a little.