How to compare Unicode strings in Javascript?
When I wrote in JavaScript "Ł" > "Z" , it returns true . In Unicode order, this should of course be false . How to fix it? My site uses UTF-8.
You can use Intl.Collator or String.prototype.localeCompare , introduced by the ECMAScript internationalization API :
"Ł".localeCompare("Z", "pl"); // -1 new Intl.Collator("pl").compare("Ł","Z"); // -1 -1 means Ł precedes Z as you want.
Please note that this only works in the latest browsers.
Here is an example of the French alphabet that could help you create a special look:
var alpha = function(alphabet, dir, caseSensitive){ return function(a, b){ var pos = 0, min = Math.min(a.length, b.length); dir = dir || 1; caseSensitive = caseSensitive || false; if(!caseSensitive){ a = a.toLowerCase(); b = b.toLowerCase(); } while(a.charAt(pos) === b.charAt(pos) && pos < min){ pos++; } return alphabet.indexOf(a.charAt(pos)) > alphabet.indexOf(b.charAt(pos)) ? dir:-dir; }; }; To use it in an array of strings a :
a.sort( alpha('ABCDEFGHIJKLMNOPQRSTUVWXYZaàâäbcçdeéèêëfghiïîjklmnñoôöpqrstuûüvwxyÿz') ); Add 1 or -1 as the second parameter alpha() to sort in ascending or descending order.
Add true as the third parameter to sort the register.
You may need to add numbers and special characters to the alphabet list.
You might be able to create your own sort function with localeCompare() , which, at least according to the MDC localeCompare() article , should sort things correctly.
If that doesn't work, here's an interesting SO question , where the OP uses string replacement to create a brute force sorting mechanism.
Also on this question, the OP shows how to build a custom textExtract function for the jQuery tablesorter plugin that sorts by locale - maybe it's also worth a look.
Edit: As a completely distant idea - I have no idea if this is possible at all, especially due to performance issues - if you are working with PHP / mySQL internally. In any case, I would like to mention the possibility of sending an Ajax request to an instance of mySQL so that it is sorted there. mySQL does a great job of sorting locale-specific data, since you can force sort operations in a specific mapping using, for example, ORDER BY xyz COLLATE utf8_polish_ci , COLLATE utf8_german_ci .... these comparisons will immediately take care of all sorting issues.
Improved microphone code for characters not mentioned:
var alpha = function(alphabet, dir, caseSensitive){ dir = dir || 1; function compareLetters(a, b) { var ia = alphabet.indexOf(a); var ib = alphabet.indexOf(b); if(ia === -1 || ib === -1) { if(ib !== -1) return a > 'a'; if(ia !== -1) return 'a' > b; return a > b; } return ia > ib; } return function(a, b){ var pos = 0; var min = Math.min(a.length, b.length); caseSensitive = caseSensitive || false; if(!caseSensitive){ a = a.toLowerCase(); b = b.toLowerCase(); } while(a.charAt(pos) === b.charAt(pos) && pos < min){ pos++; } return compareLetters(a.charAt(pos), b.charAt(pos)) ? dir:-dir; }; }; function assert(bCondition, sErrorMessage) { if (!bCondition) { throw new Error(sErrorMessage); } } assert(alpha("bac")("a", "b") === 1, "b is first than a"); assert(alpha("abc")("ac", "a") === 1, "shorter string is first than longer string"); assert(alpha("abc")("1abc", "0abc") === 1, "non-mentioned chars are compared as normal"); assert(alpha("abc")("0abc", "1abc") === -1, "non-mentioned chars are compared as normal [2]"); assert(alpha("abc")("0abc", "bbc") === -1, "non-mentioned chars are compared with mentioned chars in special way"); assert(alpha("abc")("zabc", "abc") === 1, "non-mentioned chars are compared with mentioned chars in special way [2]"); You must save two lines of sorting keys. One for the primary order, where German ä = a (primary a-> a) and French é = e (primary sort key e-> e) and one for the secondary order, where ä follows after (translating a-> azzzz in the secondary key ) or é follows e (secondary key e-> ezzzz). Especially in Czech, some letters are variations of the letter (áéí ...), while others are in their full right on the list (ABCČD ... GHChI ... RŘSŠT ...). Plus, the problem is to consider digraphs in single letters (primary ch-> hzzzz). There is no trivial problem, and there must be a solution in JS.
Funny, I have to think about this problem and end the search here because it occurred to me that I can use my own javascript module. I wrote a module to create a clean URL, so I need to translate the input string ... ( http://pid.imtqy.com/speakingurl/ )
var mySlug = require('speakingurl').createSlug({ maintainCase: true, separator: " " }); var input = "Schöner Titel läßt grüßen!? Bel été !"; var result; slug = mySlug(input); console.log(result); // Output: "Schoener Titel laesst gruessen bel ete" Now you can sort the results. You can, for example. save the source tool in the "title" field and the sorting field in the "title_sort" with the result mySlug.