The class calls the ICU library so you can find more details. They say
Because Unicode contains so many characters and includes a wide variety of written systems in the world, misuse can expose programs or systems to possible security attacks. This document identifies mechanisms that can be used to detect potential security problems.
Here is my short code example using the Spoofchecker class. I myself have not used this for a real site yet, but I assume that it can be useful when you need to create and display an external link from a URL from your user input. If an attacker tries to attract other visitors to fake sites instead of Google, Paypal, URL shortening or the like, you can unlink or reject his submission.
if(!extension_loaded('intl') || !class_exists("Spoofchecker")) { exit ('turn on php_intl extension first'); } $checker = new Spoofchecker(); // false: all letters are in ASCII var_dump($checker->isSuspicious("goog1e.com")); // true: the first Cyrillic letter is from different set var_dump($checker->isSuspicious("aypal.com")); // true: digit one instead of small L var_dump( $checker->areConfusable( 'google.com', 'goog1e.com' ) ); // false: digit zero and small O are not confusable var_dump( $checker->areConfusable( 'google.com', 'g00g1e.com' ) ); // true: Cyrillic letter instead of P var_dump( $checker->areConfusable( 'aypal.com', 'Paypal.com' ) ); // true: Japanese Katakana and Hiragana 'he' var_dump( $checker->areConfusable( 'ヘいせい.com', 'ヘいせい.com' ) ); // true: identical detected as confusable so you might check === first var_dump( $checker->areConfusable( 'google.com', 'google.com' ) );
You can check which characters are considered confused on this table .
source share