PHP Spoofchecker Class

I am reading a PHP manual and met the Spoofchecker class in the intl extension of the documentation page. The methods in the class, their parameters, and the class itself are quite undocumented, so I wonder what its purpose is.

+4
source share
2 answers

The class calls the ICU library so you can find more details. They say

Because Unicode contains so many characters and includes a wide variety of written systems in the world, misuse can expose programs or systems to possible security attacks. This document identifies mechanisms that can be used to detect potential security problems.

Here is my short code example using the Spoofchecker class. I myself have not used this for a real site yet, but I assume that it can be useful when you need to create and display an external link from a URL from your user input. If an attacker tries to attract other visitors to fake sites instead of Google, Paypal, URL shortening or the like, you can unlink or reject his submission.

 if(!extension_loaded('intl') || !class_exists("Spoofchecker")) { exit ('turn on php_intl extension first'); } $checker = new Spoofchecker(); // false: all letters are in ASCII var_dump($checker->isSuspicious("goog1e.com")); // true: the first Cyrillic letter is from different set var_dump($checker->isSuspicious("aypal.com")); // true: digit one instead of small L var_dump( $checker->areConfusable( 'google.com', 'goog1e.com' ) ); // false: digit zero and small O are not confusable var_dump( $checker->areConfusable( 'google.com', 'g00g1e.com' ) ); // true: Cyrillic letter instead of P var_dump( $checker->areConfusable( 'aypal.com', 'Paypal.com' ) ); // true: Japanese Katakana and Hiragana 'he' var_dump( $checker->areConfusable( 'ヘいせい.com', 'ヘいせい.com' ) ); // true: identical detected as confusable so you might check === first var_dump( $checker->areConfusable( 'google.com', 'google.com' ) ); 

You can check which characters are considered confused on this table .

+5
source

As pointed out by Mark Baker's comment, SpoofChecker is a PHP engine for detecting possible security problems when working with Unicode strings.

Because Unicode contains such a large number of characters and includes a variety of world writing systems, misuse can expose programs or systems to possible security attacks.

Source: http://www.unicode.org/reports/tr39/

0
source

Source: https://habr.com/ru/post/1489590/


All Articles