Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of one-character changes (insertion, deletion, substitution) required to change one word to another.
here is a simple analysis
$input = 'htc corporation'; // array of words to check against $words = array( 'htc', 'Sprint Nextel', 'Sprint', 'banana', 'orange', 'radish', 'carrot', 'pea', 'bean' ); foreach ( $words as $word ) { // Check for Intercept $ic = array_intersect(str_split($input), str_split($word)); printf("%s \tl= %s , s = %s , c = %d \n",$word , levenshtein($input, $word), similar_text($input, $word), count($ic)); }
Exit
htc l= 12 , s = 3 , c = 5 Sprint Nextel l= 14 , s = 3 , c = 8 Sprint l= 12 , s = 1 , c = 7 banana l= 14 , s = 2 , c = 2 orange l= 12 , s = 4 , c = 7 radish l= 12 , s = 3 , c = 5 carrot l= 11 , s = 1 , c = 10 pea l= 13 , s = 2 , c = 2 bean l= 13 , s = 2 , c = 2
Clear the htc distance 12 , while the carrot has 11 , if you want htc then Levenshtein was missing .. you need to compare the exact word and then set priorities