- Sort words in an array
- When a word goes into => binary search (log (n)) (we do this because if you use a hash table this will be useful for a direct match, but bad for adjacent ones)
- If a perfect match returns it
- Once again compute the levensthein distance of the requested word with adjacent words and their neighbors (to determine them) and add them to the return list (if they satisfy)
- Returns a list of selected adjacent words
Quick and dirty implementation with /usr/share/dict/words
(you still have to do the left and right parts and choices)
DISCLAIMER: Binary search code borrowed from http://www.perlmonks.org/?node_id=503154
open(FILE, "<", "/usr/share/dict/words"); my @lines = <FILE>; my $word = $ARGV[0]; sub BinSearch { my ($target, $cmp) = @_; my @array = @{$_[2]}; my $posmin = 0; my $posmax = $#array; return -0.5 if &$cmp (0, \@array, $target) > 0; return $#array + 0.5 if &$cmp ($#array, \@array, $target) < 0; while (1) { my $mid = int (($posmin + $posmax) / 2); my $result = &$cmp ($mid, \@array, $target); if ($result < 0) { $posmin = $posmax, next if $mid == $posmin && $posmax != $posmin; if ($mid == $posmin){ return "Not found, TODO find close match\n"; } $posmin = $mid; } elsif ($result > 0) { $posmax = $posmin, next if $mid == $posmax && $posmax != $posmin; if ($mid == $posmax){ return "Not found, TODO find close match\n"; } $posmax = $mid; } else { return "Found: " .@array [$mid]; } } } sub cmpFunc { my ($index, $arrayRef, $target) = @_; my $item = $$arrayRef[$index]; $item =lc($item); $target =lc($target); $a = $item cmp $target; return $a; } print BinSearch($word."\n", \&cmpFunc, \@lines)."\n";
Usage (if the script is called find_words.pl
):
perl find_words.pl word
Where the word is the word you want to find.
source share