#!/usr/local/bin/perl use strict; use warnings; use Text::SpellChecker; my $text = "coördinator"; my $checker = Text::SpellChecker->new( text => $text ); while ( my $word = $checker->next_word ) { print "Bad word is $word\n"; }
Exit: Bad word is rdinator
Desired: Bad word is coördinator
The module breaks if I have Unicode in $text . Any idea how this can be resolved?
I have installed Aspell 0.50.5, which is used by this module. I think this may be a criminal.
Edit: Text::SpellChecker requires Text::Aspell or Text::Hunspell , I uninstalled Text::Aspell and installed Hunspell , Text::Hunspell , and then:
$ hunspell -d en_US -l < badword.txt coördinator
Shows the correct result. This means that something is wrong with my code or Text :: SpellChecker.
Considering Miller's suggestion, I did below
#!/usr/local/bin/perl use strict; use warnings; use Text::SpellChecker; use utf8; binmode STDOUT, ":encoding(utf8)"; my $text = "coördinator"; my $flag = utf8::is_utf8($text); print "Flag is $flag\n"; print "Text is $text\n"; my $checker = Text::SpellChecker->new(text => $text); while (my $word = $checker->next_word) { print "Bad word is $word\n"; }
OUTPUT:
Flag is 1 Text is coördinator Bad word is rdinator
Does this mean that the module is not able to correctly process utf8 characters?
source share