Java Stanford NLP: Spell Checking

Question

Java Stanford NLP: Spell Checking

I am trying to check the spelling accuracy of text samples using Stanford NLP. This is just a text metric, not a filter or anything else, so if it's a little different, as long as the error is uniform.

My first idea was to check if the word is known to vocabulary:

private static LexicalizedParser lp = new LexicalizedParser("englishPCFG.ser.gz");

@Analyze(weight=25, name="Spelling")
    public double spelling() {
        int result = 0;

        for (List<? extends HasWord> list : sentences) {
            for (HasWord w : list) {
                if (! lp.getLexicon().isKnown(w.word())) {
                    System.out.format("misspelled: %s\n", w.word());
                    result++;
                }
            }
        }

        return result / sentences.size();
    }

However, this creates quite a few false positives:

misspelled: Sincerity
misspelled: Sisyphus
misspelled: Sisyphus
misspelled: fidelity
misspelled: negates
misspelled: gods
misspelled: henceforth
misspelled: atom
misspelled: flake
misspelled: Sisyphus
misspelled: Camus
misspelled: foandf
misspelled: foandf
misspelled: babby
misspelled: formd
misspelled: gurl
misspelled: pregnent
misspelled: babby
misspelled: formd
misspelled: gurl
misspelled: pregnent
misspelled: Camus
misspelled: Sincerity
misspelled: Sisyphus
misspelled: Sisyphus
misspelled: fidelity
misspelled: negates
misspelled: gods
misspelled: henceforth
misspelled: atom
misspelled: flake
misspelled: Sisyphus

Any ideas on how to make this better?

+3

java nlp stanford-nlp spell-checking

Nick heiner Dec 05 '09 at 20:36

source share

2 answers

, / , (, , ) . "" , , , , . , ""? ""?

, ? lp.getLexicon(). isKnown (w.word()) ? ? , ""? NLP, , , 100% - .

0

Steve B. 06 . '09 19:05

Christopher Manning · Accepted Answer · 2009-12-22T00:33:48+0000

isKnown (String) . : "false" , ( ) 1 , . 1 . . , , , isKnown (String).

Java Stanford NLP: Spell Checking

More articles: