Word / Stem Dictionary

It seems my google-fu is failing.

Does anyone know about a freely accessible dictionary of a dictionary base that simply contains the basics of words? So, for something like strawberries, she will have strawberries. But does NOT contain abbreviations or spelling errors or alternative spellings (for example, UK vs USA)? Anything that could be used quickly in Java would be nice, but it would be useful to use only a text file of mappings or something that could be read.

+3
source share
3 answers

This is called lemmatization, and what you call a โ€œword baseโ€ is called a lemma. morphaand reimplementation in a POS tester. However, both require the entry of labeled POS addresses to eliminate the inherent ambiguity in natural language.

(POS means categorizing words, for example, noun, verb. I assume you want a tool that processes English.)

Edit : since you are going to use this for your search, here are some tips:

  • A simple conclusion for the English language has a mixed reputation in the world of search engine. Sometimes it works, often it doesnโ€™t.
  • . , Google. , .
  • , , , , , . ( , .)
  • Lucene, .

( , .)

+5

, , , , ,

+1

http://www.puzzlers.org/dokuwiki/doku.php?id=solving:wordlists:about:start

The Miriam Websters Collegiate 9th Edition link on this page contains a word file of root forms only. There is strawberries, strawberries - no. In the same way, the "add" is added, there the "add" is not. Not sure if this is what you need, but it was useful to me.

+1
source

Source: https://habr.com/ru/post/1771527/


All Articles