Is there a dictionary that I can download for java?

is there any dictionary i can download for java? I want to have a program that takes a few random letters and sees if they can be converted to a real word by checking them against a dictionary

+4
source share
6 answers

Is there a dictionary that can be downloaded for java?

Others have already answered this ... Maybe you just didnโ€™t talk about the dictionary file, but about spell checking?

I want to have a program that takes a few random letters and sees if they can be converted into a real word by checking them in a dictionary

This is different. How fast do you want it? How many words are in the dictionary and how many words, to what length do you want to check?

If you need a spell checker (which is not entirely clear from your question), Jazzy is a spell checker for Java that has links to many dictionaries. This is not bad, but various implementations are terribly inefficient (this is normal for small dictionaries, but it is an amazing waste when you have several hundred thousand words).

Now, if you just want to solve the specific problem that you are describing, you can:

  • analyze the dictionary file and create a map: (letters in sorted order, a set of matching words)

  • then for any number of random letters: sort them, see if you have a record on the map (if you use the value of the record, all the words that you can do with these letters).

    abracadabra: (aaaaabbcdrr, (abracadabra))

    carthorse: (acehorrst, (carthorse))

    orchestra: (acehorrst, (carthorse, orchestra))

etc...

Now you take, say, three random letters and get "hsotrerca", you sort them to get "acehorrst", and using this as a key, you get all the (valid) anagrams ...

This works because what you described is a special (simple) case: all you need to do is sort your letters and then use the map search O (1).

To perform more complex spellchecks where errors may occur, you need to think of something with โ€œcandidatesโ€ (words that may be correct but incorrect) [for example, using soundex, a metaphone or a double metaphone algos], and then use such things like the Levenhstein Edit-distance Edit-distance algorithm for checking candidates compared to well-known good words (or the much more complex tree made from Levenhstein's Edit-distance, which Google uses for its โ€œfind on entryโ€):

http://en.wikipedia.org/wiki/Levenshtein_distance

As a ridiculous sidenite, an optimized vocabulary representation can store hundreds or even millions of words in less than 10 bits per word (yup, you read that correctly: less than 10 bits per word) and yet allow for a very quick search.

+8
source

Dictionaries are usually agnostics of a programming language. If you try to use Google without using the keyword "java", you may get better results. For instance. Download free dictionary gives under each dicts.info .

+2
source

OpenOffice dictionaries are easy to parse in turn.

You can read it in memory (remember that a lot of memory):

List words = IOUtils.readLines(new FileInputStream("dicfile.txt")) (from commons-io )

Thus, you get a List all words. Alternatively, you can use Line Iterator if you encounter prpoblems memory.

+2
source

Check out http://sourceforge.net/projects/test-dictionary/ , this may give you some clues

I'm not sure if there are any libraries to download! But I think you can definitely find the source text in sourceforge.net to find out if there are any words and words used by people - http://sourceforge.net/search/?type_of_search=soft&words=java+dictionary

0
source

Source: https://habr.com/ru/post/1304458/


All Articles