Iterate over only part of a large list in java

Question

Iterate over only part of a large list in java

I am trying to make a Boggle game in Java, and for my program, when I randomize the board, I have a method that iterates through possible combinations and compares each of them with a list of dictionaries to check if this is a real word, and if Yes, I put it in the key. It works fine, but the program takes three or four minutes to generate the key, which is mainly related to the size of the dictionary. The one I use has about 19 thousand words, and comparing each combination takes a lot of time. Here is the part of the code I'm trying to do faster:

if (str.length()>3&&!key.contains(str)&&prefixes.contains(str.substring(0,3))&&dictionary.contains(str)){ key.add(str); }

where str is the combination created. prefixes is a list that I generated based on dictionary , which looks like this:

 public void buildPrefixes(){ for (String word:dictionary){ if(!prefixes.contains(word.substring(0,3))){ prefixes.add(word.substring(0,3)); } } }

which simply adds all three letter prefixes in the dictionary, such as "abb" and "mar", so when str is jibberish like "xskfjh" it will not be checked for the entire dictionary, just prefixes something like 1k words.

What I'm trying to do is shrink in time by repeating only words in the dictionary that have the same first letter as str , so if str is an abbreviation, then only str will be checked against words starting with "a", not the whole list, which would significantly reduce the time. Or even better, it only checks str for words that have the same prefix. I am new to Java, so I would really appreciate it if you are very descriptive in your answers, thanks!

+5

java performance arraylist

Ham lingo Apr 24 '16 at 17:45

source share

1 answer

Alex · Answer 1 · 2016-04-25T14:32:13+0000

What comments are trying to say - do not reinvent the wheel. Java is not assembler or C, and it is efficient enough to handle such trivial cases. Here is a simple code that shows that a simple set can easily process your vocabulary:

 import java.util.Set; import java.util.TreeSet; public class Work { public static void main(String[] args) { long startTime=System.currentTimeMillis(); Set<String> allWords=new TreeSet<String>(); for (int i=0; i<20000;i++){ allWords.add(getRandomWord()); } System.out.println("Total words "+allWords.size()+" in "+(System.currentTimeMillis()-startTime)+" milliseconds"); } static String getRandomWord() { int length=3+(int)(Math.random()*10); String r = ""; for(int i = 0; i < length; i++) { r += (char)(Math.random() * 26 + 97); } return r; } }

On my computer it shows

 Total words 19875 in 47 milliseconds

As you can see, 125 words out of 20,000 were duplicated. And it took not only time to generate 20,000 words very inefficiently, but also to store them, as well as check for duplicates.

Iterate over only part of a large list in java

More articles: