Tile check

To check tiles in scrabble, you make four grids of 5x5 letters totaling 100 tiles. I would like to create one where all 40 horizontal and vertical words are valid. The set of available plates contains:

  • 12 x E
  • 9 x A, I
  • 8 x O
  • 6 x N, R, T
  • 4 x D, L, S, U
  • 3 x G
  • 2 x B, C, F, H, M, P, V, W, Y, empty tile (wildcard)
  • 1 x K, J, Q, X, Z

A dictionary of valid words is available here (700 KB). There are about 12,000 valid 5 letter words.

Here's an example where all 20 horizontal words are valid:

ZOWIE|PINOT YOGIN|OC t AD <= blank being used as 't' XEBEC|NALED WAITE|MERLE VINER|LUTEA ---------+--------- USNEA|KNOSP TAVER|JOLED SOFTA|IAMBI RIDGY|HAIT h <= blank being used as 'h' QURSH|GROUF 

I would like to create one where all verticals are also valid. Can you help me solve this problem? This is not homework. This is a question that a friend asked me for help.

+44
string algorithm data-structures
Jan 31 '11 at 18:17
source share
5 answers

Final Edit: Solved! Here is the solution.

 GNAWN|jOULE RACHE|EUROS IDIOT|STEAN PINOT|TRAvE TRIPY|SOLES -----+----- HOWFF|ZEBRA AGILE|EQUID CIVIL|BUXOM EVENT|RIOJA KEDGY|ADMAN 

Here is a photograph of him built with my set of scraper. http://twitpic.com/3wn7iu

It was easy to find when I had the right approach, so I'm sure you could find a lot more. See the methodology below.




Build a tree of prefixes from a dictionary of 5 alphabetic words for each row and column. Recursively, a given tile location is valid if it forms valid prefixes for its column and row, and if the tile is available, and if the next tile location is valid. The main case is that it is valid if there is no tile that can be placed.

It probably makes sense to just find all the acceptable 5x5 boards, as Glenn said, and see if you can combine the four. Repeating to a depth of 100 doesn't seem like fun.

Edit: here is the code 2 of my code for me.

 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <stdbool.h> typedef union node node; union node { node* child[26]; char string[6]; }; typedef struct snap snap; struct snap { node* rows[5]; node* cols[5]; char tiles[27]; snap* next; }; node* root; node* vtrie[5]; node* htrie[5]; snap* head; char bag[27] = {9,2,2,4,12,2,3,2,9,1,1,4,2,6,8,2,1,6,4,6,4,2,2,1,2,1,2}; const char full_bag[27] = {9,2,2,4,12,2,3,2,9,1,1,4,2,6,8,2,1,6,4,6,4,2,2,1,2,1,2}; const char order[26] = {16,23,9,25,21,22,5,10,1,6,7,12,15,2,24,3,20,13,19,11,8,17,14,0,18,4}; void insert(char* string){ node* place = root; int i; for(i=0;i<5;i++){ if(place->child[string[i] - 'A'] == NULL){ int j; place->child[string[i] - 'A'] = malloc(sizeof(node)); for(j=0;j<26;j++){ place->child[string[i] - 'A']->child[j] = NULL; } } place = place->child[string[i] - 'A']; } memcpy(place->string, string, 6); } void check_four(){ snap *a, *b, *c, *d; char two_total[27]; char three_total[27]; int i; bool match; a = head; for(b = a->next; b != NULL; b = b->next){ for(i=0;i<27; i++) two_total[i] = a->tiles[i] + b->tiles[i]; for(c = b->next; c != NULL; c = c->next){ for(i=0;i<27; i++) three_total[i] = two_total[i] + c->tiles[i]; for(d = c->next; d != NULL; d = d->next){ match = true; for(i=0; i<27; i++){ if(three_total[i] + d->tiles[i] != full_bag[i]){ match = false; break; } } if(match){ printf("\nBoard Found!\n\n"); for(i=0;i<5;i++){ printf("%s\n", a->rows[i]->string); } printf("\n"); for(i=0;i<5;i++){ printf("%s\n", b->rows[i]->string); } printf("\n"); for(i=0;i<5;i++){ printf("%s\n", c->rows[i]->string); } printf("\n"); for(i=0;i<5;i++){ printf("%s\n", d->rows[i]->string); } exit(0); } } } } } void snapshot(){ snap* shot = malloc(sizeof(snap)); int i; for(i=0;i<5;i++){ printf("%s\n", htrie[i]->string); shot->rows[i] = htrie[i]; shot->cols[i] = vtrie[i]; } printf("\n"); for(i=0;i<27;i++){ shot->tiles[i] = full_bag[i] - bag[i]; } bool transpose = false; snap* place = head; while(place != NULL && !transpose){ transpose = true; for(i=0;i<5;i++){ if(shot->rows[i] != place->cols[i]){ transpose = false; break; } } place = place->next; } if(transpose){ free(shot); } else { shot->next = head; head = shot; check_four(); } } void pick(x, y){ if(y==5){ snapshot(); return; } int i, tile,nextx, nexty, nextz; node* oldv = vtrie[x]; node* oldh = htrie[y]; if(x+1==5){ nexty = y+1; nextx = 0; } else { nextx = x+1; nexty = y; } for(i=0;i<26;i++){ if(vtrie[x]->child[order[i]]!=NULL && htrie[y]->child[order[i]]!=NULL && (tile = bag[i] ? i : bag[26] ? 26 : -1) + 1) { vtrie[x] = vtrie[x]->child[order[i]]; htrie[y] = htrie[y]->child[order[i]]; bag[tile]--; pick(nextx, nexty); vtrie[x] = oldv; htrie[y] = oldh; bag[tile]++; } } } int main(int argc, char** argv){ root = malloc(sizeof(node)); FILE* wordlist = fopen("sowpods5letters.txt", "r"); head = NULL; int i; for(i=0;i<26;i++){ root->child[i] = NULL; } for(i=0;i<5;i++){ vtrie[i] = root; htrie[i] = root; } char* string = malloc(sizeof(char)*6); while(fscanf(wordlist, "%s", string) != EOF){ insert(string); } free(string); fclose(wordlist); pick(0,0); return 0; } 

At first it tries infrequent letters, and I'm not sure if this is a good idea. He begins to get bogged down before he leaves the boards, starting with x. After seeing how many 5x5 blocks I changed the code, I simply listed all the valid 5x5 blocks. Now I have a 150 MB text file with all solutions 4,430,974 5x5.

I also tried it with just a repetition through 100 plates, and it still works.

Edit 2: Here is a list of all valid 5x5 blocks that I generated. http://web.cs.sunyit.edu/~levyt/solutions.rar

Edit 3: Hmm, it seems like an error occurred in my tracking of fragment usage because I just found a block in my output file that uses 5 Zs.

 COSTE ORCIN SCUZZ TIZZY ENZYM 

Edit 4: Here is the final product.

 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <stdbool.h> typedef union node node; union node { node* child[26]; char string[6]; }; node* root; node* vtrie[5]; node* htrie[5]; int score; int max_score; char block_1[27] = {4,2,0,2, 2,0,0,0,2,1,0,0,2,1,2,0,1,2,0,0,2,0,0,1,0,1,0};//ZEBRA EQUID BUXOM RIOJA ADMAN char block_2[27] = {1,0,1,1, 4,2,2,1,3,0,1,2,0,1,1,0,0,0,0,1,0,2,1,0,1,0,0};//HOWFF AGILE CIVIL EVENT KEDGY char block_3[27] = {2,0,1,1, 1,0,1,1,4,0,0,0,0,3,2,2,0,2,0,3,0,0,1,0,1,0,0};//GNAWN RACHE IDIOT PINOT TRIPY //JOULE EUROS STEAN TRAVE SOLES char bag[27] = {9,2,2,4,12,2,3,2,9,1,1,4,2,6,8,2,1,6,4,6,4,2,2,1,2,1,2}; const char full_bag[27] = {9,2,2,4,12,2,3,2,9,1,1,4,2,6,8,2,1,6,4,6,4,2,2,1,2,1,2}; const char order[26] = {16,23,9,25,21,22,5,10,1,6,7,12,15,2,24,3,20,13,19,11,8,17,14,0,18,4}; const int value[27] = {244,862,678,564,226,1309,844,765,363,4656,909,414,691,463,333,687,11998,329,218,423,536,1944,1244,4673,639,3363,0}; void insert(char* string){ node* place = root; int i; for(i=0;i<5;i++){ if(place->child[string[i] - 'A'] == NULL){ int j; place->child[string[i] - 'A'] = malloc(sizeof(node)); for(j=0;j<26;j++){ place->child[string[i] - 'A']->child[j] = NULL; } } place = place->child[string[i] - 'A']; } memcpy(place->string, string, 6); } void snapshot(){ static int count = 0; int i; for(i=0;i<5;i++){ printf("%s\n", htrie[i]->string); } for(i=0;i<27;i++){ printf("%c%d ", 'A'+i, bag[i]); } printf("\n"); if(++count>=1000){ exit(0); } } void pick(x, y){ if(y==5){ if(score>max_score){ snapshot(); max_score = score; } return; } int i, tile,nextx, nexty; node* oldv = vtrie[x]; node* oldh = htrie[y]; if(x+1==5){ nextx = 0; nexty = y+1; } else { nextx = x+1; nexty = y; } for(i=0;i<26;i++){ if(vtrie[x]->child[order[i]]!=NULL && htrie[y]->child[order[i]]!=NULL && (tile = bag[order[i]] ? order[i] : bag[26] ? 26 : -1) + 1) { vtrie[x] = vtrie[x]->child[order[i]]; htrie[y] = htrie[y]->child[order[i]]; bag[tile]--; score+=value[tile]; pick(nextx, nexty); vtrie[x] = oldv; htrie[y] = oldh; bag[tile]++; score-=value[tile]; } } } int main(int argc, char** argv){ root = malloc(sizeof(node)); FILE* wordlist = fopen("sowpods5letters.txt", "r"); score = 0; max_score = 0; int i; for(i=0;i<26;i++){ root->child[i] = NULL; } for(i=0;i<5;i++){ vtrie[i] = root; htrie[i] = root; } for(i=0;i<27;i++){ bag[i] = bag[i] - block_1[i]; bag[i] = bag[i] - block_2[i]; bag[i] = bag[i] - block_3[i]; printf("%c%d ", 'A'+i, bag[i]); } char* string = malloc(sizeof(char)*6); while(fscanf(wordlist, "%s", string) != EOF){ insert(string); } free(string); fclose(wordlist); pick(0,0); return 0; } 

After it turned out how many blocks there were (almost 2 billion and still counted), I switched to searching for certain types of blocks, particularly difficult to build, using unusual letters. I was hoping that if I could get a fairly soft set of letters going to the last block, the vast space of valid blocks would probably have one for this set of letters.

I assigned each slab a value inversely proportional to the number of 5 alphabetic words that it appears. Then, when I found a valid block, I would summarize the tiles, and if the rating was the best that I still saw, I would print the block.

For the first block, I deleted the empty fragments, believing that the last block will be needed for more flexibility. After I skipped it until I saw that for some time the best block appeared, I selected the best block and removed the tile from it in the bag and ran the program again, getting the second block. I repeated this for the third block. Then for the last block, I added spaces and used the first valid block that he found.

+34
Jan 31 '11 at 19:30
source share

I would approach the problem (naive, of course), pessimistic. I would try to prove that there was no 5x5 solution, and therefore, of course, not four 5x5 solutions. To prove the lack of a 5x5 solution, I would try to build one of all the possibilities. If my hypothesis failed and I managed to build a 5x5 solution, then I would have a way to build 5x5 solutions and I would try to build all (independent) 5x5 solutions. If there were at least 4, I would determine if any combination meets the letter limit.

[Edit] Null Set determined that there are "4,430,974 5x5 solutions." Is it really? I mean, we have a limit on the number of letters that we can use. This restriction can be expressed as the boundary vector BV = [9, 2, 2, 4, ...], corresponding to the limits on A, B, C, etc. (You see this vector in the Null Set code). The 5x5 decision is valid if each member of its letter count vector is less than the corresponding member in BV. It would be easy to verify that the 5x5 solution is valid as it is created. Perhaps the number 4 430 974 can be reduced, say, N.

Regardless, we can formulate the problem as: find four vector counting numbers among N, the sum of which is equal to BV. Possible (N, 4) possible amounts ("N choose 4"). With N equal to 4 million, this is still of the order of 10 ^ 25 --- not an encouraging number. Perhaps you could look for four whose first terms add up to 9, and if they check that their second term adds up to 2, etc.

I would notice that after choosing 4 from N, the calculations are independent, so if you have a multi-core computer, you can speed it up with a parallel solution.

[Edit2] Concurrency probably would not matter much. At the moment, I could look optimistically: there are, of course, more than 5x5 solutions than I expected, so there may be more final solutions than expected. You may not have to go far at 10 ^ 25 to hit him.

+2
Jan 31 '11 at 18:40
source share

Here is how I would try this. First, create a prefix tree.

Select a word and place it horizontally on top. Select a word and place it vertically. Change them to exhausted options. When alternating, you begin to fix the first letters and eliminate a lot of inappropriate words. If you really find such a square, then check to see if they can be made with these parts.

For 5x5 squares: after some reflection, it cannot be worse than O (12000! / 11990!) For random text words. But think about it a little more. Each time you correct a letter (in plain text), you eliminate about 90% (optimistic assumption) of your words. This means that after three iterations you have 12 words. So the actual speed will be

 O(n * n/10 * n/10 * n/100 * n/100 * n/1000 * n/1000 ... which for 12000 elements acts something like n^4 algorithm 

which is not so bad.

Perhaps someone can better analyze the problem. But the search for words should nevertheless converge rather quickly.

More can be eliminated by abusing infrequent letters. Essentially find all words that have infrequent letters. Try to make appropriate positions for each letter. Build a set of valid letters for each position.

For example, let's say we have four words with the letter Q in it.

  AQFED, ZQABE, EDQDE, ELQUO this means there are two valid positionings of those: xZxxx AQFED xAxxx ---> this limits our search for words that contain [ABDEFZ] as the second letter xBxxx xExxx same for the other EDQDE ---> this limits our search for words that contain [EDLU] as the third letter ELQUO all appropriate words are in union of those two conditions 

Basically, if we have several words that contain the infrequent letter X in the word S at position N, it means that other words that are in this matrix should have a letter that is also located in S at position n.

Formula

  • Find all words that contain the infrequent letter X at position 1 (next iteration 2, 3 ...)
  • Make the letter A from the letters in these words
  • Keep only those words from the dictionary that has a letter from set A at position 1
  • Try to put them in a matrix (using the first method)
  • Repeat at 2
+2
Jan 31 '11 at 10:36
source share

I start with something simpler.

Below are some results:

  3736 2x2 solutions 8812672 3x3 solutions The 1000th 4x4 solution is AAHS ACAI LAIR SIRE The 1000th 5x5 solution is AAHED ABUNA HURST ENSUE DATED The 1000th 2x4x4 solution is AAHS | AAHS ABAC | ABAC HAIR | LEKU SCRY | STED --------+-------- DEED | DEEM EINE | INTI ENOL | OVER TELT | LYNE 

Please note that wrapping "A" and the space that is used as "A" should be considered as the same solution. But wrapping rows with columns should be considered as another solution. Hope this makes sense.

0
04 Feb 2018-11-21T00:
source share

Here are a lot of pre-calculated 5x5. Left as a reading exercise to find 4 compatible :-)

http://www.gtoal.com/wordgames/wordsquare/all5

-one
Feb 01 '11 at 7:30
source share



All Articles