Designing an Oxford English Dictionary

I was asked in an interview how I would develop the Oxford English Dictionary.

I told him that I would use the TREE data structure, but he replied that it would take a lot of memory. So what other data structure should be used?

+6
source share
3 answers

One of the data that I heard was used in the past in mobile phones to store T9 dictionaries, the following (well, this only concerns the key problem, but not the storage of the definition):

Records are sorted, and each record should begin with an offset to the previous record, where it should continue, as well as the continuation. For instance:

apple 4icable 7tion 

will be decoded into an apple, applicable application. However, this may be different from chained attempts, see

 appl -> e -> ica -> ble -> tion 

Wikipedia uncovered a Directional acyclic graph of words , which differs from trees in that it not only branches, but branches can merge, where words have the same suffix. It really can be an excellent repository.

  a / \ pplic utom \ / ation 
+8
source

He will not use much memory. Your answer was ok. Maybe in 1995. Think you're lucky.

0
source

As already mentioned, if a roof is not enough for a well-designed trie, then there is probably no room for any other index. Since this is an interview question, it sounds like he was trying to direct you to classic custom data structures like B-trees.

Alternatively, a good answer may have been to request additional information, for example, "what operations do you want to do with this data structure and what performance do you need?" If you just want to check your spelling, then the Bloom filter may be the most efficient "data structure" ...

0
source

Source: https://habr.com/ru/post/903495/


All Articles