The data structure for the phone book so that it can search for a number by name, as well as search for a name by number

Do you know the solution to the next interview question?

Create a data structure for the phone book that can safely and efficiently search for a number by name, as well as search for a name by number.


More details:

The solutions found in stackoverflow are related to hashtables, but I would need to build 2 hash tables for this, which requires twice as much space.

How to do this with only one data structure in time and space mode, safe for types?

+6
source share
5 answers

Such data structures are known as multi-index containers . They are not very common in most programming languages, because the interface can become quite complex. There are versions in Java , C # and - most notably - C ++ with the Boost library, see the Boost.MultiIndex documentation , in particular, example 4 on bidirectional maps :

A bidirectional map is a container (const FromType, const ToType), such as no two elements exist with the same first or second component (std :: map only guarantees the uniqueness of the first component). Quick search for both keys. The program has a tiny Spanish-English dictionary with an online query of words in both languages.

The basic idea of ​​containers with multiple indexes is that many containers store their items in nodes containing pointers / links to other nodes (e.g. double linked lists). Instead of storing pointers / links for a single container, node contains links for multiple index structures. This works with at least linked lists, sorted trees, and unique hash indices. And the implementation is very efficient, because for each element only one memory allocation is required.

+15
source

Okay .. I agree with multi-index, and this is the right way to do it. However, for an interview, they probably want you to think about it and explain something, and not just say that you need to use boost. If they ask about internal incentives, maybe it is embarrassing if you cannot explain it properly.

So here is a possible solution.

First of all, do not use a hash table for this. Phones and names can be easily sorted, and you should probably use some sort of balanced search tree or trie if you want an interactive search (http://en.wikipedia.org/wiki/Trie). IMHO, the hash table in this case is a big waste of space.

Suppose the names are unique and the numbers are unique. Then you can:

1 Data structure for storing your data

struct Phone { // implement the phone here whatever they need // assume that whatever representation used can be converted into a unique id (number) }; // struct PhoneBookEntry { std::string name; Phone number; }; 

2- Create two trees for the name and one for the phone

 BalancedSearchTree<PhoneBookEntry> tree_by_name; BalancedSearchTree<PhoneBookEntry> tree_by_number; 

3- This is it. Look at each tree for everything you need.

 bool PhoneBook::getByName(const std::string &name, PhoneBookEntry &o) { tree_by_name.get(name, o); return !o.empty(); } bool PhoneBook::getByNumber(const Phone &p, PhoneBookEntry &o) { tree_by_number.get(p, o); return !o.empty(); } 

Good luck.

+4
source

Reset all (i.e. both directions) to the same table.

+3
source

Isn't it that simple?

Using the record / struct / tuple array of a pair of values ​​(phone number, name).

  • Do a linear search in search of a search key; O (n / 2) for match, O (n) for miss,

  • return record / struct / tuple and do whatever needs to be done.

Edit:
This algorithm can be improved in many ways.

I think that this interview question can be deliberately indicated in order to find out how the interlocutor reacts. (This is what I do when I interview). Therefore, it may be more important to consider this as such, rather than suggesting that it is simply a matter of computer science.

I think it's worth talking to the interviewer. For instance:

  • Is it important that it be fast enough for a personal telephone directory, for example. on a mobile phone (it is unlikely that any reasonable algorithm will be too slow) or an application for the country's telecommunications infrastructure, where there may be availability, parallel updating and additional requirements, for example, access control to a certain subset of names and numbers?
  • What is the target technology? You can choose different approaches, for example. for Python, C ++, assembler, ...
  • Do they have more qualities that need to be demonstrated than the effectiveness of time and space? For example, should it be easy to prove correctness or verify? Is processor saving a quick approach costing human testing time?
  • How skilled or knowledgeable are people who support and can reuse code? Simple approaches will be approved due to maintenance costs.

etc..

I think it’s more important to interact with the interviewer, and not just focus on the technical solution. When I interview, I’m looking for someone who is trying to understand the whole problem, and not the (usually small) part that is easy to identify.

+1
source

You can use a 36-ary tree (26 for alphabets and 10 for numbers) and have links pointing both from the alphabet and from the number to the same node. (It will not be a "tree", strictly speaking, but you will get this idea). This way you won’t get a constant time search, but you don’t have to repeat the data and the search will be pretty fast.

Of course, you will have to handle conflicts, but you will have to handle them when using hashtables. Perhaps you can use two pointers to indicate the next node collision by name and number.

0
source

Source: https://habr.com/ru/post/912177/


All Articles