Confusion over data structure ordering

Question

Confusion over data structure ordering

Today I attended a written test conducted by the company. The general test was focused on data structures. I have a problem that I decided to solve. But it's hard for me to calculate the Big O function for the data structure. I will provide the question and answer that I came up with.

Given the document that needs to be saved, and the words in the document, and it should be able to return a counter when a word is entered. You are provided with char* GetNextWord() .
What data structure do you choose
Give an algorithm
What will be the order of your algorithm

In question 1, I wrote that I would go to the TRIE data structure. To question 2, I gave a short algorithm. I wrote that I will build the TRIE data structure as follows.

 struct TRIE{ boolean isWord; int count; Node* myList; } struct Node{ char* character; Node *next; TRIE *child; }

I have constructTrie() methods that will do addToTrie() for each word.

I wrote that the order of addToTrie() will be O (k), where k is the length. And the order of constructTrie() will be N * O (k), where N will be the number of words.

Now my question is: Are the correct orders or not? If not, how to attack such problems in the future (given that ds find order). I was really confused after using O (k). This makes me guess O (1).

Tips / Tips / Counseling is wide open !!

Change Fixed a question clearly indicating that the word count should be kept for all unique words.

+4

algorithm data-structures trie

bragboy Feb 27 '10 at 6:48

source share

2 answers

Comparing the two generic lines, take & Theta; (k) (k = min strlen), and the number of words is N, which you need to look at, therefore & Omega; (Nk) should be the most effective difficulty you can get.

+2

kennytm Feb 27 '10 at 7:47

source share

Ivlad · Accepted Answer · 2010-02-27T08:51:20+0000

If you really want to use trie, then addToTrie() will really be O (k) , where k is the length of the word to be added. constructTrie() will take O (Nk) , where N is the number of words if you just call addToTrie() for each word. However, you do not need to call the addToTrie() function for each word. After you finish adding the word, simply reset the pointer to trie root, then move the pointer when you move your current word, adding characters as you go forward. Pseudocode:

 trieNode *curr = trieRoot; for each character c in document if it a word terminator (space etc) add a character at curr signaling the end of the current word ('\0' maybe); curr = trieRoot; else if character is not a separator add character c at curr->next->character[c]; curr = curr->next;

This will give you O (C) time to build a trie, where C is the number of characters in your document.

Now this asks the question: why do you even need to do it? Obviously you figured out a way to detect when a word ended, so why should you add your words in trie? This is unnecessary. The only data structure required is several variables: one to track the current character, one to track the previous character and word count. This is easy to do in O (C) as follows:

 char prev = '\0'; char curr; int count = 0; for each character curr if curr is a word separator and prev isn't ++count; prev = curr;

I think it makes no sense to use trie for this problem, it only complicates the situation. I think that if they wanted to test your knowledge of attempts, they would give you a problem in which the try made more sense.

Even if they provided you with the getNextWord() function (you had to use it because you could do it without it), I assume that it returns "\ 0" or something when there are no more words? So why can't you just call him until he returns "\ 0" and counts such words? In any case, the true really doesn't make sense here.

Confusion over data structure ordering

More articles: