Inverted Index Search Algorithm

Question

Inverted Index Search Algorithm

Think about whether there are 10 million words that people searched on google. appropriate for each word you have a sorted list of all document identifiers. The list is as follows:

[Word 1]->[doc_i1,doc_j1,.....]
[Word 2]->[doc_i2,doc_j2,.....]
...
...
...
[Word N]->[doc_in,doc_jn,.....]

I am looking for an algorithm to search for 100 rare pairs of words. A rare pair of words is a pair of words that occur together (not necessarily adjacent) in exactly 1 document.

I am looking for something better than O (n ^ 2), if possible.

+4

sorting set algorithm information-retrieval inverted-index

funkyme Feb 05 '14 at 16:43

source share

1 answer

pentadecagon · Answer 1 · 2014-02-05T17:54:25+0000

, . , , , . , , , .
, . , , . . , , , , , .
, , (1.). - , , , . , , .

, , 100 , , . , , (1.), , , , . O (N * log (N1)), N1 - , , 100 . , , , .

Inverted Index Search Algorithm

More articles: