Combinations (n ​​choice k) parallelization and efficiency

I recently worked with word combinations to make “phrases” in different languages, and I noticed a few things that I could do with additional expert knowledge.

By defining constants for this,

Depths ( n) on average 6-7

The length of the input set is ~ 160 unique words.

  • Memory. Generating n permutations of 160 words spends a lot of space. I can abuse databases by writing them to disk, but then I get a performance hit because I need to constantly wait for I / O. Another trick is to generate on-the-fly combinations as an object generator.
  • Time - if Im not mistakenly n choose kquickly becoming something like this formula factorial(n) / (factorial(depth) * (factorial(n-depth))), it means that the input sets are becoming very fast.

So my question is.

Given that I have a function f(x)that takes a combination and applies a calculation that has a cost, like

func f(x) {
    if query_mysql("text search query").value > 15 {
        return true
    }
    return false 
}

How can I effectively handle and execute this function in a huge set of combinations?

Bonus question, can combinations occur at the same time?

Update: I already know how to generate them conditionally, especially since it is effective.

+4
source share
2 answers

- , parallelism , , . T :

  • .
  • d , Choose(n,d) >= T.
  • "" () d ( d ).
  • T-, "" ( c d), , " " ", max(c) .

map-reduce .

map(words): //one mapper
   sort(words) //by some total ordering function
   generate all combiations of depth `d` exactly // NOT K!!!
   for each combination c produced:
       idx <- index in words of max(c) 
       emit(c,words[idx+1:end])
reduce(c1, words): //T reducers
   combinations <- generate all combinations of size k-d from words
   for each c2 in combinations:
      c <- concat(c1,c2)
      emit(c,f(c))
+1

. Chase Twiddle . , , .

. k n .

, -. 1 .

( ) , .

0

Source: https://habr.com/ru/post/1526126/


All Articles