The optimal solution for non-overlapping maximum scoring sequences

Question

The optimal solution for non-overlapping maximum scoring sequences

When developing the simulator part, I ran into the following problem. Consider a string of length N and substrings M of this string with a non-negative bound assigned to each of them. Of particular interest are the many substrings that meet the following requirements:

They do not overlap.
Their total score (in total, for simplicity) is maximum.
They span the entire line.

I understand that a naive brute force solution has complexity O (M * N ^ 2). Although the implementation of this algorithm would probably not put a lot of effort into the entire project (nowhere near the critical path, it cannot be pre-computed, etc.), it really does not suit me. I would like to know if there are any more effective solutions to this problem, and if so, what are they? Pointers to the appropriate code are always appreciated, but just a description of the algorithm will also be executed.

+3

set algorithm

Michael Foukarakis Sep 11 '09 at 6:52

source share

4 answers

O (N + M) :

Set f[1..N]=-1
Set f[0]=0
for a = 0 to N-1
    if f[a] >= 0
        For each substring beginning at a
            Let b be the last index of the substring, and c its score
            If f[a]+c > f[b+1]
                Set f[b+1] = f[a]+c
                Set g[b+1] = [substring number]
Now f[N] contains the answer, or -1 if no set of substrings spans the string.
To get the substrings:
b = N
while b > 0
    Get substring number from g[N]
    Output substring number
    b = b - (length of substring)

0

jcd 11 . '09 7:06

, M , - .

S N M Tj. Lj - Tj Pj - , Sj. ,

DP. res ints N, i- , , , i- (, "abcd", res [ 2] , "cd" ).

, Sj i- . , (res [i + Lj] + Pj) . Sj, res [i] = max (res [i + Lj] + Pj) Sj, i- .

res [0] .

0

Olexiy 11 . '09 7:41

:

N, the number of chars in a string
e[0..N-1]: (b,c) an element of set e[a] means [a,b) is a substring with score c.

( , c (a, b).)

<p> [1,2] , ( ).

( , , , "" k )

:

s[i] is the score of the best substring covering of [0,i)
a[i]: [a[i],i) is the last substring used to cover [0,i); else NULL

- O (N ^ 2), e ; O (N + E), e - . :

for i = 0 to N:
    a[i] <- NULL
    s[i] <- 0
a[0] <- 0
for i = 0 to N-1
    if a[i] != NULL
        for (b,c) in e[i]:
            sib <- s[i]+c
            if sib>s[b]:
                a[b] <- i
                s[b] <- sib

(a, b, c), [a, b) c:

i <- N
if (a[i]==NULL):
    error "no covering"
while (a[i]!=0):
    from <- a[i]
    yield (from,i,s[i]-s[from]
    i <- from

, (sib, c) s [b] .

0

Jonathan Graehl 11 . '09 8:09

Ants Aasma · Accepted Answer · 2009-09-11T20:22:32+0000

DAG. node, . , node node node , . node , , node, , .

, node , . , , , -. , -, Rabin-Karp.

, - DAG O (e) . , , , . , , . , - , .

The optimal solution for non-overlapping maximum scoring sequences

More articles: