The fastest way to sort a row to match the second row - only adjacent swaps are allowed

Question

The fastest way to sort a row to match the second row - only adjacent swaps are allowed

I want to get the minimum number of swap letters needed to convert one line to the second line. Only adjacent swaps are allowed.

Inputs: line length, line_1, line_2

Some examples:

Length | String 1 | String 2 | Output
-------+----------+----------+-------
   3   | ABC      | BCA      |   2 
   7   | AABCDDD  | DDDBCAA  |  16
   7   | ZZZAAAA  | ZAAZAAZ  |   6

Here is my code:

def letters(number, word_1, word_2):

    result = 0

    while word_1 != word_2:
        index_of_letter = word_1.find(word_2[0])
        result += index_of_letter
        word_1 = word_1.replace(word_2[0], '', 1)
        word_2 = word_2[1:]

    return result

It gives the correct results, but the calculation should remain less than 20 seconds.

Here are two sets of input (1,000,000 characters of long string): https://ufile.io/8hp46 and https://ufile.io/athxu .

In my setup, the first is done in 40 seconds, and the second in 4 minutes.

How to calculate the result in less than 20 seconds?

+4

python sorting algorithm

Stoiczkow Dec 01 '17 at 19:33

source share

3 answers

@KennyOstrom 90%. .

, , - , "" , , , . , 1 word2 ( ), . , , , .

, , . , @trincot . 1819136406 480769230766.

import numpy as np

_, word1, word2 = open("lit10b.in").read().split()
word1 = np.frombuffer(word1.encode('utf8')
                      + (((1<<len(word1).bit_length()) - len(word1))*b'Z'),
                      dtype=np.uint8)
word2 = np.frombuffer(word2.encode('utf8')
                      + (((1<<len(word2).bit_length()) - len(word2))*b'Z'),
                      dtype=np.uint8)
n = len(word1)

o1 = np.argsort(word1, kind='mergesort')
o2 = np.argsort(word2, kind='mergesort')
o1inv = np.empty_like(o1)
o1inv[o1] = np.arange(n)

order = o2[o1inv]

sum_ = 0
for i in range(1, len(word1).bit_length()):
    order = np.reshape(order, (-1, 1<<i))
    oo = np.argsort(order, axis = -1, kind='mergesort')
    ioo = np.empty_like(oo)
    ioo[np.arange(order.shape[0])[:, None], oo] = np.arange(1<<i)
    order[...] = order[np.arange(order.shape[0])[:, None], oo]
    hw = 1<<(i-1)
    sum_ += ioo[:, :hw].sum() - order.shape[0] * (hw-1)*hw // 2

print(sum_)

+5

Paul Panzer 02 . '17 0:00

, , , , .

Google, . . , Python

- , . 1. 2.

, , . . , .

, , , .

Using some code with which I happen to get away from repeating some classes of algorithms in free online classes (for fun):

print (week1.count_inversions('ABC'), week1.count_inversions('BCA'))
print (week1.count_inversions('AABCDDD'), week1.count_inversions('DDDBCAA'))
print (week1.count_inversions('ZZZAAAA'), week1.count_inversions('ZAAZAAZ'))

0 2
4 20
21 15

This corresponds to the values indicated above: 2, 16 and 6.

+1

Kenny ostrom Dec 01 '17 at 20:54

source share

trincot · Accepted Answer · 2017-12-02T11:33:14+0000

O (n ²) :

find() O (n)
replace() , O (n)
O (n)

, , , result += index_of_letter, index_of_letter.

:

word_1 word_1 dict, . . , , word_1, . , . O (n) , find O (1). , , , .. dict .
, , , . , ( ), node word_1 , ( , ). ( ), , . O (logn) . , replace. , . , node ( ).

:

, . numLeft. parent , .

:

def letters(word_1, word_2):
    size = len(word_1) # No need to pass size as argument
    # Create a binary tree for word_1, organised as a list
    #   in in-order sequence, and with the values equal to the number of
    #   non-matched letters in the range up to and including the current index:
    treesize = (1<<size.bit_length()) - 1
    numLeft = [(i >> 1 ^ ((i + 1) >> 1)) + 1 for i in range(0, treesize)]
    # Keep track of parents in this tree (could probably be simpler, I welcome comments).
    parent = [(i & ~((i^(i+1)) + 1)) | (((i ^ (i+1))+1) >> 1) for i in range(0, treesize)]
    # Create a linked list for each distinct character
    next = [-1] * size
    head = {}
    for i in range(len(word_1)-1, -1, -1): # go backwards
        c = word_1[i]
        # Add index at front of the linked list for this character
        if c in head:
            next[i] = head[c]
        head[c] = i
    # Main loop counting number of swaps needed for each letter
    result = 0
    for i, c in enumerate(word_2):
        # Extract next occurrence of this letter from linked list
        j = head[c]
        head[c] = next[j]
        # Get number of preceding characters with a binary tree lookup
        p = j
        index_of_letter = 0
        while p < treesize:
            if p >= j:  # On or at right?
                numLeft[p] -= 1  # Register that a letter has been removed at left side
            if p <= j:  # On or at left?
                index_of_letter += numLeft[p] # Add the number of left-side letters
            p = parent[p] # Walk up the tree
        result += index_of_letter
    return result

O (nlogn), - .

, , . ... .

The fastest way to sort a row to match the second row - only adjacent swaps are allowed

, , . . , .

More articles: