Line spacing, transposition only

Possible duplicate:
Counting the swaps needed to convert one permutation to another

I am looking for an algorithm that will consider some string distance where the operation is only allowed to transpose two adjacent characters. For instance:
string1: "mother"
string2: "moterh"
Distance: 2 (first replace “h” with “e” and get “motehr” and then “h” with “r”, which will result in “moterh”)
I know that the Damerau-Levenshtein distance is quite similar to the problem, but it requires a lot of memory (I would like it to work pretty quickly in words up to 1,000,000 characters). I already wrote this:

int amo = 0; for (int i = 0; i < n; i++) { if (fromString[i] == toString[i]) continue; char toWhat = toString[i]; int where = -1; for (int j = i; j < n; j++) { if (fromString[j] == toWhat) { where = j; break; } } while (where != i) { char temp = fromString[where]; fromString[where] = fromString[where - 1]; fromString[where - 1] = temp; where--; amo++; } } cout << amo << endl;` 

Lines are represented as char [n], where n is their length. I am sure there is a way to do this faster, and I would be very grateful if someone would tell me how to do this or write some kind of source code (Java / Python / C ++ would be best, but everything was would be great).

PS Excuse me for any language errors, I am not English, and I have not yet mastered this language.

+6
source share
1 answer

Basically you ask the algorithm to change the distance , but only allow the transposition operation (aka swapping, twiddling). In the book Introduction to Algorithms, you will find hints for implementing the twiddle operation; this is one of the problems at the end of the chapter on dynamic programming. In addition, in the book "Guide to the Development of Algorithms" in the chapter on dynamic programming, there is a complete implementation of the distance editing algorithm in C - without the transpose operation (again, this is one of the proposed exercises at the end of the chapter).

In the above link, you will find that a typical way to implement a distance editing algorithm is to use dynamic programming, which has the cost of O (mn) and O (mn). As far as I know, there is no way to do this faster (for example, in less than O (mn) time), but you can do it in less space - being smart, you can reduce the space to O (m), given that to calculate the cost of the operation Transpose only the current row and the previous two rows in the table are needed.

That is, assuming you only need editing distance. If you need real editing operations, you are stuck using O (mn) space to restore a solution if you use dynamic programming. However, you can reduce the space to O (min {m, n}) and restore the actual editing operations using the Hirschberg algorithm .

+5
source

Source: https://habr.com/ru/post/900072/


All Articles