Understanding five-dimensional DP with bits and XOR?

Question

Understanding five-dimensional DP with bits and XOR?

I was looking for a solution to this problem here , and I did not quite understand how dynamic programming (DP) works.

A summary of the problem is this: you are given a 9x9 grid or zeros located in nine 3x3 subgrids, as shown below:

000 000 000 001 000 100 000 000 000 000 110 000 000 111 000 000 000 000 000 000 000 000 000 000 000 000 000

You need to find the minimum number of necessary changes so that each of the nine rows, columns and 3x3 subgrid contains an even number 1. Here, the change is defined as switching this element from 1 to 0 or vice versa.

The solution includes dynamic programming, and each state consists of a minimum number of moves, so that all lines up to the current line that they are looking at have even parity (an even number of units).

However, I do not understand the details of their implementation. First of all, in its memoization array

 int memo[9][9][1<<9][1<<3][2];

What is each of the indices? I realized that the first two are for the current row and column, the third is for parity columns, the fourth is for passivity of the subgrade, and the fifth is for parity of the row. However, why do we need 2 ^ 9 elements for parity, while only 2 are needed for parity?

Next, how are the transitions between states? I would suggest that you go through the line, trying each element and moving on to the next line when you're done, but when I saw their code, I was pretty confused.

  int& ref = memo[r][c][mc][mb][p]; /* Try setting the cell to 1. */ ref = !A[r][c] + solve(r, c + 1, mc ^ 1 << c, mb ^ 1 << c / 3, !p); /* Try setting the cell to 0. */ ref = min(ref, A[r][c] + solve(r, c + 1, mc, mb, p));

How do they try to set the cell to unity by flipping the current bit in the grid? And I understand how, when you do this with changing the parity of the row, as indicated in !p , but I don’t understand how the influence of the parity of the column will affect, or what mc ^ 1 << c does - why do you need xor and bit shifts ? The same applies to subgrade parity - mb ^ 1 << c / 3 . What is he doing?

Can someone explain how this works?

+6

algorithm bit-manipulation dynamic-programming

1110101001 Jun 07 '14 at 23:53

source share

2 answers

I think I figured it out. The idea is to wade from top to bottom, from left to right. At each step, we try to move to the next position by setting the current field to either 0 or 1.

At the end of each line, if the parity is even, we go to the next line; otherwise we will retreat. At the end of every third line, if the parity of all three boxes is even, we go to the next line; otherwise we will retreat. Finally, at the end of the board, if all the columns have even parity, we are done; otherwise we come back.

The state of recursion at any point can be described in terms of the following five pieces of information:

Current row and column.
Combinations of all columns.
The ratio of the three fields in which we are now (each line intersects three).
The current parity of the column.

Here's what the memoization table looks like:

 int memo[9][9][1<<9][1<<3][2]; ^ ^ ^ ^ ^ | | | | | row --+ | | | | col -----+ | | | column parity ---+ | | box parity ----------+ | current row parity---------+

To find out why there are bit shifts, let's look at the parity of a column. There are 9 columns, so we can write their parity as a bit vector with 9 bits. Equivalently, we could use a nine-bit integer. 1 << 9 gives the number of possible nine-bit integers, so we can use a single integer to simultaneously encode all column patterns.

Why use XOR and bit shifts? Well, XORing bitvector A with second bitvector B inverts all the bits in A that are set to B, and leaves all other bits unchanged. If you track parity, you can use XOR to switch individual bits to represent flip parity; an offset occurs because we pack several bits of parity into one machine word. The division you referenced is a mapping from the column index to the horizontal index of the field through which it passes.

Hope this helps!

+7

templatetypedef Jun 08 '14 at 2:18

source share

Aleksandr Dubinsky · Accepted Answer · 2014-06-09T01:28:34+0000

The algorithm in the solution is an exhaustive depth search with pair optimization. Unfortunately, the description does not quite explain this.

An exhaustive search means that we are trying to list all possible combinations of bits. The first step means that we first try to set all the bits to one, then set the last to zero, then to the second, last and second and last, etc.

The first optimization is a rollback as soon as we find that parity has not yet been determined. So, for example, when we begin our search and reach the first line, we check whether this line has zero parity. If this does not happen, we will not continue. We stop, go back and try to set the last bit in the line to zero.

The second optimization is DP-like, because we cache partial results and reuse them. This exploits the fact that, from the point of view of the problem, different search paths can converge to the same logical state. What is the state of a logical search? The description in the decision begins to explain it ("begins", being a keyword). In fact, the trick is that at any given search point, the minimum number of additional bit-flips does not depend on the exact state of the entire sudoku board, but only on the state of the various parities that we need to track (See Further explanation below.) There are 27 parities that we track (taking into account 9 columns, 9 rows and 9 3x3 boxes). Moreover, we can optimize some of them. The parity for all higher lines, as we perform the search, will always be even, and the parity of all lower lines that have not yet been affected by the search does not change. We track only the parity of 1 line. By the same logic, the ratio of cells above and below is ignored, and we only need to track the "active" 3 fields.

Therefore, instead of 2 ^ 9 * 2 ^ 9 * 2 ^ 9 = 134,217,728 states, we have only 2 ^ 9 * 2 ^ 1 * 2 ^ 3 = 8192 states. Unfortunately, we need a separate cache for each depth level in the search. So, we multiply by 81 possible search depths to find that we need an array of size 663 552. Borrow from templatetypedef:

 int memo[9][9][1<<9][1<<3][2]; ^ ^ ^ ^ ^ | | | | | row --+ | | | | col -----+ | | | column parity ---+ | | box parity ----------+ | current row parity---------+ 1<<9 simply means 2^9, given how integers and bit shifts work.

Further explanation: due to the way parity works, a flip bit will always flip its 3 corresponding parities. Therefore, all permutations of sudoku boards that have the same parity can be solved with the same bit-flip win pattern. The "solve" function gives an answer to the problem: "Assuming that you can only execute a flip bit, starting at the cell at (x, y), what is the minimum number of bit flip to get the board allowed." All sudoku tips with the same parities will give the same answer. The search algorithm considers many permutations of the boards. He begins to change them from above, counts how many bits turned him over, and then asks the “solve” function to find out how much more is needed. If "solve" is already called with the same values (x, y) and with the same parities, we can simply return the cache result.

The confusing part is the code that actually performs the search and update:

 /* Try setting the cell to 1. */ ref = !A[r][c] + solve(r, c + 1, mc ^ 1 << c, mb ^ 1 << c / 3, !p); /* Try setting the cell to 0. */ ref = min(ref, A[r][c] + solve(r, c + 1, mc, mb, p));

It could be more clearly displayed as:

 /* Try having this cell equal 0 */ bool areWeFlipping = A[r][c] == 1; int nAdditionalFlipsIfCellIs0 = (areWeFlipping ? 1 : 0) + solve(r, c + 1, mc, mb, p); // Continue the search /* Try having this cell equal 1 */ areWeFlipping = A[r][c] == 0; // At the start, we assume the sudoku board is all zeroes, and therefore the column parity is all even. With each additional cell, we update the column parity with the value of tha cell. In this case, we assume it to be 1. int newMc = mc ^ (1 << c); // Update the parity of column c. ^ (1 << c) means "flip the bit denoting the parity of column c" int newMb = mb ^ (1 << (c / 3)); // Update the parity of 'active' box (c/3) (ie, if we're in column 5, we're in box 1) int newP = p ^ 1; // Update the current row parity int nAdditionalFlipsIfCellIs1 = (areWeFlipping ? 1 : 0) + solve(r, c + 1, newMc, newMb, newP); // Continue the search ref = min( nAdditionalFlipsIfCellIs0, nAdditionalFlipsIfCellIs1 );

Personally, I would implement the two sides of the search as “flip” and “do not flip”. This makes the algorithm more understandable, conceptually. This would make the second paragraph a chapter: “First, depth means that we first try not to flip any bits, and then flip the last, then the second-last, then the last, and the second-last, etc.” In addition, before we start the search, we need to pre-calculate the values of "mc", "mb" and "p" for our board, instead of passing 0.

 /* Try not flipping the current cell */ int nAdditionalFlipsIfDontFlip = 0 + solve(r, c + 1, mc, mb, p); /* Try flipping it */ int newMc = mc ^ (1 << c); int newMb = mb ^ (1 << (c / 3)); int newP = p ^ 1; int nAdditionalFlipsIfFlip = 1 + solve(r, c + 1, newMc, newMb, newP); ref = min( nAdditionalFlipsIfDontFlip, nAdditionalFlipsIfFlip );

However, this change does not affect performance.

UPDATE

The most amazing thing is that the key to algorithmic speed is that the memoization array ends up quite rare. At each depth level, there are usually 512 (sometimes 256 or 128) states (out of 8192). Moreover, it is always one state per column. Brown and ordinary parities don't seem to matter! Omitting them from the memoization array improves performance by another 30 times. But can we prove that this is always true?

Understanding five-dimensional DP with bits and XOR?

More articles: