I was thinking about your problem. I don't have a solution encoded, but here is the approach:
First of all, suppose without loss of generality that you have a set of 2 n bits. (If you have less than 2 n bits, drop the bit array with leading zeros until you do this. Obviously this does not double the size of the array.) In your case, you say you have a million uints, so bit 2 25 .
Suppose also that each set of 2 k + 1 bits can be divided equally into two sets of bits, the left and right collections, each with 2 k bits.
Thus, each bit collection or subsection has a โsizeโ, which is an exact power of two. The smallest possible collection contains one bit and cannot be further divided.
Secondly, suppose you likewise have an unchanged representation of a number in decimal form, and that again, without loss of generality, there are 2 d decimal digits in the string. If there are less than exactly 2 d, again, with leading zeros. Again, each decimal collection larger than one can be divided into two collections in half size.
Now we will draw a recursive algorithm:
DigitCollection Convert(BitCollection bits) { if (bits.Size == 1) return bits[0] ? DigitCollection.SingleOne : DigitCollection.SingleZero; DigitCollection left = Convert(bits.Left); DigitCollection right = Convert(bits.Right); DigitCollection power = MakePowerOfTwo(bits.Right.Size); return Add(Multiply(left, power), right); }
Now we need the Add, Multiply, and MakePowerOfTwo methods. As we will see now, we also need a Subtract and two Shift operators to quickly multiply by powers of ten.
Addition and subtraction are simple. It is clear that if a longer collection contains n digits, then the methods of adding and subtracting can be implemented to take O (n) time.
The FullShift and HalfShift operators create new collections of numbers from the old ones to facilitate quick multiplication by ten. If a set of digits of size 2 d + 1 consists of subcategories (X1, X2), each of which has size 2 d, then the โhalf-shiftedโ collection contains 2 d + 2 and consists of ((2 d leading zeros, X1), (X2 , 2 d finite zeros)). A collection with a complete shift consists of ((X1, X2), (2 d + 1 trailing zeros)).
It is obviously very cheap to build.
Multiplication is where we run into big problems . Assume without loss of generality that we multiply together two DigitCollections, each of which has exactly 2 d digits. We can use the Karatsuba algorithm:
DigitCollection Multiply(DigitCollection x, DigitCollection y) { // Assume x and y are the same digit size. if (x and y are both single digits) return the trivial solution; // Assume that z2, z1 and z0 are all the same digit size as x and y. DigitCollection z2 = Multiply(x.Left, y.Left); DigitCollection z0 = Multiply(x.Right, y.Right); DigitCollection z1 = Subtract( Multiply(Add(x.Left, x.Right), Add(y.Left, y.Right)), Add(z2, z0)); return Add(Add(FullShift(z2), HalfShift(z1)), z0); }
What is the order of this algorithm? Suppose n = 2 d digits. What is O (Multiply (n))? We repeat three times, each of which has a problem with half the number of digits. The remaining operations of addition, subtraction, and shift do not exceed O (n). So we have repeatability:
T(n) = 3 T(n/2) + O(n)
which has an easy solution through the main theorem: this algorithm is O (n 1 / log 3 ), which has a value of O (n 1,58 ).
What about MakePowerOfTwo? This is easy, given what we already have. We use the identity:
2 2n + 1 = 2 n x 2 n + 2 n x 2 n
and write the algorithm:
DigitCollection MakePowerOfTwo(int p) { if (p == 0) return DigitCollection.SingleOne; DigitCollection power = MakePowerOfTwo(p / 2); power = Multiply(power, power); if (p % 2 != 0) power = Add(power, power); return power; }
It is dominated by the calculation of multiplication, as well as O (n 1.58 ).
And now we see that multiplication also prevails in the original Convert call.
So, using this algorithm, if you have 2 d binary digits for conversion, you can expect that it will take about O (2 1,58 d ) steps. In your case, you have 2 25 bits to convert, so it will take about 777 billion calculations.
The key fact here is that the multiplication cost prevails in this algorithm . If you can reduce the cost of multiplication to less than O (n 1.58 ), you will get huge wins. If I were you, I would study improvements to decimal multiplication algorithms over Karatsuba.