Question with bit operation

Is there a way to find the bit that was set the least number of times using only bit operations?

For example, if I have three bitmaps:

11011001 11100000 11101101 

bits in positions 3 and 5 are set to 1 in only one of the three vectors.

I currently have a solution o(n) , where n is the number of bits in the bitarra, where I look at every bit in the bitarra and increment every time there is 1, but for some reason I think that o(1) , which I can use with several bitwise operations. Can anyone advise? Thanks.

+6
source share
5 answers

You can use the duplicate / shift / mask approach to split bits, and maybe a little faster than an iterative bit shift scheme if the total number of values ​​is limited.

For example, for each 8-bit β€œbit” value, assuming no more than 15 values:

 bits1 = (bits >> 3) & 0x11; bits2 = (bits >> 2) & 0x11; bits3 = (bits >> 1) & 0x11; bits4 = bits & 0x11; bitsSum1 += bits1; bitsSum2 += bits2; bitsSum3 += bits3; bitsSum4 += bits4; 

Then, at the end, break each bitsSumN value into two 4-bit counts.

+4
source

Another option is to display a bitmap. In your example:

 111 111 011 100 101 001 000 101 

And then use standard bit counting methods to count the number of bits set.

Doing this naively will most likely be slower than the usual approach, but you can try to tune the algorithms to extract bits from different words instead of the methods that they use. However, the fastest methods look a few bits at a time, so in your case it would be difficult to optimize.

+3
source

if you have 16 or fewer arrays, consider bit patterns as hexadecimal numbers (instead of binary) and just add them together. but I'm afraid that it will still be less efficient than your o (n) solution. (and yes, I understand that adding is not bitwise.)

+1
source

If you have 15 items or less, I would suggest starting with the distillation of each group of three numbers into two parts, and then each group of fifteen in four. Sort of:

  uint32 x0, y0, z0, x1, y1, z1, ... x4, y4, z4;  // Input values
   uint32 even0, even1, ... even4, odd0 ... odd4;
   uint lowereven, lowerodd, uppereven, upperodd;

   even0 = (x0 & 0x55555555) + (y0 & 0x55555555) + (z0 & 0x555555555);
   odd0 = ((x0 >> 1) & 0x55555555) + ((y0 >> 1) & 0x55555555) + ((z0 >> 1) & 0x555555555);
   ... then do likewise for even1 ... even4 and odd1 ... odd4

   lowereven = ((even0 & 0x333333333) + (even1 & 0x33333333) + (even2 & 0x33333333) ...;
   lowerodd = ((even0 & 0x333333333) + (even1 & 0x33333333) + (even2 & 0x33333333) ...;
   uppereven = ((even0 >> 2) & 0x33333333) + ((even1 >> 2) & 0x33333333) + ...;
   oddeven = ((odd0 >> 2) & 0x33333333) + ((odd1 >> 2) & 0x33333333) + ...;

After these operations, four values ​​will contain the number of bits for all bits. LowerEven will contain the counts for bits 0, 4, 8, 16, etc .; LowerOdd has 1, 5, 9, etc .; UpperEven has 2, 6, 10, etc .; UpperOdd contains 3, 7, 11, etc.

If you had more than 15 numbers, you could process up to 255 numbers in groups of 15 by doing the above for each group and then using eight operators to combine all the groups.

0
source

It is worth noting that if each bit has a certain number of leading zeros added, then adding all the input values ​​will lead to the bitcont of each of them, we just need to mask it or something like that to get it. Then the bit count itself becomes trivial, but it raises other questions, such as:

  • How to convert my input to a format that I would like to work on?
  • How do I get a bit counter after an operation?
  • Should I store the first input in a converted format first?
  • What is the maximum bit?

In the code below, I decided to support the maximum number of bits 15, but it could easily be expanded to 255. I decided to consider only a well-formed function input (without empty or too large input arrays). And although the assembly created to access the bit fields by the caller is likely to be associated with some shifts or masks, this is normal.

This implementation uses a lookup table to expand, and although I have not profiled it, I think it should be a little faster than a cyclic bitwise solution.

 struct BitCount { unsigned char bit0 : 4; unsigned char bit1 : 4; unsigned char bit2 : 4; unsigned char bit3 : 4; unsigned char bit4 : 4; unsigned char bit5 : 4; unsigned char bit6 : 4; unsigned char bit7 : 4; unsigned char bit8 : 4; unsigned char bit9 : 4; unsigned char bitA : 4; unsigned char bitB : 4; unsigned char bitC : 4; unsigned char bitD : 4; unsigned char bitE : 4; unsigned char bitF : 4; }; void CountBits(const short *invals, unsigned incount, BitCount &bitcount) { assert(incount && incount <= 0xF && sizeof bitcount == 8); static const unsigned expand[256] = { // _0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F 0x00000000, 0x00000001, 0x00000010, 0x00000011, 0x00000100, 0x00000101, 0x00000110, 0x00000111, 0x00001000, 0x00001001, 0x00001010, 0x00001011, 0x00001100, 0x00001101, 0x00001110, 0x00001111, // 0_ 0x00010000, 0x00010001, 0x00010010, 0x00010011, 0x00010100, 0x00010101, 0x00010110, 0x00010111, 0x00011000, 0x00011001, 0x00011010, 0x00011011, 0x00011100, 0x00011101, 0x00011110, 0x00011111, // 1_ 0x00100000, 0x00100001, 0x00100010, 0x00100011, 0x00100100, 0x00100101, 0x00100110, 0x00100111, 0x00101000, 0x00101001, 0x00101010, 0x00101011, 0x00101100, 0x00101101, 0x00101110, 0x00101111, // 2_ 0x00110000, 0x00110001, 0x00110010, 0x00110011, 0x00110100, 0x00110101, 0x00110110, 0x00110111, 0x00111000, 0x00111001, 0x00111010, 0x00111011, 0x00111100, 0x00111101, 0x00111110, 0x00111111, // 3_ 0x01000000, 0x01000001, 0x01000010, 0x01000011, 0x01000100, 0x01000101, 0x01000110, 0x01000111, 0x01001000, 0x01001001, 0x01001010, 0x01001011, 0x01001100, 0x01001101, 0x01001110, 0x01001111, // 4_ 0x01010000, 0x01010001, 0x01010010, 0x01010011, 0x01010100, 0x01010101, 0x01010110, 0x01010111, 0x01011000, 0x01011001, 0x01011010, 0x01011011, 0x01011100, 0x01011101, 0x01011110, 0x01011111, // 5_ 0x01100000, 0x01100001, 0x01100010, 0x01100011, 0x01100100, 0x01100101, 0x01100110, 0x01100111, 0x01101000, 0x01101001, 0x01101010, 0x01101011, 0x01101100, 0x01101101, 0x01101110, 0x01101111, // 6_ 0x01110000, 0x01110001, 0x01110010, 0x01110011, 0x01110100, 0x01110101, 0x01110110, 0x01110111, 0x01111000, 0x01111001, 0x01111010, 0x01111011, 0x01111100, 0x01111101, 0x01111110, 0x01111111, // 7_ 0x10000000, 0x10000001, 0x10000010, 0x10000011, 0x10000100, 0x10000101, 0x10000110, 0x10000111, 0x10001000, 0x10001001, 0x10001010, 0x10001011, 0x10001100, 0x10001101, 0x10001110, 0x10001111, // 8_ 0x10010000, 0x10010001, 0x10010010, 0x10010011, 0x10010100, 0x10010101, 0x10010110, 0x10010111, 0x10011000, 0x10011001, 0x10011010, 0x10011011, 0x10011100, 0x10011101, 0x10011110, 0x10011111, // 9_ 0x10100000, 0x10100001, 0x10100010, 0x10100011, 0x10100100, 0x10100101, 0x10100110, 0x10100111, 0x10101000, 0x10101001, 0x10101010, 0x10101011, 0x10101100, 0x10101101, 0x10101110, 0x10101111, // A_ 0x10110000, 0x10110001, 0x10110010, 0x10110011, 0x10110100, 0x10110101, 0x10110110, 0x10110111, 0x10111000, 0x10111001, 0x10111010, 0x10111011, 0x10111100, 0x10111101, 0x10111110, 0x10111111, // B_ 0x11000000, 0x11000001, 0x11000010, 0x11000011, 0x11000100, 0x11000101, 0x11000110, 0x11000111, 0x11001000, 0x11001001, 0x11001010, 0x11001011, 0x11001100, 0x11001101, 0x11001110, 0x11001111, // C_ 0x11010000, 0x11010001, 0x11010010, 0x11010011, 0x11010100, 0x11010101, 0x11010110, 0x11010111, 0x11011000, 0x11011001, 0x11011010, 0x11011011, 0x11011100, 0x11011101, 0x11011110, 0x11011111, // D_ 0x11100000, 0x11100001, 0x11100010, 0x11100011, 0x11100100, 0x11100101, 0x11100110, 0x11100111, 0x11101000, 0x11101001, 0x11101010, 0x11101011, 0x11101100, 0x11101101, 0x11101110, 0x11101111, // E_ 0x11110000, 0x11110001, 0x11110010, 0x11110011, 0x11110100, 0x11110101, 0x11110110, 0x11110111, 0x11111000, 0x11111001, 0x11111010, 0x11111011, 0x11111100, 0x11111101, 0x11111110, 0x11111111 }; // F_ unsigned *const countLo = (unsigned*)&bitcount; unsigned *const countHi = (unsigned*)&bitcount + 1; *countLo = expand[*invals & 0xFF]; *countHi = expand[*invals++ >> 8]; switch (incount) { case 0xF: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xE: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xD: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xC: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xB: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xA: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x9: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x8: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x7: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x6: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x5: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x4: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x3: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x2: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals >> 8]; }; } 0x01010100, 0x01010101, 0x01010110, 0x01010111, 0x01011000, 0x01011001, 0x01011010, 0x01011011, 0x01011100, 0x01011101, 0x01011110, 0x01011111, // 5_ struct BitCount { unsigned char bit0 : 4; unsigned char bit1 : 4; unsigned char bit2 : 4; unsigned char bit3 : 4; unsigned char bit4 : 4; unsigned char bit5 : 4; unsigned char bit6 : 4; unsigned char bit7 : 4; unsigned char bit8 : 4; unsigned char bit9 : 4; unsigned char bitA : 4; unsigned char bitB : 4; unsigned char bitC : 4; unsigned char bitD : 4; unsigned char bitE : 4; unsigned char bitF : 4; }; void CountBits(const short *invals, unsigned incount, BitCount &bitcount) { assert(incount && incount <= 0xF && sizeof bitcount == 8); static const unsigned expand[256] = { // _0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F 0x00000000, 0x00000001, 0x00000010, 0x00000011, 0x00000100, 0x00000101, 0x00000110, 0x00000111, 0x00001000, 0x00001001, 0x00001010, 0x00001011, 0x00001100, 0x00001101, 0x00001110, 0x00001111, // 0_ 0x00010000, 0x00010001, 0x00010010, 0x00010011, 0x00010100, 0x00010101, 0x00010110, 0x00010111, 0x00011000, 0x00011001, 0x00011010, 0x00011011, 0x00011100, 0x00011101, 0x00011110, 0x00011111, // 1_ 0x00100000, 0x00100001, 0x00100010, 0x00100011, 0x00100100, 0x00100101, 0x00100110, 0x00100111, 0x00101000, 0x00101001, 0x00101010, 0x00101011, 0x00101100, 0x00101101, 0x00101110, 0x00101111, // 2_ 0x00110000, 0x00110001, 0x00110010, 0x00110011, 0x00110100, 0x00110101, 0x00110110, 0x00110111, 0x00111000, 0x00111001, 0x00111010, 0x00111011, 0x00111100, 0x00111101, 0x00111110, 0x00111111, // 3_ 0x01000000, 0x01000001, 0x01000010, 0x01000011, 0x01000100, 0x01000101, 0x01000110, 0x01000111, 0x01001000, 0x01001001, 0x01001010, 0x01001011, 0x01001100, 0x01001101, 0x01001110, 0x01001111, // 4_ 0x01010000, 0x01010001, 0x01010010, 0x01010011, 0x01010100, 0x01010101, 0x01010110, 0x01010111, 0x01011000, 0x01011001, 0x01011010, 0x01011011, 0x01011100, 0x01011101, 0x01011110, 0x01011111, // 5_ 0x01100000, 0x01100001, 0x01100010, 0x01100011, 0x01100100, 0x01100101, 0x01100110, 0x01100111, 0x01101000, 0x01101001, 0x01101010, 0x01101011, 0x01101100, 0x01101101, 0x01101110, 0x01101111, // 6_ 0x01110000, 0x01110001, 0x01110010, 0x01110011, 0x01110100, 0x01110101, 0x01110110, 0x01110111, 0x01111000, 0x01111001, 0x01111010, 0x01111011, 0x01111100, 0x01111101, 0x01111110, 0x01111111, // 7_ 0x10000000, 0x10000001, 0x10000010, 0x10000011, 0x10000100, 0x10000101, 0x10000110, 0x10000111, 0x10001000, 0x10001001, 0x10001010, 0x10001011, 0x10001100, 0x10001101, 0x10001110, 0x10001111, // 8_ 0x10010000, 0x10010001, 0x10010010, 0x10010011, 0x10010100, 0x10010101, 0x10010110, 0x10010111, 0x10011000, 0x10011001, 0x10011010, 0x10011011, 0x10011100, 0x10011101, 0x10011110, 0x10011111, // 9_ 0x10100000, 0x10100001, 0x10100010, 0x10100011, 0x10100100, 0x10100101, 0x10100110, 0x10100111, 0x10101000, 0x10101001, 0x10101010, 0x10101011, 0x10101100, 0x10101101, 0x10101110, 0x10101111, // A_ 0x10110000, 0x10110001, 0x10110010, 0x10110011, 0x10110100, 0x10110101, 0x10110110, 0x10110111, 0x10111000, 0x10111001, 0x10111010, 0x10111011, 0x10111100, 0x10111101, 0x10111110, 0x10111111, // B_ 0x11000000, 0x11000001, 0x11000010, 0x11000011, 0x11000100, 0x11000101, 0x11000110, 0x11000111, 0x11001000, 0x11001001, 0x11001010, 0x11001011, 0x11001100, 0x11001101, 0x11001110, 0x11001111, // C_ 0x11010000, 0x11010001, 0x11010010, 0x11010011, 0x11010100, 0x11010101, 0x11010110, 0x11010111, 0x11011000, 0x11011001, 0x11011010, 0x11011011, 0x11011100, 0x11011101, 0x11011110, 0x11011111, // D_ 0x11100000, 0x11100001, 0x11100010, 0x11100011, 0x11100100, 0x11100101, 0x11100110, 0x11100111, 0x11101000, 0x11101001, 0x11101010, 0x11101011, 0x11101100, 0x11101101, 0x11101110, 0x11101111, // E_ 0x11110000, 0x11110001, 0x11110010, 0x11110011, 0x11110100, 0x11110101, 0x11110110, 0x11110111, 0x11111000, 0x11111001, 0x11111010, 0x11111011, 0x11111100, 0x11111101, 0x11111110, 0x11111111 }; // F_ unsigned *const countLo = (unsigned*)&bitcount; unsigned *const countHi = (unsigned*)&bitcount + 1; *countLo = expand[*invals & 0xFF]; *countHi = expand[*invals++ >> 8]; switch (incount) { case 0xF: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xE: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xD: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xC: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xB: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xA: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x9: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x8: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x7: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x6: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x5: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x4: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x3: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x2: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals >> 8]; }; } 0x01100100, 0x01100101, 0x01100110, 0x01100111, 0x01101000, 0x01101001, 0x01101010, 0x01101011, 0x01101100, 0x01101101, 0x01101110, 0x01101111, // 6_ struct BitCount { unsigned char bit0 : 4; unsigned char bit1 : 4; unsigned char bit2 : 4; unsigned char bit3 : 4; unsigned char bit4 : 4; unsigned char bit5 : 4; unsigned char bit6 : 4; unsigned char bit7 : 4; unsigned char bit8 : 4; unsigned char bit9 : 4; unsigned char bitA : 4; unsigned char bitB : 4; unsigned char bitC : 4; unsigned char bitD : 4; unsigned char bitE : 4; unsigned char bitF : 4; }; void CountBits(const short *invals, unsigned incount, BitCount &bitcount) { assert(incount && incount <= 0xF && sizeof bitcount == 8); static const unsigned expand[256] = { // _0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F 0x00000000, 0x00000001, 0x00000010, 0x00000011, 0x00000100, 0x00000101, 0x00000110, 0x00000111, 0x00001000, 0x00001001, 0x00001010, 0x00001011, 0x00001100, 0x00001101, 0x00001110, 0x00001111, // 0_ 0x00010000, 0x00010001, 0x00010010, 0x00010011, 0x00010100, 0x00010101, 0x00010110, 0x00010111, 0x00011000, 0x00011001, 0x00011010, 0x00011011, 0x00011100, 0x00011101, 0x00011110, 0x00011111, // 1_ 0x00100000, 0x00100001, 0x00100010, 0x00100011, 0x00100100, 0x00100101, 0x00100110, 0x00100111, 0x00101000, 0x00101001, 0x00101010, 0x00101011, 0x00101100, 0x00101101, 0x00101110, 0x00101111, // 2_ 0x00110000, 0x00110001, 0x00110010, 0x00110011, 0x00110100, 0x00110101, 0x00110110, 0x00110111, 0x00111000, 0x00111001, 0x00111010, 0x00111011, 0x00111100, 0x00111101, 0x00111110, 0x00111111, // 3_ 0x01000000, 0x01000001, 0x01000010, 0x01000011, 0x01000100, 0x01000101, 0x01000110, 0x01000111, 0x01001000, 0x01001001, 0x01001010, 0x01001011, 0x01001100, 0x01001101, 0x01001110, 0x01001111, // 4_ 0x01010000, 0x01010001, 0x01010010, 0x01010011, 0x01010100, 0x01010101, 0x01010110, 0x01010111, 0x01011000, 0x01011001, 0x01011010, 0x01011011, 0x01011100, 0x01011101, 0x01011110, 0x01011111, // 5_ 0x01100000, 0x01100001, 0x01100010, 0x01100011, 0x01100100, 0x01100101, 0x01100110, 0x01100111, 0x01101000, 0x01101001, 0x01101010, 0x01101011, 0x01101100, 0x01101101, 0x01101110, 0x01101111, // 6_ 0x01110000, 0x01110001, 0x01110010, 0x01110011, 0x01110100, 0x01110101, 0x01110110, 0x01110111, 0x01111000, 0x01111001, 0x01111010, 0x01111011, 0x01111100, 0x01111101, 0x01111110, 0x01111111, // 7_ 0x10000000, 0x10000001, 0x10000010, 0x10000011, 0x10000100, 0x10000101, 0x10000110, 0x10000111, 0x10001000, 0x10001001, 0x10001010, 0x10001011, 0x10001100, 0x10001101, 0x10001110, 0x10001111, // 8_ 0x10010000, 0x10010001, 0x10010010, 0x10010011, 0x10010100, 0x10010101, 0x10010110, 0x10010111, 0x10011000, 0x10011001, 0x10011010, 0x10011011, 0x10011100, 0x10011101, 0x10011110, 0x10011111, // 9_ 0x10100000, 0x10100001, 0x10100010, 0x10100011, 0x10100100, 0x10100101, 0x10100110, 0x10100111, 0x10101000, 0x10101001, 0x10101010, 0x10101011, 0x10101100, 0x10101101, 0x10101110, 0x10101111, // A_ 0x10110000, 0x10110001, 0x10110010, 0x10110011, 0x10110100, 0x10110101, 0x10110110, 0x10110111, 0x10111000, 0x10111001, 0x10111010, 0x10111011, 0x10111100, 0x10111101, 0x10111110, 0x10111111, // B_ 0x11000000, 0x11000001, 0x11000010, 0x11000011, 0x11000100, 0x11000101, 0x11000110, 0x11000111, 0x11001000, 0x11001001, 0x11001010, 0x11001011, 0x11001100, 0x11001101, 0x11001110, 0x11001111, // C_ 0x11010000, 0x11010001, 0x11010010, 0x11010011, 0x11010100, 0x11010101, 0x11010110, 0x11010111, 0x11011000, 0x11011001, 0x11011010, 0x11011011, 0x11011100, 0x11011101, 0x11011110, 0x11011111, // D_ 0x11100000, 0x11100001, 0x11100010, 0x11100011, 0x11100100, 0x11100101, 0x11100110, 0x11100111, 0x11101000, 0x11101001, 0x11101010, 0x11101011, 0x11101100, 0x11101101, 0x11101110, 0x11101111, // E_ 0x11110000, 0x11110001, 0x11110010, 0x11110011, 0x11110100, 0x11110101, 0x11110110, 0x11110111, 0x11111000, 0x11111001, 0x11111010, 0x11111011, 0x11111100, 0x11111101, 0x11111110, 0x11111111 }; // F_ unsigned *const countLo = (unsigned*)&bitcount; unsigned *const countHi = (unsigned*)&bitcount + 1; *countLo = expand[*invals & 0xFF]; *countHi = expand[*invals++ >> 8]; switch (incount) { case 0xF: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xE: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xD: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xC: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xB: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xA: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x9: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x8: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x7: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x6: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x5: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x4: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x3: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x2: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals >> 8]; }; } 0x10110100, 0x10110101, 0x10110110, 0x10110111, 0x10111000, 0x10111001, 0x10111010, 0x10111011, 0x10111100, 0x10111101, 0x10111110, 0x10111111, // B_ struct BitCount { unsigned char bit0 : 4; unsigned char bit1 : 4; unsigned char bit2 : 4; unsigned char bit3 : 4; unsigned char bit4 : 4; unsigned char bit5 : 4; unsigned char bit6 : 4; unsigned char bit7 : 4; unsigned char bit8 : 4; unsigned char bit9 : 4; unsigned char bitA : 4; unsigned char bitB : 4; unsigned char bitC : 4; unsigned char bitD : 4; unsigned char bitE : 4; unsigned char bitF : 4; }; void CountBits(const short *invals, unsigned incount, BitCount &bitcount) { assert(incount && incount <= 0xF && sizeof bitcount == 8); static const unsigned expand[256] = { // _0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F 0x00000000, 0x00000001, 0x00000010, 0x00000011, 0x00000100, 0x00000101, 0x00000110, 0x00000111, 0x00001000, 0x00001001, 0x00001010, 0x00001011, 0x00001100, 0x00001101, 0x00001110, 0x00001111, // 0_ 0x00010000, 0x00010001, 0x00010010, 0x00010011, 0x00010100, 0x00010101, 0x00010110, 0x00010111, 0x00011000, 0x00011001, 0x00011010, 0x00011011, 0x00011100, 0x00011101, 0x00011110, 0x00011111, // 1_ 0x00100000, 0x00100001, 0x00100010, 0x00100011, 0x00100100, 0x00100101, 0x00100110, 0x00100111, 0x00101000, 0x00101001, 0x00101010, 0x00101011, 0x00101100, 0x00101101, 0x00101110, 0x00101111, // 2_ 0x00110000, 0x00110001, 0x00110010, 0x00110011, 0x00110100, 0x00110101, 0x00110110, 0x00110111, 0x00111000, 0x00111001, 0x00111010, 0x00111011, 0x00111100, 0x00111101, 0x00111110, 0x00111111, // 3_ 0x01000000, 0x01000001, 0x01000010, 0x01000011, 0x01000100, 0x01000101, 0x01000110, 0x01000111, 0x01001000, 0x01001001, 0x01001010, 0x01001011, 0x01001100, 0x01001101, 0x01001110, 0x01001111, // 4_ 0x01010000, 0x01010001, 0x01010010, 0x01010011, 0x01010100, 0x01010101, 0x01010110, 0x01010111, 0x01011000, 0x01011001, 0x01011010, 0x01011011, 0x01011100, 0x01011101, 0x01011110, 0x01011111, // 5_ 0x01100000, 0x01100001, 0x01100010, 0x01100011, 0x01100100, 0x01100101, 0x01100110, 0x01100111, 0x01101000, 0x01101001, 0x01101010, 0x01101011, 0x01101100, 0x01101101, 0x01101110, 0x01101111, // 6_ 0x01110000, 0x01110001, 0x01110010, 0x01110011, 0x01110100, 0x01110101, 0x01110110, 0x01110111, 0x01111000, 0x01111001, 0x01111010, 0x01111011, 0x01111100, 0x01111101, 0x01111110, 0x01111111, // 7_ 0x10000000, 0x10000001, 0x10000010, 0x10000011, 0x10000100, 0x10000101, 0x10000110, 0x10000111, 0x10001000, 0x10001001, 0x10001010, 0x10001011, 0x10001100, 0x10001101, 0x10001110, 0x10001111, // 8_ 0x10010000, 0x10010001, 0x10010010, 0x10010011, 0x10010100, 0x10010101, 0x10010110, 0x10010111, 0x10011000, 0x10011001, 0x10011010, 0x10011011, 0x10011100, 0x10011101, 0x10011110, 0x10011111, // 9_ 0x10100000, 0x10100001, 0x10100010, 0x10100011, 0x10100100, 0x10100101, 0x10100110, 0x10100111, 0x10101000, 0x10101001, 0x10101010, 0x10101011, 0x10101100, 0x10101101, 0x10101110, 0x10101111, // A_ 0x10110000, 0x10110001, 0x10110010, 0x10110011, 0x10110100, 0x10110101, 0x10110110, 0x10110111, 0x10111000, 0x10111001, 0x10111010, 0x10111011, 0x10111100, 0x10111101, 0x10111110, 0x10111111, // B_ 0x11000000, 0x11000001, 0x11000010, 0x11000011, 0x11000100, 0x11000101, 0x11000110, 0x11000111, 0x11001000, 0x11001001, 0x11001010, 0x11001011, 0x11001100, 0x11001101, 0x11001110, 0x11001111, // C_ 0x11010000, 0x11010001, 0x11010010, 0x11010011, 0x11010100, 0x11010101, 0x11010110, 0x11010111, 0x11011000, 0x11011001, 0x11011010, 0x11011011, 0x11011100, 0x11011101, 0x11011110, 0x11011111, // D_ 0x11100000, 0x11100001, 0x11100010, 0x11100011, 0x11100100, 0x11100101, 0x11100110, 0x11100111, 0x11101000, 0x11101001, 0x11101010, 0x11101011, 0x11101100, 0x11101101, 0x11101110, 0x11101111, // E_ 0x11110000, 0x11110001, 0x11110010, 0x11110011, 0x11110100, 0x11110101, 0x11110110, 0x11110111, 0x11111000, 0x11111001, 0x11111010, 0x11111011, 0x11111100, 0x11111101, 0x11111110, 0x11111111 }; // F_ unsigned *const countLo = (unsigned*)&bitcount; unsigned *const countHi = (unsigned*)&bitcount + 1; *countLo = expand[*invals & 0xFF]; *countHi = expand[*invals++ >> 8]; switch (incount) { case 0xF: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xE: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xD: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xC: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xB: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xA: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x9: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x8: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x7: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x6: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x5: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x4: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x3: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x2: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals >> 8]; }; } 0x11110100, 0x11110101, 0x11110110, 0x11110111, 0x11111000, 0x11111001, 0x11111010, 0x11111011, 0x11111100, 0x11111101, 0x11111110, 0x11111111}; struct BitCount { unsigned char bit0 : 4; unsigned char bit1 : 4; unsigned char bit2 : 4; unsigned char bit3 : 4; unsigned char bit4 : 4; unsigned char bit5 : 4; unsigned char bit6 : 4; unsigned char bit7 : 4; unsigned char bit8 : 4; unsigned char bit9 : 4; unsigned char bitA : 4; unsigned char bitB : 4; unsigned char bitC : 4; unsigned char bitD : 4; unsigned char bitE : 4; unsigned char bitF : 4; }; void CountBits(const short *invals, unsigned incount, BitCount &bitcount) { assert(incount && incount <= 0xF && sizeof bitcount == 8); static const unsigned expand[256] = { // _0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F 0x00000000, 0x00000001, 0x00000010, 0x00000011, 0x00000100, 0x00000101, 0x00000110, 0x00000111, 0x00001000, 0x00001001, 0x00001010, 0x00001011, 0x00001100, 0x00001101, 0x00001110, 0x00001111, // 0_ 0x00010000, 0x00010001, 0x00010010, 0x00010011, 0x00010100, 0x00010101, 0x00010110, 0x00010111, 0x00011000, 0x00011001, 0x00011010, 0x00011011, 0x00011100, 0x00011101, 0x00011110, 0x00011111, // 1_ 0x00100000, 0x00100001, 0x00100010, 0x00100011, 0x00100100, 0x00100101, 0x00100110, 0x00100111, 0x00101000, 0x00101001, 0x00101010, 0x00101011, 0x00101100, 0x00101101, 0x00101110, 0x00101111, // 2_ 0x00110000, 0x00110001, 0x00110010, 0x00110011, 0x00110100, 0x00110101, 0x00110110, 0x00110111, 0x00111000, 0x00111001, 0x00111010, 0x00111011, 0x00111100, 0x00111101, 0x00111110, 0x00111111, // 3_ 0x01000000, 0x01000001, 0x01000010, 0x01000011, 0x01000100, 0x01000101, 0x01000110, 0x01000111, 0x01001000, 0x01001001, 0x01001010, 0x01001011, 0x01001100, 0x01001101, 0x01001110, 0x01001111, // 4_ 0x01010000, 0x01010001, 0x01010010, 0x01010011, 0x01010100, 0x01010101, 0x01010110, 0x01010111, 0x01011000, 0x01011001, 0x01011010, 0x01011011, 0x01011100, 0x01011101, 0x01011110, 0x01011111, // 5_ 0x01100000, 0x01100001, 0x01100010, 0x01100011, 0x01100100, 0x01100101, 0x01100110, 0x01100111, 0x01101000, 0x01101001, 0x01101010, 0x01101011, 0x01101100, 0x01101101, 0x01101110, 0x01101111, // 6_ 0x01110000, 0x01110001, 0x01110010, 0x01110011, 0x01110100, 0x01110101, 0x01110110, 0x01110111, 0x01111000, 0x01111001, 0x01111010, 0x01111011, 0x01111100, 0x01111101, 0x01111110, 0x01111111, // 7_ 0x10000000, 0x10000001, 0x10000010, 0x10000011, 0x10000100, 0x10000101, 0x10000110, 0x10000111, 0x10001000, 0x10001001, 0x10001010, 0x10001011, 0x10001100, 0x10001101, 0x10001110, 0x10001111, // 8_ 0x10010000, 0x10010001, 0x10010010, 0x10010011, 0x10010100, 0x10010101, 0x10010110, 0x10010111, 0x10011000, 0x10011001, 0x10011010, 0x10011011, 0x10011100, 0x10011101, 0x10011110, 0x10011111, // 9_ 0x10100000, 0x10100001, 0x10100010, 0x10100011, 0x10100100, 0x10100101, 0x10100110, 0x10100111, 0x10101000, 0x10101001, 0x10101010, 0x10101011, 0x10101100, 0x10101101, 0x10101110, 0x10101111, // A_ 0x10110000, 0x10110001, 0x10110010, 0x10110011, 0x10110100, 0x10110101, 0x10110110, 0x10110111, 0x10111000, 0x10111001, 0x10111010, 0x10111011, 0x10111100, 0x10111101, 0x10111110, 0x10111111, // B_ 0x11000000, 0x11000001, 0x11000010, 0x11000011, 0x11000100, 0x11000101, 0x11000110, 0x11000111, 0x11001000, 0x11001001, 0x11001010, 0x11001011, 0x11001100, 0x11001101, 0x11001110, 0x11001111, // C_ 0x11010000, 0x11010001, 0x11010010, 0x11010011, 0x11010100, 0x11010101, 0x11010110, 0x11010111, 0x11011000, 0x11011001, 0x11011010, 0x11011011, 0x11011100, 0x11011101, 0x11011110, 0x11011111, // D_ 0x11100000, 0x11100001, 0x11100010, 0x11100011, 0x11100100, 0x11100101, 0x11100110, 0x11100111, 0x11101000, 0x11101001, 0x11101010, 0x11101011, 0x11101100, 0x11101101, 0x11101110, 0x11101111, // E_ 0x11110000, 0x11110001, 0x11110010, 0x11110011, 0x11110100, 0x11110101, 0x11110110, 0x11110111, 0x11111000, 0x11111001, 0x11111010, 0x11111011, 0x11111100, 0x11111101, 0x11111110, 0x11111111 }; // F_ unsigned *const countLo = (unsigned*)&bitcount; unsigned *const countHi = (unsigned*)&bitcount + 1; *countLo = expand[*invals & 0xFF]; *countHi = expand[*invals++ >> 8]; switch (incount) { case 0xF: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xE: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xD: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xC: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xB: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0xA: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x9: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x8: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x7: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x6: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x5: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x4: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x3: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals++ >> 8]; case 0x2: *countLo += expand[*invals & 0xFF]; *countHi += expand[*invals >> 8]; }; } 

EDIT . You can skip the above table scan on a 64-bit system if you increment the bit counter from 4 to 8 bits, taking advantage of the multiplication.

This is the property that interests us (a, b, c, d is 0 or 1, and n is a binary number):
n * (a * 2 ^ 3 + b * 2 ^ 2 + c * 2 ^ 1 + d * 2 ^ 0) <=> ((a * n) <3) + ((b * n) <2) + ((c * n) <1) + ((d * n) <0)

So, if we transfer the byte to a 64-bit int and carefully select the factor, we can get 0 or 1 in the first bit of each byte of the product. Here is a multiplier:

00000000 00000010 00000100 00001000 00010000 00100000 01000000 10000001
or 0x002040810204081

So, we can expand the byte to 64 bits as follows:

 unsigned char b = ... // this operation can be used in substitution of the below look-up table // (if the code is written for 8-bit wide count, instead of 4-bit wide counts) unsigned __int64 valx = ((unsigned __int64)b * 0x002040810204081) & 0x0101010101010101; 

Then we can extract as many bits as this in small system systems.

 union ResultType { unsigned __int64 result; unsigned char bitcount[8]; // bitcount[x] is the number of times the x-th most significant bit appeared }; ResultType r; r.result = val1 + val2 + val3 + ...; // up to 255 values can be summed before we risk overflow r.bitcount[2] // how many times the 00000100 bit was set 
0
source

Source: https://habr.com/ru/post/889799/


All Articles