Efficient bitwise operations for counting bits or finding the right | left majority

Given an unsigned int, I have to do the following operations:

  • Count the number of bits set to 1
  • Find the index of the leftmost 1 bit
  • Find the index of the largest 1 bit

(operation should not be architecture dependent).

I did this with a bit shift, but I need to go through almost all the bits (see 32). For example, counting 1:

unsigned int number= ...; while(number != 0){ if ((number & 0x01) != 0) ++count; number >>=1; } 

Other operations are similar.

So my question is: is there a faster way to do this?

+3
source share
4 answers

If you need the fastest way, you will need to use non-portable methods.

Windows / MSVC:

GCC:

They are usually directly compared to on-site instructions. Thus, it does not get much faster than these.

But since there are no C / C ++ functions for them, they are available only through the built-in compiler functions.

+8
source

Take a look at ffs (3), ffsl (3), fls (3), flsl (3).

The ffs () and ffsl () functions find the first bit (starting with the least significant bit) in I and return the index of this bit.

The fls () and flsl () functions find the last bit set to i and return the index of that bit.

You might also be interested in bitstring (3).

+8
source

Quote from http://graphics.stanford.edu/~seander/bithacks.html

The best way to count bits in a 32-bit integer v is as follows:

 unsigned int v; // count bits set in this (32-bit value) unsigned int c; // store the total here v = v - ((v >> 1) & 0x55555555); // reuse input as temporary v = (v & 0x33333333) + ((v >> 2) & 0x33333333); // temp c = ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24; // count 

The best method for counting bits takes only 12 operations, which is similar to the search table method, but avoids memory errors and potential misses in the table cache. It is a hybrid between the purely parallel method above and earlier methods using multiplications (in the section on counting bits with 64-bit instructions), although it does not use 64-bit instructions. The number of bits specified in bytes is executed in parallel, and the sum of the bits set in bytes is calculated by multiplying by 0x1010101 and shifting the right 24 bits.

+4
source

One approach is to use a lookup table.

 uint8_t popcount_table[256] = { ... }; uint8_t popcount (uint32_t x) { uint8_t *p = (uint8_t*)&x; return popcount_table[p[0]] + popcount_table[p[1]] + popcount_table[p[2]] + popcount_table[p[3]]; } 
+2
source

Source: https://habr.com/ru/post/1447269/


All Articles