How to count leading zeros in a 32-bit unsigned integer

Can someone please tell me what is the best algorithm for counting the number of leading zeros in a 32-bit unsigned integer in C programming?

+6
source share
4 answers

This discussion assumes that your compiler either does not support the operation or does not create a sufficient assembly. Please note that they are currently unlikely, so I would recommend using __builtin_clz for gcc or equivalent on your compiler.

Please note that determining what is the β€œbest” clz algo can only be done by you. Modern processors are complex beasts, and the performance of this algorithm will largely depend on the platform on which you run it, the data that you throw on it, and the code that uses it. The only way to be sure is to measure, measure and measure a little more. If you can’t tell the difference, then you probably don’t look at your bottleneck and your time will be better spent elsewhere.

Now that there are boring disclaimers, take a look at what Hacker Delight is saying about the problem. A quick survey shows that all algorithms are based on a binary search for some description. Here is a simple example:

 int n = 32; unsigned y; y = x >>16; if (y != 0) { n = n -16; x = y; } y = x >> 8; if (y != 0) { n = n - 8; x = y; } y = x >> 4; if (y != 0) { n = n - 4; x = y; } y = x >> 2; if (y != 0) { n = n - 2; x = y; } y = x >> 1; if (y != 0) return n - 2; return n - x; 

Please note that this works on 32 int and that, if necessary, it can also be converted to an iterative version. Unfortunately, this solution does not have many levels of parallelism at the instruction level and has quite a few branches that do not make for a very good bit-twisting algorithm. Please note that there is a branch-free version above, but it is much more detailed, so I will not reproduce here.

So, let me improve the solution using the pop command (counts the number of bits):

 x = x | (x >> 1); x = x | (x >> 2); x = x | (x >> 4); x = x | (x >> 8); x = x | (x >>16); return pop(~x); 

So how does it work? The key is the pop(~x) command at the end, which counts the number of zeros in x . For counting zeros to be significant, we first need to get rid of all 0 that are not being kept. We do this by right-distributing 1s using a binary algorithm. While we still lack the parallelism instruction level, we got rid of all the branches, and it uses fewer loops than the previous solution. Much better.

So what about this pop instruction, isn't that a hoax? Most architectures have step-by-step instructions on 1 cycle, which can be accessed through built-in compilers (for example, gcc __builtin_pop ). Otherwise, there are table-based solutions, but care should be taken when trading loops to access the cache, even if the table is completely stored in the L1 cache.

Finally, as usual for hacker enthusiasm, we begin to wander in foreign territories. Let us count some leading zeros using floating point numbers:

 union { unsigned asInt[2]; double asDouble; }; asDouble = (double)k + 0.5; return 1054 - (asInt[LE] >> 20); 

First, a small warning: DO NOT USE THIS ALGORITHM . This causes undefined behavior with respect to the standard. This reproduced for a fun factor more than any practical application. Use your own danger.

Now that the disclaimer does not work, how does it work? First, it converts int to double and continues to extract the exponent component from double. Neat stuff. The LE constant must be 1 if it is executed on a machine with a small end, and 0 - on a machine with a large end.

This should give you a brief overview of the various bit-twisting algorithms for this problem. Please note that there are several variations in the book that make various compromises, but I will let you open them yourself.

+13
source

This is probably the best way to do this in pure C:

 int clz(uint32_t x) { static const char debruijn32[32] = { 0, 31, 9, 30, 3, 8, 13, 29, 2, 5, 7, 21, 12, 24, 28, 19, 1, 10, 4, 14, 6, 22, 25, 20, 11, 15, 23, 26, 16, 27, 17, 18 }; x |= x>>1; x |= x>>2; x |= x>>4; x |= x>>8; x |= x>>16; x++; return debruijn32[x*0x076be629>>27]; } 

One limitation: as written, it does not support inputting zero (where the result should be 32). If all your inputs are less than 0x80000000 , you can maintain zero at no extra cost by changing the first value in the table to 32. Otherwise, just add a line at the beginning:

  if (!x) return 32; 
+7
source

Let me count the number of digits that are not leading zeros. After that, we just do (32 - n). First, if the number is zero, n is zero. Otherwise:

 n = 1 + floor(log2(x)) 

That is, we use the base-two logarithm to find out what position the most significant non-zero bit is in. We can do this efficiently on x86 using the FYL2X instruction, which computes log2.

But now that we are talking about x86 instructions, we can also see what is really available. Here! http://en.wikipedia.org/wiki/Find_first_set - you can see that there are many instructions that directly do what you want - if you are ready to write an assembly or what your optimizing compiler generates these instructions for you, with Considering carefully written C code.

-1
source

One solution would be (in Obj-c):

 // Assuming, your 32-bit unsigned integer is in i NSInteger nrLeadingZeroes = 0; while (i >= 0) { i = i << 1; nrLeadingZeroes++; } 

EDIT: (see comment below):

 // Assuming, your 32-bit unsigned integer is in j int i = (int)j; int nrLeadingZeroes = 0; while (i >= 0) { i = i << 1; nrLeadingZeroes++; } 
-1
source

Source: https://habr.com/ru/post/969833/


All Articles