Compute sqrt (SIZE_MAX + 1) using only integer constant expressions serving strange ABIs

The OpenBSD C library has the reallocarray (3) extension, which realloc(array, size*nmemb) will not explode if the overflow overflows. The implementation contains this snippet:

 /* * This is sqrt(SIZE_MAX+1), as s1*s2 <= SIZE_MAX * if both s1 < MUL_NO_OVERFLOW and s2 < MUL_NO_OVERFLOW */ #define MUL_NO_OVERFLOW (1UL << (sizeof(size_t) * 4)) 

Over on Programmers.SE A modified version of this calculation has been tinted for technical inaccuracy. 4 should obviously be CHAR_BIT/2 , but this is not the only problem. Assume an unusual ABI in which size_t has padding bits. (This is not ridiculously implausible: consider a microcontroller with 32-bit registers, but a 24-bit address space.) Then SIZE_MAX less than 1 << (sizeof(size_t)*CHAR_BIT) [in infinite precision arithmetic], and the calculation is incorrect.

So the question is: can you calculate floor(sqrt(SIZE_MAX+1)) only using arithmetic with integer constants C99, without making any assumptions about ABI, other than what C99 requires, plus what you can learn from <limits.h> ? Note that SIZE_MAX may be equal to UINTMAX_MAX , i.e. There can be no type that SIZE_MAX+1 can represent without overflow.

EDIT: I think SIZE_MAX should be 2 n & minus; 1 for some integer n, but it does not have to be 2 2n 1 - consider S / 390, one of which ABI has a 31-bit address space. Therefore: if sqrt(SIZE_MAX+1) not an integer, the desired result (given how this constant is used) is equal to floor() true value.

+6
source share
3 answers

The constant SIZE_MAX non-negative and has type size_t . For brevity, I will define:

  #define S SIZE_MAX 

The mathematical value of S+1 is, or may be, as you indicated, out of range for any integer type.
I will write S1 for the mathematical value of S+1 .
If we consider the logarithm (in base 2, if you want) of S1 , then we have:

  logarithm(sqrt(S1)) == (1.0/2.0) logarithm(S1) 

On the other hand, in almost any real situation, we will have that S is represented as a binary number having only bits 1 . The number b this bit, in the general case, is the number CHAR_BIT times the power of two times CHAR_BIT: 16, 32, 64, 128 ... We denote the index of this power by p . So for CHAR_BIT == 8 we have:

 16 == CHAR_BIT * 2 ----> p == 1 32 == CHAR_BIT * 4 ----> p == 2 64 == CHAR_BIT * 8 ----> p == 3 

Now we have:

 logarithm(S1) == b == CHAR_BIT * (2 ** p) (I am denoting with ** to the "power math. operator"). logarithm(sqrt(S1)) == logaritm(S1) / 2.0 == CHAR_BIT * (2 ** p) / 2.0 == CHAR_BIT * (2 ** (p - 1)) 

Assuming or knowing that each bit in size_t used only to represent bits of an integer, we have this equality with some (unknown) p value:

 sizeof(size_t) == b == CHAR_BIT * (2 ** p) 

We can assume that , for the current state in 2014, that the value p <= 5 , let's say (you can increase this magic number 5 to higher values โ€‹โ€‹in the future),

Now consider the following expression, designed to "search and find" the value of b under the assumption that p <= 5 :

 #define S_1 ((size_t)1ULL) #define b (sizeof(size_t)) #define bitexpr(p) ((size_t)(CHAR_BIT * (S_1 << (p)))) #define expr(p) ((size_t) (S_1 << (p))) #define exp2_expr_1(p) ((size_t)(S_1 << bitexpr(p-1))) // SRSM() stands for: Square Root SizeMax #define SRSM ( \ (expr(1)==b)? exp2_expr_1(1) : \ (expr(2)==b)? exp2_expr_1(2) : \ (expr(3)==b)? exp2_expr_1(3) : \ (expr(4)==b)? exp2_expr_1(4) : \ (expr(5)==b)? exp2_expr_1(5) : \ (size_t)0 /* Error! */ \ ) /* end-of-macro*/ 

The SRSM macro actually brings the square root of S+1 , but I suppose you can figure out what to do with this number.

The important thing is that the square root of SIZE_MAX can be obtained using purely integer constant expressions .

If you want, the "magic" number 5 can be changed by another.

A more general approach, designed to solve an arbitrary situation, on any possible machine that meets the standard, would be more complex. The method used in this message is independent of the value having CHAR_BIT , but it uses the number of bytes to be a value of 2.

EDITED: I changed the โ€œsearchโ€ method a bit, starting from 1, and then growing to avoid possible โ€œfalseโ€ matches with the << operator and large numbers (no one ever knows ...). Now, the first match is probably correct.

+1
source

An alternative to sqrt(SIZE_MAX+1) is to look at a higher level goal shown here and below.

 /* s1*s2 <= SIZE_MAX if s1 < K and s2 < K, where K = sqrt(SIZE_MAX+1) */ const size_t MUL_NO_OVERFLOW = ((size_t)1) << (sizeof(size_t) * 4); if ((nmemb >= MUL_NO_OVERFLOW || size >= MUL_NO_OVERFLOW) && nmemb > 0 && SIZE_MAX / nmemb < size) abort(); 

A simple overflow detection might use the following. This, unfortunately, performs the separation - an expansive test, but otherwise "works." No MUL_NO_OVERFLOW .

 if (nmemb > 0 && (SIZE_MAX / nmemb < size)) abort(); 

Instead of dividing, the OP code makes 2 simple comparisons with MUL_NO_OVERFLOW to eliminate most rivals.

 if ((nmemb >= MUL_NO_OVERFLOW || size >= MUL_NO_OVERFLOW) && ... 

However, precalculating MUL_NO_OVERFLOW is complicated, so this post.

An alternative is to use a simple MUL_NO_OVERFLOW_ALT calculation to eliminate many size, nmemb from the split test.

 // good for up to SIZE_MAX + 1 == 2**128 #define SQN(x) ( !!(x) ) #define SQ0(x) ( (x) >= 0x2u ? 1 + SQN((x)/0x2u) : SQN(x) ) #define SQ1(x) ( (x) >= 0x4u ? 2 + SQ0((x)/0x4u) : SQ0(x) ) #define SQ2(x) ( (x) >= 0x10u ? 4 + SQ1((x)/0x10u) : SQ1(x) ) #define SQ3(x) ( (x) >= 0x100u ? 8 + SQ2((x)/0x100u) : SQ2(x) ) #define SQ4(x) ( (x) >= 0x10000u ? 16 + SQ3((x)/0x10000u) : SQ3(x) ) #define SQ5(x) ( (x) >= 0x100000000u ? 32 + SQ4((x)/0x100000000u) : SQ4(x) ) #define SQ6(x) ( (x)/0x100000000u >= 0x100000000u ? 64 + SQ5((x)/0x1000000000000000u/16) : SQ5(x) ) #define MUL_NO_OVERFLOW_ALT (((size_t)1) << (SQ6(SIZE_MAX)/2)) printf("%zx %zx\n", SIZE_MAX, MUL_NO_OVERFLOW_ALT); // Output // ffffffff 10000 

In the case of OP, even if the best (size_t)sqrt(SIZE_MAX + 1.0) was possible at compile time, a separation test was still needed to pass the legitimate size, nmemb , for example 4, SIZE_MAX/100 . Therefore, for this task, the best ISQRT(SIZE_MAX + 1.0) not needed, just something nearby and no more.

Using this code, the best value is calculated when SIZE_MAX + 1 is an even power of 2 โ€” the usual case.

+1
source

The following works, but need to be cleaned.

Assumes:
1) SIZE_MAX + 1 - some degree 2, odd or even.
2) unsigned long long range of at least size_t .
3) sizeof (SIZE_MAX) <= 128 bits.

Notes:
1) SIZE_MAX must be at least 65535.
2) Way too much ()
3) More testing required.

Method:
Find the power 2 of SIZE_MAX/4 + 1 through the nested definitions. A different level is required for each double bit. At each bit width level of 32, 64, 128, ... check the upper bits of SIZE_MAX . After SIZE_MAX maximum width of 2 bits, restart it to get a bit with SIZE_MAX .

Enter sqrt(SIZE_MAX + 1) from (1 << Bit_Width(SIZE_MAX/4 + 1)/2)*2 . Multiply by sqrt(2) scaling if the bit width was odd.

 #include <stddef.h> #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <limits.h> int main(void) { // Swap 1 for various powers of 2 like 256, 512. etc. to test #define M SIZE_MAX/1 printf("M %zX\n", (size_t) M); #define N (M/4 + 1) printf("N/4 + 1 %zX\n", (size_t) N); // Obviously this line can be simplified. #define M0(x) ((x & ~0x0U) ? 0 : 0) #define M1(x) ((x & ~0x1U) ? 1 + M0(x >> 2) : M0(x)) #define M2(x) ((x & ~0x3U) ? 2 + M1(x >> 2) : M1(x)) #define M3(x) ((x & ~0xFU) ? 4 + M2(x >> 4) : M2(x)) #define M4(x) ((x & ~0xFFU) ? 8 + M3(x >> 8) : M3(x)) #define M5(x) ((x & ~0xFFFFLLU) ? 16 + M4(x >> 16) : M4(x)) #if (M & ~0xFFFFFFFFLLU) #define M6(x) ((x & ~0xFFFFFFFFLLU) ? 32 + M5(x >> 32) : M5(x)) #if (M & ~0xFFFFFFFFFFFFFFFFLLU) #define MN(x) ((x & ~0xFFFFFFFFFFFFFFFFLLU) ? 64 + M6(x >> 64) : M6(x)) #define MSQ2(odd, x) ((odd&1) ? (x*13043817825332782212llu)>>63 : x) #else #define MN(x) M6(x) #define MSQ2(odd, x) ((odd&1) ? (x*3037000499llu)>>31 : x) #endif #else #define MN(x) M5(x) #define MSQ2(odd, x) ((odd&1) ? (x*46340llu)>>15 : x) #endif printf("MN(N) %d\n", MN(N)); printf("MN(M) %d\n", MN(M)); #define MUL_NO_OVERFLOW ((size_t) MSQ2(MN(N), ((((size_t)1) << (MN(N)/2))*2) )) printf("MUL_NO_OVERFLOW %zu\n", MUL_NO_OVERFLOW); return 0; } 

It looks too complicated - itโ€™s about simplification. GTG

0
source

Source: https://habr.com/ru/post/974690/


All Articles