LUT, BTS ( BTR ). , ( , GCC), ( x86).
0F AB/r --- BTS r/m32, r32 --- CF .
, Bit-String , dword, . :
, 31, . 3 5 (3 16- , 5 32- ) , . , .
When accessing a bit in memory, the processor can access 4 bytes, starting with the memory address for the 32-bit operand size, using the following relationship:
Effective Address + (4 * (Bit Offset DIV 32))
In pure assembler (Intel-MASM syntax), it will look like this:
.data
.align 16
save db 32 dup(0) ; 256bit = 32 byte YMM/__m256i temp variable space
bitNumber dd 254 ; use an UINT for the bit to set (here the second to last)
.code
mov eax, bitNumber
...
lea edx, save
movdqa xmmword ptr [edx], xmm0 ; save __m256i to to memory
bts dword ptr [edx], eax ; set the 255st bit
movdqa xmm0, xmmword ptr [edx] ; read __m256i back to register
...
If the variable is already in memory, it will be even easier.
Using the built-in assembly, this will result in the following functions:
static inline
void set_m256i_bit(__m256i * value, uint32_t bit)
{
__asm__ ("btsl %[bit], %[memval]\n\t"
: [memval] "+m" (*value) : [bit] "ri" (bit));
}
static inline
void clear_m256i_bit(__m256i * value, uint32_t bit)
{
__asm__ ( "btrl %[bit], %[memval]\n\t"
: [memval] "+m" (*value) : [bit] "ri" (bit));
}
They compile with what you expect in the Godbolt compiler explorer
And some test code similar to the assembler above:
__m256i value = _mm256_set_epi32(0,0,0,0,0,0,0,0);
set_m256i_bit(&value,254);
clear_m256i_bit(&value,254);