why can't i use a few higher bytes in the register
Each permutation of the command must be encoded in the instructions. The original 8086 processor supports the following options:
instruction encoding remarks --------------------------------------------------------- mov ax,value b8 01 00 <-- whole register mov al,value b4 01 <-- lower byte mov ah,value b0 01 <-- upper byte
Since the 8086 is a 16-bit processor, three different versions cover all options.
80386 added 32-bit support. Designers had a choice: either add support for 3 additional sets of registers (x 8 registers = 24 new registers), and somehow find the encodings for them, or leave things basically the way they were before.
Here the designers have chosen:
instruction encoding remarks --------------------------------------------------------- mov eax,value b8 01 00 00 00 (same encoding as mov ax,value!) mov ax,value 66 b8 01 00 (prefix 66 + encoding for mov eax,value) mov al,value (same as before) mov ah,value (same as before)
They simply added the 0x66 prefix to resize the register from (default) 32 to 16 bits plus the 0x67 prefix to resize the memory operand. And left it at the same time.
Otherwise, this would mean doubling the number of command encodings or adding three Six new prefixes for each of your "new" incomplete registers.
By the time 80386 was released, all command bytes had already been accepted, so there was no room for new prefixes. This opcode space was eaten up by useless instructions like AAA , AAD , AAM , AAS , DAA , DAS SALC . (They were disabled in X64 mode to free up the necessary space for encoding).
If you want to change only the higher register bytes, just do:
movzx eax,cl
But why not two (say r8dl and r8dh)
The original 8086 had 8 byte size registers:
al,cl,dl,bl,ah,ch,dh,bh <-- in this order.
Index registers, base pointer, and stack register do not have byte registers.
In x64, this has been changed. If there is a REX prefix (denoting x64 registers), then al..bh (8 regs) encodes al .. r15l . 16 reg. 1 extra coding bit from rex prefix. This adds spl , dil , sil , bpl , but excludes any xh reg. (you can get four xh rules if you don't use the REX prefix).
And using r8b makes full r8 busy
Yes, this is called "partial record registration." Since the notation r8b changes part but not all r8 , r8 now split into two halves. Half have changed, and one half not. The CPU must join the two halves. He can either do this using an additional CPU cycle to do the job, or by adding more circuits for the task to be able to do this in one cycle.
The latter is expensive in terms of silicon and complex in terms of design, it also adds extra heat due to extra work (more work per cycle = more heat). See Why GCC Does Not Use Partial Registers? to run through how different x86 processors process entries with a partial register (and later reads the full register).
If I use r8b, I cannot access the upper 56 bits at the same time, they exist but are not available
No, they are not unaccessible .
mov rax,bignumber //random value in eax mov al,0 //clear al xor r8d,r8d //r8=0 mov r8b,16 //set r8b or r8,rax //change r8 upper without changing r8b
You use masks plus and , or , xor and not and to change parts of the register without affecting the rest.
Actually, there was never a need for ah , but this led to a more compact code on 8086 (and more efficient use of registers). It is sometimes useful to use EAX or RAX, and then read AL and AH separately (e.g. movzx ecx, al / movzx edx, ah ) as part of decompressing bytes.