32-bit operation versus 64-bit operation on a 64-bit machine / OS

What operation, for example, a 32-bit operation or a 64-bit operation (for example, masking a 32-bit flag or a 64-bit flag), will be cheaper on a 64-bit machine?

+4
source share
3 answers

Since you do not specify the architecture, I can only offer a general answer, since this depends on the operation and on the processor architecture in question. Once you have data in the CPU register, most operations usually take the same amount of time, regardless of whether this value was originally 32 or 64 bits.

However, on some architectures, there may be some differences in how the data falls into the register. Here are some situations where an “intrinsic” value may be faster than a lower value for some equipment:

Data retrieval

  • Retrieving a "native size" value can be faster than fetching a smaller value. That is, the processor may need to extract 64 bits independently, and then mask / shift 32 bits to “load” the 32-bit value. This masking / shift is not required when working with a 64-bit value, so it can be loaded faster. (This contradicts the intuitive idea that something is twice as large, it may take twice as long to load).

  • Alternatively, if the bus can handle half-widths, then 32 bits can be loaded simultaneously with a 64-bit value.

  • To confuse more, CPU caching can also change the results. Usually, when you read a single value from memory, a "string" of several memory locations is read into the cache, so subsequent reads can be delivered from the fast cache instead of requiring a full sample from RAM. In this case, using 32-bit values ​​will work faster if you receive a lot of values ​​in sequence, since they will be cached twice, which will lead to fewer misses in the cache.

Calculation

  • The processor hardware is optimized for working with 64-bit values, so calculating values ​​using 32 bits can cause it more problems and, therefore, can slow down the work. for example, it can process a double (64-bit) value "initially", but it needs to convert the floating-point value (32-bit value) to double before processing it, and then convert the result back to float.

  • Alternatively, there may be 32-bit and 64-bit paths through the CPU, or the CPU can perform any conversions required so as not to affect the total execution time of the instruction, in which case they can be calculated at the same speed.

  • This can affect complex operations (floating point), but is unlikely to be a problem with simple operations (AND, OR, etc.)

+3
source

In general, a 64-bit operation or a 32-bit operation will have the same cost. A 32-bit operation may ultimately take additional instructions depending on whether the compiler needs to ensure that the upper 32-bit bits of the 64-bit register are cleared (or expanded with a character), but this operation usually has a small cost.

There may be some difference in the encoding of commands, which may take up more space than another, but this (and what will be the advantage) will depend on a number of factors.

+2
source

It depends - masking the flag will usually use the AND instruction, which will execute quickly (~ 1 loop) as soon as the data is in the register. Loading 64-bit data from memory will usually be slower than loading 32 bits of data, but if you use more than 32 flags, you will have to load more than 32 bits of data anyway and processing masking in one cycle will improve the speed by doing it in two or three instructions. Regardless of whether it matters for the overall speed, it will usually depend on the surrounding instructions - for example, if the data is already in the cache anyway, you may not need to load it from memory.

In other words, it’s hard to generalize — you just need to look at a specific sequence of code (not just one command, but a whole sequence) to say something — and the result for this sequence may not mean much about another sequence that was originally It looks almost identical.

+2
source

Source: https://habr.com/ru/post/1306174/


All Articles