Can execution out of turn lead to speculative memory accesses?

When the processor, due to order, encounters something like

LOAD R1, 0x1337 LOAD R2, $R1 LOAD R3, 0x42 

Assuming all hits result in a cache miss, can a processor request a memory controller for 0x42 content before it requests $ R1 or even 0x1337 content? If this is the case, assuming access to $ R1 leads to an exception (for example, a segmentation error), we can assume that 0x42 is loaded speculatively, right?

And by the way, when the download-upload block sends a request to the memory controller, can it send a second request before receiving a response to the previous one?

My question does not concern any architecture in particular. Answers related to any underlying architecture are welcome.

+4
source share
3 answers

The answer to your question depends on the model of ordering the memory of your CPU, which does not coincide with the CPU, which allows not to fulfill the order. If the CPU implements the ordering order of the store (for example, x86 or Sparc), then the answer to your question is 0x42, it will not be loaded to 0x1337

If the processor implements a model with weakened memory (for example, IA-64, PowerPC, alpha), then in the absence of instructions for saving memory, all bets are disabled, which will be available first. This should not make much difference if you are not doing IO or dealing with multi-threaded code.

you should notice that some CPUs (e.g. Itanium) have relaxed memory models (so reading may be out of order), but does NOT have any out-of-order logic, since they expect the compiler to order instructions and speculative instructions in an optimal way rather than use silicone space on OOE

+6
source

It would seem that this is a logical conclusion for superscalar processors with several loading units. Multichannel memory controllers are quite common these days.

In the case of executing a command, a huge amount of logic is expended in order to determine whether the commands have dependencies on others in the stream, not only for register dependencies, but also on memory operations. There is also huge logic for handling exceptions: the processor must execute all the instructions in the thread to failure (or, alternatively, unload some parts of this into the operating system).

In terms of the programming model observed by most applications, effects never occur. As you can see from the memory, it is understood that the loads will not always be executed in the expected sequence - but this is so when caches are used.

Obviously, in circumstances where the order of loading and storage matters - for example, when accessing device registers, OOE must be disabled. For this purpose, the POWER architecture has a wonderful EIEIO instruction.

Some members of the ARM Cortex-A family offer OOE - I suspect that with the power limitations of these devices and the apparent lack of instructions for forcing orders that are always in order

+4
source

The corresponding SPARC processor must implement TSO, but can also implement RMO and PSO. You need to know what mode your OS is running in if you donโ€™t know that your particular hardware platform did not implement RMO and PSO.

+1
source

Source: https://habr.com/ru/post/1435269/


All Articles