How are MTRR registers registered?

Question

How are MTRR registers registered?

x86 / x86-64 provides MTRR (memory range register), which can be useful to indicate different parts of the physical address space for different uses (e.g. Cacheable, Unchangeable, Writecombining, etc.).

My question is, does anyone know how these physical address space restrictions, as defined by the MTRR, apply in hardware? Each time the memory is accessed, the hardware checks to see if the physical address falls in the specified range before the process decides whether to look for a cache or look for a write buffer or send it directly to the memory controller?

thanks

+4

memory-management x86 x86-64 mmu

Arka Nov 08 '12 at 20:23

source share

1 answer

osgx · Accepted Answer · 2012-11-08T23:00:37+0000

Wikipedia says in an MTRR article that:

The new (primarily 64-bit) x86 processors support a more advanced method called page attribute tables, which allow you to set these modes for each table instead of a limited number of registers with a low level of detail

So, for newer x86 / x86_64 processors, we can say that MTRR can be implemented as an additional technology for PAT (Page Attribute Tables). The place where the PAT is stored in memory is the page table (some bits in the page table entry or PTE), and in the CPU they are stored (cached) in the TLB (this is part of the MMU ). TLB (and MMU) is already a place that visits every memory access. I think it might be a good place to control type of memory, even with MTRR (?)

But what if I stop guessing and open the RTFM book? There is one very good book about the x86 world: Generation of Unibridged Pentium 4: IA32 processors (ISBN-13: 978-0321246561). Part 7, Chapter 24, “Launching the Pentium Pro Software,” part “MTRR Added.”

There are long rules for each type of mtrr memory on pages 582-584, but the rules for all 5 types (Uncacheable = UC, Write-Combining = WC, Write-Through = WT, Write-Protect = WP, Write-Back = WB) begin with: "Cache search in progress."

And in Part 9 of the Pentium III, Chapter 32 of the Pentium III Xeon, the book clearly states:

When it needs to access memory, the processor processes both the MTRR and the selected PTE or PDE to determine the type of memory (and therefore the rules of behavior that it should follow).

But on the other hand ... WRMSR in the MTRR registry will invalidate the TLB (in accordance with the Intel manual instructions "manual32.chm"):

When the WRMSR command is used to write to the MTRR, TLBs are invalid, including global entries (see "Processing Buffer Buffers (TLBs" in Chapter 3 of the IA-32 Architecture Intel (R) Architecture Software Developer's Guide, Volume 3).

And there’s another direct allusion to the Intel 64 and IA-32 Architects Software Developer's Guide, Volume 3a, section 10.11.9 Great Considerations on the Page:

MTRRs provide memory entry for a limited number of regions with a dimension of 4 Kbytes (the same granularity as pages with 4 Kbytes). The type of memory for this page is cached in processor TLBs.

You asked:

In each memory access, the hardware checks to see if the physical address falls within the specified range.

No. Each memory access is not compared with all MTRRs. All MTRR ranges are pre-combined with the bits of the PTE memory when the PTE is loaded into the TLB. Then the only place to check the type of memory will be the TLB line. And TLB IS is checked for every memory access.

whether to search the cache or search the write buffer or send it to the memory controller directly.

No, there is something that we do not understand clearly. The cache scanned every access, even for UC (for example, if the area has just been changed to UC, there may be a cached copy that should be output).

From chapter 24 (we are talking about Pentium 4):

Loading from cached memory The types of memory the processor can cache to are WP, WT, and WB memory (as defined by MTRR, PTE, or PDE).
When the kernel sends a boot mop, the mop is placed in the boot buffer, which is reserved for it at the Allocator stage. Then, a memory read request is issued to the L1 cache to execute:

If the cache has a copy of the line that contains the requested read data, the read data is placed in the load buffer.
If a cache search results in an error, the request is sent upstream to the L2 cache.
If the L2 cache has a copy of the sector that contains the requested read data, the read data is immediately placed in the load buffer, and the sector is copied to the L1 data cache.
If a cache search results in a skip, the request is sent upstream either to the L3 cache (if any) or to the FSB interface.
If the L3 Cache has a copy of the sector that contains the requested read data, the read data is immediately placed in the load buffer, and the sector is copied to the L2 cache and L1 data cache.
If a search in the top-level cache leads to a skip, the request is sent to the FSB interface module.
When a sector is returned from memory, the read data is immediately placed in the load buffer, and the sector is copied to L3 cache (if any), L2 cache and L1 data cache.

The processor core is allowed to speculatively perform loads that read data from the memory area of WC, WP, WT or WB
Booting from non-quality memory Uncleaned memory types are UC and WC (as defined by MTRR, PTE, or PDE).
When the kernel sends a load pin, the read request is placed in the load buffer, which was reserved for it at the Allocator stage. A memory read request is also sent to the processor cache. In case of cache hit, the cache line is removed from the cache. The request is issued to the FSB interface module. A memory read operation is performed on the FSB to retrieve only the requested bytes from memory. When data is returned from memory, read data is immediately placed in the load buffer.
The processor core is not allowed to speculatively execute loads that read data from the UC memory area
UC Memory Storage UC is one of two obscure memory types (the other is a WC memory type). When the UC storage runs, it is sent to the storage buffer reserved for it at the Allocator stage. UC storage is also transferred to the L1 data cache, L2 cache, or to the L3 cache (if any). In the event of a cache hit, the line is displayed from the cache.
When the storage buffer containing the storage in the UC memory is sent to the FSB interface module, a memory data write transaction ... is performed on the FSB
Saving in WC memory The WC memory type is well suited for a memory area (for example, a video clip buffer) that has the following characteristics:

The processor does not cache WC memory.
Speculative execution of loads from WC memory is allowed.
Storage in the WC memory is placed in the Write Combining Buffers (WCB) processor.
Each WCB can contain one row (64 bytes of data).
When storages are executed in the line of WC memory space, bytes are accumulated in the WCB assigned to write records to this memory area.
The next store in the location in the WCB can overwrite the byte that was placed in this place by the earlier store in this place. In other words, multiple records in the same place are reset, so the location reflects the last byte of data written to that place.
When the WCBs are eventually flushed to external memory via the FSB, data is not necessarily written to memory in the same order as the previous program stores. The attributed device must allow this type of behavior (that is, it must function correctly). See the topic “WCB FSB Transactions” on page 1080 for more information.

WT Storage
When the storage is cached, it writes to memory. The store is sent to the storage buffer, which was reserved for use at the Allocator stage. In addition, the storage is sent to the L1 data cache for search. There are several possibilities: * If the storage falls into the data cache, the line in the cache is updated, but remains in the S state (which means that the line is valid). * If the storage does not have enough data cache, it is sent to the L2 cache and a search is performed: * - If it falls into a line in the L2 cache, the line is updated, but it remains in the S state (which means that the line is valid). * - If he missed the L2 cache and there is no L3 cache, no further action is taken.

How are MTRR registers registered?

More articles: