For Haswell and Broadwell, the EDRAM L4 caches are in the L3 cache on a chip. Although this setting simplifies the design of the LLC and allows earlier verification of tags for exemptions from the processor, it makes access to eDRAM LLC from other devices (such as independent GPUs via PCIe) slower, as these memory requests must be sent to the integrated L3 before accessing with LLC. To solve this problem, eDRAM was moved to a position on the DRAM controllers in Skylake (more like a memory buffer, not a cache)
Reference: Li, Ang, et al. "The study and analysis of the real impact of modern memory on memory on the scientific core of HPC." Materials of the International Conference of High Performance Computing, Networks, Storage and Analysis. ACM, 2017
Below you can see the architecture of Broadwell, Haswell and Scylek.
Broadwell and Haswell
Skylake
source share