Why is reading from a memory mapped file so fast?

Question

Why is reading from a memory mapped file so fast?

I do not have much experience working with the memory included in the memory, but after the first use, I am stunned by how fast they work. In my performance tests, I see that reading from memory associated files is 30 times faster than reading with regular C ++ stdio.

My test data is a 3 GB binary, it contains 20 large double-precision floating-point arrays. As my test program is structured, I invoke an external module reading method that uses a display in a memory location behind the scenes. Each time I call the read method, this external module returns a pointer and the size of the data that the pointer points to. Upon returning from this method, I call memcpy to copy the contents of the returned buffer to another array. Since I am doing memcpy to copy data from a memory mapped file, I expected that the reads displayed in memory would not be much faster than regular stdio, but I am surprised that it is 30 times faster.

PS: I am using a Windows machine. I compared the I / O speed and the data transfer rate on my computer is about 90 MB / s

+6

c ++ windows memory-mapped-files

Digitalye Oct 19 '14 at 22:40

source share

1 answer

codenheim · Accepted Answer · 2014-10-19T23:04:49+0000

OS kernel routines for IOs, such as read or write calls, are still just functions. These functions are written to copy data to / from the user space buffer in the kernel space structure, and then to the device. When you think that there is a user buffer, an IO library buffer (like stdio buf), a kernel buffer, and then a file, the data can potentially go through 3 copies to get between your program and disk. I / O procedures must also be reliable, and finally, the sys calls themselves impose a delay (kernel capture, context switch, wake-up process again).

When you store a memory card, you skip almost all of this part, excluding copies of the buffer. By efficiently processing the file as a large virtual array, you enable random access without missing the syscall overhead, so you reduce IO latency, and if the source code is inefficient (many small random I / O calls), then the overhead decreases even more dramatically.

An abstraction of virtual memory, a multiprocessor OS has a price, and that is it.

However, you can improve IO in some cases by disabling buffering in cases where you know that it will hurt performance, such as large continuous recordings, but beyond that you really cannot improve IO performance with memory mappings, without excluding the OS at all.

Why is reading from a memory mapped file so fast?

More articles: