Why does Python split the read function into multiple system calls?

Question

Why does Python split the read function into multiple system calls?

I tested this:

strace python -c "fp = open('/dev/urandom', 'rb'); ans = fp.read(65600); fp.close()"

With the following partial output:

 read(3, "\211^\250\202P\32\344\262\373\332\241y\226\340\16\16!<\354\250\221\261\331\242\304\375\24\36\253!\345\311"..., 65536) = 65536 read(3, "\7\220-\344\365\245\240\346\241>Z\330\266^Gy\320\275\231\30^\266\364\253\256\263\214\310\345\217\221\300"..., 4096) = 4096

There are two calls to read syscall with a different number of bytes requested.

When I repeat the same with the dd command,

 dd if=/dev/urandom bs=65600 count=1 of=/dev/null

only one read syscall is started using the exact number of bytes requested.

 read(0, "Pi\246!\356o\10A\307\376\2332\365=\262r`\273\"\370\4\n!\364J\316Q1\346\26\317"..., 65600) = 65600

I searched for this without any explanation. Is this related to page size or Python memory management?

Why is this happening?

+5

python system-calls dd

fcatho Oct 2 '15 at 11:38

source share

1 answer

Cel skeggs · Accepted Answer · 2015-11-21T06:50:28+0000

I did some research on why this is happening.

Note. I conducted tests with Python 3.5. Python 2 has a different I / O system with the same quirk for the same reason, but it was easier to understand with the new I / O system in Python 3.

As it turns out, this is due to the Python BufferedReader, not the real system calls.

You can try this code:

 fp = open('/dev/urandom', 'rb') fp = fp.detach() ans = fp.read(65600) fp.close()

If you try to find this code, you will find:

 read(3, "]\"\34\277V\21\223$l\361\234\16:\306V\323\266M\215\331\3bdU\265C\213\227\225pWV"..., 65600) = 65600

Our original object was a BufferedReader:

 >>> open("/dev/urandom", "rb") <_io.BufferedReader name='/dev/urandom'>

If we call detach() on this, we will throw away the BufferedReader part and just get FileIO, which is what the kernel says. On this layer he will read everything at once.

So, the behavior we are looking for is in BufferedReader. We can look in Modules/_io/bufferedio.c in the Python source, in particular in the _io__Buffered_read_impl function. In our case, when the file has not yet been read up to this point, we send it to _bufferedreader_read_generic .

Now that we see that the quirk comes from:

 while (remaining > 0) { /* We want to read a whole block at the end into buffer. If we had readv() we could do this in one pass. */ Py_ssize_t r = MINUS_LAST_BLOCK(self, remaining); if (r == 0) break; r = _bufferedreader_raw_read(self, out + written, r);

Essentially, this will read as many complete "blocks" as possible directly into the output buffer. The block size is based on the parameter passed to the BufferedReader constructor, which is selected by default according to several parameters:

  * Binary files are buffered in fixed-size chunks; the size of the buffer is chosen using a heuristic trying to determine the underlying device's "block size" and falling back on `io.DEFAULT_BUFFER_SIZE`. On many systems, the buffer will typically be 4096 or 8192 bytes long.

Thus, this code will be read as much as possible, without the need to run its buffer. In this case, it will be 65536 bytes, because it is the largest multiple of 4096 bytes less than or equal to 65600. By doing this, it can read the data directly at the output and not fill and free its own buffer, which would be slower.

Once this is done, there may be a little more to read. In our case, 65600 - 65536 == 64 , so it needs to read at least 64 bytes. But still he reads 4096! What gives? Well, the key here is that the point of the BufferedReader is to minimize the number of read cores that we really need to do, since each reading has significant overhead in and of itself. Therefore, it simply reads another block to fill its buffer (4096 bytes in this way) and gives you the first 64 of them.

I hope this makes sense in terms of explaining why this is happening.

As a demonstration, we could try this program:

 import _io fp = _io.BufferedReader(_io.FileIO("/dev/urandom", "rb"), 30000) ans = fp.read(65600) fp.close()

With this, strace tells us:

 read(3, "\357\202{u'\364\6R\fr\20\f~\254\372\3705\2\332JF\n\210\341\2s\365]\270\r\306B"..., 60000) = 60000 read(3, "\266_ \323\346\302}\32\334Yl\ry\215\326\222\363O\303\367\353\340\303\234\0\370Y_\3232\21\36"..., 30000) = 30000

Of course, this follows the same pattern: as many blocks as possible, and then another.

dd , in search of the high efficiency of copying a lot and a lot of data, he would try to read up to a lot more at once, so he uses only one read. Try with a large dataset, and I suspect you might find some calls to read.

TL DR: BufferedReader reads as many complete blocks as possible (64 * 4096), and then one additional block of 4096 to fill its buffer.

EDIT:

An easy way to resize the buffer, as @fcatho pointed out, is to change the buffering argument to open :

 open(name[, mode[, buffering]]) 
(...)
An optional buffering argument indicates the file size required for the buffer: 0 means unbuffered, 1 means line buffering, any other positive value means using a buffer (approximately) of this size (in bytes). Negative buffering means using the default system, which is usually buffered for tty devices and fully buffered for other files. If this parameter is omitted, the default system is used.

This works for both Python 2 and Python 3 .

Why does Python split the read function into multiple system calls?

More articles: