Python read () and write () in large blocks / memory management

Question

Python read () and write () in large blocks / memory management

I am writing python code that merges large files at different points. I did something similar in C, where I allocated a 1 MB char array and used it as a read / write buffer. And it was very simple: read 1MB into a char array, then write it.

But with python, I assume this is different, every time I call read () with size = 1M, it selects a character string of length 1 M. And, I hope when the buffer goes out of scope, we will free it on the next gc-aisle.

Will python handle the distribution this way? If so, is the constant allocation / release cycle calculated expensive?

Is it possible to tell python to use the same memory block as in C? Or is python vm smart enough to do it yourself?

I assume that I am mainly aiming that this is similar to the python dd implementation.

+4

python memory-management file-io

rhlee Sep 16 '13 at 0:14

source share

1 answer

Tim peters · Accepted Answer · 2013-09-16T00:43:17+0000

Search docs.python.org for readinto to find documents suitable for your version of Python. readinto is a low-level function. They will look something like this:

readinto (b) Read up to len (b) bytes in bytearray b and return the number of bytes read.
Like read (), multiple reads can be thrown into the original raw stream if the latter is not interactive.
BlockingIOError occurs if the underlying raw thread is in non-blocking mode and has no data at the moment.

But don't worry about it prematurely. Python allocates and frees dynamic memory at furious speeds, and it is likely that the cost of repeatedly getting and freeing a dead megabyte will be lost in noise. And note that CPython is mostly counted by reference, so your buffer will be returned “immediately” when it goes out of scope. As for whether Python will reuse the same memory space every time, the odds will be decent, but this is not guaranteed. Python does nothing to try the power , but depending on the entire distribution / release pattern and implementation details of the C malloc()/free() , this is not impossible, it will be reused; -)

Python read () and write () in large blocks / memory management

More articles: