How to achieve maximum write speed using Python?

I am writing a program that will perform high-speed data collection. The acquisition card can work up to 6.8 GB / s (it is on PCIe3 x8). Right now I'm trying to transfer to a RAM disk to see the maximum write speed I can achieve with Python.

The card is going to give me 5-10 MB blocks, which I can then write somewhere.

I wrote this piece of code that writes a 10 MB block 500 times to a binary file. I use Anaconda2 on the 64-bit version of Windows 7 and I used the profiler from Anaconda.

block = 'A'*10*1024*1024 filename = "R:\\test" f = os.open(filename, os.O_CREAT| os.O_BINARY|os.O_TRUNC|os.O_WRONLY|os.O_SEQUENTIAL) p = profiler.Profile(signatures=False) p.enable() start = time.clock() for x in range(500): os.write(f,block) transferTime_sec = time.clock() - start p.disable() p.print_stats() print('\nwrote %f MB' % (os.stat(filename).st_size/(1024*1024))) 

I tested this on a RAM disk (R: \) and I got the following output:

enter image description here

So, I realized that I get about 2.5 GB / s in RAM. which is not bad, but far from the maximum RAM bandwidth, but the numbers are consistent. Thus, low bandwidth is one of the problems.

The second problem is that when I test this code with PCIe SSD (which I compared with other software with sequential write 1090 MB / s), it gives comparable data.

enter image description here

This makes me think that this is caching and / or buffering (?), And therefore I just do not measure the actual IO. I am not sure what is actually happening as I am pretty new to python.

So, my main question is how to achieve maximum recording speed, and the question is, why am I getting these numbers?

+5
source share
1 answer

I don’t know if you continue to follow this issue, but I found your question interesting, so I tried it on a Linux laptop.

I ran your code in python 3.5 and found that you need to have the os.O_SYNC flag to avoid a buffering problem (basically, the os.write function os.write not return before all the data is written to disk). I also replace time.clock() with time.time() , which give better results.

 import os import time import cProfile def ioTest(): block = bytes('A'*10*1024*1024, 'utf-8') filename = 'test.bin' f = os.open(filename, os.O_WRONLY | os.O_CREAT | os.O_TRUNC | os.O_SYNC) start = time.time() for x in range(500): os.write(f,block) os.close(f) transferTime_sec = time.time() - start msg = 'Wrote {:0f}MB in {:0.03f}s' print(msg.format(os.stat(filename).st_size/1024/1024, transferTime_sec)) cProfile.run('ioTest()') 

In addition, this post talks about using the os.O_DIRECT flag, which will use DMA and avoid bottlenecks. I had to use the mmap module to make it work on my machine:

 import os import time import cProfile import mmap def ioTest(): m = mmap.mmap(-1, 10*1024*1024) block = bytes('A'*10*1024*1024, 'utf-8') m.write(block) filename = 'test.bin' f = os.open(filename, os.O_WRONLY | os.O_CREAT | os.O_TRUNC | os.O_SYNC, os.O_DIRECT) start = time.time() for x in range(500): os.write(f,m) os.close(f) transferTime_sec = time.time() - start msg = 'Wrote {:0f}MB in {:0.03f}s.' print(msg.format(os.stat(filename).st_size/1024/1024, transferTime_sec)) cProfile.run('ioTest()') 

This reduced the recording time on my machine by 40% ... not bad. I did not use os.O_SEQUENTIAL and os.O_BINARY which are not available on my machine.

[Edit] : I found how to use the os.O_DIRECT flag from this site , which explains it very well and in detail. I highly recommend reading this if you're interested in performance and direct I / O in Python.

0
source

Source: https://habr.com/ru/post/1259935/


All Articles