How to use Python and OpenCV with multiprocessing?

I use Python 3.4.3 and OpenCV 3.0.0 to process (using various filters) a very large image (80,000 x 60,000) in memory, and I would like to use several processor cores to improve performance. After some reading, I came up with two possible methods: 1) Use the python module multiprocessing, let each process process a piece of a large image and attach to the results after processing is complete (And should this probably be done on a POSIX system?) 2) Since NumPy supports OpenMP, and OpenCV uses NumPy, can I just leave NumPy multiprocessing?

So my question is:

Which one would be the best solution? (If they do not seem reasonable, what would be the possible approach?)

If option 2 is good, should I create both NumPy and OpenCV using OpenMP? How can I do multiprocessing? (I could not find useful instructions ..)

+7
source share
3 answers

After reading some SO posts, I found a way to use OpenCVin Python3 with multiprocessing. I recommend doing this on Linux because according to this post, the spawned processes share memory with their parents until the content changes. Here is a minimal example:

import cv2
import multiprocessing as mp
import numpy as np
import psutil

img = cv2.imread('test.tiff', cv2.IMREAD_ANYDEPTH) # here I'm using a indexed 16-bit tiff as an example.
num_processes = 4
kernel_size = 11
tile_size = img.shape[0]/num_processes  # Assuming img.shape[0] is divisible by 4 in this case

output = mp.Queue()

def mp_filter(x, output):
    print(psutil.virtual_memory())  # monitor memory usage
    output.put(x, cv2.GaussianBlur(img[img.shape[0]/num_processes*x:img.shape[0]/num_processes*(x+1), :], 
               (kernel_size, kernel_size), kernel_size/5))
    # note that you actually have to process a slightly larger block and leave out the border.

if __name__ == 'main':
    processes = [mp.Processes(target=mp_slice, args=(x, output)) for x in range(num_processes)]

    for p in processes:
        p.start()

    result = []
    for ii in range(num_processes):
        result.append(output.get(True))

    for p in processes:
        p.join()

Queue, - multiprocessing . (ctypes ctypes)

result = mp.Array(ctypes.c_uint16, img.shape[0]*img.shape[1], lock = False)

, , . mp.Array . . , . NumPy:

result_np = np.frombuffer(result, dtypye=ctypes.c_uint16)
+2

, , , libvips. (, ). , , : , , , , , , ..,

(, OpenCV, , ), Python. Linux, OS X Windows. .

+4

Ray, Python. "" fork-join, (, ), , , .

import cv2
import numpy as np
import ray

num_tasks = 4
kernel_size = 11


@ray.remote
def mp_filter(image, i):
    lower = image.shape[0] // num_tasks * i
    upper = image.shape[0] // num_tasks * (i + 1)
    return cv2.GaussianBlur(image[lower:upper, :],
                            (kernel_size, kernel_size), kernel_size // 5)


if __name__ == '__main__':
    ray.init()

    # Load the image and store it once in shared memory.
    image = np.random.normal(size=(1000, 1000))
    image_id = ray.put(image)

    result_ids = [mp_filter.remote(image_id, i) for i in range(num_tasks)]
    results = ray.get(result_ids)

, , , Python, (, , ). Plasma Apache Arrow.

Ray. , Ray.

0

Source: https://habr.com/ru/post/1608931/


All Articles