Is it possible to multiprocess in a loop in python 3.2?

I am trying to use python (3.2) for multiprocessing (ubuntu) to solve a massive search problem. Basically I want to take a list, take out the first element, find all other elements that have the same properties as the object, join the found elements and the target element in one list, remove them from the original list and (loop) do it all again . The multiprocessor is designed to split the work between processors. The code is executed once without any problems. This will essentially be a loop, as the exception is ignored and seems to be doing a good job. But within 30 seconds, he used almost all of my 16 GB of RAM.

My two problems so far: 1) I get "AssertionError exception: AssertionError (" can only check the child process ",) is ignored" as soon as I loop (and I have a lot of them). Along with this, there is a huge amount of RAM use (which, I think, may be related, not sure). And 2) It does not even perform a parallel search when using a larger data set.

My code looks like this:

class triangleListWorker(multiprocessing.Process): def __init__(self, work_queue, target, results,start): super().__init__() self.work_queue = work_queue self.results = results self.target = target self.startIndex = start def run(self): while True: try: searching = self.work_queue.get() self.do_search(searching) finally: self.work_queue.task_done() def do_search(self,searching): for x in range(len(searching)): if self.target.same_plane(searching[x]): self.results.append(self.startIndex+x) 

What I'm trying to do here is use Manager (). list () for storing all indexes where the target object and the desired object exist in the same plane.

  def do_multi_find_connections(self, target,searchList): work_queue = multiprocessing.JoinableQueue() #results= multiprocessing.Queue() cpu_count = multiprocessing.cpu_count() results = multiprocessing.Manager().list() range_per_process = len(searchList) // cpu_count start,end = 0, range_per_process + (len(searchList) % cpu_count) for i in range(cpu_count): worker = triangleListWorker(work_queue,target,results,start) worker.daemon = True worker.start() for x in range(cpu_count): searchsub = [searchList[x] for x in range(start,end)] work_queue.put(searchList[start:end]) #work_queue.put(searchList[start:end]) start,end = end, end + range_per_process print(start,end) work_queue.join() print( "can continue...") return results def find_connections(self, triangle_list,doMultiProcessing): tlist = [x for x in triangle_list] print("len tlist", len(tlist)) results = [] self.byPlane = [] if doMultiProcessing: while len(tlist) > 0: results = [] target = tlist[0] #print("target",tcopy[0]) self.do_multi_find_connections(target,tlist) results = self.do_multi_find_connections(target,tlist)#list of indexes plane = [] print(len(results)) print(results) for x in results: plane.append(tlist[x]) new_tlist = [tlist[x] for x in range(len(tlist)) if not x in results] print(len(new_tlist)) tlist = new_tlist self.byPlane.append(plane) ## self.byPlane.append(plane) ## tlist = [] 

This code (perhaps a little ugly) should loop around to find the next plane, and exhaust everything else in the plane, calling a function on it (which does multiprocessing).

Powered by Ubuntu 11.04 64, python 3.2.

+4
source share
2 answers

Instead of using a loop, I think the intended template for the multiprocessing module should create a Pool and use Pool.map_async . IOW, convert your loop into some kind of iterator (possibly generator ). Then pass the equivalent of your do_search method as a function and your iterator to map_async .

+1
source

You can use the pool class for multiprocessing:

 from multiprocessing import Pool pool = Pool(processes=5) valuesProcessed = pool.map(someFunction, valuesToProcess) 
0
source

Source: https://habr.com/ru/post/1369250/


All Articles