Logging when using parallel python

I use parallel python to execute a large function (executePipeline) several times. This function also uses multiprocessing (with a multiprocessing module). I am having trouble displaying the correct registration messages on my console using the python parallel module. When I do not use it, log messages are displayed well.

Here's how it works. I have a server that calls a worker each time it receives a request from a client using:

job = self.server.job_server.submit(func = executeWorker, args = (config, ) ) 

This function is executed from a new thread every time there is a new request from the client. The worker then calls the executePipeline function, which executes various processes using multiprocessing.

The SocketServer.TCPServerI server uses threads. I installed the logger on my server, as shown in the root log:

 self.logger = logging.getLogger() self.logger.setLevel(logging.INFO) self.logger.addHandler(logging.StreamHandler() self.job_server = pp.Server(ncpus = 8) # for test self.jobs = [] 

When I start my server, I can only get registered using executePipeline, but not from child processes. Also I get pipeline execution logging only at the end of the job until it is running.

Also here is the working code. "Running the conveyor with worker number" displays well in my terminal

 ''' Setup logging ''' logger = logging.getLogger() logger.setLevel(logging.INFO) # worker name publicIP = socket.gethostbyname(socket.gethostname()) pid = os.getpid() workerID = unicode(str(publicIP) + ":" + str(pid)) logger.info( "Executing pipeline with worker {}".format(workerID)) res = executePipeline(config) markedScore = res["marker.score"] markedDetails = res["marker.detail"] results = {'marker.detail' : markedDetails , 'marker.score' : markedScore } return results 

Is there a good way to get the log to work correctly and see what the child process of my executePipeline function sends back?

Thank you for your help!

Romanzo

+4
source share
1 answer

I had a similar problem when I tried to write parallelized tests that write the results to a common dictionary. Answer multiprocessing.Manager :

 # create shared results dictionary manager = multiprocessing.Manager() result_dict = manager.dict({}) 

so that you can simply publish the logs from the processes to this general dictionary and then process them.

or use LOG = multiprocessing.get_logger() as described here: https://docs.python.org/2/library/multiprocessing.html and here: How do I register when using multiprocessing in Python?

0
source

Source: https://habr.com/ru/post/1484577/


All Articles