Python rpy2 and matplotlib conflict when using multiprocessing

I am trying to calculate and generate graphs using multiprocessing. On Linux, the code below works correctly, but on Mac (ML) this does not happen by specifying below:

import multiprocessing import matplotlib.pyplot as plt import numpy as np import rpy2.robjects as robjects def main(): pool = multiprocessing.Pool() num_figs = 2 # generate some random numbers input = zip(np.random.randint(10,1000,num_figs), range(num_figs)) pool.map(plot, input) def plot(args): num, i = args fig = plt.figure() data = np.random.randn(num).cumsum() plt.plot(data) main() 

Rpy2 is rpy2 == 2.3.1, and R is 2.13.2 (I could not install the latest version of R 3.0 and rpy2 on any mac without a segmentation error).

Error:

 The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug. The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). 

I tried everything to understand what the problem was, no luck. My configuration:

 Danials-MacBook-Pro:~ danialt$ brew --config HOMEBREW_VERSION: 0.9.4 ORIGIN: https://github.com/mxcl/homebrew HEAD: 705b5e133d8334cae66710fac1c14ed8f8713d6b HOMEBREW_PREFIX: /usr/local HOMEBREW_CELLAR: /usr/local/Cellar CPU: dual-core 64-bit penryn OS X: 10.8.3-x86_64 Xcode: 4.6.2 CLT: 4.6.0.0.1.1365549073 GCC-4.2: build 5666 LLVM-GCC: build 2336 Clang: 4.2 build 425 X11: 2.7.4 => /opt/X11 System Ruby: 1.8.7-358 Perl: /usr/bin/perl Python: /usr/local/bin/python => /usr/local/Cellar/python/2.7.4/Frameworks/Python.framework/Versions/2.7/bin/python2.7 Ruby: /usr/bin/ruby => /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby 

Any ideas?

+6
source share
3 answers

This error occurs on Mac OS X when you perform a GUI operation outside of the main thread, which is exactly what you are doing when switching your schedule function to multiprocessing. Pool (I believe that it will not work on Windows for the same reason - since Windows has the same requirement). The only way I can imagine that it works is to use a pool to generate data, and then your main thread waits in the loop for the returned data (the queue is how I usually process it ...).

Here is an example (recognizing that this may not do what you want - archive all the numbers “at the same time”? - plt.show () blocks so that only one is drawn at a time, and I note that you don’t have it in your code example, but without I don’t see anything on my screen - however, if I can withstand it - there is no blocking and no error, because all GUI functions occur in the main thread):

 import multiprocessing import matplotlib.pyplot as plt import numpy as np import rpy2.robjects as robjects data_queue = multiprocessing.Queue() def main(): pool = multiprocessing.Pool() num_figs = 10 # generate some random numbers input = zip(np.random.randint(10,10000,num_figs), range(num_figs)) pool.map(worker, input) figs_complete = 0 while figs_complete < num_figs: data = data_queue.get() plt.figure() plt.plot(data) plt.show() figs_complete += 1 def worker(args): num, i = args data = np.random.randn(num).cumsum() data_queue.put(data) print('done ',i) main() 

Hope this helps.

+6
source

I had a similar problem with my employee, which was loading some data, generating a graph and saving it in a file. Note that this is slightly different from the OP case, which seems to be oriented towards interactive plotting. However, I find this relevant.

A simplified version of my code:

 def worker(id): data = load_data(id) plot_data_to_file(data) # Generates a plot and saves it to a file. def plot_something_parallel(ids): pool = multiprocessing.Pool() pool.map(worker, ids) plot_something_parallel(ids=[1,2,3]) 

This caused the same error that others are talking about:

 The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug. 

Following @bbbruce's thinking, I solved my problem by switching the matplotlib backend from TKAgg to the default value. In particular, I commented on the following line in the matplotlibrc file:

 #backend : TkAgg 
+5
source

It could be rpy2-specific. A similar issue is reported with OS X and multiprocessing here and there .

I think that using an initializer that imports the packages needed to run the code in the graphics can solve the problem ( multiprocessing-doc ).

+1
source

Source: https://habr.com/ru/post/943782/


All Articles