Python: using map and multiprocessing

I am trying to write a function that can take two arguments, and then add it to multiprocessing.Pool and parallelize it. I had some complications when I tried to write this simple function.

 df = pd.DataFrame() df['ind'] = [111, 222, 333, 444, 555, 666, 777, 888] df['ind1'] = [111, 444, 222, 555, 777, 333, 666, 777] def mult(elem1, elem2): return elem1 * elem2 if __name__ == '__main__': pool = Pool(processes=4) print(pool.map(mult, df.ind.astype(int).values.tolist(), df.ind1.astype(int).values.tolist())) pool.terminate() 

It returns an error:

 TypeError: unsupported operand type(s) for //: 'int' and 'list' 

I don’t understand what happened. Can someone explain what this error means and how can I fix it?

+5
source share
1 answer

The multiprocess pool module accepts a list of arguments that you want to multiprocess, and supports only one argument. You can fix this by doing the following:

 from multiprocessing import Pool import pandas as pd df = pd.DataFrame() df['ind'] = [111, 222, 333, 444, 555, 666, 777, 888] df['ind1'] = [111, 444, 222, 555, 777, 333, 666, 777] def mult(elements): elem1,elem2 = elements return elem1 * elem2 if __name__ == '__main__': pool = Pool(processes=4) inputs = zip(df.ind.astype(int).values.tolist(), df.ind1.astype(int).values.tolist()) print(pool.map(mult, inputs)) pool.terminate() 

What I did here is zip your two iteratives into a list with each element being the two arguments that you would like to enter. Now I am modifying the input of your function to unzip these arguments so that they can be processed.

+6
source

Source: https://habr.com/ru/post/1263593/


All Articles