Python: How to start a nested parallel process in python?

Question

Python: How to start a nested parallel process in python?

I have a dftrader's transaction data set . I have 2 levels for loops:

smartTrader =[]

for asset in range(len(Assets)):
    df = df[df['Assets'] == asset]
    # I have some more calculations here
    for trader in range(len(df['TraderID'])):
        # I have some calculations here, If trader is successful, I add his ID  
        # to the list as follows
        smartTrader.append(df['TraderID'][trader])

    # some more calculations here which are related to the first for loop.

I would like to parallelize the calculations for each asset in Assets, and I also want to parallelize the calculations for each trader for each asset. After all these calculations are completed, I want to do an additional analysis based on the list smartTrader.

This is my first attempt at parallel processing, so please be patient with me and I appreciate your help.

+1

python parallel-processing python-multiprocessing

roland Jul 22 '15 at 1:58

source share

3 answers

Mike McKerns · Answer 1 · 2015-07-24T18:45:22+0000

pathos, multiprocessing, . pathos , . , , , , , , .

>>> from pathos.pools import ProcessPool, ThreadPool
>>> amap = ProcessPool().amap
>>> tmap = ThreadPool().map
>>> from math import sin, cos
>>> print amap(tmap, [sin,cos], [range(10),range(10)]).get()
[[0.0, 0.8414709848078965, 0.9092974268256817, 0.1411200080598672, -0.7568024953079282, -0.9589242746631385, -0.27941549819892586, 0.6569865987187891, 0.9893582466233818, 0.4121184852417566], [1.0, 0.5403023058681398, -0.4161468365471424, -0.9899924966004454, -0.6536436208636119, 0.2836621854632263, 0.9601702866503661, 0.7539022543433046, -0.14550003380861354, -0.9111302618846769]]

, , ( get ).

pathos : https://github.com/uqfoundation : $ pip install git+https://github.com/uqfoundation/pathos.git@master

Felipe Lema · Answer 2 · 2015-07-23T13:07:33+0000

for map:

import functools
smartTrader =[]

m=map( calculations_as_a_function, 
        [df[df['Assets'] == asset] \
                for asset in range(len(Assets))])
functools.reduce(smartTradder.append, m)

map s.a. multiprocessing 's stackless'

rth · Answer 3 · 2015-07-24T19:02:17+0000

Probably multithreading, from the python standard library, is most convenient:

import threading

def worker(id):
    #Do you calculations here
    return

threads = []
for asset in range(len(Assets)):
    df = df[df['Assets'] == asset]
    for trader in range(len(df['TraderID'])):
        t = threading.Thread(target=worker, args=(trader,))
        threads.append(t)
        t.start()
    #add semaphore here if you need synchronize results for all traders.

Python: How to start a nested parallel process in python?

More articles: