If you have a multi-core computer and you can use Python 3.2 (instead of Python 2), this would be a good option for the new concurrent.futures function in Python 3.2 - depending on the processing you need to do with each line. If you require processing to be performed in file order, you may have to worry about reassembling the results later.
Otherwise, the use of concurrent.futures can be planned by each client, which will be processed in another task, without much effort. What result should you generate on this?
If you think that it is not beneficial for you to parallelize the contents of each line, the most obvious way is the best way: that is, what you just did.
This example divides the processing into up to 12 subprocesses, each of which performs the function of the built-in len Python. Replace len for a function that takes a string as a parameter and does everything you need to process on that string:
from concurrent.futures import ProcessPoolExecutor as Executor with Executor(max_workers=5) as ex: with open("poeem_5.txt") as fl: results = list(ex.map(len, fl))
A βlistβ call is needed to force the matching to be performed in the βcβ statement. If you do not need a scalar value for each line, but to write the result to a file, you can do this in a for loop:
for line in fl: ex.submit(my_function, line)
source share