Beginner python multiprocessing question?

I have several records in the database that I want to process. Basically, I want to run a few regular expression substitutions over tokens of string lines of text lines and at the end and write them back to the database.

I want to know if multiprocessing speeds up the time needed to complete such tasks. I did

multiprocessing.cpu_count

and it returns 8. I tried something like

process = []
for i in range(4):
    if i == 3:
        limit = resultsSize - (3 * division)
    else:
        limit = division

    #limit and offset indicates the subset of records the function would fetch in the db
    p = Process(target=sub_table.processR,args=(limit,offset,i,))
    p.start()
    process.append(p)
    offset += division + 1

for po in process:
    po.join()

but, apparently, the time taken by the time is more than the time required to start one thread. Why is this so? Can anyone comment on this suitable case or what am I doing wrong here?

+3
source share
3 answers

:

  • processR 1 ? ( , .)

  • , "", , , , . , . CSV, ?

, .

+1

?

- , ?

.

, , .

(, ) 8 , .

, . , .

, 8 , . , , , .

8 . , - - , - .

8 . unix query | process1 | process2 | process3 >file , , .

.

(, , , ..) , . () () , .

" - , ?" .

, . . , . , .

+5

multicpu , (.. , ).

From your description, you have an IO binding problem: getting data from the disk to the CPU (which is inactive) forever, and then the processor operation is very fast (because it is so simple).

Thus, speeding up the CPU does not make a big difference overall.

+1
source

Source: https://habr.com/ru/post/1776360/


All Articles