Threading to improve performance

I never used threads - I never thought that my code would be useful. However, I think that threads can improve the performance of the following pseudocode:

Loop through table of records containing security symbol field and a quote field Load a web page (containing a security quote for a symbol) into a string variable Parse the string for the quote Save the quote in the table Get next record end loop 

Loading each web page takes the most time. Parsing for a quote is pretty quick. I think I could take, say, half the records for one thread, and the second half - the second.

+6
source share
2 answers

In OmniThreadLibrary, it is very simple to solve this problem using a multi-stage pipeline - the first stage is performed on several tasks and loads web pages and the second stage is performed in one instance and stores the data in the database. I wrote a blog post that documented this decision some time ago.

The solution can be summarized using the following code (you will need to fill in some spaces in the HttpGet and Inserter methods).

 uses OtlCommon, OtlCollections, OtlParallel; function HttpGet(url: string; var page: string): boolean; begin // retrieve page contents from the url; return False if page is not accessible end; procedure Retriever(const input: TOmniValue; var output: TOmniValue); var pageContents: string; begin if HttpGet(input.AsString, pageContents) then output := TPage.Create(input.AsString, pageContents); end; procedure Inserter(const input, output: IOmniBlockingCollection); var page : TOmniValue; pageObj: TPage; begin // connect to database for page in input do begin pageObj := TPage(page.AsObject); // insert pageObj into database FreeAndNil(pageObj); end; // close database connection end; procedure ParallelWebRetriever; var pipeline: IOmniPipeline; s : string; urlList : TStringList; begin // set up pipeline pipeline := Parallel.Pipeline .Stage(Retriever).NumTasks(Environment.Process.Affinity.Count * 2) .Stage(Inserter) .Run; // insert URLs to be retrieved for s in urlList do pipeline.Input.Add(s); pipeline.Input.CompleteAdding; // wait for pipeline to complete pipeline.WaitFor(INFINITE); end; 
+4
source

If the number of records is relatively small, say, 50 or less, you can simply start a separate thread for each record and let them all work in parallel, for example:

 begin thread Load a web page for symbol into a string variable Parse the string for the quote Save the quote in the table end thread 

.

 Loop through table of records Launch a thread for current security symbol Get next record end loop 

If you have more records to process, consider using a thread pool so you can process records in smaller batches, for example:

 Create X threads Put threads in a list Loop through table of records Wait until a thread in pool is idle Get idle thread from pool Assign current security symbol to thread Signal thread Get next record end loop Wait for all threads to be idle Terminate threads 

.

 begin thread Loop until terminated Mark idle Wait for signal If not Terminated Load a web page for current symbol into a string variable Parse the string for the quote Save the quote in the table end if end loop end thread 

There are many different ways to implement the above, so I left it in pseudo-code. Take a look at the VCL TThread , TList and TEvent or the Win32 API function QueueUserWorkerItem() or any number of third-party thread libraries.

+4
source

Source: https://habr.com/ru/post/908238/


All Articles