Controlling the consumption of threads and memory while working with a blocking process

I have a bunch of files (about 10 per second) that go into the system (stored in the database). Each file contains an entry for 1 to 500 devices. This device will be displayed in several files (but not every file). This data must ultimately be stored in another database stored on the device. There are two different file formats.

There is an API that takes care of the final part of the database, which takes up several records for one device (behind the scenes, it also does some searches to find identifiers in the database and therefore processing several records for one device at once means performing a search once, and more than once for each entry).

For this, I have a program with several parts:

  • Parsing files, extracting data into a common set of data objects.
    • This is a threading process with one thread per file, adding data to a thread-safe collection.
    • As soon as each file is uploaded, its entry in the database is marked as "in progress"
  • Save objects to database
    • Another streaming process that retrieves all the objects for a given device and then reports the data API to save them.
    • As soon as saving all devices from one file is successful (or if a failure occurs), the database record for the source file will be marked as successful / unsuccessful.

My question is: what is the best way to control when to parse files, how many threads to use, how much RAM, etc.

  • The data API will take the longest - in most cases, threads wait for the API to return.
  • .
  • , , .
  • , API -

, , , , , , ?

+3
2

, , - ( DB ). - .

: . - Db. .

, 3 , . , "", .

1:1 2 ( /) ,

2: 1 . , Db

3: 1+, Db.

. . , 2 .

+1

. , . .

(, , api).

, , . , .

, 32- , ~ 800 , , .

- API . , .

:

  • (/ ), API DB
  • DB API, , - goto 1
0

Source: https://habr.com/ru/post/1752124/


All Articles