Help me talk about F # threads

Turning on with some F # (via MonoDevelop), I wrote a program that lists files in a directory with a single thread:

let rec loop (path:string) = Array.append ( path |> Directory.GetFiles ) ( path |> Directory.GetDirectories |> Array.map loop |> Array.concat ) 

And then the asynchronous version:

 let rec loopPar (path:string) = Array.append ( path |> Directory.GetFiles ) ( let paths = path |> Directory.GetDirectories if paths <> [||] then [| for p in paths -> async { return (loopPar p) } |] |> Async.Parallel |> Async.RunSynchronously |> Array.concat else [||] ) 

In small directories, the asynchronous version works fine. In large directories (for example, many thousands of directories and files) the asynchronous version seems to freeze. What am I missing?

I know that creating thousands of threads will never be the most effective solution - I have only 8 processors, but I'm confused that for large directories, the asynchronous function simply does not respond (even after half an hour). However, this is not noticeable, but it puzzles me. Is there a thread pool that is exhausted?

How do these threads work?

Edit:

According to this document :

Mono> = 2.8.x has a new thread, which is much, much more complicated at a dead end. If you get a dead end with an empty thread, the likelihood that your program is trying to get into a dead end.

: D

+4
source share
2 answers

Yes, most likely you are suppressing the Mono thread pool, which is stopping your system.

If you remember one thing from this, that flows streams . Each thread needs its own stack (megabytes in size) and a fragment of processor time (context switching is required). Because of this, it is rarely a good idea to deploy your own thread for short-lived tasks. This is why .NET has ThreadPool.

ThreadPool is an existing collection of threads for short tasks, and this is what F # users use Async for their workflows. Whenever you perform an F # Async operation, it simply delegates the action to the thread pool.

The problem is what happens when you immediately run thousands of asynchronous actions in F #? The naive implementation simply spawned as many threads as needed. However, if you need 1000 threads, that means you need 1000 x 4 MB of stack space. Even if you have enough memory for all the stacks, your processor will constantly switch between different threads. (And swapping local stacks to and from memory.)

IIRC, the implementation of Windows.NET, was smart enough not to create a ton of threads and just queue up for work until some spare threads were created to take action. In other words, it will continue to add threads until there is a fixed number, and just use them. However, I do not know how the Mono thread pool is implemented.

tl; dr: This works as expected.

+6
source
Chris is probably right. Another angle to consider is that file systems are not fixed things โ€” do these directories with thousands of files change when you try to process the list? If so, it can lead to a racial state somewhere.
0
source

Source: https://habr.com/ru/post/1333627/


All Articles