Typically, you use user-level threads with an event loop, so other user-level threads can continue to execute while someone is waiting for data: the thread scheduler checks the registered file descriptors for readiness when the stream gives, and usually the thread (s) takes precedence for whose input is ready. Meanwhile, non-automatic profitability has a big advantage: you often donβt have to worry about concurrent access to data structures (if the programmer is not stupid and donates in the middle of what should be an atomic operation with respect to other threads). This means less need (often not necessary) for synchronization and blocking, so user-level threads often benefit from kernel threads: much less overhead. And when synchronization is required, it is often cheaper than using kernel threads.
source share