To understand in the bowels of tasks / threads ... let's look at this toy kernel code ...
struct regs {
int eax, ebx, ecx, edx, es, ds, gs, fs, cs, ip, flags;
struct tss * task_sel;
}
struct thread {
struct regs * regs;
int parent_id;
struct thread * next;
}
struct task {
struct regs * regs;
int * phys_mem_begin;
int * phys_mem_end;
int * filehandles;
int priority;
int * num_threads;
int quantum;
int duration;
int start_time, end_time;
int parent_id;
struct thread * task_thread;
/ * ... * /
struct task * next;
}
Imagine that the kernel allocates memory for this task structure, which is a linked list, carefully look at the quantum field, that is, the processor time based on the priority field. There will always be an id 0 task that never sleeps, just sleeps, maybe releases nops (No OPerationS) ... the scheduler revolves around the nauseum ad indefinitely (that is, when the power is turned off), if the quantum field sets the task for 20 ms, sets start_time and end_time + 20ms, when it is end_time up, the kernel saves the state of the processor registers in the regs pointer. Goes to the next task in the chain, loads the processor registers from the pointer to regs and goes to the instruction, sets the quantum and the length of time when the duration reaches zero, goes to the next ... contextual switching is effective ... this is what gives it the illusion , which works simultaneously on one processor.
Now consider the thread structure, which is a linked list of threads ... inside this task structure. The kernel allocates threads for the specified task, adjusts the processor states for this thread and jumps to the threads ... now the kernel should manage the threads, as well as the tasks themselves ... again contextual switching between the task and the thread ...
Let's move on to the multi-processor, the kernel will be configured for scalability and what the scheduler does, download one task on one processor, load another on another processor (dual core) and both jump to where the pointer pointer points ... now the kernel really simultaneously launches both tasks simultaneously on both processors. Scale to 4 ways, the same thing, additional tasks downloaded to each processor, scale again, to n-way ... you get a drift.
As you can see, the threads are not perceived as scalable, since the kernel has an openly mammoth job in tracking what the processor is running, and in addition, what task is being performed, which threads, which fundamentally explains why I think that threads are not exactly scalable ... Themes consume a lot of resources ...
If you really want to see what happens, take a look at the source code for Linux, especially in the scheduler. Don’t hang, forget about kernel versions 2.6.x, look at the prehistoric version 0.99, the scheduler will be easier to understand and read easier, but a little older, but it’s worth a look, it will help you understand why and, I hope, my answer is also why the threads are not scalable ... and shows how the toy uses process-based time sharing. I tried not to delve into the technical aspects of a modern processor that could do more than what I described ...
Hope this helps.