The kernel itself does not have a stack at all. The same thing applies to the process. It also has no stack. Topics are only citizens of the system that are considered executive units. Because of this, only threads can be scheduled, and only threads have stacks. But there is one point at which kernel mode code is used - every moment in time works in the context of the current active thread. Thanks to this, the kernel can reuse the stack of the current active stack. Please note that only one of them can execute either the kernel code or the user code at the same time. Because of this, when invoking the kernel, it simply reuses the stack thread and performs a cleanup before returning control back to the interrupted actions in the thread. The same mechanism works for interrupt handlers. The same mechanism is used by signal handlers.
In turn, the stack thread is divided into two isolated parts, one of which is called the user stack (because it is used when the thread runs in user mode), and the second is called the kernel kernel (because it is used when the thread runs in kernel mode). When a thread crosses the boundary between user and kernel mode, the CPU automatically switches it from one stack to another. Both stacks are tracked differently by the kernel and processor. For a kernel core, the processor constantly stores a pointer to the top of the thread's kernel stack. This is easy because this address is constant for the stream. Each time a thread enters the kernel, it detects an empty kernel stack, and each time it returns to user mode, it clears the kernel stack. At the same time, the CPU does not mean a pointer to the top of the user stack when the thread is in kernel mode. Instead, when entering the kernel, the CPU creates a special interrupt stack at the top of the kernel stack and stores the value of the user mode stack pointer in this frame. When a thread exits the kernel, the CPU restores the ESP value from the previously created interrupt stack frame just before it is cleared. (for outdated x86, a pair of instructions an int / iret descriptor enters and exits kernel mode)
When entering the kernel mode immediately after the CPU creates the interrupt stack frame, the kernel pushes the contents of the remaining processor registers onto the kernel stack. Note that this saves values only for those registers that can be used by kernel code. For example, the kernel does not save the contents of SSE registers just because it never touches them. Similarly, before requesting the CPU to return control to user mode, the kernel unloads the previously saved contents back to the registers.
Note that on systems like Windows and Linux, there is the concept of a system thread (often called a kernel thread, I know this is confusing). System threads are special threads because they run only in kernel mode and because of this they do not have a user part of the stack. The kernel uses them to perform supportive household tasks.
The thread switch is executed only in kernel mode. This means that both streams, outbound and inbound, start in kernel mode, both use their own kernel stacks, and both have kernel stacks that have “interrupt” frames with pointers to the top of the user stacks. The key point of a thread switch is switching between thread kernel stacks, as simple as:
pushad; // save context of outgoing thread on the top of the kernel stack of outgoing thread ; here kernel uses kernel stack of outgoing thread mov [TCB_of_outgoing_thread], ESP; mov ESP , [TCB_of_incoming_thread] ; here kernel uses kernel stack of incoming thread popad; // save context of incoming thread from the top of the kernel stack of incoming thread
Note that there is only one function in the kernel that performs a thread switch. In this regard, each time when switching the kernel stack, it can find the context of the incoming stream at the top of the stack. Just because every time before switching the kernel of the stack, it pushes the context of the outgoing stream onto its stack.
Please also note that each time after switching the stack and before returning to user mode, the kernel reboots the CPU mind with the new value of the top of the kernel stack. This ensures that when a new active thread tries to enter the kernel in the future, it will be switched by the processor to its own kernel stack.
Note also that not all registers are saved on the stack during thread switching; some registers, such as FPU / MMX / SSE, are stored in a specially allocated area in the TCB of the outgoing thread. The kernel uses a different strategy here for two reasons. First of all, not every thread in the system uses them. Pushing their contents and pushing it from the stack for each thread is inefficient. And secondly, there are special instructions for “quick” saving and downloading their content. And these instructions do not use the stack.
Note also that, in fact, part of the core of the thread stack has a fixed size and is distributed as part of the TCB. (true for Linux, and I believe that for Windows too)