Once the kernel has selected a new process, the technical details associated with multitasking must be dealt with; these details are known collectively as context switching. The auxiliary function context_switch is the dispatcher for the required architecture-specific methods.
kernel/sched.c static inline void context_switch(struct rq *rq, struct task_struct *prev, struct task_struct *next)
struct mm_struct *mm, *oldmm;
prepare_task_switch(rq, prev, next);
oldmm = prev->active_mm;
Immediately before a task switch, the prepare_arch_switch hook that must be defined by every architecture is called from prepare_task_switch. This enables the kernel to execute architecture-specific code to prepare for the switch. Most supported architectures (with the exception of Sparc64 and Sparc) do not use this option because it is not needed.
The context switch proper is performed by invoking two processor-specific functions:
1. switch_mm changes the memory context described in task_struct->mm. Depending on the processor, this is done by loading the page tables, flushing the translation lookaside buffers (partially or fully), and supplying the MMU with new information. Because these actions go deep into CPU details, I do not intend to discuss their implementation here.
2. switch_to switches the processor register contents and the kernel stack (the virtual user address space is changed in the first step, and as it includes the user mode stack, it is not necessary to change the latter explicitly). This task also varies greatly from architecture to architecture and is usually coded entirely in assembly language. Again, I ignore implementation details.
Because the register contents of the userspace process are saved on the kernel stack when kernel mode is entered (see Chapter 14 for details), this need not be done explicitly during the context switch. And because each process first begins to execute in kernel mode (at that point during scheduling at which control is passed to the new process), the register contents are automatically restored using the values on the kernel stack when a return is made to userspace.
Remember, however, that kernel threads do not have their own userspace memory context and execute on top of the address space of a random task; their task_struct->mm is null. The address space ''borrowed'' from the current task is noted in active_mm instead:
next->active_mm = oldmm; atomic_inc(&oldmm->mm_count); enter_lazy_tlb(oldmm, next);
enter_lazy_tlb notifies the underlying architecture that exchanging the userspace portion of the virtual address space is not required. This speeds up the context switch and is known as the lazy TLB technique.
If the previous task was a kernel thread (i.e., prev->mm is null), its active_mm pointer must be reset to null to disconnect it from the borrowed address space:
Finally, the task switch is finished with switch_to, which switches the register state and the stack — the new process will be running after the call:
/* Here we just switch the register state and the stack. */
switch_to(prev, next, prev);
* this_rq must be evaluated again because prev may have moved
* CPUs since it called schedule(), thus the 'rq' on its stack
* frame will be invalid.
The code following after switch_to will only be executed when the current process is selected to run next time. finish_task_switch performs some cleanups and allows for correctly releasing locks, which, however, we will not discuss in detail. It also gives individual architectures another possibility to hook into the context switching process, but this is only required on a few machines. The barrier statement is a directive for the compiler that ensures that the order in which the switch_to and finish_task_switch statements are executed is not changed by any unfortunate optimizations (see Chapter 5 for more details).
Intricacies of switch_to
The interesting thing about finish_task_switch is that the cleanups are performed for the task that has been active before the running task has been selected for execution. Notice that this is not the task that has initiated the context switch, but some random other task in the system! The kernel must find a way to communicate this task to the context_switch routine, and this is achieved with the switch_to macro. It must be implemented by every architecture and has a very unusual calling convention: Two variables are handed over, but in three parameters! This is because not only two, but three processes are involved in a context switch. The situation is illustrated in Figure 2-16.
Kernel mode next = B stack prev = A
J Before switch_to prev = C } After switch_to returns
Continue reading here: Figure 216 Behavior of the prev and next variables during context switches
Was this article helpful?