Windows system >> Linux system Tutorial >> Linux Tutorial

Linux kernel preemption implementation mechanism analysis

This series of blog posts mainly introduces several important concepts and technical principles of the Linux kernel, some of which come from online summary, and some come from their own "Linux kernel design and implementation" and "deep understanding of Linux kernel" Summary. The purpose is to let some people who are new to the Linux kernel have a general understanding of some of the implementation techniques of the Linux kernel.
1.1 Kernel Preemption

2.6 The new preemptive kernel refers to kernel preemption, that is, when the process is in kernel space, when a higher priority task occurs, if the current kernel allows preemption, then You can suspend the current task and execute a higher priority process.

Before version 2.5.4, the Linux kernel was not preemptible. High-priority processes could not suspend low-priority processes running in the kernel and preempt the CPU. Once the process is in a core state (such as a user process executing a system call), the process will continue to run until the kernel is completed or exited, unless the process voluntarily relinquishes the CPU. In contrast, a preemptible Linux kernel allows the Linux kernel to be preempted like user space. When a high-priority process arrives, regardless of whether the current process is in user mode or core state, if the preemption is currently allowed, Linux that can preempt the kernel will schedule high-priority processes to run.
1.2 User Preemption

When the kernel is about to return to user space, if the need resched flag is set, the schedule() will be called, and user preemption will occur. When the kernel returns to user space, it knows that it is safe. Therefore, the kernel checks the need resched flag whether it returns from the interrupt handler or after a system call. If it is set, then the kernel will choose another (more appropriate) process to put into operation.

In short, user preemption occurs when:
l returns to user space from the system.
l Returns user space from the interrupt handler.
1.3 Features of Non-Preemptive Kernels

In kernels that do not support kernel preemption, kernel code can be executed until it is completed. That is to say, the scheduler has no way to reschedule &mdash when a kernel-level task is being executed; the tasks in the kernel are coordinated and not preemptive. Of course, a process running in kernel mode can voluntarily give up the CPU. For example, in a system call service routine, the kernel code abandons the CPU because it waits for resources. This situation is called a planned process switch. The kernel code has to be executed until completion (returning to user space) or obvious blocking.

In the case of a single CPU, such a setting greatly simplifies the synchronization and protection mechanisms of the kernel. This can be analyzed in two steps:

First, regardless of the process in which the CPU voluntarily relinquishes the CPU (that is, no process switching occurs in the kernel). Once a process enters the kernel, it will continue to run until it completes or exits the kernel. Before it completes or exits the kernel, no other process will enter the kernel. That is, the execution of the process in the kernel is serial. It is impossible to have multiple processes running in the kernel at the same time, so the kernel code design is not considered. Concurrency issues caused by simultaneous execution of multiple processes. Linux kernel developers don't have to worry about complex processes concurrently executing mutually exclusive access to critical resources. When a process accesses and modifies the data structure of the kernel, it does not need to lock to prevent multiple processes from entering the critical section at the same time. At this time, you only need to consider the interrupt situation. If the interrupt processing routine is also possible to access the data structure that the process is accessing, the process only needs to close the interrupt operation before entering the critical section, and then open the interrupt operation when exiting the critical section. Yes.

Consider again the process of voluntarily giving up the CPU. Because the abandonment of the CPU is voluntary and active, it means that the process switching in the kernel is known in advance, and there is no switching of the process that occurs without knowing it. In this way, you only need to consider the concurrency problems that may occur when multiple processes are executed simultaneously in the process of process switching, without having to consider the concurrent execution of the process in the entire kernel.
1.4 Why do you need kernel preemption?

Implementing kernel preemption is important for Linux. First, this is required to apply Linux to real-time systems. Real-time systems have strict limits on response time. When a real-time process is woken up by a hardware interrupt of a real-time device, it should be scheduled to execute within a limited time. Linux does not meet this requirement because the Linux kernel is not preemptible and cannot determine the system's residence time in the kernel. In fact, when the kernel performs long system calls, the real-time process waits until the process running in the kernel exits the kernel to be scheduled. The resulting response delay is up to 100ms in today's hardware.

This is unacceptable for systems that require high real-time response. The preemptible kernel is not only critical for real-time applications of Linux, but also solves the shortcomings of Linux's low-latency application support for multimedia (video, audio).

Because of the importance of preempting the kernel, when the Linux 2.5.4 version is released, it can be preempted into the kernel, and as a standard optional configuration of the kernel like SMP.
1.5 What does not allow the kernel to preempt?
There are several cases where the Linux kernel should not be preempted, except that the Linux kernel can be preempted at any point. These cases are:

(1) The kernel is performing interrupt processing. In the Linux kernel, the process cannot preempt interrupts (interrupts can only be aborted by other interrupts, preempted, processes cannot be aborted, preempted interrupts), and process scheduling is not allowed in interrupt routines. The process scheduling function schedule() will judge this. If it is called in an interrupt, an error message will be printed.

(2) The kernel is processing the Bottom Half (lower half of the interrupt) of the interrupt context. A soft interrupt is executed before the hardware interrupt returns and is still in the interrupt context.

(3) The code segment of the kernel is holding locks such as spinlock spin lock, writelock/readlock read-write lock, etc., in the protection state of these locks. These locks in the kernel are designed to ensure the correct execution of concurrent execution of processes running on different CPUs in a short time in an SMP system. When holding these locks, the kernel should not be preempted, otherwise the preemption will cause other CPUs to lose locks for a long time and die.

(4) The kernel is executing the scheduler Scheduler. The reason for preemption is to make a new schedule, there is no reason to preempt the scheduler and run the scheduler.

(5) The kernel is working on each CPU's "private" data structure operations (Per-CPU date structures). In SMP, the per-CPU data structure is not protected with spinlocks, because these data structures are implicitly protected (different CPUs have different per-CPU data, processes running on other CPUs do not use another CPU per-CPU data). However, if preemption is allowed, but a process is re-scheduled after being preempted, it is possible to schedule it to other CPUs. At this time, the Per-CPU variable defined will have a problem, and the preemption should be prohibited.

To ensure that the Linux kernel will not be preempted under the above conditions, the preemptive kernel uses a variable preempt_ count, called the kernel preemption lock. This variable is set in the process's PCB structure task_struct. Whenever the kernel wants to enter the above states, the variable preempt_count is incremented by 1, indicating that the kernel does not allow preemption. Whenever the kernel exits from the above states, the variable preempt_count is decremented by one, and the preemptible judgment and scheduling are performed at the same time.

When returning kernel space from an interrupt, the kernel checks the values of need_resched and preempt_count. If need_resched is set and the preempt count is 0, this means that there may be a more important task that needs to be executed and can be safely preempted, at which point the scheduler will be called. If the preempt-count is not 0, the kernel is now in a non-preemptable state and cannot be rescheduled. At this point, the current execution process is returned directly from the interrupt as usual. If all the locks held by the current process are released, the preempt_ count will be reset to zero. At this point, the code that releases the lock checks to see if need_resched is set. If so, the scheduler is called.
1.6 Kernel Preemption Opportunity

In the 2.6 kernel, the kernel introduces preemption capability; now, as long as rescheduling is safe, the kernel can seize the task being executed at any time.

So, when is rescheduling safe? As long as the premptcount is 0, the kernel can preempt. Usually locks and interrupts are signs of non-preempted areas. Since the kernel supports SMP, if the lock is not held, the code being executed is re-directable, that is, it can be preempted.

If the process in the kernel is blocked, or if it explicitly calls schedule(), kernel preemption will also occur explicitly. This form of kernel preemption has always been supported (actually actively giving up the CPU) because there is no need for additional logic to ensure that the kernel can be safely preempted. If the code explicitly calls schedule(), then it should be clear that it can be safely preempted.

Core preemption can occur:
l Before the interrupt handler is executing and returns to kernel space.
l When the kernel code is once again preemptive, such as unlocking and enabling soft interrupts (local_bh_enable).
l If the task in the kernel explicitly calls schedule()
l if the task in the kernel is blocked (this will also cause the schedule() to be called