Windows system >> Linux system Tutorial >> Linux Tutorial

Linux multi-threaded model

A process is an execution activity of a program on a computer. When you run a program, you start a process. Obviously, the program is dead (static) and the process is alive (dynamic). Processes can be divided into system processes and user processes. The processes that are used to complete the various functions of the operating system
are the system processes, which are the operating systems themselves in the running state; the user processes are all the processes started by you. A process is a unit in which an operating system allocates resources. Under Windows
, the process is further refined into threads, that is, there are multiple smaller units that can run independently under one process. I have seen many articles about linux processes and threads. I think this can be said to be the most classic ------------------------------ --- I. Basic knowledge: threads and processes According to the definition in the textbook, the process is the smallest unit of resource management, and the thread is the smallest unit of program execution. In the design of the operating system, the evolution of the thread from the process, the main purpose is to better support SMP and reduce (process /thread) context switching overhead. Regardless of the division, a process needs at least one thread as its instruction execution. The process manages resources (such as cpu, memory, files, etc.) and allocates threads to a certain CPU. A process can of course have multiple threads. In this case, if the process is running on an SMP machine, it can use multiple CPUs to execute each thread at the same time, achieving maximum parallelism to improve efficiency; at the same time, even in a single cpu On the machine, the multi-threaded model is used to design the program. Just as the multi-process model is used instead of the single-process model, the design is more concise, the function is more complete, and the program execution efficiency is higher. For example, multiple threads are used to respond to multiple inputs. At this time, the function implemented by the multi-threaded model can actually be implemented by the multi-process model. Compared with the latter, the thread switching overhead is much smaller than the process. Semantically, the response is more than simultaneous. Entering such a function actually shares all resources except cpu. According to the two major meanings of the thread model, two thread models of core thread and user thread are developed respectively. The standard of classification is mainly whether the thread scheduler is in the core or outside the core. The former is more conducive to concurrent use of multi-processor resources, while the latter is more concerned with context switching overhead. In the current commercial system, the two are usually used together, which provides the core thread to meet the needs of the smp system, and also supports another thread mechanism in the user state by using the thread library. At this time, one core thread simultaneously Become a scheduler for multiple user-mode threads. As with many technologies, "hybrid" usually leads to higher efficiency, but it also brings greater implementation difficulty. Due to the "simple" design, Linux has not been mixed since the beginning. The model's plan, but it uses another idea of "mixing" in its implementation. In the specific implementation of the thread mechanism, the thread can be implemented on the operating system kernel, or can be implemented outside the core. The latter obviously requires at least the process to be implemented in the core, while the former generally requires the process to be supported in the core as well. The core threading model obviously requires the support of the former, while the user-level threading model is not necessarily based on the latter. This difference, as mentioned above, is due to the different standards of the two classification methods. When both the support process and the thread are supported in the kernel, the thread-process" many-to-many" model can be implemented, that is, a certain thread of a process is scheduled by the kernel, and at the same time it can also be used as a user-level thread pool. The scheduler chooses the appropriate user-level thread to run in its space. This is the aforementioned "hybrid" threading model that meets the needs of multiprocessor systems and minimizes scheduling overhead. Most commercial operating systems (such as Digital Unix, Solaris, and Irix) use this threading model that fully implements the POSIX1003.1c standard. The thread implemented outside the core can be divided into two models: "one-to-one", " many-to-one", the former uses a core process (perhaps a lightweight process) corresponding to one thread, and the thread scheduling is equivalent to Process scheduling is handed over to the core, while the latter is fully multithreaded outside the core, and scheduling is done in user mode. The latter is the implementation of the simple user-level threading model mentioned above. Obviously, this extra-core thread scheduler actually only needs to complete the switching of the thread running stack. The scheduling overhead is very small, but at the same time because of the core signal (regardless of Whether it is synchronous or asynchronous) is based on processes and therefore cannot be located to threads, so this implementation cannot be used in multiprocessor systems, and this requirement is becoming larger and larger, so in reality The implementation of pure user-level threads has almost disappeared except for the purpose of algorithm research. The Linux kernel only provides support for lightweight processes, which limits the implementation of a more efficient threading model, but Linux focuses on optimizing the scheduling overhead of the process and to some extent compensates for this flaw. At present, the most popular threading mechanism LinuxThreads adopts the thread-process"one-to-one" model, which is handed over to the core, and implements a thread management mechanism including signal processing at the user level. The operating mechanism of Linux-LinuxThreads is the focus of this article. Second, the lightweight process implementation in the Linux 2.4 kernel The initial process definition contains three parts: program, resource and its execution. The program usually refers to code. The resource usually includes memory resources, IO resources, signal processing, etc. at the operating system level. The execution of a program is usually understood as the execution context, including the occupation of the CPU, and later developed into a thread. Before the emergence of the thread concept, in order to reduce the overhead of process switching, the operating system designer gradually revised the concept of the process, gradually allowing the resources occupied by the process to be stripped from its main body, allowing some processes to share a part of resources, such as files and signals. , data memory, and even code, which develops the concept of lightweight processes. The Linux kernel has implemented a lightweight process in version 2.0.x. The application can use a unified clone() system call interface to specify whether to create a lightweight process or a normal process with different parameters. In the kernel, the clone() call will call do_fork() after parameter passing and interpretation. This kernel function is also the final implementation of the fork() and vfork() system calls: <linux-2.4.20/kernel/fork .c> int do_fork(unsigned long clone_flags, unsigned long stack_start, struct pt_regs *regs, unsigned long stack_size) where clone_flags is taken from the following macro's "or " value: <linux-2.4.20/include/linux/Sched.h> #define CSIGNAL 0x000000ff /* signal mask to be sent at exit */#define CLONE_VM 0x00000100 /* set if VM shared between processes */#define CLONE_FS 0x00000200 /* set if fs info shared between processes */#define CLONE_FILES 0x00000400 /* set if open files shared between processes */#define CLONE_SIGHAND 0x00000800 /* set if signal handlers and blocked signals shared */#define CLONE_PID 0x00001000 /* set if pid shared */