Relationship and difference between CPU usage and machine load under Linux

  
 

When we use the top command to view the system's resource usage, we will see the load average, as shown in the following figure. It represents the average workload of the system at 1, 5, and 15 minutes. So what is the load? What does it have to do with CPU utilization?




load average: The average load of the system is the load of the CPU, and the information it contains is not the CPU. The rate status is the statistical information of the sum of the number of processes that the CPU is processing and waiting for the CPU to process for a period of time, that is, the length of the queue used by the CPU. The smaller the number, the better.


1. Difference between CPU load and CPU utilization


CPU utilization: shows the percentage of CPU occupied by the program during runtime

CPU Load: Shows the average number of tasks being used and waiting to use the CPU for a period of time. High CPU utilization does not mean that the load will be large. For example: If I have a program that needs to use the CPU's computing function all the time, then the CPU usage may reach 100%, but the CPU workload is closer to “1” because the CPU is only responsible for one. Work! If you execute such a program at the same time? The CPU usage is still 100%, but the workload becomes 2. So, that is, when the CPU's workload is larger, it means that the CPU must perform frequent work switching between different jobs.


Examples


There is an interesting metaphor on the Internet. Take a call to explain the difference between the two, I press my own Understand the explanation.


A public telephone booth, there is a person on the phone, four people are waiting, each person is limited to use the phone for one minute, if someone does not finish the call within one minute, can only hang up The phone goes to the queue and waits for the next round. The phone is equivalent to the CPU here, and the person who is waiting or waiting to call is equivalent to the number of tasks.


In the course of using the phone booth, some people will definitely leave the phone after they have finished calling. Some people choose to re-queue after not finishing the call. There will be new people waiting in line here. The change is equivalent to the increase or decrease of the number of tasks. In order to calculate the average load, we counted the number of people in 5 minutes, and averaged the statistics at the 1st, 5th, and 15th minutes to form an average load of 1, 5, and 15 minutes.


Some people pick up the phone and play it for 1 minute, while some people may be looking for a phone number in the first 30 seconds, or hesitate to play, after three Ten seconds is really on the phone. If the phone is regarded as a CPU and the number of people is regarded as a task, we say that the CPU utilization of the previous person (task) is high, and the CPU utilization of the latter person (task) is low.


Of course, the CPU does not work in the first thirty seconds, after 30 seconds, just saying that some programs involve a lot of calculations, so the CPU utilization is high. However, some programs involve a small amount of calculations, and CPU utilization is naturally low. But regardless of whether the CPU utilization is high or low, there is no necessary relationship with how many tasks are queued behind.


2. What is the load?


This is controversial and has its own arguments. Individuals agree that CPU load is less than or equal to 0.5 is an ideal state.


Regardless of the performance of a CPU, how many tasks can be processed in one second, we can think that it doesn't matter, although it is not. When evaluating the CPU load, we only count the length of the task queue in 5 minutes. If the task queue length is 1 every 5 minutes, the CPU load is 1. If we only have a single-core CPU, the load is always 1, meaning that no tasks are queued, not bad.


But my server is dual-core and CPU, which is equivalent to 4 cores. If the load of each core is 1, the total load is 4. This means that if the CPU load of my server stays at around 4 for a long time, it is acceptable.


But the load per core is 1, and it is not an ideal state! This means that our CPU has been busy and not free. The ideal state on the Internet is that the load of each core is about 0.7. I agree that 0.7 is multiplied by the number of cores to get the ideal CPU load of the server. For example, my server can load below 3.0.


3. How to reduce the CPU load of the server?


The easiest way is to replace a better server, don't think about just improving the performance of the CPU, it's useless, the CPU needs to play its best performance and needs other soft Hardware cooperation.


In the case of other configurations of the server, the number of CPUs and the number of CPU cores (that is, the number of cores) will affect the CPU load, because the task is ultimately assigned to the CPU core for processing. of. Two CPUs are better than one CPU, and dual cores are better than single cores.


So, we need to remember that in addition to CPU performance differences, CPU load is calculated based on the number of cores! There is a saying, "How many cores there are, how much load there is".


4. What is the CPU usage rate?


CPU utilization has often been considered by our laymen to be a standard for judging whether a machine has reached full capacity. I see a long-term CPU usage of 60-80% as a machine. There is a bottleneck.

Copyright © Windows knowledge All Rights Reserved