Detailed explanation: Ten processor parameters of server processor products

  
                  

For many people, maybe you know more about PC processors, even parameters, design structure and application performance, but in terms of enterprise server CPU, due to different platforms, parameters and consumption The CPU has a very obvious difference. In this regard, the author collects the server CPU, briefly introduces the parameters of the ten server processors, helps to better understand the role of the server processor product application, and can improve the related technology of enterprise products. Research.

1, the server processor frequency

The server processor frequency is also called the clock frequency, the unit is MHz, used to indicate the CPU's operating speed. The main frequency of the CPU = FSB × multiplier. Many people think that the clock speed determines the speed of the CPU, which is not only one-sided, but also for the server, this understanding has also deviated. So far, there is no definitive formula to achieve the numerical relationship between the main frequency and the actual computing speed. Even the two major processor manufacturers Intel and AMD have great controversy at this point. We are from Intel. The development trend of the products, it can be seen that Intel is very focused on strengthening its own frequency development. Like other processor manufacturers, some people have used a fast 1G Transmeta to compare, its operating efficiency is equivalent to 2G Intel processor.

Therefore, the CPU's main frequency is not directly related to the actual computing power of the CPU. The main frequency indicates the speed at which the digital pulse signal oscillates in the CPU. In Intel's processor products, we can also see examples of this: the 1 GHz Itanium chip can behave almost as fast as the 2.66 GHz Xeon/Opteron, or the 1.5 GHz Itanium 2 is about as fast as the 4 GHz Xeon/Opteron. The CPU's operating speed depends on the performance of all aspects of the CPU's pipeline.

Of course, the main frequency is related to the actual computing speed. It can only be said that the main frequency is only one aspect of CPU performance, and does not represent the overall performance of the CPU.

2. Server Front Side Bus (FSB) Frequency

The front side bus (FSB) frequency (ie bus frequency) directly affects the direct data exchange speed between CPU and memory. There is a formula that can be calculated, that is, data bandwidth = (bus frequency × data bandwidth) /8, the maximum bandwidth of data transmission depends on the width and transmission frequency of all simultaneously transmitted data. For example, the current support for 64-bit Xeon Nocona, the front side bus is 800MHz, according to the formula, its data transmission maximum bandwidth is 6.4GB /sec.

The difference between the FSB and the front side bus (FSB) frequency: the speed of the front side bus refers to the speed of data transmission, and the FSB is the speed of synchronous operation between the CPU and the main board. That is to say, the 100MHz FSB specifically refers to the digital pulse signal oscillating 10 million times per second; and the 100MHz front side bus refers to the acceptable data transmission amount per second CPU is 100MHz × 64bit ÷ 8Byte /bit = 800MB /s.

In fact, the emergence of the "HyperTransport" architecture has changed the frequency of the front-side bus (FSB) in this practical sense. Before we knew that the IA-32 architecture must have three important components: the memory controller Hub (MCH), the I/O controller Hub and the PCI Hub, and the typical Intel chipset Intel 7501 and Intel7505 chipset, which are dual-strong. The processor is tailor-made, and the MCH it contains provides a front-side bus with a frequency of 533MHz for the CPU. With DDR memory, the front-side bus bandwidth can reach 4.3GB/sec. However, as the performance of the processor continues to increase, it brings many problems to the system architecture.

The "HyperTransport" architecture not only solves the problem, but also increases the bus bandwidth more effectively. For example, the AMD Opteron processor, the flexible HyperTransport I/O bus architecture allows it to integrate the memory controller for processing. The device does not pass the system bus to the chipset and exchanges data directly with the memory. In this case, the front side bus (FSB) frequency is not known in the AMD Opteron processor.

3, processor FSB

FSB is the reference frequency of the CPU, the unit is also MHz. The CPU's FSB determines the speed of the entire board. To put it bluntly, in the desktop, what we call overclocking is the FSB's FSB (of course, the CPU multiplier is locked). I believe this is well understood. But for server CPUs, overclocking is absolutely not allowed. As mentioned above, the CPU determines the running speed of the motherboard. The two are running synchronously. If the CPU of the server is overclocked and the FSB is changed, asynchronous operation will occur. (Many motherboards of the desktop support asynchronous operation) This will cause the entire server. The system is unstable.

In most computer systems, the FSB is also the speed of synchronous operation between the memory and the motherboard. In this way, it can be understood that the CPU's FSB is directly connected to the memory to achieve the connection between the two. Synchronous running state. The FSB and Front Side Bus (FSB) frequencies are easily confused. The following front side bus introduces us to the difference between the two. //This article transferred from www.45it.com.cn computer software and hardware application network

4, CPU bit and word length

Bit: binary in digital circuits and computer technology, code only “0” and “1”, where either “0” or “1” is a “bit” in the CPU.

Word Length: The number of bits in a computer technology that can be processed once per CPU time (at the same time) by the CPU is called the word length. Therefore, a CPU that can process 8-bit data is usually called an 8-bit CPU. Similarly, a 32-bit CPU can process binary data with a word length of 32 bits per unit time. The difference between byte and word length: Since commonly used English characters can be represented by 8-bit binary, 8 bits are usually called one byte. The length of the word length is not fixed, and the length of the word length is different for different CPUs. An 8-bit CPU can only process one byte at a time, while a 32-bit CPU can process 4 bytes at a time. A CPU with the same 64-bit length can process 8 bytes at a time.

5, multiplier coefficient

The multiplier coefficient refers to the relative proportional relationship between the CPU main frequency and the external frequency. At the same FSB, the higher the multiplier, the higher the CPU frequency. But in fact, under the premise of the same FSB, the high-frequency CPU itself does not mean much. This is because the data transmission speed between the CPU and the system is limited. The CPU that achieves high frequency multiplication and high frequency will have obvious "bottleneck" effect. The limit speed of the CPU to obtain data from the system cannot satisfy the CPU operation. speed. In general, except for the engineering version of Intel's CPU, the multiplier is locked, and AMD has no lock before.

6, CPU cache

Cache size is also one of the important indicators of the CPU, and the structure and size of the cache on the CPU speed is very large, the CPU cache running frequency is very high, generally It operates at the same frequency as the processor, and its work efficiency is far greater than system memory and hard disk. In actual work, the CPU often needs to repeatedly read the same data block, and the increase of the cache capacity can greatly improve the hit rate of the internal read data of the CPU, instead of looking into the memory or the hard disk, thereby improving system performance. . However, due to the factor of CPU chip area and cost, the cache is very small.

L1 Cache is the first layer of CPU cache, divided into data cache and instruction cache. The capacity and structure of the built-in L1 cache have a great impact on the performance of the CPU. However, the cache memory is composed of static RAM. The structure is more complicated. The capacity of the L1 cache is not large when the CPU die area is not too large. It may be too big. The capacity of the L1 cache of a general server CPU is usually 32-256 KB.

L2 Cache (L2 Cache) is the second layer of the CPU cache, divided into internal and external chips. The internal chip L2 cache runs at the same speed as the main frequency, while the external L2 cache is only half the main frequency. The L2 cache capacity also affects the performance of the CPU. The principle is that the bigger the better, the largest CPU capacity for home use is 512KB, and the L2 cache for CPUs on servers and workstations is up to 256-1MB, and some are up to 2MB or 3MB. .

L3 Cache (three-level cache), divided into two, the early is external, and now are built-in. The practical effect of this is that L3 cache applications can further reduce memory latency while improving the performance of processors in large data volumes. Reducing memory latency and increasing the amount of computing power can be very helpful for games. The increase in L3 cache in the server space is still a significant improvement in performance. For example, a configuration with a larger L3 cache can be more efficient with physical memory, so its slower disk I/O subsystem can handle more data requests. Processors with larger L3 caches provide more efficient file system caching behavior and shorter message and processor queue lengths.

In fact, the earliest L3 cache was applied to the K6-III processor released by AMD. At that time, the L3 cache was limited by the manufacturing process and was not integrated into the chip, but integrated on the motherboard. The L3 cache, which can only synchronize with the system bus frequency, is not much different from the main memory. Later, using the L3 cache was Intel's Itanium processor for the server market. Then there is P4EE and Xeon MP. Intel also plans to introduce a 9MB L3 cached Itanium2 processor and a dual-core Itanium2 processor with 24MB L3 cache.

But basically L3 cache is not very important for processor performance improvement. For example, the Xeon MP processor with 1MB L3 cache is still not Opteron's opponent. It can be seen that the front side bus is more expensive than the cache. Increases bring more effective performance gains.

7, CPU extended instruction set

CPU relies on instructions to calculate and control the system, each CPU is designed to specify a series of instruction systems that match its hardware circuit. The strength of the instruction is also an important indicator of the CPU. The instruction set is one of the most effective tools to improve the efficiency of the microprocessor. From the current mainstream architecture, the instruction set can be divided into two parts: complex instruction set and reduced instruction set. From the specific application, such as Intel's MMX (Multi Media Extended), SSE, SSE2 (Streaming-Single instruction multiple data) -Extensions 2), SEE3 and AMD's 3DNow! are all CPU extended instruction sets, which enhance the CPU's multimedia, graphics and Internet processing capabilities.

We usually refer to the CPU's extended instruction set as the "CPU instruction set". The SSE3 instruction set is also the smallest instruction set at present. Previously, MMX contained 57 commands, SSE contained 50 commands, SSE2 contained 144 commands, and SSE3 contained 13 commands. SSE3 is also the most advanced instruction set. The Intel Prescott processor already supports the SSE3 instruction set. AMD will add support for the SSE3 instruction set in future dual-core processors. Transmeta's processors will also support this instruction set.

8, CPU core and I /O working voltage

Starting from 586CPU, CPU operating voltage is divided into core voltage and I /O voltage, usually the core voltage of the CPU is less than or equal to I /O voltage. The size of the core voltage is determined according to the production process of the CPU. The smaller the general manufacturing process, the lower the operating voltage of the core; the I/O voltage is generally 1.6~5V. Low voltage can solve the problem of excessive power consumption and excessive heat.

9. Manufacturing Process

The manufacturing process of micron refers to the distance between the circuit and the circuit inside the IC. The trend in manufacturing processes is to move toward higher concentrations. The higher the density of the IC circuit design, means that in the same size area of ​​the IC, you can have a higher density, more complex circuit design. Now the main 180nm, 130nm, 90nm. Recently, the official has indicated that there is a 65nm manufacturing process.

10, package form

CPU package is a specific material to cure the CPU chip or CPU module in it to prevent damage, generally must be packaged after the CPU can be delivered to the user. The way the CPU is packaged depends on the CPU installation and device integration design. From a large classification, CPUs that are usually installed with Socket sockets are packaged in PGA (Grid Array) mode, while CPUs that are installed in Slot x slots use SEC. In the form of a package (single side connector box). There are also packaging technologies such as PLGA (Plastic Land Grid Array) and OLGA (Organic Land Grid Array). Due to the increasingly fierce market competition, the current development direction of CPU packaging technology is mainly based on cost saving.

Copyright © Windows knowledge All Rights Reserved