Windows system >> Computer software Tutorial >> Server technology >> About the server

How to do disk array and disk image

Disk array is to make multiple disks into an array, used as a single disk, it stores data in different ways in different ways, access data When the relevant disks in the array act together, the data access time is greatly reduced, and there is better space utilization. The different technologies used by disk arrays, called RAIDlevel, are different levels for different systems and applications to solve data security problems. Currently recognized standards in the industry are RAID0~RAID5. This level does not represent the level of technology, level 5 is not higher than level3, level1 is not lower than level4, as to select which kind of RAIDlevel product, depending on the user's operating environment (operatingenvironment) and application (application), There is no necessary relationship with the level of the level. RAID1 RAID1 is a technology that uses disk mirroring.

Disk mirroring is used in many systems before RAID1. It is done by adding an additional backup disk to the working disk. The data stored on the two disks is completely Similarly, data is written to the working disk and also written to the backup disk. Disk mirroring is not necessarily RAID1. For example, NovellNetware also provides disk mirroring, but it does not mean that Netware has RAID1 functionality. Generally, disk mirroring and RAID1 have the biggest difference between two points: RAID1 has no working disk and backup disk. Multiple disks can be operated at the same time and have overlapping reading function. Even different mirror disks can be written at the same time. Action, this is an optimized way called load-balance. For example, if multiple users want to read data at the same time, the system can simultaneously drive the mirrored disks and read the data at the same time to reduce the load on the system and increase the performance of the I/O.

RAID1's disks are arrayed in a disk-extended manner, and data is stored in a data-segmented manner, so it has almost the same performance as RAID0 when read. From the structure of RAID, it can be clearly seen that RAID1 is different from general disk image. RAID2 RAID2 is to spread data into bits or blocks, add Hamming code HammingCode, interleaving to each disk in the disk array, and the address is the same, that is, In each disk, its data is in the same cylinder track and sector. RAID2 is designed to use the technique of synergistic synchronization. When accessing data, the entire disk array operates together and is accessed in parallel at the same position of each disk, so it has the best access time. The bus is a special design that transmits the accessed data in parallel with a wide bandwidth, so there is the best transfer time. In large file access applications, RAID2 has the best performance, but if the file is too small, its performance will be pulled down, because disk access is in sector units, and RAID2 access is parallel to all disks. And it is a unit of element access, so the amount of data smaller than one sector will greatly reduce its performance. RAID2 is designed for computers that require continuous and large amounts of data, such as mainframetosupercomputers, workstations for image processing or CAD/CAM, etc. It is not suitable for general multi-user environments, network servers (networkserver). , minicomputer or PC. RAID2 security uses memory array technology (memoryarray) technology, using multiple additional disks for single-bit correction and double-bit detection (double-bit detection); as for how many extra disks are needed, Depending on the method and structure employed, for example, an array of eight data disks may require three additional disks, and a high-end array with thirty-two data disks may require seven additional disks. RAID3 RAID3 has the same data storage and access methods as RAID2, but in terms of security, parity is used instead of Hamming code for error correction and detection, so only one additional parity disk is required. The parity value is calculated by XOR logical operation of the corresponding bits of each disk, and then the result is written to the parity disk. Any data modification is done to calculate the parity, such as a disk failure, and replace it with a new one. After the disk, the entire disk array (including the parity disk) needs to be recalculated to recover the data of the failed disk and write it to the new disk; if the parity disk fails, the parity value is recalculated to achieve fault tolerance. Requirements.

Compared to RAID1 and RAID2, RAID3 has 85% disk space utilization, and its performance is slightly worse than RAID2 because of parity calculation; parallel synchronization parallel access when reading files It has good performance, but it is slower to write, and you need to recalculate and modify the contents of the parity disk. RAID3 and RAID2 have the same application method, and are suitable for applications with large files and large amounts of data input and output, and are not suitable for PCs and network servers. RAID4 RAID4 also uses a parity disk, but it is not the same as RAID3. RAID4 is segmented by sectors. The segments at the same position of each disk form a parity disk and are placed on the parity disk. This method can perform different read operations in parallel on different disks and greatly improve the read performance of the disk array. However, when writing data, it is limited to the verification disk, and can only be done once at the same time to start all the disks. The read data forms all the data segments of the same check segment, and the checksum calculation is performed with the data to be written. Even so, the write of small files is still faster than RAID3, because the check calculation is simpler than the bitlevel calculation; but the verification disk forms the bottleneck of RAID4, which reduces the performance, and RAID4 is caused by RAID5. Less used. RAID5 RAID5 avoids the bottleneck of RAID4 by placing the parity data in a circular fashion on each disk without verifying the disk. The first disk segment of the disk array is the check value, the segment from the second disk to the next disk and then back to the first disk is the data, and then the segment of the second disk is the check value, from the third The segment where the disk is folded back to the second disk is the data, and so on, until it is finished. The first parityblock in the figure is calculated by A0, A1..., B1, B2, and the second parityblock is calculated by B3, B4, ..., C4, D0, that is, the check value is determined by each The data of the segment at the same position of the disk is calculated. This method can greatly increase the access performance of small files, not only can be read at the same time, it is even possible to perform multiple write operations at the same time, such as writing data to disk 1 and its parity block on disk 2, while writing data To disk 4 and its parity block on disk 1, this provides the best solution for online transaction processing (OLTP, On-LineTransaction Processing) such as banking systems, finance, stock market, etc. or large databases, because of these applications Each amount of data is small, disk input and output is frequent and must be fault tolerant.

In fact, the performance of RAID5 is not so ideal, because any data modification, all the data of the same parity block should be read out and modified, and then the verification calculation is completed and then written back, that is, RMWcycle (Read- Modify-Writecycle, this cycle does not include the check calculation); just because of the whole body, so: R: N (can read all the disks at the same time) W: 1 (can write to the disk at the same time) S: N-1 (Utilization) The control of RAID5 is more complicated, especially the use of hardware to control the disk array, because the application of this method has more things to master than other RAID levels, and there are more output and input requirements, which is fast. Also need to process the data, calculate the check value, do error correction, etc., so the price is higher; its application is best OLTP, as for image processing, etc., not necessarily have the best performance. RAID0 and RAID1 are suitable for PC and PC related systems such as small network servers (networkserver) and workstations that require high disk capacity and fast disk access. They are cheaper; RAID3 and RAID4 are suitable for large computers and video, CAD/CAM, etc. Processing; RAID5 is mostly used in OLTP. Due to the urgent need of financial institutions and large data processing centers, it is more widely used and more famous. RAID2 is less used. Others such as RAID6, RAID7, and even RAID10 are all manufacturers. There is no consistent standard.

Computer software and hardware application website long note: This article is collected by the network, the author is unknown, if the author sees or has the knowledge, please post a comment with the name, we will add . Thank you for your support.