Redundant array of independent disks

From Hill2dot0

Jump to: navigation, search

Redundant array of independent disks (RAID) is a category of disk drives that employ two or more drives in combination with hardware or software to enhance fault tolerance and performance. RAID disk drives are used frequently on servers but aren't generally necessary for personal computers.

Contents

Introduction

Hard drives provide long-term storage for applications and data alike and the failure of one can be catastrophic. To help reduce this risk and its impact if it does occur, system administrators combine a group of relatively inexpensive disks. The idea is that these less expensive disks can be configured to support one another and if one disk fails, other disks could minimize the impact of that failure. This approach is less expensive than trying to design and build a hard drive that would never fail, assuming that is even possible. RAID assumes that hard drives will fail at some point and simply designs around that risk with multiple, inexpensive hard drives. This explains the other common expansion of the acronym: redundant array of inexpensive disks.

RAID systems are also created to improve system performance. Since each individual disk can only I/O a given amount of data per second, multiple disks can be combined providing higher overall data throughput than a single disk.

RAID arrangements can provide various levels of performance and redundancy—known as RAID 0, RAID 1, RAID 2, etc. Despite the existence of at least eleven RAID levels, not all of them are practical. The following discussion highlights only the most common RAID levels in use for storage networks.

RAID Levels

 RAID 1
Enlarge
RAID 1

RAID levels include:

  • Level 0 -- Striped Disk Array without Fault Tolerance: Provides data striping (spreading out blocks of each file across multiple disk drives) but no redundancy. This improves performance but does not deliver fault tolerance. If one drive fails then all data in the array is lost.
  • Level 1 -- Mirroring and Duplexing: Provides disk mirroring. RAID 1 is the first true RAID level in that it provides fault tolerance for disk arrays. Level 1 provides twice the read transaction rate of single disks and the same write transaction rate as single disks.
As information is written to disk 1, it is written to disk 2. This provides the benefit of fault tolerance in that either disk could fail and yet the data would be easily recoverable. RAID 1 does not yield significant improvements for operations that rely upon many disk writes, but it can double performance for read operations since the data output operation can be spread over multiple disks.
The disadvantage to RAID 1 is that it is the least efficient RAID mechanism regarding disk usage. If each hard drive in our example here were 120 GB, we have paid for 240 GB worth of storage, but we can only use 120 GB. Given the low price of storage, however, many network administrators feel that the investment is worthwhile.
  • Level 2 -- Error-Correcting Coding: Not a typical implementation and rarely used, Level 2 stripes data at the bit level rather than the block level.
  • Level 3 -- Bit-Interleaved Parity: Provides byte-level striping with a dedicated parity disk. Level 3, which cannot service simultaneous multiple requests, also is rarely used.
 RAID 3
Enlarge
RAID 3
If very high performance is required of the disk subsystem, at the expense of some fault tolerance, then RAID 3 is the solution of choice. RAID 3 stripes data across multiple disks as in RAID 0, but RAID 3 also provides a separate disk for parity. So, if any of the data disks fails, the parity information can be used to recreate the data on the disk. Properly implemented, this RAID configuration has very high read and write throughput and is best suited for very demanding I/O applications such as video editing or live content streaming.
  • Level 4 -- Dedicated Parity Drive: A commonly used implementation of RAID, Level 4 provides block-level striping (like Level 0) with a parity disk. If a data disk fails, the parity data is used to create a replacement disk. A disadvantage to Level 4 is that the parity disk can create write bottlenecks.
  • Level 5 -- Block Interleaved Distributed Parity: Provides data striping at the byte level and also stripe error correction information. This results in excellent performance and good fault tolerance. One of the most popular RAID systems for fault tolerance and I/O speed is RAID 5
In this arrangement, each write is spread among all of the disks in the array minus one. That extra disk is used to hold parity information that can be used to recreate the rest of the write. As shown in the figure at the right, write 2 is broken up into a number of blocks, each block being written to a different disk. In this particular example, disk 1 serves as the parity disk for write 2.
 RAID 5
Enlarge
RAID 5
There are several advantages to this arrangement. The first is data redundancy. Distributing parity information ensures that if a single disk fails, the parity information on the other disks could be used to recreate the lost data. If two disks fail, information on all of the disks would be lost, but the odds of simultaneous failure of multiple disks are exceedingly low.
Since I/O is spread across multiple disks, RAID 5 also has very good data transfer rates compared to non-RAID solutions. Less disk space is “lost” compared to RAID 1 solutions since the amount of parity information always equals the size of one disk. Thus, if five 100 GB disks were used for a RAID 5 implementation, 100 GB, or one-fifth of the total available storage, would be unusable. If eight 100 GB disks were used for a RAID 5 implementation, the total parity would still equal 100 GB spread out among the eight disks; one-eighth of the total disk space would be used for parity information. This economy of scale gives RAID 5 a definite edge over RAID 1 implementations.
Due to the flexibility and advantages of RAID 5 implementations, this is one of the most common RAID arrays in production and is suitable for almost all applications that would benefit from redundancy and speed. It is hard to find an application that would not benefit from these features.
  • Level 6 -- Independent Data Disks with Double Parity: Provides block-level striping with parity data distributed across all disks. Although less common than RAID 5, RAID 6 can survive multiple drive failures. RAID 6 is an extension of RAID 5 except that for every data write, two parity blocks are written instead of one. This means increased drive use for parity information and poor write performance, but this RAID level is perfect for mission critical applications.
  • Level 0+1 – A Mirror of Stripes: Not one of the original RAID levels, two RAID 0 stripes are created, and a RAID 1 mirror is created over them. Used for both replicating and sharing data among disks.
  • Level 10 – A Stripe of Mirrors: Not one of the original RAID levels, multiple RAID 1 mirrors are created, and a RAID 0 stripe is created over these.
  • Level 7: A trademark of Storage Computer Corporation that adds caching to Levels 3 or 4.
  • RAID S: EMC Corporation's proprietary striped parity RAID system used in its Symmetrix storage systems.


Related Articles

PodSnacks

Download | Redundant array of inexpensive disks (RAID)
Personal tools
Hill Associates Sites