Encyclopedia > Redundant array of independent disks

Article Content

Redundant array of independent disks

The goal of a redundant array of independent disks (originally known as a redundant array of inexpensive disks) -- or RAID -- is to provide large reliable virtual disks that can be much larger than commonly available disk drives.

There are 6 official levels: RAID 0 to RAID 5. There can also be combinations of RAID levels.

RAID arrays are usually implemented with identically-sized disk drives.

Hardware vs. Software

Any of the RAID levels listed below can be implemented in hardware or software.

With a software implementation, the operating system itself manages the disks of the array through the normal drive controller (IDE, SCSI, FC). This option can be slow, but it does not require the purchase of extra hardware.

A hardware implementation of RAID requires (at a minimum) a special-purpose RAID controller card. This controller handles the management of the disks, and performs parity calculations (needed for RAID 4, 5). This option tends to provide better performance, and makes operating system support easier.

Hardware implementations also typically support hot swap[?], allowing failed drives to be replaced while the system is running.

RAID 0: Striped Disk Array without Fault Tolerance (Nonredundant)

RAID Level 0 requires a minimum of 2 drives to implement.

Characteristics/Advantages
RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written to a separate disk drive. I/O performance is greatly improved by spreading the I/O load across many channels and drives

Best performance is achieved when data is striped across multiple controllers with only one drive per controller. No parity calculation overhead is involved. Very simple design, easy to implement.

Disadvantages
Not a "True" RAID because it is NOT fault-tolerant. The failure of just one drive will result in all data in an array being lost. Should never be used in mission critical environments that involve modification of data. (Some applications work with control information stored on a RAID 1 or 5 filesystem and multimedia data stored on RAID 0 and backed up to tape or optical media.)

Recommended Applications

Video Production and Editing
Image Editing
Pre-Press Applications
Any application requiring high bandwidth

RAID 1: Mirroring and Duplexing (Mirrored)

For Highest performance, the controller must be able to perform two concurrent separate Reads per mirrored pair or two duplicate Writes per mirrored pair.

RAID Level 1 requires a minimum of 2 drives to implement

Characteristics/Advantages
One Write or two Reads possible per mirrored pair. Twice the Read transaction rate of single disks, same Write transaction rate as single disks. 100% redundancy of data means no rebuild is necessary in case of a disk failure, just a copy to the replacement disk.

Transfer rate per block is equal to that of a single disk Under certain circumstances, RAID 1 can sustain multiple simultaneous drive failures.

Simplest RAID storage subsystem design.

Advantages

Since a disk of a mirrored pair has all the information, it can potentially be used without the RAID hardware/software.

Disadvantages
Highest disk overhead of all RAID types (100%) inefficient.

Recommended Applications

Accounting
Payroll
Financial
Any application requiring very high availability

RAID 2: Error-Correcting Coding

Redundancy scheme in RAID Level 2 is Hamming code, where the striping unit is a single bit. Striping at the bit level has the implication that in a disk array with D data disks, the smallest unit of transfer for a read is a set of D blocks.

RAID level 2 is rarely implemented.

RAID 3: Bit-Interleaved Parity (Richard M. Price Parity)

RAID level 3 has a single check disk and only processes one I/O at a time.

RAID level 3 is rarely implemented.

RAID 4: Dedicated parity drive (Block-Interleaved Parity)

Disks are striped, as in RAID 0. Parity information for the stripe is calculated, and stored on a parity disk. If one of the data disks fails, the information is re-built on a spare disk using the parity information. If the parity disk fails, the parity information is recalculated on a spare disk.

Disadvantages

The parity drive can be a bottleneck during write operations.

RAID 5: Independent Data disks with distributed parity blocks (Block Interleaved Distributed Parity)

Each entire data block is written on a data disk; parity for blocks in the same rank is generated on Writes, recorded in a distributed location and checked on Reads.

RAID Level 5 requires a minimum of 3 drives to implement.

Characteristics/Advantages
Highest Read data transaction rate. Medium to poor Write data transaction rate, especially when the host CPU performs software parity checking. Low ratio of ECC (Parity) disks to data disks means high efficiency. Good aggregate transfer rate

Disadvantages
Disk failure has a medium impact on throughput. Most complex controller design. Difficult to rebuild in the event of a disk failure (as compared to RAID level 1). Individual block data transfer rate same as single disk. High overhead for small writes. To change 1 byte in a file, the entire stripe must be read, the byte changed, the parity information re-calculated, and the entire stripe re-written. However, the fact that file systems tend to address disks naturally in clusters partially hides this effect.

Recommended Applications

File and Application servers
Database servers
WWW, E-mail, and News servers
Intranet servers
Most versatile RAID level

RAID 10: A Stripe of Mirrors

Multiple RAID 1 mirrors are created, and a RAID 0 stripe is created over these.

Advantages

Can potentially handle multiple simultaneous disk failures, as long as at least one disk of each mirrored pair is working.

Same advantages and disadvantages of RAID 1.

RAID 0+1: A Mirror of Stripes

Two RAID 0 stripes are created, and a RAID 1 mirror is created over them.

Disadvantages

Is not as robust as RAID 0+1. Cannot tolerate two simultaneous disk failures, if not from the same stripe.

History

RAID was first proposed in 1988 by David A. Patterson, Garth A. Gibson and Randy H. Katz in the paper, "A Case for Redundant Arrays of Inexpensive Disks (RAID)". This was published in the SIGMOD Conference 1988: pp 109-116. The term "RAID" started with this paper.

It was particularly ground-breaking work in that the concepts are "obvious". This paper spawned the entire disk array[?] industry.

All Wikipedia text is available under the terms of the GNU Free Documentation License

Search Encyclopedia

Search over one million articles, find something about almost anything!