This section describes how to create and manage software RAIDs with the Enterprise Volume Management System (EVMS). EVMS supports only RAIDs 0, 1, 4, and 5 at this time. For RAID 6 and 10 solutions, see Chapter 7, Managing Software RAIDs 6 and 10 with mdadm.
Understanding Software RAIDs on Linux
Section 6.1.1, “What Is a Software RAID?”
Section 6.1.2, “Overview of RAID Levels”
Section 6.1.3, “Comparison of RAID Performance”
Section 6.1.4, “Comparison of Disk Fault Tolerance”
Section 6.1.5, “Configuration Options for RAIDs”
Section 6.1.6, “Interoperability Issues”
Section 6.1.7, “Guidelines for Component Devices”
Section 6.1.8, “RAID 5 Algorithms for Distributing Stripes and Parity”
Section 6.1.9, “Multi-Disk Plug-In for EVMS”
Section 6.1.10, “Device Mapper Plug-In for EVMS”
A RAID combines multiple devices into a multi-disk array to provide resiliency in the storage device and to improve storage capacity and I/O performance. If a disk fails, some RAID levels keep data available in a degraded mode until the failed disk can be replaced and its content reconstructed.
A software RAID provides the same high availability that you find in a hardware RAID. The key operational differences are described in the following table:
Table 6.1. Comparison of Software RAIDs and Hardware RAIDs
|
Feature
|
Linux Software RAID
|
Hardware RAID
|
|---|
|
RAID function
|
Multi-disk (md) driver or mdadm
|
RAID controller on the disk array
|
|
RAID processing
|
In the host server’s processor
|
RAID controller on the disk array
|
|
RAID levels
|
0, 1, 4, 5, and 10 plus the mdadm raid10
|
Varies by vendor
|
|
Component devices
|
Disks from same or different disk array
|
Same disk array
|
The following table describes the advantages and disadvantages of the RAID levels supported by EVMS. The description assumes that the component devices reside on different disks and that each disk has its own dedicated I/O capability.
Table 6.2. RAID Levels Supported by EVMS
|
RAID Level
|
Description
|
Performance and Fault Tolerance
|
|---|
|
0
|
Stripes data using a round-robin method to distribute data over the RAID’s component devices.
|
Improves disk I/O performance for both reads and writes. Actual performance depends on the stripe size, the actual data, and the application.
Does not provide disk fault tolerance and data redundancy. Any disk failure causes all data in the RAID to be lost.
|
|
1
|
Mirrors data by copying blocks of one disk to another and keeping them in continuous synchronization. If disks are different sizes, the smallest disk determines the size of the RAID.
|
Improves disk reads by making multiple copies of data available via different I/O paths. The write performance is about the same as for a single disk because a copy of the data must be written to each of the disks in the mirror.
Provides 100% data redundancy. If one disk fails then the data remains available on its mirror, and processing continues.
|
|
4
|
Stripes data and records parity to a dedicated disk. If disks are different sizes, the smallest disk determines the size of the RAID.
|
Improves disk I/O performance for both reads and writes. Write performance is considerably slower than for RAID 0, because parity must be calculated and written. Write performance is slightly slower than RAID 5. Read performance is slower than for a RAID 1 array with the same number of component devices. The dedicated parity disk can become a bottleneck for writing parity.
Provides disk fault tolerance. If a disk fails, performance is degraded while the RAID uses the parity to reconstruct data for the replacement disk.
|
|
5
|
Stripes data and distributes parity in a round-robin fashion across all disks. If disks are different sizes, the smallest disk determines the size of the RAID.
|
Improves disk I/O performance for reads and writes. Write performance is considerably less than for RAID 0, because parity must be calculated and written. Write performance is faster than RAID 4. Read performance is slower than for a RAID 1 array with the same number of component disks. Actual performance depends on the number of component disks, the stripe size, the actual data, and the application.
Provides disk fault tolerance. If a disk fails, performance is degraded while the RAID uses the parity to reconstruct data for the replacement disk. Provides slightly less data redundancy than mirroring because it uses parity to reconstruct the data.
|
Comparison of RAID Performance
The following table compares the read and write performance for RAID devices.
Table 6.3. Read and Write Performance for RAIDs
|
Raid Level
|
Read Performance
|
Write Performance
|
|---|
|
0
|
Faster than for a single disk
|
Faster than for a single disk and other RAIDs.
|
|
1
|
Faster than for a single disk, increasing as more mirrors are added
|
Slower than for a single disk, declining as more mirrors are added.
|
|
4
|
Faster than for a single disk. Slower than a RAID 0 because one disk is used for parity.
|
Faster than for a single disk. Slower than a RAID 0 because of writes for parity. Slower than a RAID 5 because of possible bottlenecks for writes of parity to the dedicated parity disk.
|
|
5
|
Faster than for a single disk; comparable to a RAID 0.
|
Faster than a single disk. Slower than a RAID 0 because of writes for parity.
|
Comparison of Disk Fault Tolerance
The following table compares the disk fault tolerance for RAID devices.
Table 6.4. Fault Tolerance for RAIDs
|
Raid Level
|
Number of Disk Failures Tolerated
|
Data Redundancy
|
|---|
|
0
|
None
|
No
|
|
1
|
Number of disks minus 1
|
100% redundancy for each mirror
|
|
4
|
1
|
Dedicated parity disk to reconstruct data. If the parity disk fails, all parity must be recalculated.
|
|
5
|
1
|
Distributed parity to reconstruct data and parity on the failed disk.
|
Configuration Options for RAIDs
In EVMS management tools, the following RAID configuration options are provided:
Table 6.5. Configuration Options in EVMS
|
Option
|
Description
|
|---|
|
Spare Disk
|
For RAIDs 1, 4, and 5, you can optionally specify a device, segment, or region to use as the replacement for a failed disk (the member device, segment, or region). On failure, the spare disk automatically replaces the failed disk, then reconstructs the data.
However, if the parity disk fails on a RAID 5, parity cannot be reconstructed.
|
|
Chunk Size (KB)
|
For RAIDs 0, 4, or 5, specify the stripe size in KB.
Consider the intended use of the RAID, such as the file system block size, the applications used, and the actual data (file sizes and typical reads and writes). A typical write size for large files is 128 KB.
Default: 32 KB
Range: 4 KB to 4096 KB, in powers of 2.
|
|
RAID Level
|
If you selected , specify or (default).
|
|
RAID Algorithm
|
For RAID 5, specify one of the following algorithms to use for striping and distributing parity on the disk.
|
Linux software RAID cannot be used underneath clustered file systems because it does not support concurrent activation. If you want RAID and OCFS2, you need the RAID to be handled by the storage subsystem.
![[Warning]](admon/warning.png) | |
|---|
Activating Linux software RAID devices concurrently on multiple servers can result in data corruption or inconsistencies. |
Guidelines for Component Devices
For efficient use of space and performance, the disks you use to create the RAID should have the same storage capacity. Typically, if component devices are not of identical storage capacity, then each member of the RAID uses only an amount of space equal to the capacity of the smallest member disk.
Version 2.3 and later of mdadm supports component devices up to 4 TB in size each. Earlier versions support component devices up to 2 TB in size.
![[Important]](admon/important.png) | |
|---|
If you have a local disk, external disk arrays, or SAN devices that are larger than the supported device size, use a third-party disk partitioner to carve the devices into smaller logical devices. |
You can combine up to 28 component devices to create the RAID array. The md RAID device you create can be up to the maximum device size supported by the file system you plan to use. For information about file system limits for SUSE® Linux Enterprise Server 10, see “Large File System Support” in the SUSE Linux Enterprise Server 10 Installation and Administration Guide..
In general, each storage object included in the RAID should be from a different physical disk to maximize I/O performance and to achieve disk fault tolerance where supported by the RAID level you use. In addition, they should be of the same type (disks, segments, or regions).
Using component devices of differing speeds might introduce a bottleneck during periods of demanding I/O. The best performance can be achieved by using the same brand and models of disks and controllers in your hardware solution. If they are different, you should try to match disks and controllers with similar technologies, performance, and capacity. Use a low number of drives on each controller to maximize throughput.
![[Important]](admon/important.png) | |
|---|
As with any hardware solution, using the same brand and model introduces the risk of concurrent failures over the life of the product, so plan maintenance accordingly. |
The following table provides recommendations for the minimum and maximum number of storage objects to use when creating a software RAID:
Table 6.6. Recommended Number of Storage Objects to Use in the Software RAID
|
RAID Type
|
Minimum Number of Storage Objects
|
Recommended Maximum Number of Storage Objects
|
|---|
|
RAID 0 (striping)
|
2
|
8
|
|
RAID 1 (mirroring)
|
2
|
4
|
|
RAID 4 (striping with dedicated parity)
|
3
|
8
|
|
RAID 5 (striping with distributed parity)
|
3
|
8
|
Connection fault tolerance can be achieved by having multiple connection paths to each storage object in the RAID. For more information about configuring multipath I/O support before configuring a software RAID, see Chapter 5, Managing Multipath I/O for Devices.
RAID 5 Algorithms for Distributing Stripes and Parity
RAID 5 uses an algorithm to determine the layout of stripes and parity. The following table describes the algorithms.
Table 6.7. RAID 5 Algorithms
|
Algorithm
|
EVMS Type
|
Description
|
|---|
|
Left Asymmetric
|
1
|
Stripes are written in a round-robin fashion from the first to last member segment. The parity’s position in the striping sequence moves in a round-robin fashion from last to first. For example:
sda1 sdb1 sdc1 sde1 | 0 1 2 p | 3 4 p 5 | 6 p 7 8 | p 9 10 11 | 12 13 14 p |
|
|
Left Symmetric
|
2
|
This is the default setting and is considered the fastest method for large reads.
Stripes wrap to follow the parity. The parity’s position in the striping sequence moves in a round-robin fashion from last to first. For example:
sda1 sdb1 sdc1 sde1 | 0 1 2 p | 4 5 p 3 | 8 p 6 7 | p 9 10 11 | 12 13 14 p |
|
|
Right Asymmetric
|
3
|
Stripes are written in a round-robin fashion from the first to last member segment. The parity’s position in the striping sequence moves in a round-robin fashion from first to last. For example:
sda1 sdb1 sdc1 sde1 | p 0 1 2 | 3 p 4 5 | 6 7 p 8 | 9 10 11 p | p 12 13 14 |
|
|
Right Symmetric
|
4
|
Stripes wrap to follow the parity. The parity’s position in the striping sequence moves in a round-robin fashion from first to last. For example:
sda1 sdb1 sdc1 sde1 | p 0 1 2 | 5 p 3 4 | 7 8 p 6 | 9 10 11 p | p 12 13 14 |
|
For information about the layout of stripes and parity with each of these algorithms, see Linux RAID-5 Algorithms.
Multi-Disk Plug-In for EVMS
The Multi-Disk (MD) plug-in supports creating software RAIDs 0 (striping), 1 (mirror), 4 (striping with dedicated parity), and 5 (striping with distributed parity). The MD plug-in to EVMS allows you to manage all of these MD features as “regions” with the Regions Manager.
Device Mapper Plug-In for EVMS
The Device Mapper plug-in supports the following features in the EVMS MD Region Manager:
Multipath I/O: Connection fault tolerance and load balancing for connections between the server and disks where multiple paths are available. If you plan to use multipathing, you should configure MPIO for the devices that you plan to use in the RAID before configuring the RAID itself. For information, see Chapter 5, Managing Multipath I/O for Devices.
![[Important]](admon/important.png) | |
|---|
The EVMS interface manages multipathing under the MD Region Manager, which originally supported the md multipath functions. It uses the legacy md terminology in the interface and in naming of device nodes, but implements the storage objects with Device Mapper. |
Linear RAID: A linear concatenation of discontinuous areas of free space from the same or multiple storage devices. Areas can be of different sizes.
Snapshots: Snapshots of a file system at a particular point in time, even while the system is active, thereby allowing a consistent backup.
The Device Mapper driver is not started by default in the rescue system.
Open a terminal console, then log in as the root user or equivalent.
Start the Device Mapper by entering the following at the terminal console prompt:
/etc/init.d/boot.device-mapper start