Managing Software RAIDs with EVMS

This section describes how to create and manage software RAIDs with the Enterprise Volume Management System (EVMS). EVMS supports only RAIDs 0, 1, 4, and 5 at this time. For RAID 6 and 10 solutions, see Chapter 7, Managing Software RAIDs 6 and 10 with mdadm.

Understanding Software RAIDs on Linux

What Is a Software RAID?

A RAID combines multiple devices into a multi-disk array to provide resiliency in the storage device and to improve storage capacity and I/O performance. If a disk fails, some RAID levels keep data available in a degraded mode until the failed disk can be replaced and its content reconstructed.

A software RAID provides the same high availability that you find in a hardware RAID. The key operational differences are described in the following table:

Table 6.1. Comparison of Software RAIDs and Hardware RAIDs

Feature

Linux Software RAID

Hardware RAID

RAID function

Multi-disk (md) driver or mdadm

RAID controller on the disk array

RAID processing

In the host server’s processor

RAID controller on the disk array

RAID levels

0, 1, 4, 5, and 10 plus the mdadm raid10

Varies by vendor

Component devices

Disks from same or different disk array

Same disk array


Overview of RAID Levels

The following table describes the advantages and disadvantages of the RAID levels supported by EVMS. The description assumes that the component devices reside on different disks and that each disk has its own dedicated I/O capability.

[Important]Important

For information about creating complex or nested RAID devices with mdadm, see Chapter 7, Managing Software RAIDs 6 and 10 with mdadm.

Table 6.2. RAID Levels Supported by EVMS

RAID Level

Description

Performance and Fault Tolerance

0

Stripes data using a round-robin method to distribute data over the RAID’s component devices.

Improves disk I/O performance for both reads and writes. Actual performance depends on the stripe size, the actual data, and the application.

Does not provide disk fault tolerance and data redundancy. Any disk failure causes all data in the RAID to be lost.

1

Mirrors data by copying blocks of one disk to another and keeping them in continuous synchronization. If disks are different sizes, the smallest disk determines the size of the RAID.

Improves disk reads by making multiple copies of data available via different I/O paths. The write performance is about the same as for a single disk because a copy of the data must be written to each of the disks in the mirror.

Provides 100% data redundancy. If one disk fails then the data remains available on its mirror, and processing continues.

4

Stripes data and records parity to a dedicated disk. If disks are different sizes, the smallest disk determines the size of the RAID.

Improves disk I/O performance for both reads and writes. Write performance is considerably slower than for RAID 0, because parity must be calculated and written. Write performance is slightly slower than RAID 5. Read performance is slower than for a RAID 1 array with the same number of component devices. The dedicated parity disk can become a bottleneck for writing parity.

Provides disk fault tolerance. If a disk fails, performance is degraded while the RAID uses the parity to reconstruct data for the replacement disk.

5

Stripes data and distributes parity in a round-robin fashion across all disks. If disks are different sizes, the smallest disk determines the size of the RAID.

Improves disk I/O performance for reads and writes. Write performance is considerably less than for RAID 0, because parity must be calculated and written. Write performance is faster than RAID 4. Read performance is slower than for a RAID 1 array with the same number of component disks. Actual performance depends on the number of component disks, the stripe size, the actual data, and the application.

Provides disk fault tolerance. If a disk fails, performance is degraded while the RAID uses the parity to reconstruct data for the replacement disk. Provides slightly less data redundancy than mirroring because it uses parity to reconstruct the data.


Comparison of RAID Performance

The following table compares the read and write performance for RAID devices.

Table 6.3. Read and Write Performance for RAIDs

Raid Level

Read Performance

Write Performance

0

Faster than for a single disk

Faster than for a single disk and other RAIDs.

1

Faster than for a single disk, increasing as more mirrors are added

Slower than for a single disk, declining as more mirrors are added.

4

Faster than for a single disk. Slower than a RAID 0 because one disk is used for parity.

Faster than for a single disk. Slower than a RAID 0 because of writes for parity. Slower than a RAID 5 because of possible bottlenecks for writes of parity to the dedicated parity disk.

5

Faster than for a single disk; comparable to a RAID 0.

Faster than a single disk. Slower than a RAID 0 because of writes for parity.


Comparison of Disk Fault Tolerance

The following table compares the disk fault tolerance for RAID devices.

Table 6.4. Fault Tolerance for RAIDs

Raid Level

Number of Disk Failures Tolerated

Data Redundancy

0

None

No

1

Number of disks minus 1

100% redundancy for each mirror

4

1

Dedicated parity disk to reconstruct data. If the parity disk fails, all parity must be recalculated.

5

1

Distributed parity to reconstruct data and parity on the failed disk.


Configuration Options for RAIDs

In EVMS management tools, the following RAID configuration options are provided:

Table 6.5. Configuration Options in EVMS

Option

Description

Spare Disk

For RAIDs 1, 4, and 5, you can optionally specify a device, segment, or region to use as the replacement for a failed disk (the member device, segment, or region). On failure, the spare disk automatically replaces the failed disk, then reconstructs the data.

However, if the parity disk fails on a RAID 5, parity cannot be reconstructed.

Chunk Size (KB)

For RAIDs 0, 4, or 5, specify the stripe size in KB.

Consider the intended use of the RAID, such as the file system block size, the applications used, and the actual data (file sizes and typical reads and writes). A typical write size for large files is 128 KB.

Default: 32 KB

Range: 4 KB to 4096 KB, in powers of 2.

RAID Level

If you selected MD RAID 4/5 Region Manager, specify RAID 4 or RAID 5 (default).

RAID Algorithm

For RAID 5, specify one of the following algorithms to use for striping and distributing parity on the disk.

  • Left Asymmetric

  • Left Symmetric (Default, fastest performance for large reads)

  • Right Asymmetric

  • Right Symmetric


Guidelines for Component Devices

For efficient use of space and performance, the disks you use to create the RAID should have the same storage capacity. Typically, if component devices are not of identical storage capacity, then each member of the RAID uses only an amount of space equal to the capacity of the smallest member disk.

Version 2.3 and later of mdadm supports component devices up to 4 TB in size each. Earlier versions support component devices up to 2 TB in size.

[Important]Important

If you have a local disk, external disk arrays, or SAN devices that are larger than the supported device size, use a third-party disk partitioner to carve the devices into smaller logical devices.

You can combine up to 28 component devices to create the RAID array. The md RAID device you create can be up to the maximum device size supported by the file system you plan to use. For information about file system limits for SUSE® Linux Enterprise Server 10, see “Large File System Support” in the SUSE Linux Enterprise Server 10 Installation and Administration Guide..

In general, each storage object included in the RAID should be from a different physical disk to maximize I/O performance and to achieve disk fault tolerance where supported by the RAID level you use. In addition, they should be of the same type (disks, segments, or regions).

Using component devices of differing speeds might introduce a bottleneck during periods of demanding I/O. The best performance can be achieved by using the same brand and models of disks and controllers in your hardware solution. If they are different, you should try to match disks and controllers with similar technologies, performance, and capacity. Use a low number of drives on each controller to maximize throughput.

[Important]Important

As with any hardware solution, using the same brand and model introduces the risk of concurrent failures over the life of the product, so plan maintenance accordingly.

The following table provides recommendations for the minimum and maximum number of storage objects to use when creating a software RAID:

Table 6.6. Recommended Number of Storage Objects to Use in the Software RAID

RAID Type

Minimum Number of Storage Objects

Recommended Maximum Number of Storage Objects

RAID 0 (striping)

2

8

RAID 1 (mirroring)

2

4

RAID 4 (striping with dedicated parity)

3

8

RAID 5 (striping with distributed parity)

3

8


Connection fault tolerance can be achieved by having multiple connection paths to each storage object in the RAID. For more information about configuring multipath I/O support before configuring a software RAID, see Chapter 5, Managing Multipath I/O for Devices.

RAID 5 Algorithms for Distributing Stripes and Parity

RAID 5 uses an algorithm to determine the layout of stripes and parity. The following table describes the algorithms.

Table 6.7. RAID 5 Algorithms

Algorithm

EVMS Type

Description

Left Asymmetric

1

Stripes are written in a round-robin fashion from the first to last member segment. The parity’s position in the striping sequence moves in a round-robin fashion from last to first. For example:

sda1 sdb1 sdc1 sde1
0 1 2 p
3 4 p 5
6 p 7 8
p 9 10 11
12 13 14 p

Left Symmetric

2

This is the default setting and is considered the fastest method for large reads.

Stripes wrap to follow the parity. The parity’s position in the striping sequence moves in a round-robin fashion from last to first. For example:

sda1 sdb1 sdc1 sde1
0 1 2 p
4 5 p 3
8 p 6 7
p 9 10 11
12 13 14 p

Right Asymmetric

3

Stripes are written in a round-robin fashion from the first to last member segment. The parity’s position in the striping sequence moves in a round-robin fashion from first to last. For example:

sda1 sdb1 sdc1 sde1
p 0 1 2
3 p 4 5
6 7 p 8
9 10 11 p
p 12 13 14

Right Symmetric

4

Stripes wrap to follow the parity. The parity’s position in the striping sequence moves in a round-robin fashion from first to last. For example:

sda1 sdb1 sdc1 sde1
p 0 1 2
5 p 3 4
7 8 p 6
9 10 11 p
p 12 13 14

For information about the layout of stripes and parity with each of these algorithms, see Linux RAID-5 Algorithms.

Multi-Disk Plug-In for EVMS

The Multi-Disk (MD) plug-in supports creating software RAIDs 0 (striping), 1 (mirror), 4 (striping with dedicated parity), and 5 (striping with distributed parity). The MD plug-in to EVMS allows you to manage all of these MD features as “regions” with the Regions Manager.

Device Mapper Plug-In for EVMS

The Device Mapper plug-in supports the following features in the EVMS MD Region Manager:

  • Multipath I/O: Connection fault tolerance and load balancing for connections between the server and disks where multiple paths are available. If you plan to use multipathing, you should configure MPIO for the devices that you plan to use in the RAID before configuring the RAID itself. For information, see Chapter 5, Managing Multipath I/O for Devices.

    [Important]Important

    The EVMS interface manages multipathing under the MD Region Manager, which originally supported the md multipath functions. It uses the legacy md terminology in the interface and in naming of device nodes, but implements the storage objects with Device Mapper.

  • Linear RAID: A linear concatenation of discontinuous areas of free space from the same or multiple storage devices. Areas can be of different sizes.

  • Snapshots: Snapshots of a file system at a particular point in time, even while the system is active, thereby allowing a consistent backup.

The Device Mapper driver is not started by default in the rescue system.

  1. Open a terminal console, then log in as the root user or equivalent.

  2. Start the Device Mapper by entering the following at the terminal console prompt:

    /etc/init.d/boot.device-mapper start
    

SUSE® Linux Enterprise Server Storage Administration Guide 10