In a Linux host, when there are multiple paths to a storage controller, each path appears as a separate block device, and results in multiple block devices for single LUN. The Device Mapper Multipath service detects multiple paths with the same LUN ID, and creates a new multipath device with that ID. For example, a host with two HBAs attached to a storage controller with two ports via a single unzoned Fibre Channel switch sees four block devices: /dev/sda, /dev/sdb, /dev/sdc, and /dev/sdd. The Device Mapper Multipath service creates a single block device, /dev/mpath/mpath1 that reroutes I/O through those four underlying block devices.
This section describes how to specify policies for failover and configure priorities for the paths.
Use the multipath command with the -p option to set the path failover policy:
multipathdevicename-ppolicy
Replace policy with one of the following policy options:
Table 5.4. Group Policy Options for the multipath -p Command
You must manually enter the failover priorities for the device in the /etc/multipath.conf file. Examples for all settings and options can be found in the /usr/share/doc/packages/multipath-tools/multipath.conf.annotated file.
A priority group is a collection of paths that go to the same physical LUN. By default, I/O is distributed in a round-robin fashion across all paths in the group. The multipath command automatically creates priority groups for each LUN in the SAN based on the path_grouping_policy setting for that SAN. The multipath command multiplies the number of paths in a group by the group’s priority to determine which group is the primary. The group with the highest calculated value is the primary. When all paths in the primary group are failed, the priority group with the next highest value becomes active.
A path priority is an integer value assigned to a path. The higher the value, the higher is the priority. An external program is used to assign priorities for each path. For a given device, its paths with the same priorities belong to the same priority group.
Table 5.5. Multipath Attributes
|
Multipath Attribute |
Description |
Values |
|---|---|---|
|
user_friendly_names |
Specifies whether to use IDs or to use the |
yes. Autogenerate user-friendly names as aliases for the multipath devices instead of the actual ID. no. Default. Use the WWIDs shown in the |
|
blacklist |
Specifies the list of device names to ignore as non-multipathed devices, such as cciss, fd, hd, md, dm, sr, scd, st, ram, raw, loop. |
For an example, see Section 5.4.5.4, “Blacklisting Non-Multipathed Devices in /etc/multipath.conf”. |
|
blacklist_exceptions |
Specifies the list of device names to treat as multipath devices even if they are included in the blacklist. |
For an example, see the |
|
getuid_callout |
The default program and argumentss to callout to obtain a unique path identifier. Should be specified with an absolute path. |
/lib/udev/scsi_id -g -u -s This is the default location and arguments. |
|
path_grouping_policy |
Specifies the path grouping policy for a multipath device hosted by a given controller. |
failover. One path is assigned per priority group so that only one path at a time is used. multibus. (Default) All valid paths are in one priority group. Traffic is load-balanced across all active paths in the group. group_by_prio. One priority group exists for each path priority value. Paths with the same priority are in the same priority group. Priorities are assigned by an external program. group_by_serial. Paths are grouped by the SCSI target serial number (controller node WWN). group_by_node_name. One priority group is assigned per target node name. Target node names are fetched in |
|
path_checker |
Determines the state of the path. |
directio. (Default in readsector0. (Default in tur. Issues a SCSI test unit ready command to the device. This is the preferred setting if the LUN supports it. The command does not fill up Some SAN vendors provide custom |
|
path_selector |
Specifies the path-selector algorithm to use for load-balancing. |
round-robin 0. (Default) The load-balancing algorithm used to balance traffic across all active paths in a priority group. This is currently the only algorithm available. |
|
pg_timeout |
Specifies path group timeout handling. |
NONE (internal default) |
|
prio_callout |
Specifies the program and arguments to use to determine the layout of the multipath map. When queried by the multipath command, the specified mpath_prio_* callout program returns the priority for a given path in relation to the entire multipath layout. When it is used with the path_grouping_policy of group_by_prio, all paths with the same priority are grouped into one multipath group. The group with the highest aggregate priority becomes the active group. When all paths in a group fail, the group with the next highest aggregate priority becomes active. Additionally, a failover command (as determined by the hardware handler) might be send to the target. The mpath_prio_* program can also be a custom script created by a vendor or administrator for a specified setup. A %n in the command line expands to the device name in the A %b expands to the device number in A %d expands to the device ID in the If devices are hot-pluggable, use the %d flag instead of %n. This addresses the short time that elapses between the time when devices are available and when udev creates the device nodes. |
If no /bin/true. Use this value when the group_by_priority is not being used. The prioritizer programs generate path priorities when queried by the multipath command. The program names must begin with /sbin/mpath_prio_alua %n. Generates path priorities based on the SCSI-3 ALUA settings. /sbin/mpath_prio_balance_units. Generates the same priority for all paths. /sbin/mpath_prio_emc %n. Generates the path priority for EMC arrays. /sbin/mpath_prio_hds_modular %b. Generates the path priority for Hitachi HDS Modular storage arrays. /sbin/mpath_prio_hp_sw %n. Generates the path priority for Compaq/HP controller in active/standby mode. /sbin/mpath_prio_netapp %n. Generates the path priority for NetApp arrays. /sbin/mpath_prio_random %n. Generates a random priority for each path. /sbin/mpath_prio_rdac %n. Generates the path priority for LSI/Engenio RDAC controller. /sbin/mpath_prio_tpc %n. You can optionally use a script created by a vendor or administrator that gets the priorities from a file where you specify priorities to use for each path. /usr/local/sbin/mpath_prio_spec.sh %n. Provides the path of a user-created script that generates the priorities for multipathing based on information contained in a second data file. (This path and filename are provided as an example. Specify the location of your script instead.) The script can be created by a vendor or administrator. The script’s target file identifies each path for all multipathed devices and specifies a priority for each path. For an example, see Section 5.6.3, “Using a Script to Set Path Priorities”. |
|
rr_min_io |
Specifies the number of I/O transactions to route to a path before switching to the next path in the same path group, as determined by the specified algorithm in the | |
|
rr_weight |
Specifies the weighting method to use for paths. |
uniform. Default. All paths have the same round-robin weightings. priorities. Each path’s weighting is determined by the path’s priority times the rr_min_io setting. |
|
no_path_retry |
Specifies the behaviors to use on path failure. |
n (> 0). Specifies the number of retries until multipath stops the queuing and fails the path. Specify an integer value greater than 0. fail. Specified immediate failure (no queuing). queue. Never stop queuing (queue forever until the path comes alive). |
|
failback |
Specifies whether to monitor the failed path recovery, and indicates the timing for group failback after failed paths return to service. When the failed path recovers, the path is added back into the multipath enabled path list based on this setting. Multipath evaluates the priority groups, and changes the active priority group when the priority of the primary path exceeds the secondary group. |
immediate. When a path recovers, enable the path immediately. n (> 0). When the path recovers, wait n seconds before enabling the path. Specify an integer value greater than 0. manual. (Default) The failed path is not monitored for recovery. The administrator runs the multipath command to update enabled paths and priority groups. |
All paths are active. I/O is configured for some number of seconds or some number of I/O transactions before moving to the next open path in the sequence.
A single path with the highest priority (lowest value setting) is active for traffic. Other paths are available for failover, but are not used unless failover occurs.
You can create a script that interacts with DM-MP to provide priorities for paths to the LUN when set as a resource for the prio_callout setting.
First, set up a text file that lists information about each device and the priority values you want to assign to each path. For example, name the file /usr/local/etc/primary-paths. Enter one line for each path in the following format:
host_wwpn target_wwpn scsi_id priority_value
Return a priority value for each path on the device. Make sure that the variable FILE_PRIMARY_PATHS resolves to a real file with appropriate data (host wwpn, target wwpn, scsi_id and priority value) for each device.
The contents of the primary-paths file for a single LUN with eight paths each might look like this:
0x10000000c95ebeb4 0x200200a0b8122c6e 2:0:0:0 sdb 3600a0b8000122c6d00000000453174fc 50
0x10000000c95ebeb4 0x200200a0b8122c6e 2:0:0:1 sdc 3600a0b80000fd6320000000045317563 2
0x10000000c95ebeb4 0x200200a0b8122c6e 2:0:0:2 sdd 3600a0b8000122c6d0000000345317524 50
0x10000000c95ebeb4 0x200200a0b8122c6e 2:0:0:3 sde 3600a0b80000fd6320000000245317593 2
0x10000000c95ebeb4 0x200300a0b8122c6e 2:0:1:0 sdi 3600a0b8000122c6d00000000453174fc 5
0x10000000c95ebeb4 0x200300a0b8122c6e 2:0:1:1 sdj 3600a0b80000fd6320000000045317563 51
0x10000000c95ebeb4 0x200300a0b8122c6e 2:0:1:2 sdk 3600a0b8000122c6d0000000345317524 5
0x10000000c95ebeb4 0x200300a0b8122c6e 2:0:1:3 sdl 3600a0b80000fd6320000000245317593 51
To continue the example mentioned in Table 5.5, “Multipath Attributes”, create a script named /usr/local/sbin/path_prio.sh. You can use any path and filename. The script does the following:
On query from multipath, grep the device and its path from the /usr/local/etc/primary-paths file.
Return to multipath the priority value in the last column for that entry in the file.
The mpath_prio_alua(8) command is used as a priority callout for the Linux multipath(8) command. It returns a number that is used by DM-MP to group SCSI devices with the same priority together. This path priority tool is based on ALUA (Asynchronous Logical Unit Access).
directorySpecifying the Linux directory path where the listed device node names can be found. The default directory is /dev. When used, specify the device node name only (such as sda) for the device or devices you want to manage.
Displays help for this command, then exits.
Turns on verbose output to display status in human-readable format. Output includes information about which port group the specified device is in and its current state.
Displays the version number of this tool, then exits.
deviceSpecifies the SCSI device you want to manage. The device must be a SCSI device that supports the Report Target Port Groups (sg_rtpg(8)) command. Use one of the following formats for the device node name:
The full Linux directory path, such as /dev/sda. Do not use with the -d option.
The device node name only, such as sda. Specify the directory path using the -d option.
The major and minor number of the device separated by a colon (:) with no spaces, such as 8:0. This creates a temporary device node in the /dev directory with a name in the format of tmpdev-<major>:<minor>-<pid>. For example, /dev/tmpdev-8:0-<pid>.
On success, returns a value of 0 and the priority value for the group. Table 5.6, “ALUA Priorities for Device Mapper Multipath” shows the priority values returned by the mpath_prio_alua command.
Values are widely spaced because of the way the multipath command handles them. It multiplies the number of paths in a group with the priority value for the group, then selects the group with the highest result. For example, if a non-optimized path group has six paths (6 x 10 = 60) and the optimized path group has a single path (1 x 50 = 50), the non-optimized group has the highest score, so multipath chooses the non-optimized group. Traffic to the device uses all six paths in the group in a round-robin fashion.
On failure, returns a value of 1 to 5 indicating the cause for the command’s failure. For information, see the man page for mpath_prio_alua.