Moger, Babu [Wed, 14 Mar 2012 21:20:22 +0000 (21:20 +0000)]
multipath: blacklist all the management Luns by default
This patch adds the blacklisting for all the management luns. Otherwise
user has to manually add blacklisting in multipath.conf for these luns.
Signed-off-by: Babu Moger <babu.moger@netapp.com>
Jun'ichi Nomura [Mon, 12 Mar 2012 11:56:52 +0000 (20:56 +0900)]
Fix fast_io_fail capping
Hi Christophe,
fast_io_fail is only meaningful if it is smaller than dev_loss_tmo.
Setting dev_loss_tmo value to fast_io_fail ends up with -EINVAL.
If the fast_io_fail is not configured properly, turning it off
seems to be the right behavior.
MP_FAST_IO_FAIL_OFF is -1, defined in the following patch:
[PATCH] Fix for setting '0' to fast_io_fail
http://www.redhat.com/archives/dm-devel/2012-March/msg00047.html
--
Jun'ichi Nomura, NEC Corporation
Jun'ichi Nomura [Mon, 12 Mar 2012 11:43:55 +0000 (20:43 +0900)]
Fix for setting '0' to fast_io_fail
Hi Christophe,
In kernel, '0' is valid value for fast_io_fail, meaning immediate
termination of ios on rport delete.
However, '0' is treated as 'not-configured' in various places of
multipath-tools and it is not possible to set 0 to fast_io_fail.
Attached patch fixes that by introducing MP_FAST_IO_FAIL_ZERO
as internal representation of zero value.
--
Jun'ichi Nomura, NEC Corporation
Martin George [Mon, 12 Mar 2012 08:22:11 +0000 (13:52 +0530)]
multipath: Set 'tur' as the default path checker for NetApp LUNs
In our tests, we've noticed that the 'tur' checker provides
better performance compared to 'directio' primarily because 'tur'
does not use FS-based requests unlike 'directio'. Moreover with
Hannes' recent async tur enhancement, the 'tur' checker is more
efficient now than before.
So we'd prefer using 'tur' as the default path checker for NetApp
LUNs now. The below patch enables the same by updating the
.checker_name in the hwtable for NetApp LUNs.
Signed-off-by: Martin George <marting@netapp.com>
Chauhan, Vijay [Tue, 6 Mar 2012 15:11:38 +0000 (15:11 +0000)]
multipath-tools: cleanup for all unused-but-set-variable variables in mpathpersist
This patch is a cleanup for all unused-but-set-variable variables
in mpathpersist.
Signed-off-by: Vijay Chauhan <vijay.chauhan@netapp.com>
Chauhan, Vijay [Tue, 6 Mar 2012 15:10:15 +0000 (15:10 +0000)]
multipath-tools: Implementation for hex output (-H) for mpathpersist
Adding missing implementation for hex output(-H).
Signed-off-by: Vijay Chauhan <vijay.chauhan@netapp.com>
Moger, Babu [Wed, 22 Feb 2012 18:09:10 +0000 (18:09 +0000)]
multipath-tools: Generalizing the vpd 0x83 processing with correct buffer length
Right now the buffer length for inquiry vpd 0x83 is hardcoded to 128 bytes.
This can cause problems if the length of all the designation descriptors
exceed 128 bytes. This was causing me issues while configuring my storage
with alua. I have generalized the processing with correct buffer length.
Patch has been tested with NetApp E-series storage.
Signed-off-by: Babu Moger <babu.moger@netapp.com>
Moger, Babu [Wed, 22 Feb 2012 18:09:00 +0000 (18:09 +0000)]
multipath-tools: fix the bug while processing vpd 0x83 designation descriptors
This patch fixes the bug while processing the vpd 0x83 designation descriptors.
Removing the buggy check(> sizeof(buf))while loping the descriptors. Sizeof(buf) will
always return 8 (in 64 bit machine). Descriptor length can be more than 8 bytes in
some cases. This was causing problems while configuring my storage with alua.
Signed-off-by: Babu Moger <babu.moger@netapp.com>
Christophe Varoqui [Sat, 11 Feb 2012 08:33:45 +0000 (09:33 +0100)]
mpathpersist build fix
remove -lsysfs from Makefiles. sysfs.h is provided through
-lmultipath.
Benjamin Marzinski [Fri, 10 Feb 2012 18:18:38 +0000 (12:18 -0600)]
multipath: another manpage update
Missed an option with my last manpage update.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 10 Feb 2012 18:16:50 +0000 (12:16 -0600)]
multipath: adjust messages
Stop the rport_id messages from being dispalyed all the time, and add a message
alerting users when multipath tries to setup a map and fails, or ends up
removing the map.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 10 Feb 2012 18:14:39 +0000 (12:14 -0600)]
kpartx: verify GUID partition entry size
This patch pulls in some kernel code to catch a corrupt GUID partition
table with the wrong size.
Signed-off-by: Boris Ranto <branto@redhat.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 10 Feb 2012 18:13:12 +0000 (12:13 -0600)]
multipath: don't remove map twice
If setup_mutipath fails, it removes the map itself, so don't try to again.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 10 Feb 2012 18:11:37 +0000 (12:11 -0600)]
multipath: cleanup dev_loss_tmo issues
There are a couple of issues with the dev_loss_tmo code. First, the
comparison between fast_io_fail and dev_loss was failing for
fast_io_fail = -1. Second, if fast_io_fail_tmo was set to off, and
dev_loss was greater than 600, dev_loss_tmo would not be set. Finally,
verify_paths was calling sysfs_set_scsi_tmo without ever calling
select_fast_io_fail. However, this hasn't be causing problems since
setup_map is always called immediately after verify_paths, and it calls
all the select_ functions correctly. This patch fixes all these. Now,
if setting dev_loss_tmo fails, and fast_io_fail is set to off, it will
retry will dev_loss_tmo set to 600. Also, the calls that are duplicated
between verify_paths and setup_map have been removed.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 10 Feb 2012 18:10:11 +0000 (12:10 -0600)]
multipath: fix shutdown crashes
A number of processes don't reach a pthread cancellation point
before they use the pathvec or mpvec vectors, after they've
locked the vecs lock. This can cause crashes on shutdown, since
these vectors are deallocated. Also, the log thread accesses a
number of resources which may have been deallocated during shutdown
without holding any locks. This patch avoids these issues by
adding pthread_testcancel() checks after acquiring the vecs lock,
and having the child process make sure the log thread has exitted
before deallocating the resources.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Vijay Chauhan [Thu, 9 Feb 2012 15:00:20 +0000 (10:00 -0500)]
mpathpersist: Add new utility for managing persistent reservation on dm multipath device
Persistent reservation management utility (mpathpersist) allows cluster management software to manage
persistent reservation through mpath device. It processes management request from caller
and hides the management task details. It also handles persistent reservation management of
data path life cycle and state changes.
Signed-off-by: Vijay Chauhan <vijay.chauhan@netapp.com>
Phillip Susi [Thu, 9 Feb 2012 20:16:21 +0000 (21:16 +0100)]
[kpartx] Don't add 'p' delimiter when you shouldn't
The 'p' delimiter is supposed to be added when the base disk name
ends in a digit. This decision was based on the name given on the
command line, not the canonical device name, so giving /dev/dm-0
instead of /dev/mapper/foo triggered the digit test and added the
'p'. Changed test to use the canonical name rather than the given
name.
Gerhard Wichert [Wed, 8 Feb 2012 20:52:26 +0000 (21:52 +0100)]
Add Fujitsu Eternus defaults
Benjamin Marzinski [Fri, 27 Jan 2012 20:41:49 +0000 (14:41 -0600)]
multipath: Update multipath device on show topology
when multipathd's show_map_topology or show_maps_topology commands are
called, multipathd doesn't update its device state from the kernel. So,
if you do something like call disablequeueing first, show_map_topology won't
register the change. This patche makes multipathd update the device before
printing the topology. This also requires a change to setup_multipath, to
allow it to just read the kernel state, and not update anything.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 27 Jan 2012 20:42:45 +0000 (14:42 -0600)]
multipath: Update multipath.conf man page
Update the multipath.conf man page.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 13 Jan 2012 04:17:21 +0000 (22:17 -0600)]
multipath: don't remove dm device on remove uevent
multipathd gets remove uevents for dm devices when the devices have
been removed. It shouldn't try to actually remove the device itself,
since that has already been done, or it wouldn't have gotten the uevent.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 13 Jan 2012 04:14:56 +0000 (22:14 -0600)]
multipath: make tgt_node_name work for iscsi devices
tgt_node_name wasn't displaying anything for iscsi devices. With this
change, if multipath can't get the node_name, it will check
sys/devices/platform/hostX/sessionY/iscsi_session/sessionY/targetname
and if this is available, it will get the node name from there.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Oren Held [Fri, 30 Dec 2011 11:45:09 +0000 (12:45 +0100)]
Add default values for IBM XIV Storage System.
Benjamin Marzinski [Mon, 19 Dec 2011 21:41:57 +0000 (15:41 -0600)]
multipath: add option to change the number of error messages
This patch adds a new default config parameter, log_checker_err. It accepts
two values, "once" and "always", and defaults of "always". It controls
how multipathd logs checker error messages. If it's set to "once", only the
first checker error message is logged at logging level 2. All future messages
are logged at level 3, until the device is restored or removed. If it's set
to "always", all messages are logged at level 2, like multipathd currently does.
This version actually compiles.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Mon, 19 Dec 2011 22:19:56 +0000 (16:19 -0600)]
multipath: fix scsi timeout code
sysfs_attr_set_value() returns the amount written on on success, or -1 on
failure. sysfs_setc_scsi_tmo() was checking if the return was nonzero, and
failing if it was. This meant that it always failed out silently after writing
the first value. I've changed the check, and added some error messages. I also
made sysfs_attr_set_value return -1 for all errors.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Phillip Susi [Tue, 6 Dec 2011 16:12:58 +0000 (11:12 -0500)]
multipath-tools: Remove bad udev rules
This sample udev rules file contains some rules relating to dmraid
that both should not be there and are broken anyhow. They should
not be there because firstly, what is a dmraid rule doing in a
kpartx rule file, and secondly, dmraid already activates partitions
itself, so there is no need to run kpartx to do that. The rule is
broken because it is matching on the DM_UUID starting with "dmraid-",
but this comparison is case sensitive, and it actually starts with
"DMRAID-".
Signed-off-by: Phillip Susi <psusi@cfl.rr.com>
Christophe Varoqui [Tue, 15 Nov 2011 20:39:34 +0000 (21:39 +0100)]
Revert "multipath: rlookup WWIDs with spaces by alias"
This reverts commit
1620040c3b1a4c4f6762d7e606a83c9f5ab8ebff.
wwid can have not whitespace anyway. scsi_id make sure of that.
Aruna Balakrishnaiah [Tue, 15 Nov 2011 15:09:57 +0000 (20:39 +0530)]
Update man page for multipath -r
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
hegdevasant [Tue, 15 Nov 2011 20:33:34 +0000 (21:33 +0100)]
kpartx man page update
This patch updates the kpartx man page.
Signed-off-by: Vasant Hegde <hegdevasant@in.ibm.com>
Christophe Varoqui [Sat, 12 Nov 2011 13:38:26 +0000 (14:38 +0100)]
Fix prio default value in multipath.conf.annotated
"none" to "const"
Christophe Varoqui [Sat, 12 Nov 2011 12:04:19 +0000 (13:04 +0100)]
Fix polling insterval reported by multipath -t
polling interval default value was not set in the multipath
code path, but only in multipathd (where it is used).
Move the default value setting to load_config, where it belongs,
to have it set in both multipath and multipathd.
Olivier Lambert [Thu, 10 Nov 2011 11:36:23 +0000 (12:36 +0100)]
update prioritizer for iet target
Add missing free(), remove spurious whitespaces
Benjamin Marzinski [Sat, 12 Nov 2011 05:12:49 +0000 (23:12 -0600)]
multipath: don't print so many add map messages
Whenever a dm device gets a change uevent, multipathd prints an add map
message. This can get confusing for users, so change that message to
not print at the default log level, and add a new message that only
prints if multipathd will actually try to add a map
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Sat, 12 Nov 2011 05:10:21 +0000 (23:10 -0600)]
multipath: Set the default max_fds to the system max
Since many people don't realize that they need to set max_fds until they run
out of file descriptors, default to the system max.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Sat, 12 Nov 2011 04:54:26 +0000 (22:54 -0600)]
multipath: rlookup WWIDs with spaces by alias
If a WWID contained spaces, the rlookup code wasn't able to look it up
by its user_friendly_name, since the code was only reading the wwid till
the first space. It now reads to the end of the line.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
hegdevasant [Wed, 9 Nov 2011 23:00:01 +0000 (00:00 +0100)]
Add missing break statement in kpartx
Signed-off-by: Vasant Hegde <vahegde1@linux.vnet.ibm.com>
Christophe Varoqui [Wed, 9 Nov 2011 22:53:03 +0000 (23:53 +0100)]
Merge a prioritizer for IET scsi software target
Written by Olivier Lambert <lambert.olivier@gmail.com>
Anton Blanchard [Wed, 9 Nov 2011 22:48:12 +0000 (23:48 +0100)]
Vendor/product comparisons are too broad
We have a POWER machine with a broken multipath setup. Analysis shows
that the RDAC driver is being used even though it shouldn't.
The vendor/product is:
IBM,IPR-0
65C61818
There is an entry for this device:
/* IBM IPR */
.vendor = "IBM",
.product = "IPR.*",
Unfortunately it looks like a previous entry is matching against this
(since we do a regex match):
/* IBM DS5000 */
.vendor = "IBM",
.product = "1818",
There are a number of IBM entries that have this issue. The following
patch ensures we match against the entire product ID.
bmarzins@sourceware.org [Wed, 2 Nov 2011 21:44:58 +0000 (22:44 +0100)]
Fix for Red Hat bz #737072.
CVSROOT: /cvs/dm
Module name: multipath-tools
Branch: RHEL5_FC6
Changes by: bmarzins@sourceware.org 2011-10-24 13:41:32
Modified files:
path_priority/pp_alua: rtpg.c
Log message:
Fix for bz #737072. Shorten the timeout for the alua prio callout function
from 5 minutes to 1 minute.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/multipath-tools/path_priority/pp_alua/rtpg.c.diff?cvsroot=dm&only_with_tag=RHEL5_FC6&r1=1.3.2.5&r2=1.3.2.6
bmarzins@sourceware.org [Mon, 24 Oct 2011 13:37:18 +0000 (13:37 +0000)]
multipath-tools/kpartx gpt.c
CVSROOT: /cvs/dm
Module name: multipath-tools
Branch: RHEL5_FC6
Changes by: bmarzins@sourceware.org 2011-10-24 13:37:18
Modified files:
kpartx : gpt.c
Log message:
Fix for bz #719575. Validate size of GPT partitions.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/multipath-tools/kpartx/gpt.c.diff?cvsroot=dm&only_with_tag=RHEL5_FC6&r1=1.3&r2=1.3.2.1
Benjamin Marzinski [Mon, 17 Oct 2011 21:16:02 +0000 (16:16 -0500)]
multipath: handle offlined paths
The kernel does not allow multipath to load tables containing offline
devices. Because of this, if you try add a path to a multipath device with
an offline path, the multipathd will continually, retry and fail to reload
the table. I've limited the retries to three to avoid livelocking.
Also, if you map included a offline path, multipath was crashing because
it couldn't get the required sysfs information before calling get_state().
It now checks for this in multipath, like it does in multipathd
Lastly, multipathd would keep reprinting the last checker message for
offlined paths, instead of something useful. It now prints a "path offline"
message.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Tue, 11 Oct 2011 03:18:09 +0000 (22:18 -0500)]
multipath: better argument type checking
The way that multipath decided what you passed in as an argument didn't
always work. If the argument was the name of a file, then multipath
assumed that it was a path. That meant if you were in /dev/mapper and ran
# multipath -f <mpath_device_name>
It would fail, since it thought you gave it a path name, instead of a
multipath device name. Now multipath will only treat the argument
as a path name if it is a block device with a different major number than
device-mapper's. Also, I've switched the MAJOR/MINOR/MKDEV macros to
work like kpartx, so that they can handle minor numbers over 255.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Tue, 11 Oct 2011 03:19:13 +0000 (22:19 -0500)]
multipath: set ACT_RESIZE when the size has changed
When the multipath path devices change size, multipath can't be reloaded
with noflush set. So, don't set the action to ACT_RELOAD, which will
cause the multipath device to get stuck in SUSPEND. Use ACT_RESIZE.
Also, I was seeing some messages that were getting cut off with the
128 byte messages size, so I doubled that, and the log area size.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Tue, 11 Oct 2011 03:16:51 +0000 (22:16 -0500)]
multipath: better check for daemon mode
With the existing check, if a multipath device gets created with a
blacklisted path (because, for instance, the path was unblacklisted,
but multipathd was not reconfigured), multipathd will crash. This is
because multipathd will add the path when it adds the multipath device,
but it won't have all the necessary information to use the path. The
new check makes sure multipathd won't add blacklisted paths, simply
because they are part of a multipath device.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Aruna Balakrishnaiah [Wed, 5 Oct 2011 14:36:14 +0000 (20:06 +0530)]
'multipath' with -h and -t option, it returns '1' (fail) for successful command execution
Fix exit status for -h and -t options in multipath command
Benjamin Marzinski [Wed, 5 Oct 2011 04:13:49 +0000 (23:13 -0500)]
multipath: make sure all the hwe attributes get merged
Not all of the hwe attributes were getting merged. Also,
multipathd show config was putting an extra set of quotes around the entries
in the devices section.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 5 Oct 2011 04:15:05 +0000 (23:15 -0500)]
multipath: get right sysfs value for checker_timeout
sysfs_get_timeout() wasn't looking in the correct directory for the
checker timeout value. It was looking at .../block/<devname>/timeout,
instead of .../block/<devname>/device/timeout
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Tue, 27 Sep 2011 20:50:52 +0000 (15:50 -0500)]
multipath: add default hardware configs.
Here are some hardware configs I've received from vendors, that haven't made it
upstream yet, along with a little bit of cleanup. The changes come from Redhat
BZ #622569, #636213, and #694602
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Mon, 26 Sep 2011 14:50:58 +0000 (09:50 -0500)]
multipath: don't set queue_if_no_path without multipathd
If multipathd is not running, when all paths to a device have failed, there's
no way for them to automatically get restored. If the device is set to queue,
whatever is accessing it will hang forever. This can lead to problems if it
happens at boot-up. This patch unsets queue_if_no_path for all devices created
when multipathd is not running. When multipathd starts, it will automatically
get reset queue_if_no_path to the proper value. This new behaviour can be
overridden using the new "-q" option to multipath.
This version of the patch contacts multipathd's client socket to tell if it's
running.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 23 Sep 2011 14:35:59 +0000 (09:35 -0500)]
multipath: add support for setting oom_score_adj
The oom_adj procfs interface is deprecated. I've added support for using the
new oom_score_adj interface. The code still falls back to using oom_adj
if oom_score_adj doesn't exist.
Resending, since I was obviously working far too late last night.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Chauhan, Vijay [Fri, 2 Sep 2011 13:22:46 +0000 (18:52 +0530)]
multipath-tools: Adding Netapp as a brand name for RDAC
Resending this patch. Previous post had indentation issue due to my mail settings.
Christophe Varoqui [Thu, 1 Sep 2011 19:43:27 +0000 (21:43 +0200)]
Remove useless alias pointer reset to NULL
This is already done by free_multipath()
Benjamin Marzinski [Thu, 1 Sep 2011 17:08:29 +0000 (12:08 -0500)]
multipath: systemd unit file
Here is a systemd unit file for managing multipathd.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 13 Jul 2011 18:30:42 +0000 (13:30 -0500)]
multipath: check setup_multipath return value.
When setup_multipath() fails, it removes the map. So update_path_groups()
needs check the return value, and fail without touching the map anymore if
setup_multipath() fails.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 1 Sep 2011 06:50:54 +0000 (08:50 +0200)]
multipath: strdup multipath alias, so that it isn't deleted
When a multipath device is added to multipathd with ev_add_map(),
the alias is not duplicated, and is freed immediately after ev_add_map()
returns, causing a memory error. This patch corrects that.
Moger, Babu [Mon, 29 Aug 2011 16:24:37 +0000 (12:24 -0400)]
multipath-tools: service mode changes for RDAC storage
This patch was not picked up yet, so resubmitting it again. There no change except it was generated on top of the latest file.
This patch handles the recent changes in NetApp RDAC storage firmware to report service mode. Firmware changed the inquiry page 0xc9 to report service mode. Purpose this change is to avoid DMMP going into infinite loop of switching back and forth between controllers when a controller is placed in service mode. This fixes the problem and reports the path as failed if the controller is placed in service mode.
Signed-off-by: Babu Moger <babu.moger@netapp.com>
Reviewed-by: Yanling Qi <yanling.qi@netapp.com>
Reviewed-by: Somasundaram Krishnasamy <Somasundaram.Krishnasamy@netapp.com>
Ritesh Raj Sarraf [Wed, 17 Aug 2011 17:40:37 +0000 (23:10 +0530)]
Add kpartx example to manpage
Thanks: Lars Wirzenius
Closes: 637538
Signed-off-by: Ritesh Raj Sarraf <rrs@debian.org>
Craig [Fri, 19 Aug 2011 22:16:55 +0000 (00:16 +0200)]
fix-linebreaks
Hi,
error messages (DM message failed) in the logs look like this in multipath:
Jul 24 06:17:47 onosendai multipathd: DM message failed [queue_if_no_path
Jul 24 06:17:47 onosendai ]
Jul 24 06:17:47 onosendai multipathd: sdc: alua not supported
Jul 24 06:17:47 onosendai multipathd: sdd: alua not supported
Jul 24 06:17:47 onosendai multipathd: DM message failed [queue_if_no_path
Jul 24 06:17:47 onosendai ]
The patch fixes the unneccessary \n in libmultipath/devmapper.c
Best regards,
Craig
>From
bb1354f917b7bd205605a41016e0a3e1aff6feac Mon Sep 17 00:00:00 2001
From: craig <craig@haquarter.de>
Date: Sun, 24 Jul 2011 06:38:55 +0200
Subject: [PATCH] fix linebreaks
Oren Held [Sun, 21 Aug 2011 10:05:22 +0000 (13:05 +0300)]
Remove prio_callout as a valid option from man page
Signed-off-by: Oren Held <orenhe@il.ibm.com>
Moger, Babu [Fri, 27 May 2011 14:30:19 +0000 (10:30 -0400)]
multipath-tools: Manual failback fix when priority changes
Current code switches the path-group when there is a change in priority. However,
this is not the right thing to do when failback is set to manual. This patch fixes
this problem. Call update_path_groups only if failback is immediate.
Signed-off-by: Babu Moger <babu.moger@netapp.com>
Christophe Varoqui [Wed, 25 May 2011 21:21:42 +0000 (23:21 +0200)]
Fix hang on reconfigure CLI command
Restore the vector locking outside the reconfigure() function.
Moving it inside caused a double-lock hang situation. The
first locker being uxsock_trigger(), caller of reconfigure().
Discussion on-going on wether we'd better stop locking from
uxsock_trigger().
Hannes Reinecke [Wed, 25 May 2011 12:40:42 +0000 (14:40 +0200)]
Use refcounting for sysfs devices
As we're caching sysfs devices we need to introduce some sort
of refcounting here. Otherwise the device might be removed from
other threads while we're still accessing it.
References: bnc#642846
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 25 May 2011 12:40:31 +0000 (14:40 +0200)]
multipathd: Do not attempt to rename a device
If a device-mapper device got renamed we should be notified
via the waiter thread; no need to do it in the main loop.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 25 May 2011 12:40:19 +0000 (14:40 +0200)]
Race condition when calling stop_waiter_thread()
We cannot access the waiter structure from other threads as
the lifetime is totally different and it might be deleted
at any time.
So we better store the pthread id in the calling thread and
just send a signal to the thread.
References: bnc#642846
Signed-off-by: Hannes Reinecke <hare@suse.de>
Christophe Varoqui [Wed, 25 May 2011 12:04:35 +0000 (14:04 +0200)]
Quick and dirty adaption of the debian startup file
For the pidfile is no longer created.
Christophe Varoqui [Wed, 25 May 2011 12:00:52 +0000 (14:00 +0200)]
Fix segfault in dm reassign code path
alias is allocated and freed in multipathd/main.c:uev_add_map().
Don't free it in ev_add_map() called from uev_add_map() to avoid
double free.
Christophe Varoqui [Wed, 25 May 2011 11:59:53 +0000 (13:59 +0200)]
Set an internal default for feature
Fix segfault when no /etc/multipath.conf is present.
Christophe Varoqui [Wed, 25 May 2011 09:43:58 +0000 (11:43 +0200)]
Revert "Add 'max_polling_interval' config variable"
This reverts commit
efc8ace4b335e752a7d28aca6040af0f9fe37530.
Spurious patch in Hannes branch
Christophe Varoqui [Wed, 25 May 2011 06:35:34 +0000 (08:35 +0200)]
Merge remote-tracking branch 'hannes/for-christophe'
Conflicts:
multipathd/main.c
Hannes Reinecke [Thu, 4 Dec 2008 13:20:06 +0000 (14:20 +0100)]
Reload map for device read-only setting changes
Whenever the read-only setting for a device changes we have
to reload the map. This patch implements the required cli command
and also a uevent handler if a uevent with 'DISK_RO=' setting
has been received.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 20 Mar 2009 09:30:04 +0000 (10:30 +0100)]
Increase priority value for emc priority callout
For non-default paths the emc priority callout should
not return '0', as this will inhibit the daemon to switch
paths. And we should be returning 'PRIO_UNDEF' in the
case of failure.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 13:33:18 +0000 (15:33 +0200)]
Allow dev_loss to be set to 'infinity'
With this patch we can set dev_loss to infinity, so that
failed devices will never removed from the system.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 12:29:07 +0000 (14:29 +0200)]
Update manpages
The man pages are in dire need of updating. Do it now.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Mon, 30 Apr 2007 07:37:38 +0000 (09:37 +0200)]
multipath: add '-t' option to dump internal hwtable
This patch adds an option '-t' to dump the internal hardware table.
Quite handy if you want to know the default settings.
In doing so it also fixes the keyword allocation; currently the
keywords are only initialised if a configuration file is used.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 12:02:00 +0000 (14:02 +0200)]
Reassign existing device-mapper maps
When a multipath device is created other maps might already be
in place pointing to the same block device. To ensure uninterrupted
access these maps should be reassigned to point to the
multipath devices instead.
This patch also adds a configuration variable 'reassign_maps'
to toggle this behaviour.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Mon, 13 Sep 2010 09:06:13 +0000 (11:06 +0200)]
alua: Handle LBA_DEPENDENT state
SPC-4 added another state, LBA_DEPENDENT. This patch
adds basic support for it.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 11:28:04 +0000 (13:28 +0200)]
Update multipathd init script for SuSE
As we know have the 'show daemon' CLI command we can be using
it to track startup and shutdown.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 11:22:13 +0000 (13:22 +0200)]
multipathd: Add PID to 'show daemon' cli command
We might want to know the PID of the daemon itself.
So adding it to the 'show daemon' cli command.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 11:12:24 +0000 (13:12 +0200)]
Update pid file handling
As we now have a shutdown CLI command we don't actually need the
pid file anymore. So any errors on creation can be ignored.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 4 May 2010 09:02:37 +0000 (11:02 +0200)]
Serialize startup on large machines
On large installations the startup can take quite long.
So to better integration with the init scripts I've added
the CLI command 'show daemon' which returns the internal
running state of the daemon.
With this the init scripts can wait until the daemon
is properly started.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 28 Apr 2009 09:11:23 +0000 (11:11 +0200)]
Add 'shutdown' cli command
Rather than sending a signal to the process (which might get caught
by any thread, possibly blocking it) we can as well implement a
'shutdown' cli command, making sure the daemon will be terminated.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Thu, 27 May 2010 08:37:58 +0000 (10:37 +0200)]
multipathd: Update 'max_fds' handling
We don't need to update the 'max_fds' setting if it's already
higher than we need. And we should be issuing a debug message
when we did.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 18 Feb 2011 08:34:18 +0000 (09:34 +0100)]
multipathd: crash in reconfigure CLI command
The 'reconfigure' CLI command doesn't take the vector lock,
so if multipathd is processing a table / udev event at the
same time it'll crash.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Mon, 1 Feb 2010 08:46:57 +0000 (09:46 +0100)]
Add 'max_polling_interval' config variable
We should be able to set the 'max_polling_interval' variable
manually. Especially systems requiring precise failover timing
will want to disable the automatic polling interval increase.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Thu, 10 Feb 2011 09:35:28 +0000 (10:35 +0100)]
libmultipath: Only count UP and GHOST paths for prio update
When calculating the priority of a pathgroup we should be
counting only UP and GHOST paths; all other values might
be stale.
References: bnc#665289
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Thu, 2 Jul 2009 12:53:53 +0000 (14:53 +0200)]
Always synchronize with dm state
When running on iSCSI the connection might suffer intermediate
errors, causing the path to fail. But by the time the path checker
runs these errors will be cleared by the iSCSI internal connection
recovery, which means the daemon will never see any error and not
reinstated any failed paths.
References: bnc#447887
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Mon, 9 Feb 2009 13:04:25 +0000 (14:04 +0100)]
multipathd is not starting waitevent checker for single paths
After multipathd was started, any SCSI disks that would be added afterwards
would not trigger multipathd to create a waitevent thread.
The waitevent thread listens for kernel's offline/online events and thoroughly
checks what the kernel sees with what multipathd thinks and if something is
off,
whacks multipathd to the right state.
For devices which did not have a kernel device mapper helper (hp_sw, rdac,
etc) and only have one single path, when the link experiences a momentary blib
with I/O on it the path would be marked as failed _only_ by the kernel. This
event
would _not_ be propagated to multipathd (b/c it did not have a waitevent thread
create). Multipathd would only do the path checker which would provide a
PATH_UP event (rightly so - as the path would only be down for a second or so).
However, the device mapper path group would be marked as failed, and any
incoming I/O would be blocked (if queue_if_no_path was set) or fail.
The end result was the multipathd would think everything was peachy while the
kernel would be failing (or queueing) the I/O to the multipath device.
References: bnc#473841
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 21 Jan 2009 14:01:38 +0000 (15:01 +0100)]
More debugging output when synchronizing path states
When synchronizing path state we might end up removing a path.
However, if this path is not in state PATH_DOWN there is an
error somewhere. So modify the messages accordingly.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 12 May 2010 06:01:18 +0000 (08:01 +0200)]
libmultipath: Make path state message unique
There are two identical messages 'state =', so we
should modify them to have them better identified.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Mon, 11 Jan 2010 14:06:08 +0000 (15:06 +0100)]
Update dev_loss_tmo for no_path_retry
When 'no_path_retry' is active we have to update the dev_loss_tmo
setting accordingly to ensure that no devices are removed during
an all-paths-down scenario.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Mon, 26 Apr 2010 10:01:40 +0000 (12:01 +0200)]
Check for valid argument in update_multipath_strings()
We need to check for a valid argument here.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Thu, 27 May 2010 11:53:43 +0000 (13:53 +0200)]
libmultipath: Remove duplicate calls to path_offline()
When calling pathinfo() path_offline() is called several times
in a row, which is quite unnecessary.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 07:35:34 +0000 (09:35 +0200)]
Check for offline path in get_prio()
We need to check for an offline path in get_prio(), otherwise
the priority callout might stall.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 07:34:13 +0000 (09:34 +0200)]
Use state name in get_state()
Rather than to display the numerical value we should be returning
the path status string.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 07:32:15 +0000 (09:32 +0200)]
Only check offline status for SCSI devices
Only SCSI devices can be checked for offline status, so we
should return PATH_UP for every other type.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 18 May 2011 07:24:29 +0000 (09:24 +0200)]
libmultipath: zero out sense buffer in do_inq()
We should be zero out the sense buffer when doing an inquiry
so as not to have invalid contents being passed up.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 17 May 2011 12:46:56 +0000 (14:46 +0200)]
multipathd: fix memory issues in cli.c
Some memory issues in cli.c have been found by valgrind.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 17 May 2011 12:42:31 +0000 (14:42 +0200)]
multipathd: Remove handling of 'umount' events
umount uevents are gone, so we should be removing
handling for it, too.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 28 Apr 2009 09:05:39 +0000 (11:05 +0200)]
Safe memory allocation in cli_handlers
Valgrind pointed out that the memory returned from realloc() is
not initialized. So do that explicitely.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 12 Jan 2011 09:13:04 +0000 (10:13 +0100)]
multipathd: Fix uxlsnr race condition on shutdown
The multipath daemon deallocates some memory structures
upon shutdown which have been allocated in the thread
context of uxlsnr. Upon shutdown this thread is already
done for, taking it's memory structures with it.
So we need to establish a proper pthread cleanup
handler here to ensure the memory structures are
freed correctly.
Signed-off-by: Hannes Reinecke <hare@suse.de>