Benjamin Marzinski [Fri, 19 Nov 2010 03:33:18 +0000 (21:33 -0600)]
multipath: update default configurations
Here are some default configuration changes that I've been sent.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Sun, 14 Nov 2010 21:02:14 +0000 (15:02 -0600)]
multipath: clean up path orphaning and adoption
Make sure that multipathd orphans paths when they don't get included in maps,
to reset them to a consistent state, and make sure that multipath adopts paths
that get picked up during a table reload. However, multipathd shouldn't change
the state or priority of paths when it's updating due to a table reload,
since this can interfere with the checkerloop.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 10 Nov 2010 18:52:20 +0000 (12:52 -0600)]
multipath: sort all pathgroups by priority
Right now, the only path grouping policy that sorts pathgroups by priority is
group_by_prio. For the others, the pathgroups are setup in the order that
their paths are discovered. This can cause a problem when the kernel needs to
switch pathgroups, such as when a path fails. The kernel will simply pick the
first valid pathgroup. If failback isn't set to manual, this can cause
needless pathgroup switching. Even worse, with failback set to manual, which
is the default, the kernel will just continue to use the non-optimal pathgroup
until it fails. This patch makes all path grouping policies except multibus
sort the pathgroups by priority.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Malahal Naineni [Wed, 28 Jul 2010 00:46:19 +0000 (17:46 -0700)]
dm-multipath: typo, strncat instead of strncpy
strncat here doesn't seem right, strncpy should be correct!
Signed-off-by: Malahal Naineni (malahal@us.ibm.com)
PS: But why bother correcting an entry that is going to be deleted
anyway? IMO, just deleting the strncat/strncpy line should be fine too.
Any comments???
Hannes Reinecke [Fri, 18 Jun 2010 10:32:21 +0000 (12:32 +0200)]
Set geometry information for multipath maps
Some programs (most notably grub) try to get the device geometry
information via the HDIO_GETGEO ioctl. While device-mapper provides
this ioctl, it's values have to be set previously.
So we can just use the geometry information from the first path
to set this information for the multipath map.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Tim Harder [Sun, 28 Nov 2010 21:12:30 +0000 (13:12 -0800)]
libmultipath: fix snprintf buffer overflows
Hi,
I've attached a patch against the latest git version of multipath-tools
to fix a couple snprintf buffer overflows.
Thanks,
Tim
>From
2d51194853342ba24f3b76e9343b8467a8e55400 Mon Sep 17 00:00:00 2001
From: Tim Harder <radhermit@gentoo.org>
Date: Sun, 28 Nov 2010 12:37:08 -0800
Subject: [PATCH] libmultipath: fix snprintf buffer overflows
Arkadiusz Miskiewicz [Sat, 27 Nov 2010 18:21:21 +0000 (19:21 +0100)]
multipath-tools overflow
On Saturday 27 of November 2010, you wrote:
[...]
> the whole logarea is memset to 0 by logarea_init(), and each dequeued
> message is also memset to 0 by log_dequeue(), so it seems normal that
> msg->str value is 0x0, but it's really its address that matters.
Ok, got it. Pointers, memory areas in my debugging session - are looking
good then.
>
> It's not clear to me : are you actually hitting a bug or is it your
> debug session that puzzles you ?
I'm hitting a bug. multipathd dies for me at that strcpy(). Now I think
the bug is strcpy usage instead of memcpy because I'm building with
-O2 -D_FORTIFY_SOURCE=2 which turns on special glibc overflow
detection.
That detection seem to be smart enough to know that &str area is not
a string memory and aborts the program.
Found similar problem discussed here
http://sourceware.org/ml/binutils/2005-11/msg00308.html
glibc aborts the program:
[pid 13432] writev(2, [{"*** ", 4}, {"buffer overflow detected", 24},
{" ***: ", 6}, {"/home/users/arekm/rpm/BUILD/multipath-tools-0.4.9
/multipathd/multipathd", 71}, {" terminated\n", 12}], 5) = 117
same for valgrind:
**13436** *** strcpy_chk: buffer overflow detected ***: program terminated
==13436== at 0x4024997: VALGRIND_PRINTF_BACKTRACE (valgrind.h:4477)
==13436== by 0x40265F8: __strcpy_chk (mc_replace_strmem.c:781)
==13436== by 0x40EDC06: log_enqueue (string3.h:107)
==13436== by 0x40ED68A: log_safe (log_pthread.c:24)
==13436== by 0x40E296A: dlog (debug.c:36)
==13436== by 0x804ECEC: pidfile_create (pidfile.c:37)
==13436== by 0x804E731: main (main.c:1424)
The bug is not visible if I run multipathd in debug mode (-d).
This patch fixes the problem for me by avoiding false positive on strcpy_chk.
Christophe Varoqui [Fri, 5 Nov 2010 07:09:44 +0000 (08:09 +0100)]
Merge branch 'master' of git+ssh:///linux/storage/multipath-tools
Benjamin Marzinski [Fri, 5 Nov 2010 04:59:46 +0000 (23:59 -0500)]
multipath: fix multipath locking
In lock_multipath(), if multipathd fails halfway though locking the the path
devices, it doesn't unlock the ones that it already has locked. This
patch fixes that.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
Konrad Rzeszutek [Wed, 3 Nov 2010 22:13:02 +0000 (23:13 +0100)]
de-alloc the buf variable if we fail on the first do_rtpg call
Benjamin Marzinski [Thu, 2 Sep 2010 08:07:15 +0000 (10:07 +0200)]
Fix for bz #566685.
Multipathd now will reload the device if it notices that the
priorities have changed during the checkerloop.
Benjamin Marzinski [Thu, 2 Sep 2010 07:59:26 +0000 (09:59 +0200)]
Fix for bz #560892.
multipath now prints some warnings if it notices problems
with /etc/multipath.conf.
Malahal Naineni [Mon, 16 Aug 2010 22:57:02 +0000 (15:57 -0700)]
option to multipath to not modify the bindinfs file
initramfs is mounted read-write causing multipath to update the
initramfs bindings file and name all multipath devices it finds using
friendly names. The actual changes to the file are thrown away as they
are only written to the memory image rather than to the disk image. This
may cause the in memory updated initramfs bindings file inconsistent
with the actual bindings file in the active root file system image when
devices are added or removed.
In other words, the boot time updated initramfs bindings file may have
'uuid1 map to mpatha' and 'uuid2 map to mpathb', but the active root fs
bindings file may have 'uuid1 map to mpathb' and 'uuid2 map to mpatha'
The option, -B, will not modify the bindings file. It will only use the
bindings file if needed. This option to multipath should be used when
invoked in the initramfs context to avoid the inconsistency.
Signed-off-by: Malahal Naineni (malahal@us.ibm.com)
bmarzins@sourceware.org [Wed, 11 Aug 2010 23:18:45 +0000 (23:18 +0000)]
multipath-tools/multipath multipath.conf.5
CVSROOT: /cvs/dm
Module name: multipath-tools
Branch: RHEL5_FC6
Changes by: bmarzins@sourceware.org 2010-08-11 23:18:43
Modified files:
multipath : multipath.conf.5
Log message:
Fix for bz #599686
Small manpage clarification.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/multipath-tools/multipath/multipath.conf.5.diff?cvsroot=dm&only_with_tag=RHEL5_FC6&r1=1.1.2.3&r2=1.1.2.4
bmarzins@sourceware.org [Mon, 9 Aug 2010 21:35:58 +0000 (21:35 +0000)]
multipath-tools ./multipath.conf.defaults libm ...
CVSROOT: /cvs/dm
Module name: multipath-tools
Branch: RHEL5_FC6
Changes by: bmarzins@sourceware.org 2010-08-09 21:35:58
Modified files:
. : multipath.conf.defaults
libmultipath : hwtable.c
Log message:
Fix for bz565579 New default configurations.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/multipath-tools/multipath.conf.defaults.diff?cvsroot=dm&only_with_tag=RHEL5_FC6&r1=1.5.4.21&r2=1.5.4.22
http://sourceware.org/cgi-bin/cvsweb.cgi/multipath-tools/libmultipath/hwtable.c.diff?cvsroot=dm&only_with_tag=RHEL5_FC6&r1=1.20.2.28&r2=1.20.2.29
Jun'ichi Nomura [Fri, 30 Jul 2010 09:13:14 +0000 (18:13 +0900)]
multipath: add fast_io_fail and dev_loss_tmo config parameters
Hi,
(03/23/10 11:44), Benjamin Marzinski wrote:
> This patch adds two new configuration parameters to multipath.conf,
> fast_io_fail_tmo and dev_loss_tmo which set
>
> /sys/class/fc_remote_ports/rport-<host>:<channel>-<rport_id>/fast_io_fail_tmo and
> /sys/class/fc_remote_ports/rport-<host>:<channel>-<rport_id>/dev_loss_tmo
...
This is nice feature but the code uses scsi_id instead of rport_id:
> +sysfs_set_scsi_tmo (struct multipath *mpp)
...
> + vector_foreach_slot(mpp->paths, pp, i) {
> + if (safe_snprintf(attr_path, SYSFS_PATH_SIZE,
> + "/class/fc_remote_ports/rport-%d:%d-%d",
> + pp->sg_id.host_no, pp->sg_id.channel,
> + pp->sg_id.scsi_id)) {
> + condlog(0, "attr_path '/class/fc_remote_ports/rport-%d:%d-%d' too large", pp->sg_id.host_no, pp->sg_id.channel, pp->sg_id.scsi_id);
> + return 1;
> + }
So it sets fast_io_fail_tmo/dev_loss_tmo for wrong rport.
For example, I have a storage with node_id 0x2000003013842bcb
connected via switch, whose node_id is 0x100000051e09ee30.
When I set 'fast_io_fail_tmo = 8' in multipath.conf,
multipath command sets the timeout like this:
# for f in /sys/class/fc_remote_ports/rport-*/fast_io_fail_tmo; do d=$(dirname $f); echo $(basename $d):$(cat $d/node_name):$(cat $f); done
rport-0:0-0:0x100000051e09ee30:8
rport-0:0-1:0x100000051e09ee30:8
rport-0:0-2:0x2000003013842bcb:off
rport-0:0-3:0x2000003013842bcb:off
rport-1:0-0:0x100000051e09ee30:8
rport-1:0-1:0x100000051e09ee30:8
rport-1:0-2:0x2000003013842bcb:off
rport-1:0-3:0x2000003013842bcb:off
As a result, when a link is down for the storage and fast_io_fail_tmo
has passed, I/O will be still blocked.
Attached is a quick patch for this problem.
With this patch, fast_io_fail_tmo is set like this:
rport-0:0-0:0x100000051e09ee30:8
rport-0:0-1:0x100000051e09ee30:8
rport-0:0-2:0x2000003013842bcb:off
rport-0:0-3:0x2000003013842bcb:off
rport-1:0-0:0x100000051e09ee30:8
rport-1:0-1:0x100000051e09ee30:8
rport-1:0-2:0x2000003013842bcb:off
rport-1:0-3:0x2000003013842bcb:off
Others might have better idea about resolving rport_id from target.
Mike, Hannes, any comments?
Thanks,
--
Jun'ichi Nomura, NEC Corporation
rport_id != scsi_id
multipath should find rport_id from the target_id.
Heath Kehoe [Mon, 2 Aug 2010 18:33:59 +0000 (20:33 +0200)]
Build fixes
the -L argument for a library must come before the library itself
(otherwise it's going to link against the library in /usr/lib and
not the one in the build directory)
Hannes Reinecke [Tue, 20 Jul 2010 08:10:29 +0000 (10:10 +0200)]
libmultipath: simplify dm_get_name()
dm_get_name() should just return the device-mapper table
name corresponding to a given uuid. Happily the uuid is
another primary key in the device-mapper hash table, so
we can just use it directly and do not have to loop over
all existing tables.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 20 Jul 2010 08:10:25 +0000 (10:10 +0200)]
Fix typo in coalesce_paths
We allocate space for the alias, but never use it.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 20 Jul 2010 08:10:14 +0000 (10:10 +0200)]
libmultipath: always allocate space for alias
We should always allocating space for alias. This makes
freeing up and allocation tracking far easier.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Christof Schmitt [Thu, 22 Jul 2010 16:36:48 +0000 (18:36 +0200)]
multipath-tools: Intialize pointer passed to get_cmdvec
get_cmdvec can return before the vector argument has been initialized. Fix this
by initializing the pointer before passing it to get_cmdvec. This fixes a
segfault in the interactive mode when hitting the tab key directly on the
command prompt.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Christof Schmitt [Thu, 22 Jul 2010 16:36:47 +0000 (18:36 +0200)]
multipath-tools: Assign correct pointer from REALLOC
Assign the pointer returned from REALLOC to v->slot; this is the memory area to
be changed.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Christophe Varoqui [Sun, 18 Jul 2010 22:31:02 +0000 (00:31 +0200)]
Reorder final multipathd linking command to satisfy --as-needed ld flag
Some distributions use --as-needed by default, which make the linker
complain about unresolved dependencies if the libs are positioned
before the object files using their symbols.
Reported by ptchinster (at) archlinux (dot) us
Malahal Naineni [Wed, 23 Jun 2010 02:27:33 +0000 (19:27 -0700)]
Add alias_prefix to get multipath names based on storage type
The current multipath tools use "mpath" prefix for all LUNs when
user_friendly_names is set. It would be nice if the names are generated
based on the storage subsystem. For example, all EMC LUNs would be named
emc_a, emc_b, elm_c etc., and all IBM's SVC LUNs would be named svc_a,
svc_b, svc_c. This patch attempts to do that using only multipath.conf.
Patches can be added to the internal hardware table, if needed.
Signed-off-by: Malahal Naineni (malahal@us.ibm.com)
Christophe Varoqui [Sat, 22 May 2010 12:01:58 +0000 (14:01 +0200)]
Merge the 'path count' cli command with 'show status'.
Document 'show status' in the multipathd manpage.
Christophe Varoqui [Fri, 21 May 2010 14:41:30 +0000 (16:41 +0200)]
Prioritizers enhancement
1/ add the 'prio_args' config keyword to allow passing arguments
to the getprio function
2/ merge the datacore prioritizer. Adapt the legacy datacore
prioritizer callout to the libprio framework. First use of the
'prio_args'
3/ fix the 'show config' multipathd cli command to display the
prio and prio_args values. Also fix a bunch of other values
affected by the same bug (features, ...).
4/ update docs
5/ remove some heading whitespaces
6/ remove useless prioritizers include files
Benjamin Marzinski [Thu, 20 May 2010 04:00:45 +0000 (23:00 -0500)]
multipath: don't clear daemon setting on reconfigure
When you reconfigure multipathd, it needs to set the daemon flag in the
new config structure, so that the daemon only-code will still work.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 20 May 2010 03:59:57 +0000 (22:59 -0500)]
multipath: close sysfs file after setting value
We need to close the sysfs file after setting its value.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 20 May 2010 03:57:52 +0000 (22:57 -0500)]
multipath: fix fast_io_fail_tmo typo
The name of the sysfs file is actually fast_io_fail_tmo, not fast_io_fail
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Brian King [Wed, 19 May 2010 20:26:24 +0000 (15:26 -0500)]
multipath_tools: Fixup IBM Virtual SCSI hwtable entries
Removes a hwtable entry for a IBM Virtual SCSI vendor/device ID that
is no longer going to be released. Adds a new vendor/device ID for
a new IBM Virtual SCSI vendor/device ID. This is needed since path
switching on this device type is expensive, so we don't want to do
round robin. Additionally, the default health checker (direct IO)
does not work for this device type.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Benjamin Marzinski [Mon, 17 May 2010 19:04:27 +0000 (14:04 -0500)]
multipath: add "count paths" multipathd command
This adds a new multipathd command, "count paths". which returns information in
the format
Paths: <nr_of_paths>
Busy: <True|False>
where "Paths" is the number of monitored paths, and "Busy" is set when
multipathd is currently handling uevents. With this, it is possible to quickly
get the number of paths being monitored, as well as an idea if more paths may
be showing up shortly.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Mon, 17 May 2010 19:03:38 +0000 (14:03 -0500)]
multipath: add udev sync support.
device-mapper in now able to synchronize operations through udev. This patch
allows multipath and kpartx to make use of this feature. If kpartx is run with
"-s", it waits for the partitions to be created before returning. Multipath
will now always wait for the devices to be created before returning.
This feature requires dm_task_set_cookie() which was finalized in device-mapper
version 1.2.38
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 20 May 2010 04:46:34 +0000 (06:46 +0200)]
update redhat multipathd init script
This patch adds the requested force-stop and force-restart to the init
script. It also keeps the init script for printing annoying error messages,
while checking the root device.
Moger, Babu [Fri, 9 Apr 2010 20:30:40 +0000 (14:30 -0600)]
multipath_tools: minor rdac message fix and code cleanup
This patch fixes a rdac minor message issue. rdac message is not passed in one of the path down cases. This patch fixes this problem. Also I have re-arranged the code for better readability.
Signed-off-by: Babu Moger <babu.moger@lsi.com>
Reviewed-by: Vijay Chauhan <Vijay.Chauhan@lsi.com>
Moger, Babu [Tue, 13 Apr 2010 15:21:12 +0000 (09:21 -0600)]
multipath_tools: add alias while printing checker_message
This patch adds alias while printing checker messages.
Example of current Checker message.
Apr 11 04:03:53 mymachine multipathd: sde: rdac checker reports path is down
Most of the time "sde" is meaningless when debugging the past failures.
This patch add alias before the checker message..
Example of the new message..
Apr 12 16:55:54 mymachine multipathd: mpathb: sde - rdac checker reports path is down
Signed-off-by: Babu Moger <babu.moger@lsi.com>
Reviewed-by: Vijay Chauhan <Vijay.Chauhan@lsi.com>
Brian King [Fri, 26 Mar 2010 19:28:14 +0000 (14:28 -0500)]
multipath_tools: Add IBM Virtual SCSI ALUA device type to hwtable
Add entry to the hwtable for IBM Power Virtual SCSI ALUA devices
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Benjamin Marzinski [Fri, 26 Mar 2010 22:10:24 +0000 (17:10 -0500)]
multipath: move bindings file location
The current bindings file location (/var/lib/multipath/bindings) can be
problematic, since multipath can start up before /var/lib is mounted yet in
late boot. In this case, multipath will create it's own bindings file which
will be covered up by /var when it is mounted. This means that the device
names that you get on startup might be different from the device names that you
get when you run multipath on a system during normal operation. Since /etc is
always available when multipath starts up in late boot, moving the bindings
file there fixes the problem.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 25 Mar 2010 21:48:42 +0000 (16:48 -0500)]
multipath: fix offset for containted slices.
For contained slices, the offset of the new device should be from the start
of the containing device, which is what you are creating the new device on top
of. It should not be the offset from the start of the entire disk.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 25 Mar 2010 17:52:06 +0000 (12:52 -0500)]
multipath: don't let init script stop multipathd for root devices
This patch modifies the redhat init script, so that it doesn't allow
multipathd to be stopped when the root device is on it.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 25 Mar 2010 05:44:17 +0000 (00:44 -0500)]
multipath: patch checker consolidation
This patch does two things. First, it allows the tur checker to retry when it
fails with DID_TRANSPORT_DISRUPTED. Second, it makes both calls to check a path
use get_state, do avoid duplicated code.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Tue, 23 Mar 2010 22:04:00 +0000 (17:04 -0500)]
multipath: add queue_without_daemon config parameter
This patch adds a new multipath.conf default paramter, queue_without_daemon.
If this is set to "no", when multipathd stops, queueing will be turned off for
all devices. This is useful for devices that set no_path_retry. If a machine
is shut down while all paths to a device are down, it is possible to hang
waiting for IO to return from the device after multipathd has been stopped.
Without multipathd running, access to the paths cannot be restored, and the
kernel cannot be told to stop queueing IO. Setting queue_without_daemon to "no"
makes multipathd turn off queueing on all devices when it stops, avoiding the
problem.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Tue, 23 Mar 2010 19:48:43 +0000 (14:48 -0500)]
multipath: add some default configurations.
This patch adds some default configurations that have been requested.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Christophe Varoqui [Thu, 25 Mar 2010 19:16:23 +0000 (20:16 +0100)]
Correct whitespace before tabs warnings
Benjamin Marzinski [Tue, 23 Mar 2010 02:44:39 +0000 (21:44 -0500)]
multipath: add fast_io_fail and dev_loss_tmo config parameters
This patch adds two new configuration parameters to multipath.conf,
fast_io_fail_tmo and dev_loss_tmo which set
/sys/class/fc_remote_ports/rport-<host>:<channel>-<rport_id>/fast_io_fail_tmo and
/sys/class/fc_remote_ports/rport-<host>:<channel>-<rport_id>/dev_loss_tmo
for all the capable paths in a multipath device.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Nikanth Karthikesan [Wed, 17 Mar 2010 07:14:25 +0000 (12:44 +0530)]
multipath: display average priority as group priority
Display avg priority as group priority
Now average priority is used as path group priority, instead of sum of
priorities of the paths. But while displaying group priority, sum is
being displayed. Change it to print the average priority.
When there are no enabled paths, print 0 as priority.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Colin Watson [Fri, 12 Feb 2010 12:18:59 +0000 (12:18 +0000)]
Honour ALUA preference indicator
SPC defines the preference indicator (bit 7 of the first byte returned
by REPORT TARGET PORT GROUPS) as indicating a preferred primary target
port group, and says that applications may use it to influence path
selection. Choose TPGs with this bit set over TPGs with it unset.
This fixes failback handling with the Intel Modular Server.
Signed-off-by: Yingying Zhao <yingying.zhao@intel.com>
Signed-off-by: Colin Watson <cjwatson@canonical.com>
Christophe Varoqui [Sat, 6 Feb 2010 22:21:01 +0000 (23:21 +0100)]
Mail address change
Rumko [Fri, 5 Feb 2010 20:02:23 +0000 (21:02 +0100)]
Latest git -master is not compilable
--Boundary-01=_PlHbLmcCyk7NmgQ
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
Hi!
In latest git -master on line 1443 of multipathd/main.c lock() is called on=
=20
exit_mutex, but since exit_mutex is a pthread_mutex_t, pthread_mutex_lock()=
=20
is needed.
Attached is the one-liner patch, tested it on a gentoo machine and seems to=
be=20
working.
=2D-=20
Regards,
Rumko
From
a6bf54d588c2d0c9d3a97541bcb7b605fd1f3ae0 Mon Sep 17 00:00:00 2001
From: Rumko <rumcic@gmail.com>
Date: Fri, 5 Feb 2010 20:59:21 +0100
Subject: [PATCH] Use pthread_mutex_lock() instead of lock() since we are dealing with a
mutex directly.
Hannes Reinecke [Wed, 3 Feb 2010 13:22:22 +0000 (14:22 +0100)]
Update path_offline() to return device status
A SCSI device can have for more states than just 'offline' and
'running'. In fact, any device _not_ in state 'running' is
unaccessible to I/O, so running a path checker on these devices
will cause the checker to be delayed and hence stall the entire
daemon.
This patch updates the path_offline() function to return the
actual device state. Path checkers will only be run if the
state is PATH_UP. A 'blocked' device state will be translated
into PATH_PENDING, causing the checkerloop to skip this device
and recheck as soon as possible.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Wed, 3 Feb 2010 14:26:11 +0000 (15:26 +0100)]
[libmultipath] the entry for EMC Envista is broken
It is referring to an unknown field 'getprio'.
Fix that up and use 'prio_name = DEFAULT_PRIO' instead
Alex Zeffertt [Wed, 3 Feb 2010 12:29:37 +0000 (13:29 +0100)]
[multipathd] Fix pthread bug in multipath-tools.
You should lock the mutex before doing a pthread_cond_wait otherwise
undefined results occur. In fact we get away with this with glibc,
but with uclibc it causes a segfault.
Benjamin Marzinski [Tue, 26 Jan 2010 13:01:49 +0000 (14:01 +0100)]
Add hardware defaults for EMC Invista
Christophe Varoqui [Fri, 22 Jan 2010 17:45:39 +0000 (18:45 +0100)]
Add nexenta comstart hardware defaults.
Testing was sponsored by Alyseo SARL.
Christophe Varoqui [Fri, 22 Jan 2010 11:26:39 +0000 (12:26 +0100)]
Add checks before use conf->xxx now that they can be null
Christophe Varoqui [Fri, 22 Jan 2010 09:51:29 +0000 (10:51 +0100)]
Add %z path wildcard to display path serial
Christophe Varoqui [Thu, 21 Jan 2010 22:46:19 +0000 (23:46 +0100)]
[lib] don't pretend config file has setup parameters
we already have fallbacks coming from propsel.c functions.
This change make 'multipath -v3' correctly report what set
the value of a parameter (mpe, hwe, cf or internal)
Christophe Varoqui [Thu, 21 Jan 2010 22:26:25 +0000 (23:26 +0100)]
[git] Ignore libmultipath/libmultipath.so.0
Guido Günther [Sun, 10 Jan 2010 17:14:59 +0000 (18:14 +0100)]
Dots are special in groff
so don't start a line with it or the rest of the line will be dropped.
Benjamin Marzinski [Thu, 3 Dec 2009 01:45:56 +0000 (19:45 -0600)]
multipath-tools: fix soname
I assume that the soname wasn't supposed to have a 'd' at the end.
-Ben
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Christophe Varoqui [Sun, 22 Nov 2009 00:16:00 +0000 (01:16 +0100)]
[multipathd] give ev_add_path() separated failure messages
when the multipath already exists and
1/ new path size is 0
2/ new path size is different than the multipath known size
as per Chandra Seetharaman recommendation.
Chauhan, Vijay [Fri, 20 Nov 2009 15:08:18 +0000 (20:38 +0530)]
multipath-tools: multipath should allow only path with valid size to get added in the map
Hi,
If READ_CAPACITY fails during device discovery, sd device gets attached with device size 0. Currently multipath discover these paths and adds into the map. RDAC patch checker sends inquiry on each path to check path status, which eventually marks this path as up. If this path is from owning controller then mode select will be issued to switch the pathgroup. But any I/O sent to this path(path with size 0) will eventually fail in sd_prep_fn due to incorrect device size and resulting to ping pong between pathgroups. We should only allow valid paths to get added in the map. Below patch checks two cases before adding paths; i.e.:
1) device size of path is not 0
2) there is no mismatch between mpp size and new path size.
Thanks,
Vijay
----
multipath should only add paths with valid size to the map. If there is mismatch between map and path size it should not be added. This patch also check if the device size is not 0 before adding path. During device discovery if READ_CAPACITY fails, sd device get attached with device size 0. multipath should not allow the such device to get added in the map.
Signed-off-by: Vijay Chauhan <vijay.chauhan@lsi.com>
Guido Günther [Sat, 21 Nov 2009 20:29:52 +0000 (21:29 +0100)]
multipath-toosl: Use current name of the divice node ($name)
instead of the kernel's name ($kernel). Otherwise we might end up
looking at a wrong or nonexistant node.
Guido Günther [Sat, 21 Nov 2009 17:10:10 +0000 (18:10 +0100)]
multipath-tools: fix udev rule for dmraid
Hi,
attach patch fixes the udev rule for dmraid by not abusing the mpath
prefix. It also drops the superflous path from kpartx_id.
Cheers,
-- Guido
From: Hannes Reinecke <hare@suse.de>
Date: Tue, 24 Jun 2008 16:38:37 +0200
Subject: [PATCH] Fix udev rules for dmraid
The kpartx_id program is located under /lib/udev, so we don't need to
call it with the full pathname.
And we should also create persistent links for dmraid tables.
Guido Günther [Sat, 21 Nov 2009 17:19:55 +0000 (18:19 +0100)]
multipath-tools: add library dependencies
Hi,
attached patch adds dependent libraries when building the shared lib.
This allows other tools like dpkg-shlibdeps to deduce the needed
dependencies automatically.
Cheers,
-- Guido
From: =?UTF-8?q?Guido=20G=C3=BCnther?= <agx@sigxcpu.org>
Date: Sun, 30 Aug 2009 14:18:21 +0200
Subject: [PATCH] add library dependencies
Guido Günther [Sat, 21 Nov 2009 17:17:20 +0000 (18:17 +0100)]
multipath-tools: add a soname to the library
Hi,
attached patch adds a fake soname to the created lib making tools such
as lintian happy. I can keep this debian specific if need be but having
a soname certainly won't hurt.
Cheers,
-- Guido
From: =?UTF-8?q?Guido=20G=C3=BCnther?= <agx@sigxcpu.org>
Date: Sun, 30 Aug 2009 14:30:34 +0200
Subject: [PATCH] set a soname
Guido Günther [Sat, 21 Nov 2009 17:15:25 +0000 (18:15 +0100)]
multipath-tools: check header file instead of intalled lib
Hi,
attached patch checks the header file instead of an installed lib for
dm_task_struct. Since distros have this lib add different patchs the
check should be more reliable.
Cheers,
-- Guido
From: =?UTF-8?q?Guido=20G=C3=BCnther?= <agx@sigxcpu.org>
Date: Sun, 30 Aug 2009 13:38:55 +0200
Subject: [PATCH] check header file for defintion of dm_task_no_flush
instead of checking the so for the symbol (which seems to be hard to
find).
Guido Günther [Sat, 21 Nov 2009 17:12:26 +0000 (18:12 +0100)]
multipath-tools: fix path to FAQ
Hi,
attached patch fixes the URL to the FAQ.
Cheers,
-- Guido
From: Vincent McIntyre <Vincent.McIntyre@csiro.au>
Date: Fri, 9 Jan 2009 18:18:46 +0100
Subject: [PATCH] fix URL to FAQ
Benjamin Marzinski [Tue, 20 Oct 2009 19:49:47 +0000 (14:49 -0500)]
multipath-tools: Default configuration changes
Add support for some more MSA arrays.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Tue, 20 Oct 2009 19:48:54 +0000 (14:48 -0500)]
multipath-tools: Minor doc fix
The polling_interval increases caused some confusion, so I'm adding it
to the documentation.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Tue, 20 Oct 2009 18:20:42 +0000 (13:20 -0500)]
multipath-tools: Change default configs to look for "hp_sw" not "hp-sw"
The kernel hp_sw handler now wants the hwhandler string to be "hp_sw", not
"hp-sw". This fixes the default configs to match.
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Chauhan, Vijay [Tue, 20 Oct 2009 13:09:15 +0000 (18:39 +0530)]
multipath-tools: rdac path checked leads to I/O hang when volumes are unmapped from storage.
Hi,
We are seeing I/O hang when volumes (configured with rdac path checker) are unmapped from the storage. Expected is I/O should fail. Please see the syslog snippet below:
Oct 15 14:40:43 linux kernel: sd 6:0:0:0: queueing MODE_SELECT command.
Oct 15 14:40:43 linux kernel: sd 6:0:0:0: MODE_SELECT failed with sense 0x59100.
Oct 15 14:40:43 linux kernel: end_request: I/O error, dev sdh, sector
18509824
Oct 15 14:40:43 linux kernel: device-mapper: multipath: Failing path 8:112.
Oct 15 14:40:43 linux kernel: end_request: I/O error, dev sdh, sector 0
Oct 15 14:40:43 linux kernel: end_request: I/O error, dev sdh, sector
18512896
Oct 15 14:40:43 linux kernel: end_request: I/O error, dev sdh, sector
18511872
Oct 15 14:40:43 linux kernel: end_request: I/O error, dev sdh, sector
18510848
Oct 15 14:40:43 linux multipathd: 8:112: mark as failed
Oct 15 14:40:43 linux multipathd:
3600a0b800029ea52000097bc4acde51e: Entering recovery mode: max_retries=30
Oct 15 14:40:44 linux multipathd:
3600a0b800029eb0a0000f2af4acde4d1: queue_if_no_path enabled
Oct 15 14:40:44 linux multipathd:
3600a0b800029eb0a0000f2af4acde4d1: Recovered to normal mode
Oct 15 14:40:44 linux kernel: sd 5:0:0:1: queueing MODE_SELECT command.
Oct 15 14:40:44 linux kernel: sd 5:0:0:1: MODE_SELECT failed with sense 0x59100.
Oct 15 14:40:44 linux kernel: end_request: I/O error, dev sdc, sector 0
Oct 15 14:40:44 linux kernel: device-mapper: multipath: Failing path 8:32.
Oct 15 14:40:44 linux kernel: end_request: I/O error, dev sdc, sector
16089088
Oct 15 14:40:44 linux kernel: end_request: I/O error, dev sdc, sector
16090112
Oct 15 14:40:44 linux kernel: end_request: I/O error, dev sdc, sector
16091136
Oct 15 14:40:44 linux kernel: end_request: I/O error, dev sdc, sector
16092160
Oct 15 14:40:44 linux multipathd: 8:32: mark as failed
Below is the patch that fix this issue. When devices are unmapped from storage, rdac patch checker sets the path state for those devices as ghost. As a reason dm issues mode select to failover path group and fails with 0x59100, which eventually ends up with ping pong between path groups resulting in I/O hang. In rdac path checker, we need to check if devices are not connected, mark it as failed. This patch adds check for Peripheral Qualifier (PQ) & Peripheral Device type(PDT) of Inquiry data and fails the path if either 1) PQ is set to 0x1 or 2) PQ set to 0x11 and PDT set to 0x1F.
Signed-off-by: Vijay Chauhan <vijay.chauhan@lsi.com>
Reviewed-by: Babu Moger <babu.moger@lsi.com>
Benjamin Marzinski [Mon, 28 Sep 2009 17:19:00 +0000 (12:19 -0500)]
multipath-tools: Fix dry-run output
When multipath checks whether or not the multipath device needs to be renamed,
it only does the check if dry-run isn't selected. This means that you will
instead see all your renames as creates during a dry-run. The attached patch
fixes this.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Christophe Varoqui [Fri, 2 Oct 2009 20:06:21 +0000 (22:06 +0200)]
[lib] remove spurious and obsolete kdev_t.h include in directio checker
Benjamin Marzinski [Fri, 2 Oct 2009 20:01:24 +0000 (22:01 +0200)]
[kpartx] make kpartx deal with more than 256 minor numbers
Fix for bz #526550. Fix kpartx MAKEDEV macro so it can deal with more
than 256 minor numbers.
Kiyoshi Ueda [Fri, 18 Sep 2009 01:37:53 +0000 (10:37 +0900)]
queue-length/service-time path selectors map parser fix
Actual device configuration seems to be working fine, but the getting
information from the configured device seems to be failing due to
table parsing problem in disassemble_map().
I guess the attached patch works around the problem. Please try it.
Please note that this patch may *NOT* be a complete fix.
Some more codes for new dynamic load balancers may be needed.
Benjamin Marzinski [Wed, 2 Sep 2009 19:40:17 +0000 (14:40 -0500)]
multipath-tools: Fix rtpg buffer length calculation
Since you use scsi_buflen to allocate the correct size for the
SCSI RTPG request buffer, You need to have the "+4" in its
calculation, not just in the check to see if the request buffer is
already big enough, otherwise, you'll fail the check, but you
won't allocate a new buffer that is big enough.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Ritesh Raj Sarraf [Mon, 31 Aug 2009 05:12:41 +0000 (10:42 +0530)]
Update multipathd manpage
New commands added to the multipathd -k command mode.
Document them in the manpage
Signed-off-by: Ritesh Raj Sarraf <rsarraf@netapp.com>
Benjamin Marzinski [Mon, 31 Aug 2009 20:38:42 +0000 (22:38 +0200)]
[doc] comment to not set the alias config option to mpath<n>
... since it will conflict with user_friendly_names.
(redhat bz #481239)
Bryn M.Reeves [Sun, 30 Aug 2009 20:36:22 +0000 (22:36 +0200)]
[FAQ] document directio error when aio-max-nr is exhausted
Charlie Brady [Sun, 30 Aug 2009 20:30:47 +0000 (22:30 +0200)]
[lib] Add default config for Sun StorageTek 2500 and 2530
Yanqing Liu [Sun, 30 Aug 2009 20:27:50 +0000 (22:27 +0200)]
[lib] Add Dell 32xx/i support into hardware table
Benjamin Marzinski [Mon, 3 Aug 2009 22:03:22 +0000 (17:03 -0500)]
multipath-tools: miscellaneous code cleanups
io_getevents can return < 0 if it is interrupted, but it doesn't set errno.
This patch sets errno to zero first to avoid printing garbage. Also the
log_thread and uevq_thread functions need to return NULL to avoid compiler
warnings.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Mon, 3 Aug 2009 22:02:29 +0000 (17:02 -0500)]
multipath-tools: install libraries to /lib64 where appriopriate
Multipath currently installs all of it's libraries to /lib, even on 64-bit
machines with a /lib64 directory. With this patch, multipath will install
the libraries under /lib64 if it exists. This can be overridden by running
# make LIB=<lib>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Mon, 3 Aug 2009 22:01:33 +0000 (17:01 -0500)]
multipath-tools: uninstall libraries correctly
The unistall action for checker libraries doesn't work. Also, the uninstall
action for the prioritizer libraries runs the risk of uninstalling something
that we didn't install. This patch changes them to correctly uninstall the
files listed in LIBS.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Mon, 3 Aug 2009 22:00:03 +0000 (17:00 -0500)]
multipath-tools: Log all messages
When the log thread pulls the last message off the buffer, it sets
la->empty. However, then it returns la->empty, which means that the log is
empty, instead of 0, which means that it found a message. This means that
multipathd is not logging the last message in the buffer when the log threa
runs. This patch changes the return code, so that multipathd logs all the
messages in the buffer.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Mon, 3 Aug 2009 21:59:05 +0000 (16:59 -0500)]
multipath-tools: Fix uevent handling code
Multipathd wasn't setting buflen when it read in a uevent message. This was
causing buflen to be used unitialized, and would often keep multipathd from
processing uevents. This patch correctly initializes buflen to the size of the
buffer received.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Christophe Varoqui [Tue, 4 Aug 2009 21:29:29 +0000 (23:29 +0200)]
[lib] correct tab-before-whitespace errors introduce by the prev commit
Chandra Seetharaman [Thu, 30 Jul 2009 20:13:55 +0000 (13:13 -0700)]
Use Average path priority value for path switching
Hi Christophe,
I submitted this patch on Jul 2
(http://marc.info/?l=dm-devel&m=
124658334721911&w=2). Resending it.
Only change is a field name from up_paths to enabled_paths.
Hi Hannes,
Need an ACK from you :-).
regards,
chandra
-----------------------------------------------------------------------
Failback happens only when the sum of priorities of all paths
(on the higher priority path group) is greater than the sum
of priorities of all paths on the lower priority path group.
This leads into problems when there are more than one paths
in each of the path groups, and the sum of all paths in the
lower priority path group is greater than that of path priority
of a single high priority path.
This patch fixes the problem by using average priority of
the path group to decide on which path group to switch over.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Ritesh Raj Sarraf [Wed, 29 Jul 2009 12:37:03 +0000 (18:07 +0530)]
Fix minor error in multipath.conf manpage
Signed-off-by: Ritesh Raj Sarraf <rsarraf@netapp.com>
Chauhan, Vijay [Tue, 28 Jul 2009 12:51:13 +0000 (18:21 +0530)]
multipath-tools: no_path_retry for time based queuing is queuing I/O for ever
Even though no_path_retry is set for time based queuing(i.e no_path_retry <N>), I/O is getting queued for ever. During all path failure condition, setup_feature() resets no_path_retry of multipath structure to NO_PATH_RETRY_QUEUE which queues I/O for ever. This patch skips resetting no_path_retry until no_path_retry is set with queue.
Signed-off-by: Vijay Chauhan <vijay.chauhan@lsi.com>
Stefan Weinhuber [Tue, 7 Jul 2009 21:51:29 +0000 (23:51 +0200)]
[kpartx] add DASD large volume support
With kernel 2.6.30 the DASD device driver supports devices with more
then 65520 cylinders. As the traditional record layouts and hardware
interfaces only allow for 16-bit cylinder values, the new larger
cylinder addresses have to be partially encoded into the head part
of a cylinder/head address. To make things complicated, old kernels
will recognize a large volume device, but with a maximum of 65535
cylinders, so a large volume that has been formatted with old tools
will only be partially formatted. To handle these issues our disk
layouts and partition detection code had to be extended, and to use
large volumes with the multipath tools, the DASD partition detection
code in kpartx needs to be extended as well.
compatible disk layout (VOL1 label):
We use the same address encoding as the hardware interfaces.
To prevent old tools and kernels from misinterpreting the encoded
partition sizes, the new VTOC entries have the format number 8
instead of 1.
linux disk layout (LNX1 label):
Here we will still create one partition for the whole disk.
To make sure that the whole disk has been formatted, large volumes
use a new version of the disk label, which contains the number of
formatted blocks. If the disk contains an old volume label, we know
it was formatted with the number of cylinders as reported by the
HDIO_GETGEO ioctl.
CMS disk layout (CMS1 label):
Already contains the number of formatted blocks in the label, we
just have to use it.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Charlie Brady [Wed, 24 Jun 2009 22:24:00 +0000 (00:24 +0200)]
[lib] add defaults for SUN/LSI LCSM100_I
same as LCSM100_F
Charlie Brady [Wed, 24 Jun 2009 22:06:03 +0000 (00:06 +0200)]
[lib] rdac checker message fix
when unplugging a link to the controller, that the path is seen as down,
but then is reported again as down when the link is restored.
Here's a sample log:
Jun 23 14:12:25 sun4150node1 iscsid: connection1:0 is operational after
recovery (4 attempts)
Jun 23 14:13:10 sun4150node1 kernel: ping timeout of 10 secs expired, last
rx 172050, last ping 177050,
now 187050
Jun 23 14:13:10 sun4150node1 kernel: connection1:0: iscsi: detected conn
error (1011)
Jun 23 14:13:11 sun4150node1 multipathd: sdb: rdac checker reports path is
down
Jun 23 14:13:11 sun4150node1 multipathd: checker failed path 8:16 in map
mpath0
Jun 23 14:13:11 sun4150node1 kernel: device-mapper: multipath: Failing
path 8:16.
Jun 23 14:13:11 sun4150node1 multipathd: mpath0: remaining active paths: 1
Jun 23 14:13:11 sun4150node1 multipathd: dm-2: add map (uevent)
Jun 23 14:13:11 sun4150node1 multipathd: dm-2: devmap already registered
Jun 23 14:13:12 sun4150node1 iscsid: Kernel reported iSCSI connection 1:0
error (1011) state (3)
Jun 23 14:13:37 sun4150node1 multipathd: sdb: rdac checker reports path is
down
Jun 23 14:13:37 sun4150node1 multipathd: 8:16: reinstated
Jun 23 14:13:37 sun4150node1 multipathd: mpath0: remaining active paths: 2
Notice the first message at 14:13:37.
Brian King [Thu, 4 Jun 2009 18:10:38 +0000 (13:10 -0500)]
multipath_tools: Add IBM Virtual SCSI to hwtable
Add entry to the hwtable for IBM Power Virtual SCSI devices.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Benjamin Marzinski [Thu, 14 May 2009 04:38:28 +0000 (23:38 -0500)]
multipath-tools: Improvement to max_fds
Setting max_fds to unlimited doesn't actually work. In the kernel, there is a
fixed limit to the maximum number of open fds a process can have. If you try
to set max_fds to greater than this, it fails. This patch replaces the special
value "unlimited" with a new special value, "max". If you set max_fds to "max",
multipath will use the actual system limit, which it looks up from
/proc/sys/fs/nr_open.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 30 Apr 2009 16:31:37 +0000 (11:31 -0500)]
Add uid, gid, and mode config attributes
This adds the ability to set uid, gid, and mode for the multipath device
nodes. Also, kpartx created device nodes will inherit the uid, gid, and
mode of their parent device. These attributes can be set in either the
defaults or the multipaths section of /etc/multipath.conf
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Mike Snitzer [Wed, 29 Apr 2009 19:26:32 +0000 (15:26 -0400)]
multipathd: restrict /var/run/multipathd.sock permissions further
Use a more restrictive umask for /var/run/multipathd.sock
Group and Other do not need to access the socket.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Mike Snitzer [Wed, 29 Apr 2009 19:26:16 +0000 (15:26 -0400)]
fix small issues in cli_handlers
- properly check cli_list_wildcards()'s MALLOC returned pointer
- add missing newline to "blacklisted" reply
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Mike Snitzer [Wed, 29 Apr 2009 19:25:42 +0000 (15:25 -0400)]
cleanup various MALLOC/REALLOC callers
- alloc_hwe and alloc_mpe should've been used
- MALLOC and REALLOC returned pointer must be checked before use
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Mike Snitzer [Wed, 29 Apr 2009 19:25:09 +0000 (15:25 -0400)]
do not allow relative path names to be added to the pathvec
CVE-2009-0115 taught us that such paths should not be tolerated
Signed-off-by: Mike Snitzer <snitzer@redhat.com>