Gerd Hoffmann [Fri, 30 Nov 2012 10:54:44 +0000 (11:54 +0100)]
uas: improve device reset
Add new function to unlink and abort requests from the work
list, call it on bus reset and disconnect where we kill all
in-flight urbs. Also reorder calls in disconnect to first
cancel transfers, then remove the scsi hba.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gerd Hoffmann [Fri, 30 Nov 2012 10:54:43 +0000 (11:54 +0100)]
uas: improve abort handler
Two changes. First we check whenever the request is linked in the work
list and if so take it out. Second check whenever the command is
actually in flight before asking the device to cancel it via task
management, and in case it isn't just zap the data urbs and finish it.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gerd Hoffmann [Fri, 30 Nov 2012 10:54:42 +0000 (11:54 +0100)]
uas: add IS_IN_WORK_LIST flag
Keep track whenever the request is linked into the work list or not.
Needed for request abort.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gerd Hoffmann [Fri, 30 Nov 2012 10:54:41 +0000 (11:54 +0100)]
uas: add UNLINK_DATA_URBS flag
uas_unlink_data_urbs uses this to make sure the the scsi command is
not released while looking at it. This will be needed when we start
calling uas_unlink_data_urbs in the request cancel code paths.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gerd Hoffmann [Fri, 30 Nov 2012 10:54:40 +0000 (11:54 +0100)]
uas: new function to cancel data urbs
Add uas_unlink_data_urbs function to cancel in-flight data urbs.
Moves existing code into a separate function.
[ v2: also drop the locking, just call usb_unlink_urb no matter what,
which is safe because the usb core guarantees the completion
callback is called only once ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dongjin Kim [Fri, 7 Dec 2012 20:18:44 +0000 (05:18 +0900)]
USB: misc: Add USB3503 High-Speed Hub Controller
This patch adds new driver of SMSC USB3503 USB 2.0 hub controller with HSIC
upstream connectivity and three USB 2.0 downstream ports. The specification
can be found from 'http://www.smsc.com/index.php?tid=295&pid=325'.
The current version have been tested very basic features switching the modes,
HUB-MODE and STANDBY-MODE.
Signed-off-by: Dongjin Kim <tobetter@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Mon, 7 Jan 2013 18:14:23 +0000 (10:14 -0800)]
Merge tag 'for-usb-next-2013-01-03' of git://git./linux/kernel/git/sarah/xhci into usb-next
Sarah writes:
usb-next: Further warm reset improvements
Hi Greg,
Here's some patches for 3.9. They further improve the warm reset
error handling, but they're too big to go into stable. There's also a
patch to remove an unused variable in the xHCI driver.
As I mentioned, you'll need to merge usb-linus into usb-next before
applying these patches.
Sarah Sharp
Greg Kroah-Hartman [Mon, 7 Jan 2013 18:12:35 +0000 (10:12 -0800)]
Merge tag 'for-usb-linus-2013-01-03' of git://git./linux/kernel/git/sarah/xhci into usb-linus
Sarah says:
usb-linus: USB core fixes for warm reset
Hi Greg,
Happy New Year! Here's some bug fixes for 3.8. I have usb-next
patches that are based on this set, so please merge your usb-linus
branch into usb-next after this set is applied.
The bulk of the patchset (patches 2-7) improve the USB core's warm
reset error handling.
There's also one patch that fixes an arithmetic error in the xHCI
driver, and another to avoid the "dead ports" issue caused by
unhandled port status change events.
These are all marked for stable.
Sarah Sharp
Greg Kroah-Hartman [Mon, 7 Jan 2013 18:09:49 +0000 (10:09 -0800)]
Merge tag 'fixes-for-v3.8-rc2' of git://git./linux/kernel/git/balbi/usb into usb-linus
Felipe says:
usb: fixes for v3.8-rc2
Here is the first set of fixes for v3.8-rc cycle.
There is a build fix for musb's dsps glue layer caused
by some header cleanup on the OMAP tree.
Marvel's USB drivers got a fix up for clk API usage
switching over to clk_prepare() calls.
u_serial has a bug fix for a missing wake_up() which
would make gs_cleanup() wait forever for gs_close()
to finish.
A minor bug fix on dwc3's debugfs interface which
would make us read wrong addresses when dumping
all registers.
dummy_hcd learned how to enumerate g_multi.
s3c-hsotg now understands that we shouldn't kfree()
memory allocated with devm_*.
Other than that, there are a bunch of other minor fixes
on renesas_usbhs, tcm_usb_gadget and amd5536udc.
All patches have been pending on mailing for many weeks
and shouldn't cause any problems.
Javier Martinez Canillas [Sun, 16 Dec 2012 03:21:31 +0000 (04:21 +0100)]
usb: host: xhci: remove unused trb var in xhci_irq()
The union xhci_trb *trb variable is defined and assigned
inside the xHCI IRQ handler function but is never used.
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Sarah Sharp [Tue, 11 Dec 2012 18:14:03 +0000 (10:14 -0800)]
USB: Refactor hub_port_wait_reset.
Refactor hub_port_wait_reset into a small loop to wait for the port
reset to be complete, and then a larger block to deal with the final
port status. This patch should not change any current behavior.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Sarah Sharp [Mon, 10 Dec 2012 23:43:08 +0000 (15:43 -0800)]
USB: Use helper function hub_set_port_link_state
Change the code that manually issues a Set Port Feature(Link State) to
use the new helper function hub_set_port_link_state().
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Sarah Sharp [Thu, 1 Nov 2012 18:20:44 +0000 (11:20 -0700)]
USB: Fix connected device switch to Inactive state.
A USB 3.0 device can transition to the Inactive state if a U1 or U2 exit
transition fails. The current code in hub_events simply issues a warm
reset, but does not call any pre-reset or post-reset driver methods (or
unbind/rebind drivers without them). Therefore the drivers won't know
their device has just been reset.
hub_events should instead call usb_reset_device. This means
hub_port_reset now needs to figure out whether it should issue a warm
reset or a hot reset.
Remove the FIXME note about needing disconnect() for a NOTATTACHED
device. This patch fixes that.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Sarah Sharp [Thu, 1 Nov 2012 18:20:44 +0000 (11:20 -0700)]
USB: Rip out recursive call on warm port reset.
When a hot reset fails on a USB 3.0 port, the current port reset code
recursively calls hub_port_reset inside hub_port_wait_reset. This isn't
ideal, since we should avoid recursive calls in the kernel, and it also
doesn't allow us to issue multiple warm resets on reset failures.
Rip out the recursive call. Instead, add code to hub_port_reset to
issue a warm reset if the hot reset fails, and try multiple warm resets
before giving up on the port.
In hub_port_wait_reset, remove the recursive call and re-indent. The
code is basically the same, except:
1. It bails out early if the port has transitioned to Inactive or
Compliance Mode after the reset completed.
2. It doesn't consider a connect status change to be a failed reset. If
multiple warm resets needed to be issued, the connect status may have
changed, so we need to ignore that and look at the port link state
instead. hub_port_reset will now do that.
3. It unconditionally sets udev->speed on all types of successful
resets. The old recursive code would set the port speed when the second
hub_port_reset returned.
The old code did not handle connected devices needing a warm reset well.
There were only two situations that the old code handled correctly: an
empty port needing a warm reset, and a hot reset that migrated to a warm
reset.
When an empty port needed a warm reset, hub_port_reset was called with
the warm variable set. The code in hub_port_finish_reset would skip
telling the USB core and the xHC host that the device was reset, because
otherwise that would result in a NULL pointer dereference.
When a USB 3.0 device reset migrated to a warm reset, the recursive call
made the call stack look like this:
hub_port_reset(warm = false)
hub_wait_port_reset(warm = false)
hub_port_reset(warm = true)
hub_wait_port_reset(warm = true)
hub_port_finish_reset(warm = true)
(return up the call stack to the first wait)
hub_port_finish_reset(warm = false)
The old code didn't want to notify the USB core or the xHC host of device reset
twice, so it only did it in the second call to hub_port_finish_reset,
when warm was set to false. This was necessary because
before patch two ("USB: Ignore xHCI Reset Device status."), the USB core
would pay attention to the xHC Reset Device command error status, and
the second call would always fail.
Now that we no longer have the recursive call, and warm can change from
false to true in hub_port_reset, we need to have hub_port_finish_reset
unconditionally notify the USB core and the xHC of the device reset.
In hub_port_finish_reset, unconditionally clear the connect status
change (CSC) bit for USB 3.0 hubs when the port reset is done. If we
had to issue multiple warm resets for a device, that bit may have been
set if the device went into SS.Inactive and then was successfully warm
reset.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Sarah Sharp [Tue, 4 Dec 2012 23:27:50 +0000 (15:27 -0800)]
USB: Prepare for refactoring by adding extra udev checks.
The next patch will refactor the hub port code to rip out the recursive
call to hub_port_reset on a failed hot reset. In preparation for that,
make sure all code paths can deal with being called with a NULL udev.
The usb_device will not be valid if warm reset was issued because a port
transitioned to the Inactive or Compliance Mode on a device connect.
This patch should have no effect on current behavior.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Sarah Sharp [Thu, 1 Nov 2012 18:20:44 +0000 (11:20 -0700)]
USB: Don't use EHCI port sempahore for USB 3.0 hubs.
The EHCI host controller needs to prevent EHCI initialization when the
UHCI or OHCI companion controller is in the middle of a port reset. It
uses ehci_cf_port_reset_rwsem to do this. USB 3.0 hubs can't be under
an EHCI host controller, so it makes no sense to down the semaphore for
USB 3.0 hubs. It also makes the warm port reset code more complex.
Don't down ehci_cf_port_reset_rwsem for USB 3.0 hubs.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Sarah Sharp [Tue, 27 Nov 2012 20:30:23 +0000 (12:30 -0800)]
xhci: Avoid "dead ports", add roothub port polling.
The USB core hub thread (khubd) is designed with external USB hubs in
mind. It expects that if a port status change bit is set, the hub will
continue to send a notification through the hub status data transfer.
Basically, it expects hub notifications to be level-triggered.
The xHCI host controller is designed to be edge-triggered on the logical
'OR' of all the port status change bits. When all port status change
bits are clear, and a new change bit is set, the xHC will generate a
Port Status Change Event. If another change bit is set in the same port
status register before the first bit is cleared, it will not send
another event.
This means that the hub code may lose port status changes because of
race conditions between clearing change bits. The user sees this as a
"dead port" that doesn't react to device connects.
The fix is to turn on port polling whenever a new change bit is set.
Once the USB core issues a hub status request that shows that no change
bits are set in any USB ports, turn off port polling.
We can't allow the USB core to poll the roothub for port events during
host suspend because if the PCI host is in D3cold, the port registers
will be all f's. Instead, stop the port polling timer, and
unconditionally restart it when the host resumes. If there are no port
change bits set after the resume, the first call to hub_status_data will
disable polling.
This patch should be backported to stable kernels with the first xHCI
support, 2.6.31 and newer, that include the commit
0f2a79300a1471cf92ab43af165ea13555c8b0a5 "USB: xhci: Root hub support."
There will be merge conflicts because the check for HC_STATE_SUSPENDED
was moved into xhci_suspend in 3.8.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable@vger.kernel.org
Sarah Sharp [Thu, 15 Nov 2012 01:58:04 +0000 (17:58 -0800)]
USB: Handle warm reset failure on empty port.
An empty port can transition to either Inactive or Compliance Mode if a
newly connected USB 3.0 device fails to link train. In that case, we
issue a warm reset. Some devices, such as John's Roseweil eusb3
enclosure, slip back into Compliance Mode after the warm reset.
The current warm reset code does not check for device connect status on
warm reset completion, and it incorrectly reports the warm reset
succeeded. This causes the USB core to attempt to send a Set Address
control transfer to a port in Compliance Mode, which will always fail.
Make hub_port_wait_reset check the current connect status and link state
after the warm reset completes. Return a failure status if the device
is disconnected or the link state is Compliance Mode or SS.Inactive.
Make hub_events disable the port if warm reset fails. This will disable
the port, and then bring it back into the RxDetect state. Make the USB
core ignore the connect change until the device reconnects.
Note that this patch does NOT handle connected devices slipping into the
Inactive state very well. This is a concern, because devices can go
into the Inactive state on U1/U2 exit failure. However, the fix for
that case is too large for stable, so it will be submitted in a separate
patch.
This patch should be backported to kernels as old as 3.2, contain the
commit ID
75d7cf72ab9fa01dc70877aa5c68e8ef477229dc "usbcore: refine warm
reset logic"
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: John Covici <covici@ccs.covici.com>
Cc: stable@vger.kernel.org
Sarah Sharp [Thu, 15 Nov 2012 22:58:04 +0000 (14:58 -0800)]
USB: Ignore port state until reset completes.
The port reset code bails out early if the current connect status is
cleared (device disconnected). If we're issuing a hot reset, it may
also look at the link state before the reset is finished.
Section 10.14.2.6 of the USB 3.0 spec says that when a port enters the
Error state or Resetting state, the port connection bit retains the
value from the previous state. Therefore we can't trust it until the
reset finishes. Also, the xHCI spec section 4.19.1.2.5 says software
shall ignore the link state while the port is resetting, as it can be in
an unknown state.
The port state during reset is also unknown for USB 2.0 hubs. The hub
sends a reset signal by driving the bus into an SE0 state. This
overwhelms the "connect" signal from the device, so the port can't tell
whether anything is connected or not.
Fix the port reset code to ignore the port link state and current
connect bit until the reset finishes, and USB_PORT_STAT_RESET is
cleared.
Remove the check for USB_PORT_STAT_C_BH_RESET in the warm reset case,
because it's redundant. When the warm reset finishes, the port reset
bit will be cleared at the same time USB_PORT_STAT_C_BH_RESET is set.
Remove the now-redundant check for a cleared USB_PORT_STAT_RESET bit
in the code to deal with the finished reset.
This patch should be backported to all stable kernels.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable@vger.kernel.org
Sarah Sharp [Thu, 15 Nov 2012 01:16:52 +0000 (17:16 -0800)]
USB: Increase reset timeout.
John's NEC 0.96 xHCI host controller needs a longer timeout for a warm
reset to complete. The logs show it takes 650ms to complete the warm
reset, so extend the hub reset timeout to 800ms to be on the safe side.
This commit should be backported to kernels as old as 3.2, that contain
the commit
75d7cf72ab9fa01dc70877aa5c68e8ef477229dc "usbcore: refine
warm reset logic".
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: John Covici <covici@ccs.covici.com>
Cc: stable@vger.kernel.org
Sarah Sharp [Thu, 15 Nov 2012 00:42:32 +0000 (16:42 -0800)]
USB: Allow USB 3.0 ports to be disabled.
If hot and warm reset fails, or a port remains in the Compliance Mode,
the USB core needs to be able to disable a USB 3.0 port. Unlike USB 2.0
ports, once the port is placed into the Disabled link state, it will not
report any new device connects. To get device connect notifications, we
need to put the link into the Disabled state, and then the RxDetect
state.
The xHCI driver needs to atomically clear all change bits on USB 3.0
port disable, so that we get Port Status Change Events for future port
changes. We could technically do this in the USB core instead of in the
xHCI roothub code, since the port state machine can't advance out of the
disabled state until we set the link state to RxDetect. However,
external USB 3.0 hubs don't need this code. They are level-triggered,
not edge-triggered like xHCI, so they will continue to send interrupt
events when any change bit is set. Therefore it doesn't make sense to
put this code in the USB core.
This patch is part of a series to fix several reports of infinite loops
on device enumeration failure. This includes John, when he boots with
a USB 3.0 device (Roseweil eusb3 enclosure) attached to his NEC 0.96
host controller. The fix requires warm reset support, so it does not
make sense to backport this patch to stable kernels without warm reset
support.
This patch should be backported to kernels as old as 3.2, contain the
commit ID
75d7cf72ab9fa01dc70877aa5c68e8ef477229dc "usbcore: refine warm
reset logic"
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: John Covici <covici@ccs.covici.com>
Cc: stable@vger.kernel.org
Sarah Sharp [Thu, 15 Nov 2012 00:10:49 +0000 (16:10 -0800)]
USB: Ignore xHCI Reset Device status.
When the USB core finishes reseting a USB device, the xHCI driver sends
a Reset Device command to the host. The xHC then updates its internal
representation of the USB device to the 'Default' device state. If the
device was already in the Default state, the xHC will complete the
command with an error status.
If a device needs to be reset several times during enumeration, the
second reset will always fail because of the xHCI Reset Device command.
This can cause issues during enumeration.
For example, usb_reset_and_verify_device calls into hub_port_init in a
loop. Say that on the first call into hub_port_init, the device is
successfully reset, but doesn't respond to several set address control
transfers. Then the port will be disabled, but the udev will remain in
tact. usb_reset_and_verify_device will call into hub_port_init again.
On the second call into hub_port_init, the device will be reset, and the
xHCI driver will issue a Reset Device command. This command will fail
(because the device is already in the Default state), and
usb_reset_and_verify_device will fail. The port will be disabled, and
the device won't be able to enumerate.
Fix this by ignoring the return value of the HCD reset_device callback.
This commit should be backported to kernels as old as 3.2, that contain
the commit
75d7cf72ab9fa01dc70877aa5c68e8ef477229dc "usbcore: refine
warm reset logic".
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable@vger.kernel.org
Sarah Sharp [Wed, 14 Nov 2012 23:58:52 +0000 (15:58 -0800)]
USB: Handle auto-transition from hot to warm reset.
USB 3.0 hubs and roothubs will automatically transition a failed hot
reset to a warm (BH) reset. In that case, the warm reset change bit
will be set, and the link state change bit may also be set. Change
hub_port_finish_reset to unconditionally clear those change bits for USB
3.0 hubs. If these bits are not cleared, we may lose port change events
from the roothub.
This commit should be backported to kernels as old as 3.2, that contain
the commit
75d7cf72ab9fa01dc70877aa5c68e8ef477229dc "usbcore: refine
warm reset logic".
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable@vger.kernel.org
Sarah Sharp [Mon, 17 Dec 2012 22:12:35 +0000 (14:12 -0800)]
xhci: Handle HS bulk/ctrl endpoints that don't NAK.
A high speed control or bulk endpoint may have bInterval set to zero,
which means it does not NAK. If bInterval is non-zero, it means the
endpoint NAKs at a rate of 2^(bInterval - 1).
The xHCI code to compute the NAK interval does not handle the special
case of zero properly. The current code unconditionally subtracts one
from bInterval and uses it as an exponent. This causes a very large
bInterval to be used, and warning messages like these will be printed:
usb 1-1: ep 0x1 - rounding interval to 32768 microframes, ep desc says 0 microframes
This may cause the xHCI host hardware to reject the Configure Endpoint
command, which means the HS device will be unusable under xHCI ports.
This patch should be backported to kernels as old as 2.6.31, that contain
commit
dfa49c4ad120a784ef1ff0717168aa79f55a483a "USB: xhci - fix math in
xhci_get_endpoint_interval()".
Reported-by: Vincent Pelletier <plr.vincent@gmail.com>
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: stable@vger.kernel.org
Linus Torvalds [Thu, 3 Jan 2013 02:13:21 +0000 (18:13 -0800)]
Linux 3.8-rc2
Linus Torvalds [Thu, 3 Jan 2013 02:12:35 +0000 (18:12 -0800)]
Merge branch 'fixes-for-3.8' of git://git./linux/kernel/git/cooloney/linux-leds
Pull LED fix from Bryan Wu.
* 'fixes-for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
leds: leds-gpio: set devm_gpio_request_one() flags param correctly
Javier Martinez Canillas [Thu, 20 Dec 2012 12:56:59 +0000 (04:56 -0800)]
leds: leds-gpio: set devm_gpio_request_one() flags param correctly
commit a99d76f leds: leds-gpio: use gpio_request_one
changed the leds-gpio driver to use gpio_request_one() instead
of gpio_request() + gpio_direction_output()
Unfortunately, it also made a semantic change that breaks the
leds-gpio driver.
The gpio_request_one() flags parameter was set to:
GPIOF_DIR_OUT | (led_dat->active_low ^ state)
Since GPIOF_DIR_OUT is 0, the final flags value will just be the
XOR'ed value of led_dat->active_low and state.
This value were used to distinguish between HIGH/LOW output initial
level and call gpio_direction_output() accordingly.
With this new semantic gpio_request_one() will take the flags value
of 1 as a configuration of input direction (GPIOF_DIR_IN) and will
call gpio_direction_input() instead of gpio_direction_output().
int gpio_request_one(unsigned gpio, unsigned long flags, const char *label)
{
..
if (flags & GPIOF_DIR_IN)
err = gpio_direction_input(gpio);
else
err = gpio_direction_output(gpio,
(flags & GPIOF_INIT_HIGH) ? 1 : 0);
..
}
The right semantic is to evaluate led_dat->active_low ^ state and
set the output initial level explicitly.
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Reported-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
Linus Torvalds [Thu, 3 Jan 2013 01:46:14 +0000 (17:46 -0800)]
Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fixes from Wim Van Sebroeck:
"This fixes some small errors in the new da9055 driver, eliminates a
compiler warning and adds DT support for the twl4030_wdt driver (so
that we can have multiple watchdogs with DT on the omap platforms)."
* git://www.linux-watchdog.org/linux-watchdog:
watchdog: twl4030_wdt: add DT support
watchdog: omap_wdt: eliminate unused variable and a compiler warning
watchdog: da9055: Don't update wdt_dev->timeout in da9055_wdt_set_timeout error path
watchdog: da9055: Fix invalid free of devm_ allocated data
Linus Torvalds [Thu, 3 Jan 2013 01:44:29 +0000 (17:44 -0800)]
Merge tag '3.8-pci-fixes' of git://git./linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Some fixes for v3.8. They include a fix for the new SR-IOV sysfs
management support, an expanded quirk for Ricoh SD card readers, a
Stratus DMI quirk fix, and a PME polling fix."
* tag '3.8-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Reduce Ricoh 0xe822 SD card reader base clock frequency to 50MHz
PCI/PM: Do not suspend port if any subordinate device needs PME polling
PCI: Add PCIe Link Capability link speed and width names
PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)
PCI: Remove spurious error for sriov_numvfs store and simplify flow
David Howells [Wed, 2 Jan 2013 15:13:02 +0000 (15:13 +0000)]
UAPI: Strip _UAPI prefix on header install no matter the whitespace
Commit
56c176c9cac9 ("UAPI: strip the _UAPI prefix from header guards
during header installation") strips the _UAPI prefix from header guards,
but only if there's a single space between the cpp directive and the
label.
Make it more flexible and able to handle tabs and multiple white space
characters.
Signed-off-by: David Howells <dhowell@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Wed, 2 Jan 2013 15:12:55 +0000 (15:12 +0000)]
UAPI: Remove empty Kbuild files
Empty files can get deleted by the patch program, so remove empty Kbuild
files and their links from the parent Kbuilds.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 3 Jan 2013 01:33:50 +0000 (17:33 -0800)]
Merge tag 'ecryptfs-3.8-rc2-fixes' of git://git./linux/kernel/git/tyhicks/ecryptfs
Pull ecryptfs fixes from Tyler Hicks:
"Two self-explanatory fixes and a third patch which improves
performance: when overwriting a full page in the eCryptfs page cache,
skip reading in and decrypting the corresponding lower page."
* tag 'ecryptfs-3.8-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
fs/ecryptfs/crypto.c: make ecryptfs_encode_for_filename() static
eCryptfs: fix to use list_for_each_entry_safe() when delete items
eCryptfs: Avoid unnecessary disk read and data decryption during writing
Linus Torvalds [Thu, 3 Jan 2013 01:32:49 +0000 (17:32 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"Two of Alex's patches deal with a race when reseting server
connections for open RBD images, one demotes some non-fatal BUGs to
WARNs, and my patch fixes a protocol feature bit failure path."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: fix protocol feature mismatch failure path
libceph: WARN, don't BUG on unexpected connection states
libceph: always reset osds when kicking
libceph: move linger requests sooner in kick_requests()
Mel Gorman [Fri, 21 Dec 2012 23:10:25 +0000 (23:10 +0000)]
mm: mempolicy: Convert shared_policy mutex to spinlock
Sasha was fuzzing with trinity and reported the following problem:
BUG: sleeping function called from invalid context at kernel/mutex.c:269
in_atomic(): 1, irqs_disabled(): 0, pid: 6361, name: trinity-main
2 locks held by trinity-main/6361:
#0: (&mm->mmap_sem){++++++}, at: [<
ffffffff810aa314>] __do_page_fault+0x1e4/0x4f0
#1: (&(&mm->page_table_lock)->rlock){+.+...}, at: [<
ffffffff8122f017>] handle_pte_fault+0x3f7/0x6a0
Pid: 6361, comm: trinity-main Tainted: G W
3.7.0-rc2-next-
20121024-sasha-00001-gd95ef01-dirty #74
Call Trace:
__might_sleep+0x1c3/0x1e0
mutex_lock_nested+0x29/0x50
mpol_shared_policy_lookup+0x2e/0x90
shmem_get_policy+0x2e/0x30
get_vma_policy+0x5a/0xa0
mpol_misplaced+0x41/0x1d0
handle_pte_fault+0x465/0x6a0
This was triggered by a different version of automatic NUMA balancing
but in theory the current version is vunerable to the same problem.
do_numa_page
-> numa_migrate_prep
-> mpol_misplaced
-> get_vma_policy
-> shmem_get_policy
It's very unlikely this will happen as shared pages are not marked
pte_numa -- see the page_mapcount() check in change_pte_range() -- but
it is possible.
To address this, this patch restores sp->lock as originally implemented
by Kosaki Motohiro. In the path where get_vma_policy() is called, it
should not be calling sp_alloc() so it is not necessary to treat the PTL
specially.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 2 Jan 2013 17:57:34 +0000 (09:57 -0800)]
Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 bug fixes from Ted Ts'o:
"Various bug fixes for ext4. Perhaps the most serious bug fixed is one
which could cause file system corruptions when performing file punch
operations."
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: avoid hang when mounting non-journal filesystems with orphan list
ext4: lock i_mutex when truncating orphan inodes
ext4: do not try to write superblock on ro remount w/o journal
ext4: include journal blocks in df overhead calcs
ext4: remove unaligned AIO warning printk
ext4: fix an incorrect comment about i_mutex
ext4: fix deadlock in journal_unmap_buffer()
ext4: split off ext4_journalled_invalidatepage()
jbd2: fix assertion failure in jbd2_journal_flush()
ext4: check dioread_nolock on remount
ext4: fix extent tree corruption caused by hole punch
Hugh Dickins [Wed, 2 Jan 2013 10:04:23 +0000 (02:04 -0800)]
mempolicy: remove arg from mpol_parse_str, mpol_to_str
Remove the unused argument (formerly no_context) from mpol_parse_str()
and from mpol_to_str().
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 2 Jan 2013 10:01:33 +0000 (02:01 -0800)]
tmpfs mempolicy: fix /proc/mounts corrupting memory
Recently I suggested using "mount -o remount,mpol=local /tmp" in NUMA
mempolicy testing. Very nasty. Reading /proc/mounts, /proc/pid/mounts
or /proc/pid/mountinfo may then corrupt one bit of kernel memory, often
in a page table (causing "Bad swap" or "Bad page map" warning or "Bad
pagetable" oops), sometimes in a vm_area_struct or rbnode or somewhere
worse. "mpol=prefer" and "mpol=prefer:Node" are equally toxic.
Recent NUMA enhancements are not to blame: this dates back to 2.6.35,
when commit
e17f74af351c "mempolicy: don't call mpol_set_nodemask() when
no_context" skipped mpol_parse_str()'s call to mpol_set_nodemask(),
which used to initialize v.preferred_node, or set MPOL_F_LOCAL in flags.
With slab poisoning, you can then rely on mpol_to_str() to set the bit
for node 0x6b6b, probably in the next page above the caller's stack.
mpol_parse_str() is only called from shmem_parse_options(): no_context
is always true, so call it unused for now, and remove !no_context code.
Set v.nodes or v.preferred_node or MPOL_F_LOCAL as mpol_to_str() might
expect. Then mpol_to_str() can ignore its no_context argument also,
the mpol being appropriately initialized whether contextualized or not.
Rename its no_context unused too, and let subsequent patch remove them
(that's not needed for stable backporting, which would involve rejects).
I don't understand why MPOL_LOCAL is described as a pseudo-policy:
it's a reasonable policy which suffers from a confusing implementation
in terms of MPOL_PREFERRED with MPOL_F_LOCAL. I believe this would be
much more robust if MPOL_LOCAL were recognized in switch statements
throughout, MPOL_F_LOCAL deleted, and MPOL_PREFERRED use the (possibly
empty) nodes mask like everyone else, instead of its preferred_node
variant (I presume an optimization from the days before MPOL_LOCAL).
But that would take me too long to get right and fully tested.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Wong [Tue, 1 Jan 2013 21:20:27 +0000 (21:20 +0000)]
epoll: prevent missed events on EPOLL_CTL_MOD
EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to
ensure events are not missed. Since the modifications to the interest
mask are not protected by the same lock as ep_poll_callback, we need to
ensure the change is visible to other CPUs calling ep_poll_callback.
We also need to ensure f_op->poll() has an up-to-date view of past
events which occured before we modified the interest mask. So this
barrier also pairs with the barrier in wq_has_sleeper().
This should guarantee either ep_poll_callback or f_op->poll() (or both)
will notice the readiness of a recently-ready/modified item.
This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in:
http://thread.gmane.org/gmane.linux.kernel/1408782/
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andreas Voellmy <andreas.voellmy@yale.edu>
Tested-by: "Junchang(Jason) Wang" <junchang.wang@yale.edu>
Cc: netdev@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aaro Koskinen [Sun, 23 Dec 2012 20:03:37 +0000 (22:03 +0200)]
watchdog: twl4030_wdt: add DT support
Add DT support for twl4030_wdt. This is needed to get twl4030_wdt to
probe when booting with DT.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Aaro Koskinen [Sun, 23 Dec 2012 20:03:36 +0000 (22:03 +0200)]
watchdog: omap_wdt: eliminate unused variable and a compiler warning
We forgot to delete this in the commit
4f4753d9 (watchdog: omap_wdt:
convert to devm_ functions), and as a result the following compilation
warning was introduced:
drivers/watchdog/omap_wdt.c: In function 'omap_wdt_remove':
drivers/watchdog/omap_wdt.c:299:19: warning: unused variable 'res' [-Wunused-variable]
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Axel Lin [Sat, 22 Dec 2012 03:07:01 +0000 (11:07 +0800)]
watchdog: da9055: Don't update wdt_dev->timeout in da9055_wdt_set_timeout error path
Otherwise, WDIOC_GETTIMEOUT returns wrong value if set_timeout fails.
This patch also removes unnecessary ret variable in da9055_wdt_ping function.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Axel Lin [Fri, 21 Dec 2012 13:09:06 +0000 (21:09 +0800)]
watchdog: da9055: Fix invalid free of devm_ allocated data
It is not required to free devm_ allocated data. Since kref_put
needs a valid release function, da9055_wdt_release_resources()
is not deleted.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Linus Torvalds [Sun, 30 Dec 2012 18:00:37 +0000 (10:00 -0800)]
Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull DRM update from Dave Airlie:
"This is a bit larger due to me not bothering to do anything since
before Xmas, and other people working too hard after I had clearly
given up.
It's got the 3 main x86 driver fixes pulls, and a bunch of tegra
fixes, doesn't fix the Ironlake bug yet, but that does seem to be
getting closer.
- radeon: gpu reset fixes and userspace packet support
- i915: watermark fixes, workarounds, i830/845 fix,
- nouveau: nvd9/kepler microcode fixes, accel is now enabled and
working, gk106 support
- tegra: misc fixes."
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (34 commits)
Revert "drm: tegra: protect DC register access with mutex"
drm: tegra: program only one window during modeset
drm: tegra: clean out old gem prototypes
drm: tegra: remove redundant tegra2_tmds_config entry
drm: tegra: protect DC register access with mutex
drm: tegra: don't leave clients host1x member uninitialized
drm: tegra: fix front_porch <-> back_porch mixup
drm/nve0/graph: fix fuc, and enable acceleration on all known chipsets
drm/nvc0/graph: fix fuc, and enable acceleration on GF119
drm/nouveau/bios: cache ramcfg strap on later chipsets
drm/nouveau/mxm: silence output if no bios data
drm/nouveau/bios: parse/display extra version component
drm/nouveau/bios: implement opcode 0xa9
drm/nouveau/bios: update gpio parsing apis to match current design
drm/nouveau: initial support for GK106
drm/radeon: add WAIT_UNTIL to evergreen VM safe reg list
drm/i915: disable shrinker lock stealing for create_mmap_offset
drm/i915: optionally disable shrinker lock stealing
drm/i915: fix flags in dma buf exporting
drm/radeon: add support for MEM_WRITE packet
...
Linus Torvalds [Sun, 30 Dec 2012 17:59:21 +0000 (09:59 -0800)]
Merge tag 'omap-late-cleanups' of git://git./linux/kernel/git/arm/arm-soc
Pull late ARM cleanups for omap from Olof Johansson:
"From Tony Lindgren:
Here are few more patches to finish the omap changes for multiplatform
conversion that are not strictly fixes, but were too complex to do
with the dependencies during the merge window. Those are to move of
serial-omap.h to platform_data, and the removal of remaining
cpu_is_omap macro usage outside mach-omap2.
Then there are several trivial fixes for typos and few minimal
omap2plus_defconfig updates."
* tag 'omap-late-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arch/arm/mach-omap2/dpll3xxx.c: drop if around WARN_ON
OMAP2: Fix a typo - replace regist with register.
ARM/omap: use module_platform_driver macro
ARM: OMAP2+: PMU: Remove unused header
ARM: OMAP4: remove duplicated include from omap_hwmod_44xx_data.c
ARM: OMAP2+: omap2plus_defconfig: enable twl4030 SoC audio
ARM: OMAP2+: omap2plus_defconfig: Add tps65217 support
ARM: OMAP2+: enable devtmpfs and devtmpfs automount
ARM: OMAP2+: omap_twl: Change TWL4030_MODULE_PM_RECEIVER to TWL_MODULE_PM_RECEIVER
ARM: OMAP2+: Drop plat/cpu.h for omap2plus
ARM: OMAP: Split fb.c to remove last remaining cpu_is_omap usage
MAINTAINERS: Add an entry for omap related .dts files
Linus Torvalds [Sun, 30 Dec 2012 17:58:36 +0000 (09:58 -0800)]
Merge tag 'fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"It's been quiet over the holidays, but we have had a couple of trivial
fixes coming in for the newly introduced sunxi platform; one to add it
to the multiplatform defconfig for build coverage, and one fixup for
device tree strings."
* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
sunxi: Change the machine compatible string.
ARM: multi_v7_defconfig: Add ARCH_SUNXI
Dave Airlie [Sun, 30 Dec 2012 11:58:20 +0000 (21:58 +1000)]
Revert "drm: tegra: protect DC register access with mutex"
This reverts commit
83c0bcb694be31dcd6c04bdd935b96a95a0af548.
Lucas pointed out this was a mistake, and I missed the discussion,
so just revert it out to save a rebase.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lucas Stach [Wed, 19 Dec 2012 21:38:57 +0000 (21:38 +0000)]
drm: tegra: program only one window during modeset
The intention is to program exactly WIN_A, not WIN_A and possibly
others.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lucas Stach [Wed, 19 Dec 2012 21:38:56 +0000 (21:38 +0000)]
drm: tegra: clean out old gem prototypes
There is no gem.c anymore, those functions are implemented by the
drm_cma_helpers now.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lucas Stach [Wed, 19 Dec 2012 21:38:55 +0000 (21:38 +0000)]
drm: tegra: remove redundant tegra2_tmds_config entry
The 720p and 1080p entries are completely redundant, as we are matching
the table entries against <=pclk.
Also generalize the comment, as we are using those table entries even
when driving other modes than the standard TV ones.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lucas Stach [Wed, 19 Dec 2012 21:38:54 +0000 (21:38 +0000)]
drm: tegra: protect DC register access with mutex
Window properties are programmed through a shared aperture and have to
happen atomically. Also we do the read-update-write dance on some of the
shared regs.
To make sure that different functions don't stumble over each other
protect the register access with a mutex.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lucas Stach [Wed, 19 Dec 2012 21:38:53 +0000 (21:38 +0000)]
drm: tegra: don't leave clients host1x member uninitialized
No real problem for now, as nothing is using this, but leaving it
unitialized is asking for trouble later on.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lucas Stach [Wed, 19 Dec 2012 21:38:52 +0000 (21:38 +0000)]
drm: tegra: fix front_porch <-> back_porch mixup
Fixes wrong picture offset observed when using HDMI output with a
Technisat HD TV.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Acked-by: Mark Zhang <markz@nvidia.com>
Tested-by: Mark Zhang <markz@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 30 Dec 2012 03:54:12 +0000 (13:54 +1000)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Some fixes for 3.8:
- Watermark fixups from Chris Wilson (4 pieces).
- 2 snb workarounds, seem to be recently added to our internal DB.
- workaround for the infamous i830/i845 hang, seems now finally solid!
Based on Chris' fix for SNA, now also for UXA/mesa&old SNA.
- Some more fixlets for shrinker-pulls-the-rug issues (Chris&me).
- Fix dma-buf flags when exporting (you).
- Disable the VGA plane if it's enabled on lid open - similar fix in
spirit to the one I've sent you last weeek, BIOS' really like to mess
with the display when closing the lid (awesome debug work from Krzysztof
Mazur).
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: disable shrinker lock stealing for create_mmap_offset
drm/i915: optionally disable shrinker lock stealing
drm/i915: fix flags in dma buf exporting
i915: ensure that VGA plane is disabled
drm/i915: Preallocate the drm_mm_node prior to manipulating the GTT drm_mm manager
drm: Export routines for inserting preallocated nodes into the mm manager
drm/i915: don't disable disconnected outputs
drm/i915: Implement workaround for broken CS tlb on i830/845
drm/i915: Implement WaSetupGtModeTdRowDispatch
drm/i915: Implement WaDisableHiZPlanesWhenMSAAEnabled
drm/i915: Prefer CRTC 'active' rather than 'enabled' during WM computations
drm/i915: Clear self-refresh watermarks when disabled
drm/i915: Double the cursor self-refresh latency on Valleyview
drm/i915: Fixup cursor latency used for IVB lp3 watermarks
Dave Airlie [Sun, 30 Dec 2012 03:02:48 +0000 (13:02 +1000)]
Merge branch 'drm-fixes-3.8' of git://people.freedesktop.org/~agd5f/linux into drm-next
Misc fixes for reset and new packets for userspace usage.
* 'drm-fixes-3.8' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: add WAIT_UNTIL to evergreen VM safe reg list
drm/radeon: add support for MEM_WRITE packet
drm/radeon: restore modeset late in GPU reset path
drm/radeon: avoid deadlock in pm path when waiting for fence
drm/radeon: don't leave fence blocked process on failed GPU reset
Dave Airlie [Sun, 30 Dec 2012 03:01:52 +0000 (13:01 +1000)]
Merge branch 'drm-nouveau-fixes-3.8' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-next
Fixes the accel support for nvd9 + kepler chipsets, also fixes GK106 support.
* 'drm-nouveau-fixes-3.8' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nve0/graph: fix fuc, and enable acceleration on all known chipsets
drm/nvc0/graph: fix fuc, and enable acceleration on GF119
drm/nouveau/bios: cache ramcfg strap on later chipsets
drm/nouveau/mxm: silence output if no bios data
drm/nouveau/bios: parse/display extra version component
drm/nouveau/bios: implement opcode 0xa9
drm/nouveau/bios: update gpio parsing apis to match current design
drm/nouveau: initial support for GK106
Zlatko Calusic [Fri, 28 Dec 2012 02:16:38 +0000 (03:16 +0100)]
mm: fix null pointer dereference in wait_iff_congested()
An unintended consequence of commit
4ae0a48b5efc ("mm: modify
pgdat_balanced() so that it also handles order-0") is that
wait_iff_congested() can now be called with NULL 'struct zone *'
producing kernel oops like this:
BUG: unable to handle kernel NULL pointer dereference
IP: [<
ffffffff811542d9>] wait_iff_congested+0x59/0x140
This trivial patch fixes it.
Reported-by: Zhouping Liu <zliu@redhat.com>
Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Zlatko Calusic <zlatko.calusic@iskon.hr>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Olof Johansson [Fri, 28 Dec 2012 07:53:01 +0000 (08:53 +0100)]
Merge tag 'sunxi-fixes-for-3.8-rc2' of git://github.com/mripard/linux into fixes
From Maxime Ripard:
Fixes for the sunxi core to be merged in 3.8-rc2
* tag 'sunxi-fixes-for-3.8-rc2' of git://github.com/mripard/linux:
sunxi: Change the machine compatible string.
ARM: multi_v7_defconfig: Add ARCH_SUNXI
Sage Weil [Fri, 28 Dec 2012 02:27:04 +0000 (20:27 -0600)]
libceph: fix protocol feature mismatch failure path
We should not set con->state to CLOSED here; that happens in
ceph_fault() in the caller, where it first asserts that the state
is not yet CLOSED. Avoids a BUG when the features don't match.
Since the fail_protocol() has become a trivial wrapper, replace
calls to it with direct calls to reset_connection().
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Alex Elder [Wed, 26 Dec 2012 16:43:57 +0000 (10:43 -0600)]
libceph: WARN, don't BUG on unexpected connection states
A number of assertions in the ceph messenger are implemented with
BUG_ON(), killing the system if connection's state doesn't match
what's expected. At this point our state model is (evidently) not
well understood enough for these assertions to trigger a BUG().
Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
so we learn about these issues without killing the machine.
We now recognize that a connection fault can occur due to a socket
closure at any time, regardless of the state of the connection. So
there is really nothing we can assert about the state of the
connection at that point so eliminate that assertion.
Reported-by: Ugis <ugis22@gmail.com>
Tested-by: Ugis <ugis22@gmail.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Alex Elder [Wed, 26 Dec 2012 20:31:40 +0000 (14:31 -0600)]
libceph: always reset osds when kicking
When ceph_osdc_handle_map() is called to process a new osd map,
kick_requests() is called to ensure all affected requests are
updated if necessary to reflect changes in the osd map. This
happens in two cases: whenever an incremental map update is
processed; and when a full map update (or the last one if there is
more than one) gets processed.
In the former case, the kick_requests() call is followed immediately
by a call to reset_changed_osds() to ensure any connections to osds
affected by the map change are reset. But for full map updates
this isn't done.
Both cases should be doing this osd reset.
Rather than duplicating the reset_changed_osds() call, move it into
the end of kick_requests().
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Alex Elder [Wed, 19 Dec 2012 21:52:36 +0000 (15:52 -0600)]
libceph: move linger requests sooner in kick_requests()
The kick_requests() function is called by ceph_osdc_handle_map()
when an osd map change has been indicated. Its purpose is to
re-queue any request whose target osd is different from what it
was when it was originally sent.
It is structured as two loops, one for incomplete but registered
requests, and a second for handling completed linger requests.
As a special case, in the first loop if a request marked to linger
has not yet completed, it is moved from the request list to the
linger list. This is as a quick and dirty way to have the second
loop handle sending the request along with all the other linger
requests.
Because of the way it's done now, however, this quick and dirty
solution can result in these incomplete linger requests never
getting re-sent as desired. The problem lies in the fact that
the second loop only arranges for a linger request to be sent
if it appears its target osd has changed. This is the proper
handling for *completed* linger requests (it avoids issuing
the same linger request twice to the same osd).
But although the linger requests added to the list in the first loop
may have been sent, they have not yet completed, so they need to be
re-sent regardless of whether their target osd has changed.
The first required fix is we need to avoid calling __map_request()
on any incomplete linger request. Otherwise the subsequent
__map_request() call in the second loop will find the target osd
has not changed and will therefore not re-send the request.
Second, we need to be sure that a sent but incomplete linger request
gets re-sent. If the target osd is the same with the new osd map as
it was when the request was originally sent, this won't happen.
This can be fixed through careful handling when we move these
requests from the request list to the linger list, by unregistering
the request *before* it is registered as a linger request. This
works because a side-effect of unregistering the request is to make
the request's r_osd pointer be NULL, and *that* will ensure the
second loop actually re-sends the linger request.
Processing of such a request is done at that point, so continue with
the next one once it's been moved.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Linus Torvalds [Thu, 27 Dec 2012 18:46:47 +0000 (10:46 -0800)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Report i2c errors to userspace in lm73 driver
- Fix problem with DIV_ROUND_CLOSEST and unsigned divisors in emc6w201
driver
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (emc6w201) Fix DIV_ROUND_CLOSEST problem with unsigned divisors
hwmon: (lm73} Detect and report i2c bus errors
Linus Torvalds [Thu, 27 Dec 2012 18:42:46 +0000 (10:42 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull namespace fixes from Eric Biederman:
"This tree includes two bug fixes for problems Oleg spotted on his
review of the recent pid namespace work. A small fix to not enable
bottom halves with irqs disabled, and a trivial build fix for f2fs
with user namespaces enabled."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
f2fs: Don't assign e_id in f2fs_acl_from_disk
proc: Allow proc_free_inum to be called from any context
pidns: Stop pid allocation when init dies
pidns: Outlaw thread creation after unshare(CLONE_NEWPID)
Linus Torvalds [Thu, 27 Dec 2012 18:40:30 +0000 (10:40 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) GRE tunnel drivers don't set the transport header properly, they also
blindly deref the inner protocol ipv4 and needs some checks. Fixes
from Isaku Yamahata.
2) Fix sleeps while atomic in netdevice rename code, from Eric Dumazet.
3) Fix double-spinlock in solos-pci driver, from Dan Carpenter.
4) More ARP bug fixes. Fix lockdep splat in arp_solicit() and then the
bug accidentally added by that fix. From Eric Dumazet and Cong Wang.
5) Remove some __dev* annotations that slipped back in, as well as all
HOTPLUG references. From Greg KH
6) RDS protocol uses wrong interfaces to access scatter-gather elements,
causing a regression. From Mike Marciniszyn.
7) Fix build error in cpts driver, from Richard Cochran.
8) Fix arithmetic in packet scheduler, from Stefan Hasko.
9) Similarly, fix association during calculation of random backoff in
batman-adv. From Akinobu Mita.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
ipv6/ip6_gre: set transport header correctly
ipv4/ip_gre: set transport header correctly to gre header
IB/rds: suppress incompatible protocol when version is known
IB/rds: Correct ib_api use with gs_dma_address/sg_dma_len
net/vxlan: Use the underlying device index when joining/leaving multicast groups
tcp: should drop incoming frames without ACK flag set
netprio_cgroup: define sk_cgrp_prioidx only if NETPRIO_CGROUP is enabled
cpts: fix a run time warn_on.
cpts: fix build error by removing useless code.
batman-adv: fix random jitter calculation
arp: fix a regression in arp_solicit()
net: sched: integer overflow fix
CONFIG_HOTPLUG removal from networking core
Drivers: network: more __dev* removal
bridge: call br_netpoll_disable in br_add_if
ipv4: arp: fix a lockdep splat in arp_solicit()
tuntap: dont use a private kmem_cache
net: devnet_rename_seq should be a seqcount
ip_gre: fix possible use after free
ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally
...
Theodore Ts'o [Thu, 27 Dec 2012 06:42:50 +0000 (01:42 -0500)]
ext4: avoid hang when mounting non-journal filesystems with orphan list
When trying to mount a file system which does not contain a journal,
but which does have a orphan list containing an inode which needs to
be truncated, the mount call with hang forever in
ext4_orphan_cleanup() because ext4_orphan_del() will return
immediately without removing the inode from the orphan list, leading
to an uninterruptible loop in kernel code which will busy out one of
the CPU's on the system.
This can be trivially reproduced by trying to mount the file system
found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs
source tree. If a malicious user were to put this on a USB stick, and
mount it on a Linux desktop which has automatic mounts enabled, this
could be considered a potential denial of service attack. (Not a big
deal in practice, but professional paranoids worry about such things,
and have even been known to allocate CVE numbers for such problems.)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Cc: stable@vger.kernel.org
Theodore Ts'o [Thu, 27 Dec 2012 06:42:48 +0000 (01:42 -0500)]
ext4: lock i_mutex when truncating orphan inodes
Commit
c278531d39 added a warning when ext4_flush_unwritten_io() is
called without i_mutex being taken. It had previously not been taken
during orphan cleanup since races weren't possible at that point in
the mount process, but as a result of this
c278531d39, we will now see
a kernel WARN_ON in this case. Take the i_mutex in
ext4_orphan_cleanup() to suppress this warning.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Cc: stable@vger.kernel.org
Isaku Yamahata [Mon, 24 Dec 2012 16:51:04 +0000 (16:51 +0000)]
ipv6/ip6_gre: set transport header correctly
ip6gre_xmit2() incorrectly sets transport header to inner payload
instead of GRE header. It seems copy-and-pasted from ipip.c.
Set transport header to gre header.
(In ipip case the transport header is the inner ip header, so that's
correct.)
Found by inspection. In practice the incorrect transport header
doesn't matter because the skb usually is sent to another net_device
or socket, so the transport header isn't referenced.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Isaku Yamahata [Mon, 24 Dec 2012 16:51:03 +0000 (16:51 +0000)]
ipv4/ip_gre: set transport header correctly to gre header
ipgre_tunnel_xmit() incorrectly sets transport header to inner payload
instead of GRE header. It seems copy-and-pasted from ipip.c.
So set transport header to gre header.
(In ipip case the transport header is the inner ip header, so that's
correct.)
Found by inspection. In practice the incorrect transport header
doesn't matter because the skb usually is sent to another net_device
or socket, so the transport header isn't referenced.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marciniszyn, Mike [Fri, 21 Dec 2012 08:01:54 +0000 (08:01 +0000)]
IB/rds: suppress incompatible protocol when version is known
Add an else to only print the incompatible protocol message
when version hasn't been established.
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marciniszyn, Mike [Fri, 21 Dec 2012 08:01:49 +0000 (08:01 +0000)]
IB/rds: Correct ib_api use with gs_dma_address/sg_dma_len
0b088e00 ("RDS: Use page_remainder_alloc() for recv bufs")
added uses of sg_dma_len() and sg_dma_address(). This makes
RDS DOA with the qib driver.
IB ulps should use ib_sg_dma_len() and ib_sg_dma_address
respectively since some HCAs overload ib_sg_dma* operations.
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 20 Dec 2012 03:36:08 +0000 (03:36 +0000)]
net/vxlan: Use the underlying device index when joining/leaving multicast groups
The socket calls from vxlan to join/leave multicast group aren't
using the index of the underlying device, as a result the stack uses
the first interface that is up. This results in vxlan being non functional
over a device which isn't the 1st to be up.
Fix this by providing the iflink field to the vxlan instance
to the multicast calls.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 26 Dec 2012 12:44:34 +0000 (12:44 +0000)]
tcp: should drop incoming frames without ACK flag set
In commit
96e0bf4b5193d (tcp: Discard segments that ack data not yet
sent) John Dykstra enforced a check against ack sequences.
In commit
354e4aa391ed5 (tcp: RFC 5961 5.2 Blind Data Injection Attack
Mitigation) I added more safety tests.
But we missed fact that these tests are not performed if ACK bit is
not set.
RFC 793 3.9 mandates TCP should drop a frame without ACK flag set.
" fifth check the ACK field,
if the ACK bit is off drop the segment and return"
Not doing so permits an attacker to only guess an acceptable sequence
number, evading stronger checks.
Many thanks to Zhiyun Qian for bringing this issue to our attention.
See :
http://web.eecs.umich.edu/~zhiyunq/pub/ccs12_TCP_sequence_number_inference.pdf
Reported-by: Zhiyun Qian <zhiyunq@umich.edu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoffer Dall [Fri, 21 Dec 2012 18:03:50 +0000 (13:03 -0500)]
mm: Fix PageHead when !CONFIG_PAGEFLAGS_EXTENDED
Unfortunately with !CONFIG_PAGEFLAGS_EXTENDED, (!PageHead) is false, and
(PageHead) is true, for tail pages. If this is indeed the intended
behavior, which I doubt because it breaks cache cleaning on some ARM
systems, then the nomenclature is highly problematic.
This patch makes sure PageHead is only true for head pages and PageTail
is only true for tail pages, and neither is true for non-compound pages.
[ This buglet seems ancient - seems to have been introduced back in Apr
2008 in commit
6a1e7f777f61: "pageflags: convert to the use of new
macros". And the reason nobody noticed is because the PageHead()
tests are almost all about just sanity-checking, and only used on
pages that are actual page heads. The fact that the old code returned
true for tail pages too was thus not really noticeable. - Linus ]
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <Will.Deacon@arm.com>
Cc: Steve Capper <Steve.Capper@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: stable@kernel.org # 2.6.26+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Zefan [Tue, 25 Dec 2012 20:48:24 +0000 (20:48 +0000)]
netprio_cgroup: define sk_cgrp_prioidx only if NETPRIO_CGROUP is enabled
sock->sk_cgrp_prioidx won't be used at all if CONFIG_NETPRIO_CGROUP=n.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Cochran [Sun, 23 Dec 2012 21:19:10 +0000 (21:19 +0000)]
cpts: fix a run time warn_on.
This patch fixes a warning in clk_enable by calling clk_prepare_enable
instead.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Cochran [Sun, 23 Dec 2012 21:19:09 +0000 (21:19 +0000)]
cpts: fix build error by removing useless code.
The cpts driver tries to obtain the input clock frequency by calling the
clock's internal 'recalc' method. Since <plat/clock.h> has been removed,
this code can no longer compile.
However, the driver never makes use of the frequency value, so this patch
fixes the issue by removing the offending code altogether.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Akinobu Mita [Wed, 26 Dec 2012 02:32:10 +0000 (02:32 +0000)]
batman-adv: fix random jitter calculation
batadv_iv_ogm_emit_send_time() attempts to calculates a random integer
in the range of 'orig_interval +- BATADV_JITTER' by the below lines.
msecs = atomic_read(&bat_priv->orig_interval) - BATADV_JITTER;
msecs += (random32() % 2 * BATADV_JITTER);
But it actually gets 'orig_interval' or 'orig_interval - BATADV_JITTER'
because '%' and '*' have same precedence and associativity is
left-to-right.
This adds the parentheses at the appropriate position so that it matches
original intension.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Cc: Marek Lindner <lindner_marek@yahoo.de>
Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Cc: Antonio Quartulli <ordex@autistici.org>
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Lutomirski [Sat, 1 Dec 2012 20:37:20 +0000 (12:37 -0800)]
PCI: Reduce Ricoh 0xe822 SD card reader base clock frequency to 50MHz
Otherwise it fails like this on cards like the Transcend 16GB SDHC card:
mmc0: new SDHC card at address b368
mmcblk0: mmc0:b368 SDC 15.0 GiB
mmcblk0: error -110 sending status command, retrying
mmcblk0: error -84 transferring data, sector 0, nr 8, cmd response 0x900, card status 0xb0
Tested on my Lenovo x200 laptop.
[bhelgaas: changelog]
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Chris Ball <cjb@laptop.org>
CC: Manoj Iyer <manoj.iyer@canonical.com>
CC: stable@vger.kernel.org
Huang Ying [Wed, 26 Dec 2012 17:39:23 +0000 (10:39 -0700)]
PCI/PM: Do not suspend port if any subordinate device needs PME polling
Ulrich reported that his USB3 cardreader does not work reliably when
connected to the USB3 port. It turns out that USB3 controller failed to
awaken when plugging in the USB3 cardreader. Further experiments found
that the USB3 host controller can only be awakened via polling, not via PME
interrupt. But if the PCIe port to which the USB3 host controller is
connected is suspended, we cannot poll the controller because its config
space is not accessible when the PCIe port is in a low power state.
To solve the issue, the PCIe port will not be suspended if any subordinate
device needs PME polling.
[bhelgaas: use bool consistently rather than mixing int/bool]
Reference: http://lkml.kernel.org/r/
50841CCC.9030809@uli-eckhardt.de
Reported-by: Ulrich Eckhardt <usb@uli-eckhardt.de>
Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v3.6+
Bjorn Helgaas [Wed, 26 Dec 2012 17:39:23 +0000 (10:39 -0700)]
PCI: Add PCIe Link Capability link speed and width names
Add standard #defines for the Supported Link Speeds field in the PCIe
Link Capabilities register.
Note that prior to PCIe spec r3.0, these encodings were defined:
0001b 2.5GT/s Link speed supported
0010b 5.0GT/s and 2.5GT/s Link speed supported
Starting with spec r3.0, these encodings refer to bits 0 and 1 in the
Supported Link Speeds Vector in the Link Capabilities 2 register, and bits
0 and 1 there mean 2.5 GT/s and 5.0 GT/s, respectively. Therefore, code
that followed r2.0 and interpreted 0x1 as 2.5GT/s and 0x2 as 5.0GT/s will
continue to work, and we can identify a device using the new encodings
because it will have a non-zero Link Capabilities 2 register.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Myron Stowe [Wed, 26 Dec 2012 17:39:23 +0000 (10:39 -0700)]
PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)
Commit 284f5f9 was intended to disable the "only_one_child()" optimization
on Stratus ftServer systems, but its DMI check is wrong. It looks for
DMI_SYS_VENDOR that contains "ftServer", when it should look for
DMI_SYS_VENDOR containing "Stratus" and DMI_PRODUCT_NAME containing
"ftServer".
Tested on Stratus ftServer 6400.
Reported-by: Fadeeva Marina <astarta@rat.ru>
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=51331
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.5+
Bjorn Helgaas [Wed, 26 Dec 2012 17:39:22 +0000 (10:39 -0700)]
PCI: Remove spurious error for sriov_numvfs store and simplify flow
If we request "num_vfs" and the driver's sriov_configure() method enables
exactly that number ("num_vfs_enabled"), we complain "Invalid value for
number of VFs to enable" and return an error. We should silently return
success instead.
Also, use kstrtou16() since numVFs is defined to be a 16-bit field and
rework to simplify control flow.
Reported-by: Greg Rose <gregory.v.rose@intel.com>
Reference: http://lkml.kernel.org/r/
20121214101911.
00002f59@unknown
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Donald Dutile <ddutile@redhat.com>
Eric W. Biederman [Sat, 22 Dec 2012 09:52:39 +0000 (01:52 -0800)]
f2fs: Don't assign e_id in f2fs_acl_from_disk
With user namespaces enabled building f2fs fails with:
CC fs/f2fs/acl.o
fs/f2fs/acl.c: In function ‘f2fs_acl_from_disk’:
fs/f2fs/acl.c:85:21: error: ‘struct posix_acl_entry’ has no member named ‘e_id’
make[2]: *** [fs/f2fs/acl.o] Error 1
make[2]: Target `__build' not remade because of errors.
e_id is a backwards compatibility field only used for file systems
that haven't been converted to use kuids and kgids. When the posix
acl tag field is neither ACL_USER nor ACL_GROUP assigning e_id is
unnecessary. Remove the assignment so f2fs will build with user
namespaces enabled.
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Amit Sahrawat <a.sahrawat@samsung.com>
Acked-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Sat, 22 Dec 2012 04:38:00 +0000 (20:38 -0800)]
proc: Allow proc_free_inum to be called from any context
While testing the pid namespace code I hit this nasty warning.
[ 176.262617] ------------[ cut here ]------------
[ 176.263388] WARNING: at /home/eric/projects/linux/linux-userns-devel/kernel/softirq.c:160 local_bh_enable_ip+0x7a/0xa0()
[ 176.265145] Hardware name: Bochs
[ 176.265677] Modules linked in:
[ 176.266341] Pid: 742, comm: bash Not tainted 3.7.0userns+ #18
[ 176.266564] Call Trace:
[ 176.266564] [<
ffffffff810a539f>] warn_slowpath_common+0x7f/0xc0
[ 176.266564] [<
ffffffff810a53fa>] warn_slowpath_null+0x1a/0x20
[ 176.266564] [<
ffffffff810ad9ea>] local_bh_enable_ip+0x7a/0xa0
[ 176.266564] [<
ffffffff819308c9>] _raw_spin_unlock_bh+0x19/0x20
[ 176.266564] [<
ffffffff8123dbda>] proc_free_inum+0x3a/0x50
[ 176.266564] [<
ffffffff8111d0dc>] free_pid_ns+0x1c/0x80
[ 176.266564] [<
ffffffff8111d195>] put_pid_ns+0x35/0x50
[ 176.266564] [<
ffffffff810c608a>] put_pid+0x4a/0x60
[ 176.266564] [<
ffffffff8146b177>] tty_ioctl+0x717/0xc10
[ 176.266564] [<
ffffffff810aa4d5>] ? wait_consider_task+0x855/0xb90
[ 176.266564] [<
ffffffff81086bf9>] ? default_spin_lock_flags+0x9/0x10
[ 176.266564] [<
ffffffff810cab0a>] ? remove_wait_queue+0x5a/0x70
[ 176.266564] [<
ffffffff811e37e8>] do_vfs_ioctl+0x98/0x550
[ 176.266564] [<
ffffffff810b8a0f>] ? recalc_sigpending+0x1f/0x60
[ 176.266564] [<
ffffffff810b9127>] ? __set_task_blocked+0x37/0x80
[ 176.266564] [<
ffffffff810ab95b>] ? sys_wait4+0xab/0xf0
[ 176.266564] [<
ffffffff811e3d31>] sys_ioctl+0x91/0xb0
[ 176.266564] [<
ffffffff810a95f0>] ? task_stopped_code+0x50/0x50
[ 176.266564] [<
ffffffff81939199>] system_call_fastpath+0x16/0x1b
[ 176.266564] ---[ end trace
387af88219ad6143 ]---
It turns out that spin_unlock_bh(proc_inum_lock) is not safe when
put_pid is called with another spinlock held and irqs disabled.
For now take the easy path and use spin_lock_irqsave(proc_inum_lock)
in proc_free_inum and spin_loc_irq in proc_alloc_inum(proc_inum_lock).
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Sat, 22 Dec 2012 04:27:12 +0000 (20:27 -0800)]
pidns: Stop pid allocation when init dies
Oleg pointed out that in a pid namespace the sequence.
- pid 1 becomes a zombie
- setns(thepidns), fork,...
- reaping pid 1.
- The injected processes exiting.
Can lead to processes attempting access their child reaper and
instead following a stale pointer.
That waitpid for init can return before all of the processes in
the pid namespace have exited is also unfortunate.
Avoid these problems by disabling the allocation of new pids in a pid
namespace when init dies, instead of when the last process in a pid
namespace is reaped.
Pointed-out-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Michael Tokarev [Tue, 25 Dec 2012 19:08:16 +0000 (14:08 -0500)]
ext4: do not try to write superblock on ro remount w/o journal
When a journal-less ext4 filesystem is mounted on a read-only block
device (blockdev --setro will do), each remount (for other, unrelated,
flags, like suid=>nosuid etc) results in a series of scary messages
from kernel telling about I/O errors on the device.
This is becauese of the following code ext4_remount():
if (sbi->s_journal == NULL)
ext4_commit_super(sb, 1);
at the end of remount procedure, which forces writing (flushing) of
a superblock regardless whenever it is dirty or not, if the filesystem
is readonly or not, and whenever the device itself is readonly or not.
We only need call ext4_commit_super when the file system had been
previously mounted read/write.
Thanks to Eric Sandeen for help in diagnosing this issue.
Signed-off-By: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
Eric Sandeen [Tue, 25 Dec 2012 18:56:01 +0000 (13:56 -0500)]
ext4: include journal blocks in df overhead calcs
To more accurately calculate overhead for "bsd" style
df reporting, we should count the journal blocks as
overhead as well.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Eric Whitney <enwlinux@gmail.com>
Eric Sandeen [Tue, 25 Dec 2012 18:33:13 +0000 (13:33 -0500)]
ext4: remove unaligned AIO warning printk
Although I put this in, I now think it was a bad decision. For most
users, there is very little to be done in this case. They get the
message, once per day, with no real context or proposed action. TBH,
it generates support calls when it probably does not need to; the
message sounds more dire than the situation really is.
Just nuke it. Normal investigation via blktrace or whatnot can
reveal poor IO patterns if bad performance is encountered.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Andy Lutomirski [Tue, 25 Dec 2012 18:31:52 +0000 (13:31 -0500)]
ext4: fix an incorrect comment about i_mutex
i_mutex is not held when ->sync_file is called.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Jan Kara [Tue, 25 Dec 2012 18:29:52 +0000 (13:29 -0500)]
ext4: fix deadlock in journal_unmap_buffer()
We cannot wait for transaction commit in journal_unmap_buffer()
because we hold page lock which ranks below transaction start. We
solve the issue by bailing out of journal_unmap_buffer() and
jbd2_journal_invalidatepage() with -EBUSY. Caller is then responsible
for waiting for transaction commit to finish and try invalidation
again. Since the issue can happen only for page stradding i_size, it
is simple enough to manually call jbd2_journal_invalidatepage() for
such page from ext4_setattr(), check the return value and wait if
necessary.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Jan Kara [Tue, 25 Dec 2012 18:28:54 +0000 (13:28 -0500)]
ext4: split off ext4_journalled_invalidatepage()
In data=journal mode we don't need delalloc or DIO handling in invalidatepage
and similarly in other modes we don't need the journal handling. So split
invalidatepage implementations.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Eric W. Biederman [Fri, 21 Dec 2012 03:26:06 +0000 (19:26 -0800)]
pidns: Outlaw thread creation after unshare(CLONE_NEWPID)
The sequence:
unshare(CLONE_NEWPID)
clone(CLONE_THREAD|CLONE_SIGHAND|CLONE_VM)
Creates a new process in the new pid namespace without setting
pid_ns->child_reaper. After forking this results in a NULL
pointer dereference.
Avoid this and other nonsense scenarios that can show up after
creating a new pid namespace with unshare by adding a new
check in copy_prodcess.
Pointed-out-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cong Wang [Sun, 23 Dec 2012 15:23:16 +0000 (15:23 +0000)]
arp: fix a regression in arp_solicit()
Sedat reported the following commit caused a regression:
commit
9650388b5c56578fdccc79c57a8c82fb92b8e7f1
Author: Eric Dumazet <edumazet@google.com>
Date: Fri Dec 21 07:32:10 2012 +0000
ipv4: arp: fix a lockdep splat in arp_solicit
This is due to the 6th parameter of arp_send() needs to be NULL
for the broadcast case, the above commit changed it to an all-zero
array by mistake.
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 23 Dec 2012 17:48:33 +0000 (09:48 -0800)]
Merge branch 'i2c-embedded/for-next' of git://git.pengutronix.de/git/wsa/linux
Pull i2c __dev* attribute removal from Wolfram Sang:
"The squashed patches from Bill to get rid of the __dev* annotations in
the i2c subsystem. I couldn't include it in my previous pull request
due to some dependency with the mfd subsystem. I had this patch in
linux-next for two days before rc1 and nothing popped up."
* 'i2c-embedded/for-next' of git://git.pengutronix.de/git/wsa/linux:
i2c: remove __dev* attributes from subsystem
Zlatko Calusic [Sun, 23 Dec 2012 14:12:54 +0000 (15:12 +0100)]
mm: modify pgdat_balanced() so that it also handles order-0
Teach pgdat_balanced() about order-0 allocations so that we can simplify
code in a few places in vmstat.c.
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Zlatko Calusic <zlatko.calusic@iskon.hr>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Sun, 23 Dec 2012 13:39:32 +0000 (14:39 +0100)]
Partly revert "[media] uvcvideo: Set error_idx properly for extended controls API failures"
Commit
f0ed2ce840b3 ("[media] uvcvideo: Set error_idx properly for
extended controls API failures") causes user space to behave incorrectly
on one of my test machines (there is no sound under KDE 4.9.4 using
pulseaudio and there is a knotify4 process occupying one of the CPU
cores 100% of the time). Reverting that commit entirely fixes the
problem for me.
However, commit
f0ed2ce840b3 appears to do more than it follows from its
changelog, because the changelog only says about the changes related to
ctrls->error_idx, while the commit additionally changes error codes
returned by various functions in uvc_ctrl.c and uvc_v4l2.c. It turns
out that the changes of the returned error codes confuse the user spce,
so it is sufficient to revert the part of commit
f0ed2ce840b3 not
mentioned in its changelog to fix the problem.
[ 'ENOENT' is not a valid error return from an ioctl to begin with, and
I don't understand how anybody ever even thought it would be. - Linus ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Maxime Ripard [Tue, 18 Dec 2012 14:17:12 +0000 (15:17 +0100)]
sunxi: Change the machine compatible string.
Commit
68136b10 ("ARM: sunxi: Change device tree naming scheme for
sunxi") changed the naming scheme and the compatible strings used in the
device trees related to the sunXi platform, but forgot to change the
compatible string in the DT machine definition.
This prevents the kernel from booting on these boards.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Maxime Ripard [Wed, 21 Nov 2012 09:57:49 +0000 (10:57 +0100)]
ARM: multi_v7_defconfig: Add ARCH_SUNXI
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Ben Skeggs [Thu, 20 Dec 2012 02:52:06 +0000 (12:52 +1000)]
drm/nve0/graph: fix fuc, and enable acceleration on all known chipsets
Also adds GK106 to chipsets known by ucode.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 20 Dec 2012 02:50:52 +0000 (12:50 +1000)]
drm/nvc0/graph: fix fuc, and enable acceleration on GF119
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>