Peter Feiner [Thu, 25 Sep 2014 23:05:18 +0000 (16:05 -0700)]
mm: softdirty: addresses before VMAs in PTE holes aren't softdirty
In PTE holes that contain VM_SOFTDIRTY VMAs, unmapped addresses before
VM_SOFTDIRTY VMAs are reported as softdirty by /proc/pid/pagemap. This
bug was introduced in commit
68b5a6524856 ("mm: softdirty: respect
VM_SOFTDIRTY in PTE holes"). That commit made /proc/pid/pagemap look at
VM_SOFTDIRTY in PTE holes but neglected to observe the start of VMAs
returned by find_vma.
Tested:
Wrote a selftest that creates a PMD-sized VMA then unmaps the first
page and asserts that the page is not softdirty. I'm going to send the
pagemap selftest in a later commit.
Signed-off-by: Peter Feiner <pfeiner@google.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Jamie Liu <jamieliu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joseph Qi [Thu, 25 Sep 2014 23:05:16 +0000 (16:05 -0700)]
ocfs2/dlm: do not get resource spinlock if lockres is new
There is a deadlock case which reported by Guozhonghua:
https://oss.oracle.com/pipermail/ocfs2-devel/2014-September/010079.html
This case is caused by &res->spinlock and &dlm->master_lock
misordering in different threads.
It was introduced by commit
8d400b81cc83 ("ocfs2/dlm: Clean up refmap
helpers"). Since lockres is new, it doesn't not require the
&res->spinlock. So remove it.
Fixes:
8d400b81cc83 ("ocfs2/dlm: Clean up refmap helpers")
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Reported-by: Guozhonghua <guozhonghua@h3c.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andreas Rohner [Thu, 25 Sep 2014 23:05:14 +0000 (16:05 -0700)]
nilfs2: fix data loss with mmap()
This bug leads to reproducible silent data loss, despite the use of
msync(), sync() and a clean unmount of the file system. It is easily
reproducible with the following script:
----------------[BEGIN SCRIPT]--------------------
mkfs.nilfs2 -f /dev/sdb
mount /dev/sdb /mnt
dd if=/dev/zero bs=1M count=30 of=/mnt/testfile
umount /mnt
mount /dev/sdb /mnt
CHECKSUM_BEFORE="$(md5sum /mnt/testfile)"
/root/mmaptest/mmaptest /mnt/testfile 30 10 5
sync
CHECKSUM_AFTER="$(md5sum /mnt/testfile)"
umount /mnt
mount /dev/sdb /mnt
CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)"
umount /mnt
echo "BEFORE MMAP:\t$CHECKSUM_BEFORE"
echo "AFTER MMAP:\t$CHECKSUM_AFTER"
echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT"
----------------[END SCRIPT]--------------------
The mmaptest tool looks something like this (very simplified, with
error checking removed):
----------------[BEGIN mmaptest]--------------------
data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, file_offset);
for (i = 0; i < write_count; ++i) {
memcpy(data + i * 4096, buf, sizeof(buf));
msync(data, file_size - file_offset, MS_SYNC))
}
----------------[END mmaptest]--------------------
The output of the script looks something like this:
BEFORE MMAP:
281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile
AFTER MMAP:
6604a1c31f10780331a6850371b3a313 /mnt/testfile
AFTER REMOUNT:
281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile
So it is clear, that the changes done using mmap() do not survive a
remount. This can be reproduced a 100% of the time. The problem was
introduced in commit
136e8770cd5d ("nilfs2: fix issue of
nilfs_set_page_dirty() for page at EOF boundary").
If the page was read with mpage_readpage() or mpage_readpages() for
example, then it has no buffers attached to it. In that case
page_has_buffers(page) in nilfs_set_page_dirty() will be false.
Therefore nilfs_set_file_dirty() is never called and the pages are never
collected and never written to disk.
This patch fixes the problem by also calling nilfs_set_file_dirty() if the
page has no buffers attached to it.
[akpm@linux-foundation.org: s/PAGE_SHIFT/PAGE_CACHE_SHIFT/]
Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joseph Qi [Thu, 25 Sep 2014 23:05:11 +0000 (16:05 -0700)]
ocfs2: free vol_label in ocfs2_delete_osb()
osb->vol_label is malloced in ocfs2_initialize_super but not freed if
error occurs or during umount, thus causing a memory leak.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 24 Sep 2014 22:57:55 +0000 (15:57 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Some final radeon and i915 fixes, black screens mostly"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/cik: use a separate counter for CP init timeout
drm/i915/hdmi: fix hdmi audio state readout
drm/i915: Don't leak command parser tables on suspend/resume
drm/radeon: add PX quirk for asus K53TK
drm/radeon: add a backlight quirk for Amilo Xi 2550
drm/radeon: add a module parameter for backlight control (v2)
drm/radeon: Update IH_RB_RPTR register after each processed interrupt
drm/radeon: Make IH ring overflow debugging output more useful
drm/radeon: Clear RB_OVERFLOW bit earlier
Dave Airlie [Wed, 24 Sep 2014 20:49:26 +0000 (06:49 +1000)]
Merge branch 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
- fix a backlight regression resulting in dark screen
- add a PX quirk to avoid a hang with runtime pm
- fix an init issue on the CIK compute rings
- fix IH ring buffer overflows gracefully
* 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon/cik: use a separate counter for CP init timeout
drm/radeon: add PX quirk for asus K53TK
drm/radeon: add a backlight quirk for Amilo Xi 2550
drm/radeon: add a module parameter for backlight control (v2)
drm/radeon: Update IH_RB_RPTR register after each processed interrupt
drm/radeon: Make IH ring overflow debugging output more useful
drm/radeon: Clear RB_OVERFLOW bit earlier
Dave Airlie [Wed, 24 Sep 2014 20:48:26 +0000 (06:48 +1000)]
Merge tag 'drm-intel-fixes-2014-09-24' of git://anongit.freedesktop.org/drm-intel into drm-fixes
a couple of small fixes for 3.17 still.
* tag 'drm-intel-fixes-2014-09-24' of git://anongit.freedesktop.org/drm-intel:
drm/i915/hdmi: fix hdmi audio state readout
drm/i915: Don't leak command parser tables on suspend/resume
Linus Torvalds [Wed, 24 Sep 2014 19:45:24 +0000 (12:45 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"Here is a quick pull request primarily meant to address the deconfig
fallout from changing SCSI_NETLINK from being used via 'select' to
being used via 'depends'.
I applied a set of 5 patches written by Michal Marek, and then I
carefully audited all of the remaining config files, basically:
1) I scanned every arch config file, and if it mentioned CONFIG_INET
or CONFIG_UNIX, I made sure it had CONFIG_NET=y
2) After that, I scanned every arch config file, and if it did not
have CONFIG_NET=y I made sure it did not reference any networking
config options.
Finally, we have some late breaking wireless fixes in here from John
Linville and co"
[ And there's a sparc bpf fix snuck in too ]
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
sparc: bpf_jit: fix loads from negative offsets
parisc: Update defconfigs which were missing CONFIG_NET.
powerpc: Update defconfigs which were missing CONFIG_NET.
s390: Update defconfigs which were missing CONFIG_NET.
mips: Update some more defconfigs which were missing CONFIG_NET.
sparc: Set CONFIG_NET=y in defconfigs
sh: Set CONFIG_NET=y in defconfigs
powerpc: Set CONFIG_NET=y in defconfigs
parisc: Set CONFIG_NET=y in defconfigs
mips: Set CONFIG_NET=y in defconfigs
brcmfmac: Fix off by one bug in brcmf_count_20mhz_channels()
ath9k: Fix NULL pointer dereference on early irq
net: rfkill: gpio: Fix clock status
NFC: st21nfca: Fix potential depmod dependency cycle
NFC: st21nfcb: Fix depmod dependency cycle
NFC: microread: Potential overflows in microread_target_discovered()
Alexei Starovoitov [Tue, 23 Sep 2014 20:50:10 +0000 (13:50 -0700)]
sparc: bpf_jit: fix loads from negative offsets
- fix BPF_LD|ABS|IND from negative offsets:
make sure to sign extend lower 32 bits in 64-bit register
before calling C helpers from JITed code, otherwise 'int k'
argument of bpf_internal_load_pointer_neg_helper() function
will be added as large unsigned integer, causing packet size
check to trigger and abort the program.
It's worth noting that JITed code for 'A = A op K' will affect
upper 32 bits differently depending whether K is simm13 or not.
Since small constants are sign extended, whereas large constants
are stored in temp register and zero extended.
That is ok and we don't have to pay a penalty of sign extension
for every sethi, since all classic BPF instructions have 32-bit
semantics and we only need to set correct upper bits when
transitioning from JITed code into C.
- though instructions 'A &= 0' and 'A *= 0' are odd, JIT compiler
should not optimize them out
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Sep 2014 19:00:12 +0000 (15:00 -0400)]
Merge tag 'master-2014-09-23' of git://git./linux/kernel/git/linville/wireless
John W. Linville says:
====================
pull request: wireless 2014-09-23
Please consider pulling this one last batch of fixes intended for the 3.17 stream!
For the NFC bits, Samuel says:
"Hopefully not too late for a handful of NFC fixes:
- 2 potential build failures for ST21NFCA and ST21NFCB, triggered by a
depmod dependenyc cycle.
- One potential buffer overflow in the microread driver."
On top of that...
Emil Goode provides a fix for a brcmfmac off-by-one regression which
was introduced in the 3.17 cycle.
Loic Poulain fixes a polarity mismatch for a variable assignment
inside of rfkill-gpio.
Wojciech Dubowik prevents a NULL pointer dereference in ath9k.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Sep 2014 17:53:53 +0000 (13:53 -0400)]
parisc: Update defconfigs which were missing CONFIG_NET.
Commit
df568d8e ("scsi: Use 'depends' with LIBFC instead of
'select'.") removed what happened to be the only instance of 'select
NET'. Defconfigs that were relying on the select now lack networking
support.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Sep 2014 17:53:43 +0000 (13:53 -0400)]
powerpc: Update defconfigs which were missing CONFIG_NET.
Commit
df568d8e ("scsi: Use 'depends' with LIBFC instead of
'select'.") removed what happened to be the only instance of 'select
NET'. Defconfigs that were relying on the select now lack networking
support.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Sep 2014 17:44:32 +0000 (13:44 -0400)]
s390: Update defconfigs which were missing CONFIG_NET.
Commit
df568d8e ("scsi: Use 'depends' with LIBFC instead of
'select'.") removed what happened to be the only instance of 'select
NET'. Defconfigs that were relying on the select now lack networking
support.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Sep 2014 17:44:16 +0000 (13:44 -0400)]
mips: Update some more defconfigs which were missing CONFIG_NET.
Commit
df568d8e ("scsi: Use 'depends' with LIBFC instead of
'select'.") removed what happened to be the only instance of 'select
NET'. Defconfigs that were relying on the select now lack networking
support.
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Marek [Tue, 23 Sep 2014 15:44:04 +0000 (17:44 +0200)]
sparc: Set CONFIG_NET=y in defconfigs
Commit
5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
instead of selecting NET") removed what happened to be the only instance
of 'select NET'. Defconfigs that were relying on the select now lack
networking support.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: sparclinux@vger.kernel.org
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Marek [Tue, 23 Sep 2014 15:44:03 +0000 (17:44 +0200)]
sh: Set CONFIG_NET=y in defconfigs
Commit
5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
instead of selecting NET") removed what happened to be the only instance
of 'select NET'. Defconfigs that were relying on the select now lack
networking support.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Marek [Tue, 23 Sep 2014 15:44:02 +0000 (17:44 +0200)]
powerpc: Set CONFIG_NET=y in defconfigs
Commit
5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
instead of selecting NET") removed what happened to be the only instance
of 'select NET'. Defconfigs that were relying on the select now lack
networking support.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Marek [Tue, 23 Sep 2014 15:44:01 +0000 (17:44 +0200)]
parisc: Set CONFIG_NET=y in defconfigs
Commit
5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
instead of selecting NET") removed what happened to be the only instance
of 'select NET'. Defconfigs that were relying on the select now lack
networking support.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Marek [Tue, 23 Sep 2014 15:44:00 +0000 (17:44 +0200)]
mips: Set CONFIG_NET=y in defconfigs
Commit
5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
instead of selecting NET") removed what happened to be the only instance
of 'select NET'. Defconfigs that were relying on the select now lack
networking support.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: linux-mips@linux-mips.org
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 24 Sep 2014 16:51:50 +0000 (09:51 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull one last block fix from Jens Axboe:
"We've had an issue with scsi-mq where probing takes forever. This was
bisected down to the percpu changes for blk_mq_queue_enter(), and the
fact we now suffer an RCU grace period when killing a queue. SCSI
creates and destroys tons of queues, so this let to 10s of seconds of
stalls at boot for some.
Tejun has a real fix for this, but it's too involved for 3.17. So
this is a temporary workaround to expedite the queue killing until we
can fold in the real fix for 3.18 when that merge window opens"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe
Linus Torvalds [Wed, 24 Sep 2014 16:46:29 +0000 (09:46 -0700)]
Merge tag 'pci-v3.17-fixes-3' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Here are a few fixes that should be in v3.17.
- Reverting "Don't scan random busses" covers up a CardBus regression
having to do with allocating CardBus bus numbers.
- Reverting "Make sure bus numbers stay within parents bounds" covers
up an ACPI _CRS bug that makes us reconfigure a bridge, causing a
broken device behind it to stop responding.
- The pciehp timeout change fixes some code we added in v3.17.
Without the fix, we can send a new hotplug command too early,
before the timeout has expired.
I hope for better fixes for the reverts, but those will have to come
after v3.17"
* tag 'pci-v3.17-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: pciehp: Fix pcie_wait_cmd() timeout
Revert "PCI: Make sure bus number resources stay within their parents bounds"
Revert "PCI: Don't scan random busses in pci_scan_bridge()"
Linus Torvalds [Wed, 24 Sep 2014 16:37:35 +0000 (09:37 -0700)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes three issues:
- if ccp is loaded on a machine without ccp, it will incorrectly
activate causing all requests to fail. Fixed by preventing ccp
from loading if hardware isn't available.
- not all IRQs were enabled for the qat driver, leading to potential
stalls when it is used
- disabled buggy AVX CTR implementation in aesni"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: aesni - disable "by8" AVX CTR optimization
crypto: ccp - Check for CCP before registering crypto algs
crypto: qat - Enable all 32 IRQs
Linus Torvalds [Wed, 24 Sep 2014 16:03:43 +0000 (09:03 -0700)]
Merge tag 'media/v3.17-rc7' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"For some last time fixes:
- a regression detected on Kernel 3.16 related to VBI Teletext
application breakage on drivers using videobuf2 (see
https://bugzilla.kernel.org/show_bug.cgi?id=84401). The bug was
noticed on saa7134 (migrated to VB2 on 3.16), but also affects
em28xx (migrated on 3.9 to VB2);
- two additional sanity checks at videobuf2;
- two fixups to restore proper VBI support at the em28xx driver;
- two Kernel oops fixups (at cx24123 and cx2341x drivers);
- a bug at adv7604 where an if was doing just the opposite as it
would be expected;
- some documentation fixups to match the behavior defined at the
Kernel"
* tag 'media/v3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] em28xx-v4l: get rid of field "users" in struct em28xx_v4l2"
[media] em28xx: fix VBI handling logic
[media] DocBook media: improve the poll() documentation
[media] DocBook media: fix the poll() 'no QBUF' documentation
[media] vb2: fix VBI/poll regression
[media] cx2341x: fix kernel oops
[media] cx24123: fix kernel oops due to missing parent pointer
[media] adv7604: fix inverted condition
[media] media/radio: fix radio-miropcm20.c build with io.h header file
[media] vb2: fix plane index sanity check in vb2_plane_cookie()
[media] DocBook media: update version number and V4L2 changes
[media] DocBook media: fix fieldname in struct v4l2_subdev_selection
[media] vb2: fix vb2 state check when start_streaming fails
[media] videobuf2-core.h: fix comment
[media] videobuf2-core: add comments before the WARN_ON
[media] videobuf2-dma-sg: fix for wrong GFP mask to sg_alloc_table_from_pages
Linus Torvalds [Wed, 24 Sep 2014 15:53:33 +0000 (08:53 -0700)]
Merge tag 'md/3.17-more-fixes' of git://git.neil.brown.name/md
Pull bugfixes for md/raid1 from Neil Brown:
"It is amazing how much easier it is to find bugs when you know one is
there. Two bug reports resulted in finding 7 bugs!
All are tagged for -stable. Those that can't cause (rare) data
corruption, cause lockups.
Particularly, but not only, fixing new "resync" code"
* tag 'md/3.17-more-fixes' of git://git.neil.brown.name/md:
md/raid1: fix_read_error should act on all non-faulty devices.
md/raid1: count resync requests in nr_pending.
md/raid1: update next_resync under resync_lock.
md/raid1: Don't use next_resync to determine how far resync has progressed
md/raid1: make sure resync waits for conflicting writes to complete.
md/raid1: clean up request counts properly in close_sync()
md/raid1: be more cautious where we read-balance during resync.
md/raid1: intialise start_next_window for READ case to avoid hang
Tejun Heo [Tue, 23 Sep 2014 19:24:32 +0000 (15:24 -0400)]
blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe
blk-mq uses percpu_ref for its usage counter which tracks the number
of in-flight commands and used to synchronously drain the queue on
freeze. percpu_ref shutdown takes measureable wallclock time as it
involves a sched RCU grace period. This means that draining a blk-mq
takes measureable wallclock time. One would think that this shouldn't
matter as queue shutdown should be a rare event which takes place
asynchronously w.r.t. userland.
Unfortunately, SCSI probing involves synchronously setting up and then
tearing down a lot of request_queues back-to-back for non-existent
LUNs. This means that SCSI probing may take more than ten seconds
when scsi-mq is used.
This will be properly fixed by implementing a mechanism to keep
q->mq_usage_counter in atomic mode till genhd registration; however,
that involves rather big updates to percpu_ref which is difficult to
apply late in the devel cycle (v3.17-rc6 at the moment). As a
stop-gap measure till the proper fix can be implemented in the next
cycle, this patch introduces __percpu_ref_kill_expedited() and makes
blk_mq_freeze_queue() use it. This is heavy-handed but should work
for testing the experimental SCSI blk-mq implementation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Christoph Hellwig <hch@infradead.org>
Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
Fixes:
add703fda981 ("blk-mq: use percpu_ref for mq usage count")
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Tested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Mathias Krause [Tue, 23 Sep 2014 20:31:07 +0000 (22:31 +0200)]
crypto: aesni - disable "by8" AVX CTR optimization
The "by8" implementation introduced in commit
22cddcc7df8f ("crypto: aes
- AES CTR x86_64 "by8" AVX optimization") is failing crypto tests as it
handles counter block overflows differently. It only accounts the right
most 32 bit as a counter -- not the whole block as all other
implementations do. This makes it fail the cryptomgr test #4 that
specifically tests this corner case.
As we're quite late in the release cycle, just disable the "by8" variant
for now.
Reported-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tom Lendacky [Fri, 5 Sep 2014 15:31:09 +0000 (10:31 -0500)]
crypto: ccp - Check for CCP before registering crypto algs
If the ccp is built as a built-in module, then ccp-crypto (whether
built as a module or a built-in module) will be able to load and
it will register its crypto algorithms. If the system does not have
a CCP this will result in -ENODEV being returned whenever a command
is attempted to be queued by the registered crypto algorithms.
Add an API, ccp_present(), that checks for the presence of a CCP
on the system. The ccp-crypto module can use this to determine if it
should register it's crypto alogorithms.
Cc: stable@vger.kernel.org
Reported-by: Scot Doyle <lkml14@scotdoyle.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Scot Doyle <lkml14@scotdoyle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Linus Torvalds [Tue, 23 Sep 2014 23:47:34 +0000 (16:47 -0700)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband
Pull infiniband/rdma fixes from Roland Dreier:
"Last late set of InfiniBand/RDMA fixes for 3.17:
- fixes for the new memory region re-registration support
- iSER initiator error path fixes
- grab bag of small fixes for the qib and ocrdma hardware drivers
- larger set of fixes for mlx4, especially in RoCE mode"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits)
IB/mlx4: Fix VF mac handling in RoCE
IB/mlx4: Do not allow APM under RoCE
IB/mlx4: Don't update QP1 in native mode
IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header
mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses
IB/core: When marshaling uverbs path, clear unused fields
IB/mlx4: Avoid executing gid task when device is being removed
IB/mlx4: Fix lockdep splat for the iboe lock
IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up
IB/mlx4: Reorder steps in RoCE GID table initialization
IB/mlx4: Don't duplicate the default RoCE GID
IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()
IB/iser: Bump version to 1.4.1
IB/iser: Allow bind only when connection state is UP
IB/iser: Fix RX/TX CQ resource leak on error flow
RDMA/ocrdma: Use right macro in query AH
RDMA/ocrdma: Resolve L2 address when creating user AH
mlx4: Correct error flows in rereg_mr
IB/qib: Correct reference counting in debugfs qp_stats
IPoIB: Remove unnecessary port query
...
Linus Torvalds [Tue, 23 Sep 2014 21:47:11 +0000 (14:47 -0700)]
Merge tag 'sound-3.17-rc7' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"One fix is about a buggy computation in PCM API function Clemens
spotted out, but the impact must be really small as no one really uses
it in user-space side.
The rest are a trivial fix for a HD-audio model and a USB-audio
device-specific regression fix, so all look fairly safe to apply"
* tag 'sound-3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: snd-usb-caiaq: Fix LED commands for Kore controller
ALSA: pcm: fix fifo_size frame calculation
ALSA: hda - Add fixup model name lookup for Lemote A1205
Linus Torvalds [Tue, 23 Sep 2014 21:45:09 +0000 (14:45 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull final block fixes from Jens Axboe:
"This week and last we've been fixing some corner cases related to
blk-mq, mostly. I ended up pulling most of that out of for-linus
yesterday, which is why the branch looks fresh. The rest were
postponed for 3.18.
This pull request contains:
- Fix from Christoph, avoiding a stack overflow when FUA insertion
would recursive infinitely.
- Fix from David Hildenbrand on races between the timeout handler and
uninitialized requests. Fixes a real issue that virtio_blk has run
into.
- A few fixes from me:
- Ensure that request deadline/timeout is ordered before the
request is marked as started.
- A potential oops on out-of-memory, when we scale the queue
depth of the device and retry.
- A hang fix on requeue from SCSI, where the hardware queue
would be stopped when we attempt to re-run it (and hence
nothing would happen, stalling progress).
- A fix for commit
2da78092, where the cleanup path was moved
to RCU, but a debug might_sleep() was inadvertently left in
the code. This causes warnings for people"
* 'for-linus' of git://git.kernel.dk/linux-block:
genhd: fix leftover might_sleep() in blk_free_devt()
blk-mq: use blk_mq_start_hw_queues() when running requeue work
blk-mq: fix potential oops on out-of-memory in __blk_mq_alloc_rq_maps()
blk-mq: avoid infinite recursion with the FUA flag
blk-mq: Avoid race condition with uninitialized requests
blk-mq: request deadline must be visible before marking rq as started
Linus Torvalds [Tue, 23 Sep 2014 21:05:32 +0000 (14:05 -0700)]
Merge branch 'parisc-3.17-7' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"We avoid using -mfast-indirect-calls for 64bit kernel builds to
prevent building an unbootable kernel due to latest gcc changes.
In the pdc_stable/firmware-access driver we fix a few possible stack
overflows and we now call secure_computing_strict() instead of
secure_computing() which fixes upcoming SECCOMP patches in the
for-next trees"
* 'parisc-3.17-7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds
parisc: pdc_stable.c: Avoid potential stack overflows
parisc: pdc_stable.c: Cleaning up unnecessary use of memset in conjunction with strncpy
parisc: ptrace: use secure_computing_strict()
John David Anglin [Tue, 23 Sep 2014 00:54:50 +0000 (20:54 -0400)]
parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds
In spite of what the GCC manual says, the -mfast-indirect-calls has
never been supported in the 64-bit parisc compiler. Indirect calls have
always been done using function descriptors irrespective of the
-mfast-indirect-calls option.
Recently, it was noticed that a function descriptor was always requested
when the -mfast-indirect-calls option was specified. This caused
problems when the option was used in application code and doesn't make
any sense because the whole point of the option is to avoid using a
function descriptor for indirect calls.
Fixing this broke 64-bit kernel builds.
I will fix GCC but for now we need the attached change. This results in
the same kernel code as before.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # v3.0+
Signed-off-by: Helge Deller <deller@gmx.de>
Linus Torvalds [Tue, 23 Sep 2014 19:10:48 +0000 (12:10 -0700)]
Merge tag 'please-pull-defconfig' of git://git./linux/kernel/git/aegl/linux
Pull ia64 defconfig update from Tony Luck:
"Need to rebuild defconfig files to cope with removal of "select NET"
in drivers/scsi/Kconfig"
* tag 'please-pull-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
[IA64] refresh arch/ia64/configs/* using "make savedefconfig"
Linus Torvalds [Tue, 23 Sep 2014 18:53:28 +0000 (11:53 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Fix a resource leak in tmp103 driver
- Add support for two more processors to fam15h_power driver
- Also fix a bug in the same driver to only report the power level on
chips which actually support reporting it
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (tmp103) Fix resource leak bug in tmp103 temperature sensor driver
hwmon: (fam15h_power) Add support for two more processors
hwmon: (fam15h_power) Make actual power reporting conditional
Tony Luck [Mon, 22 Sep 2014 16:35:11 +0000 (09:35 -0700)]
[IA64] refresh arch/ia64/configs/* using "make savedefconfig"
Prompted by a change to drivers/scsi/Kconfig which used to do a
"select NET" but now does a "depends on NET". This meant that some
configurations ended up without CONFIG_NET=y
Signed-off-by Tony Luck <tony.luck@intel.com>
Linus Torvalds [Tue, 23 Sep 2014 16:13:43 +0000 (09:13 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull another kvm fix from Paolo Bonzini:
"Another fix for 3.17 arrived at just the wrong time, after I had sent
yesterday's pull request. Normally I would have waited for some other
patches to pile up, but since 3.17 might be short here it is"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
arm/arm64: KVM: Fix unaligned access bug on gicv2 access
Linus Torvalds [Tue, 23 Sep 2014 16:06:18 +0000 (09:06 -0700)]
Merge branch 'for-3.17-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
"One late fix for cgroup.
I was waiting for another set of fixes for a long-standing obscure
cpuset bug but am not sure whether they'll be ready before v3.17
release. This one is a simple fix for a mutex unlock balance bug in
an allocation failure path in pidlist_array_load().
The bug was introduced in v3.14 and the fix is tagged for -stable"
* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix unbalanced locking
Emil Goode [Mon, 22 Sep 2014 22:49:55 +0000 (00:49 +0200)]
brcmfmac: Fix off by one bug in brcmf_count_20mhz_channels()
In the brcmf_count_20mhz_channels function we are looping through a list
of channels received from firmware. Since the index of the first channel
is 0 the condition leads to an off by one bug. This is causing us to hit
the WARN_ON_ONCE(1) calls in the brcmu_d11n_decchspec function, which is
how I discovered the bug.
Introduced by:
commit
b48d891676f756d48b4d0ee131e4a7a5d43ca417
("brcmfmac: rework wiphy structure setup")
Acked-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Emil Goode <emilgoode@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Alex Deucher [Tue, 23 Sep 2014 14:20:13 +0000 (10:20 -0400)]
drm/radeon/cik: use a separate counter for CP init timeout
Otherwise we may fail to init the second compute ring.
Noticed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Jani Nikula [Wed, 17 Sep 2014 12:34:58 +0000 (15:34 +0300)]
drm/i915/hdmi: fix hdmi audio state readout
Check the correct bit for audio. Seems like a copy-paste error from the
start:
commit
9ed109a7b445e3f073d8ea72f888ec80c0532465
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Apr 24 23:54:52 2014 +0200
drm/i915: Track has_audio in the pipe config
Reported-by: Martin Andersen <martin.x.andersen@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82756
Cc: stable@vger.kernel.org # 3.16+
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Paolo Bonzini [Tue, 23 Sep 2014 13:18:02 +0000 (15:18 +0200)]
Merge tag 'kvm-arm-for-v3.17-rc7-or-final' of git://git./linux/kernel/git/kvmarm/kvmarm into kvm-master
Fixes unaligned access to the gicv2 virtual cpu status.
Brad Volkin [Mon, 22 Sep 2014 15:25:21 +0000 (08:25 -0700)]
drm/i915: Don't leak command parser tables on suspend/resume
Ring init and cleanup are not balanced because we re-init the rings on
resume without having cleaned them up on suspend. This leads to the
driver leaking the parser's hash tables with a kmemleak signature such
as this:
unreferenced object 0xffff880405960980 (size 32):
comm "systemd-udevd", pid 516, jiffies
4294896961 (age 10202.044s)
hex dump (first 32 bytes):
d0 85 46 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..F.............
98 60 28 04 04 88 ff ff 00 00 00 00 00 00 00 00 .`(.............
backtrace:
[<
ffffffff81816f9e>] kmemleak_alloc+0x4e/0xb0
[<
ffffffff811fa678>] kmem_cache_alloc_trace+0x168/0x2f0
[<
ffffffffc03e20a5>] i915_cmd_parser_init_ring+0x2a5/0x3e0 [i915]
[<
ffffffffc04088a2>] intel_init_ring_buffer+0x202/0x470 [i915]
[<
ffffffffc040c998>] intel_init_vebox_ring_buffer+0x1e8/0x2b0 [i915]
[<
ffffffffc03eff59>] i915_gem_init_hw+0x2f9/0x3a0 [i915]
[<
ffffffffc03f0057>] i915_gem_init+0x57/0x1d0 [i915]
[<
ffffffffc045e26a>] i915_driver_load+0xc0a/0x10e0 [i915]
[<
ffffffffc02e0d5d>] drm_dev_register+0xad/0x100 [drm]
[<
ffffffffc02e3b9f>] drm_get_pci_dev+0x8f/0x200 [drm]
[<
ffffffffc03c934b>] i915_pci_probe+0x3b/0x60 [i915]
[<
ffffffff81436725>] local_pci_probe+0x45/0xa0
[<
ffffffff81437a69>] pci_device_probe+0xd9/0x130
[<
ffffffff81524f4d>] driver_probe_device+0x12d/0x3e0
[<
ffffffff815252d3>] __driver_attach+0x93/0xa0
[<
ffffffff81522e1b>] bus_for_each_dev+0x6b/0xb0
This patch extends the current convention of checking whether a
resource is already allocated before allocating it during ring init.
Longer term it might make sense to only init the rings once.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83794
Tested-by: Kari Suvanto <kari.tj.suvanto@gmail.com>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Linus Torvalds [Tue, 23 Sep 2014 06:05:49 +0000 (23:05 -0700)]
Revert "x86/efi: Fixup GOT in all boot code paths"
This reverts commit
9cb0e394234d244fe5a97e743ec9dd7ddff7e64b.
It causes my Sony Vaio Pro 11 to immediately reboot at startup.
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yinghai Lu [Tue, 23 Sep 2014 02:05:45 +0000 (20:05 -0600)]
PCI: pciehp: Fix pcie_wait_cmd() timeout
pcie_poll_cmd() take msecs instead of jiffies, so convert timeout to msecs.
Fixes:
40b960831cfa ("PCI: pciehp: Compute timeout from hotplug command start time")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Linus Torvalds [Tue, 23 Sep 2014 01:23:33 +0000 (18:23 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) If the user gives us a msg_namelen of 0, don't try to interpret
anything pointed to by msg_name. From Ani Sinha.
2) Fix some bnx2i/bnx2fc randconfig compilation errors.
The gist of the issue is that we firstly have drivers that span both
SCSI and networking. And at the top of that chain of dependencies
we have things like SCSI_FC_ATTRS and SCSI_NETLINK which are
selected.
But since select is a sledgehammer and ignores dependencies,
everything to select's SCSI_FC_ATTRS and/or SCSI_NETLINK has to also
explicitly select their dependencies and so on and so forth.
Generally speaking 'select' is supposed to only be used for child
nodes, those which have no dependencies of their own. And this
whole chain of dependencies in the scsi layer violates that rather
strongly.
So just make SCSI_NETLINK depend upon it's dependencies, and so on
and so forth for the things selecting it (either directly or
indirectly).
From Anish Bhatt and Randy Dunlap.
3) Fix generation of blackhole routes in IPSEC, from Steffen Klassert.
4) Actually notice netdev feature changes in rtl_open() code, from
Hayes Wang.
5) Fix divide by zero in bond enslaving, from Nikolay Aleksandrov.
6) Missing memory barrier in sunvnet driver, from David Stevens.
7) Don't leave anycast addresses around when ipv6 interface is
destroyed, from Sabrina Dubroca.
8) Don't call efx_{arch}_filter_sync_rx_mode before addr_list_lock is
initialized in SFC driver, from Edward Cree.
9) Fix missing DMA error checking in 3c59x, from Neal Horman.
10) Openvswitch doesn't emit OVS_FLOW_CMD_NEW notifications accidently,
fix from Samuel Gauthier.
11) pch_gbe needs to select NET_PTP_CLASSIFY otherwise we can get a
build error.
12) Fix macvlan regression wherein we stopped emitting
broadcast/multicast frames over software devices. From Nicolas
Dichtel.
13) Fix infiniband bug due to unintended overflow of skb->cb[], from
Eric Dumazet. And add an assertion so this doesn't happen again.
14) dm9000_parse_dt() should return error pointers, not NULL. From
Tobias Klauser.
15) IP tunneling code uses this_cpu_ptr() in preemptible contexts, fix
from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
net: bcmgenet: call bcmgenet_dma_teardown in bcmgenet_fini_dma
net: bcmgenet: fix TX reclaim accounting for fragments
ipv4: do not use this_cpu_ptr() in preemptible context
dm9000: Return an ERR_PTR() in all error conditions of dm9000_parse_dt()
r8169: fix an if condition
r8152: disable ALDPS
ipoib: validate struct ipoib_cb size
net: sched: shrink struct qdisc_skb_cb to 28 bytes
tg3: Work around HW/FW limitations with vlan encapsulated frames
macvlan: allow to enqueue broadcast pkt on virtual device
pch_gbe: 'select' NET_PTP_CLASSIFY.
scsi: Use 'depends' with LIBFC instead of 'select'.
openvswitch: restore OVS_FLOW_CMD_NEW notifications
genetlink: add function genl_has_listeners()
lib: rhashtable: remove second linux/log2.h inclusion
net: allow macvlans to move to net namespace
3c59x: Fix bad offset spec in skb_frag_dma_map
3c59x: Add dma error checking and recovery
sparc: bpf_jit: fix support for ldx/stx mem and SKF_AD_VLAN_TAG
can: at91_can: add missing prepare and unprepare of the clock
...
Linus Torvalds [Tue, 23 Sep 2014 01:18:55 +0000 (18:18 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux
Pull clock layer fixes from Mike Turquette:
"The fixes for the clock tree are mostly run-time bugs in clock
drivers.
The fixes for TI DRA7 remove divide-by-zero errors. The recently
merged AT91 clock driver fixes some bad error checking and the QCOM
driver fix restores audio for that platform, a clear regression. A
list iteration bug in the framework core was hit recently and is fixed
up here. Finally a compilation warning is fixed for efm32gg, which is
also a regression fix"
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
clk/efm32gg: fix dt init prototype
clk: prevent erronous parsing of children during rate change
clk: rockchip: Fix the clocks for i2c1 and i2c2
clk: qcom: Fix sdc 144kHz frequency entry
clk: at91: fix num_parents test in at91sam9260 slow clk implementation
clk: ti: dra7-atl: Provide error check for incoming parameters in set_rate
clk: ti: divider: Provide error check for incoming parameters in set_rate
Linus Torvalds [Tue, 23 Sep 2014 00:52:16 +0000 (17:52 -0700)]
Merge tag 'fscache-fixes-
20140917' of git://git./linux/kernel/git/dhowells/linux-fs
Pull fs-cache fixes from David Howells:
- Put a timeout in releasepage() to deal with a recursive hang between
the memory allocator, writeback, ext4 and fscache under memory
pressure.
- Fix a pair of refcount bugs in the fscache error handling.
- Remove a couple of unused pagevecs.
- The cachefiles requirement that the base directory support rename
should permit rename2 as an alternative - otherwise certain
filesystems cannot now be used as backing stores (such as ext4).
* tag 'fscache-fixes-
20140917' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
CacheFiles: Handle rename2
cachefiles: remove two unused pagevecs.
FS-Cache: refcount becomes corrupt under vma pressure.
FS-Cache: Reduce cookie ref count if submit fails.
FS-Cache: Timeout for releasepage()
David S. Miller [Mon, 22 Sep 2014 22:38:53 +0000 (18:38 -0400)]
Merge branch 'bcmgenet'
Florian Fainelli says:
====================
net: bcmgenet: TX reclaim and DMA fixes
This patch set contains one fix for an accounting problem while reclaiming
transmitted buffers having fragments, and the second fix is to make sure
that the DMA shutdown is properly controlled.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 22 Sep 2014 18:54:43 +0000 (11:54 -0700)]
net: bcmgenet: call bcmgenet_dma_teardown in bcmgenet_fini_dma
We should not be manipulaging the DMA_CTRL registers directly by writing
0 to them to disable DMA. This is an operation that needs to be timed to
make sure the DMA engines have been properly stopped since their state
machine stops on a packet boundary, not immediately.
Make sure that tha bcmgenet_fini_dma() calls bcmgenet_dma_teardown() to
ensure a proper DMA engine state. As a result, we need to reorder the
function bodies to resolve the use dependency.
Fixes:
1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 22 Sep 2014 18:54:42 +0000 (11:54 -0700)]
net: bcmgenet: fix TX reclaim accounting for fragments
The GENET driver supports SKB fragments, and succeeds in transmitting
them properly, but when reclaiming these transmitted fragments, we will
only update the count of free buffer descriptors by 1, even for SKBs
with fragments. This leads to the networking stack thinking it has more
room than the hardware has when pushing new SKBs, and backing off
consequently because we return NETDEV_TX_BUSY.
Fix this by accounting for the SKB nr_frags plus one (itself) and update
ring->free_bds accordingly with that value for each iteration loop in
__bcmgenet_tx_reclaim().
Fixes:
1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 22 Sep 2014 17:38:16 +0000 (10:38 -0700)]
ipv4: do not use this_cpu_ptr() in preemptible context
this_cpu_ptr() in preemptible context is generally bad
Sep 22 05:05:55 br kernel: [ 94.608310] BUG: using smp_processor_id()
in
preemptible [
00000000] code: ip/2261
Sep 22 05:05:55 br kernel: [ 94.608316] caller is
tunnel_dst_set.isra.28+0x20/0x60 [ip_tunnel]
Sep 22 05:05:55 br kernel: [ 94.608319] CPU: 3 PID: 2261 Comm: ip Not
tainted
3.17.0-rc5 #82
We can simply use raw_cpu_ptr(), as preemption is safe in these
contexts.
Should fix https://bugzilla.kernel.org/show_bug.cgi?id=84991
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Joe <joe9mail@gmail.com>
Fixes:
9a4aa9af447f ("ipv4: Use percpu Cache route in IP tunnels")
Acked-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Deucher [Mon, 22 Sep 2014 21:28:29 +0000 (17:28 -0400)]
drm/radeon: add PX quirk for asus K53TK
Seems to have problems turning the dGPU on/off.
bug:
https://bugzilla.kernel.org/show_bug.cgi?id=51381
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 17 Sep 2014 15:31:02 +0000 (11:31 -0400)]
drm/radeon: add a backlight quirk for Amilo Xi 2550
Only the acpi backlight seems to work. Using the
radeon backlight controller causes the backlight to
go off.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=81382
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 17 Sep 2014 00:57:26 +0000 (20:57 -0400)]
drm/radeon: add a module parameter for backlight control (v2)
Add a module parameter to disable the radeon GPU backlight
controller to override the automatic detection. Some
laptops seems to indicate that they use the integrated
controller, but appear to actually use an external
controller.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=81382
v2: fix module parameter description
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Fri, 19 Sep 2014 03:22:10 +0000 (12:22 +0900)]
drm/radeon: Update IH_RB_RPTR register after each processed interrupt
This might decrease the chance of IH ring buffer overflows.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Fri, 19 Sep 2014 03:22:07 +0000 (12:22 +0900)]
drm/radeon: Make IH ring overflow debugging output more useful
Use the same format for all ring indices, and fix the calculation of the
post-overflow RPTR.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Fri, 19 Sep 2014 03:07:11 +0000 (12:07 +0900)]
drm/radeon: Clear RB_OVERFLOW bit earlier
Otherwise the bit remains set in rdev->ih.rptr, so the wptr can never
match that and we still have an infinite loop.
This fix allows me to successfully recover from an IH ring buffer
overflow.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christoffer Dall [Mon, 22 Sep 2014 20:10:36 +0000 (22:10 +0200)]
arm/arm64: KVM: Fix unaligned access bug on gicv2 access
We were using an atomic bitop on the vgic_v2.vgic_elrsr field which was
not aligned to the natural size on 64-bit platforms. This bug showed up
after QEMU correctly identifies the pl011 line as being level-triggered,
and not edge-triggered.
These data structures are protected by a spinlock so simply use a
non-atomic version of the accessor instead.
Tested-by: Joel Schopp <joel.schopp@amd.com>
Reported-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Jens Axboe [Tue, 16 Sep 2014 19:38:51 +0000 (13:38 -0600)]
genhd: fix leftover might_sleep() in blk_free_devt()
Commit
2da78092 changed the locking from a mutex to a spinlock,
so we now longer sleep in this context. But there was a leftover
might_sleep() in there, which now triggers since we do the final
free from an RCU callback. Get rid of it.
Reported-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
David S. Miller [Mon, 22 Sep 2014 20:41:41 +0000 (16:41 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2014-09-22
We generate a blackhole or queueing route if a packet
matches an IPsec policy but a state can't be resolved.
Here we assume that dst_output() is called to kill
these packets. Unfortunately this assumption is not
true in all cases, so it is possible that these packets
leave the system without the necessary transformations.
This pull request contains two patches to fix this issue:
1) Fix for blackhole routed packets.
2) Fix for queue routed packets.
Both patches are serious stable candidates.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Fri, 19 Sep 2014 14:16:25 +0000 (16:16 +0200)]
dm9000: Return an ERR_PTR() in all error conditions of dm9000_parse_dt()
In one error condition dm9000_parse_dt() returns NULL, however the
return value is checked using IS_ERR() in dm9000_probe(), leading to the
error not being properly propagated if CONFIG_OF is not enabled or the
device tree data is not available. Fix this by also returning an
ERR_PTR() in this case.
Fixes:
0b8bf1baabe5 (net: dm9000: Allow instantiation using device tree)
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wojciech Dubowik [Thu, 18 Sep 2014 06:30:41 +0000 (08:30 +0200)]
ath9k: Fix NULL pointer dereference on early irq
The ah struct might not have been initialized when
interrupt comes so check for it.
Signed-off-by: Wojciech Dubowik <Wojciech.Dubowik@neratec.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Loic Poulain [Tue, 16 Sep 2014 12:53:58 +0000 (14:53 +0200)]
net: rfkill: gpio: Fix clock status
Clock is disabled when the device is blocked.
So, clock_enabled is the logical negation of "blocked".
Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
John W. Linville [Mon, 22 Sep 2014 19:57:46 +0000 (15:57 -0400)]
Merge tag 'nfc-fixes-3.17-1' of git://git./linux/kernel/git/sameo/nfc-fixes
Samuel Ortiz <sameo@linux.intel.com> says:
"NFC: 3.17 fixes
We have 3 NFC fixes for 3.17:
- 2 potential build failures for ST21NFCA and ST21NFCB, triggered by a
depmod dependenyc cycle.
- One potential buffer overflow in the microread driver."
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Dan Carpenter [Fri, 19 Sep 2014 10:40:25 +0000 (13:40 +0300)]
r8169: fix an if condition
There is an extra semi-colon so __rtl8169_set_features() is called every
time.
Fixes:
929a031dfd62 ('r8169: adjust __rtl8169_set_features')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Hayes Wang <hayeswang@realtek.com>--
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 22 Sep 2014 18:58:23 +0000 (11:58 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Two very simple bugfixes, affecting all supported architectures"
[ Two? There's three commits in here. Oh well, I guess Paolo didn't
count the preparatory symbol export ]
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: correct null pid check in kvm_vcpu_yield_to()
KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()
mm: export symbol dependencies of is_zero_pfn()
hayeswang [Fri, 19 Sep 2014 07:17:18 +0000 (15:17 +0800)]
r8152: disable ALDPS
If the hw is in ALDPS mode, the hw may have no response for accessing
the most registers. Therefore, the ALDPS should be disabled before
accessing the hw in rtl_ops.init(), rtl_ops.disable(), rtl_ops.up(),
and rtl_ops.down(). Regardless of rtl_ops.enable(), because the hw
wouldn't enter ALDPS mode when linking on. The hw would enter the
ALDPS mode after several seconds when link down occurs and the ALDPS
is enabled.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 18 Sep 2014 18:00:27 +0000 (11:00 -0700)]
ipoib: validate struct ipoib_cb size
To catch future errors sooner.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 18 Sep 2014 15:02:05 +0000 (08:02 -0700)]
net: sched: shrink struct qdisc_skb_cb to 28 bytes
We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
or increasing skb->cb[] size.
Commit
e0f31d849867 ("flow_keys: Record IP layer protocol in
skb_flow_dissect()") broke IPoIB.
Only current offender is sch_choke, and this one do not need an
absolutely precise flow key.
If we store 17 bytes of flow key, its more than enough. (Its the actual
size of flow_keys if it was a packed structure, but we might add new
fields at the end of it later)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes:
e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()")
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Thu, 18 Sep 2014 14:31:17 +0000 (10:31 -0400)]
tg3: Work around HW/FW limitations with vlan encapsulated frames
TG3 appears to have an issue performing TSO and checksum offloading
correclty when the frame has been vlan encapsulated (non-accelrated).
In these cases, tcp checksum is not correctly updated.
This patch attempts to work around this issue. After the patch,
802.1ad vlans start working correctly over tg3 devices.
CC: Prashant Sreedharan <prashant@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sundarjdev [Mon, 22 Sep 2014 17:31:39 +0000 (10:31 -0700)]
hwmon: (tmp103) Fix resource leak bug in tmp103 temperature sensor driver
tmp103 temperature sensor driver registers with the hwmon framework by calling
hwmon_device_register_with_groups but does not have a .remove method to call
hwmon_device_unregister to unregister from the framework when the device is no
longer needed. Fix this by calling devm_hwmon_device_register_with_groups.
Signed-off-by: Sundar J Dev <sundarjayakumardev@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Nicolas Dichtel [Wed, 17 Sep 2014 08:08:08 +0000 (10:08 +0200)]
macvlan: allow to enqueue broadcast pkt on virtual device
Since commit
412ca1550cbe ("macvlan: Move broadcasts into a work queue"), the
driver uses tx_queue_len of the master device as the limit of packets enqueuing.
Problem is that virtual drivers have this value set to 0, thus all broadcast
packets were rejected.
Because tx_queue_len was arbitrarily chosen, I replace it with a static limit
of 1000 (also arbitrarily chosen).
CC: Herbert Xu <herbert@gondor.apana.org.au>
Reported-by: Thibaut Collet <thibaut.collet@6wind.com>
Suggested-by: Thibaut Collet <thibaut.collet@6wind.com>
Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jens Axboe [Fri, 19 Sep 2014 19:10:29 +0000 (13:10 -0600)]
blk-mq: use blk_mq_start_hw_queues() when running requeue work
When requests are retried due to hw or sw resource shortages,
we often stop the associated hardware queue. So ensure that we
restart the queues when running the requeue work, otherwise the
queue run will be a no-op.
Signed-off-by: Jens Axboe <axboe@fb.com>
Jens Axboe [Fri, 19 Sep 2014 14:04:53 +0000 (08:04 -0600)]
blk-mq: fix potential oops on out-of-memory in __blk_mq_alloc_rq_maps()
__blk_mq_alloc_rq_maps() can be invoked multiple times, if we scale
back the queue depth if we are low on memory. So don't clear
set->tags when we fail, this is handled directly in
the parent function, blk_mq_alloc_tag_set().
Reported-by: Robert Elliott <Elliott@hp.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Christoph Hellwig [Tue, 16 Sep 2014 21:44:07 +0000 (14:44 -0700)]
blk-mq: avoid infinite recursion with the FUA flag
We should not insert requests into the flush state machine from
blk_mq_insert_request. All incoming flush requests come through
blk_{m,s}q_make_request and are handled there, while blk_execute_rq_nowait
should only be called for BLOCK_PC requests. All other callers
deal with requests that already went through the flush statemchine
and shouldn't be reinserted into it.
Reported-by: Robert Elliott <Elliott@hp.com>
Debugged-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
David Hildenbrand [Thu, 18 Sep 2014 09:04:31 +0000 (11:04 +0200)]
blk-mq: Avoid race condition with uninitialized requests
This patch should fix the bug reported in
https://lkml.org/lkml/2014/9/11/249.
We have to initialize at least the atomic_flags and the cmd_flags when
allocating storage for the requests.
Otherwise blk_mq_timeout_check() might dereference uninitialized
pointers when racing with the creation of a request.
Also move the reset of cmd_flags for the initializing code to the point
where a request is freed. So we will never end up with pending flush
request indicators that might trigger dereferences of invalid pointers
in blk_mq_timeout_check().
Cc: stable@vger.kernel.org
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reported-by: Paulo De Rezende Pinatti <ppinatti@linux.vnet.ibm.com>
Tested-by: Paulo De Rezende Pinatti <ppinatti@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Jens Axboe [Tue, 16 Sep 2014 16:37:37 +0000 (10:37 -0600)]
blk-mq: request deadline must be visible before marking rq as started
When we start the request, we set the deadline and flip the bits
marking the request as started and non-complete. However, it's
important that the deadline store is ordered before flipping the
bits, otherwise we could have a small window where the request is
marked started but with an invalid deadline. This can confuse the
timeout handling.
Suggested-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
David S. Miller [Mon, 22 Sep 2014 17:25:51 +0000 (13:25 -0400)]
pch_gbe: 'select' NET_PTP_CLASSIFY.
Fixes the following randconfig build failure:
> drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c: In function
> ‘pch_ptp_match’:
> drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:130:2: error:
> implicit declaration of function ‘ptp_classify_raw’
> [-Werror=implicit-function-declaration]
> if (ptp_classify_raw(skb) == PTP_CLASS_NONE)
> ^
> cc1: some warnings being treated as errors
> make[5]: *** [drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o] Error 1
Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 22 Sep 2014 17:14:33 +0000 (13:14 -0400)]
scsi: Use 'depends' with LIBFC instead of 'select'.
LIBFC depends upon SCSI_FC_ATTRS and select's CRC32C.
The only alternative would be to 'select' CRC32C and all of
SCSI_FC_ATTRS direct and indirect dependencies in the Kconfig section
for every LIBFCOE user which makes little sense.
Subsequently, use 'depends' instead of 'select' for LIBFCOE too.
Signed-off-by: David S. Miller <davem@davemloft.net>
Roland Dreier [Mon, 22 Sep 2014 17:05:40 +0000 (10:05 -0700)]
Merge branches 'core', 'ipoib', 'iser', 'mlx4', 'ocrdma' and 'qib' into for-next
Jack Morgenstein [Thu, 11 Sep 2014 11:11:20 +0000 (14:11 +0300)]
IB/mlx4: Fix VF mac handling in RoCE
We had several problems here. First, a race condition on QP1 mac
handling between mlx4_ib_update_qps and mlx4_ib_modify_qp, which is
fixed by taking the qp mutex in mlx4_ib_update_qps.
Also, qp->pri.smac_port was not updated in mlx4_ib_update_qps.
Last, in __mlx4_ib_modify_qp we did not properly handle the case where
the mac is zero, but port is non-zero.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Jack Morgenstein [Thu, 11 Sep 2014 11:11:19 +0000 (14:11 +0300)]
IB/mlx4: Do not allow APM under RoCE
Automatic Path Migration is not supported under RoCE. Therefore,
return a "not-supported" error if the caller attempts to set an
alternate path in a QP context.
In addition, if there are no IB ports configured, do not report
APM capability in the device flags returned by mlx4_ib_query_device.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Jack Morgenstein [Thu, 11 Sep 2014 11:11:18 +0000 (14:11 +0300)]
IB/mlx4: Don't update QP1 in native mode
For native functions (non-SR-IOV), there's no reason to update
the smac_index, as QP1 is a GSI QP.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Jack Morgenstein [Thu, 11 Sep 2014 11:11:17 +0000 (14:11 +0300)]
IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header
The source MAC is needed in RoCE when building the QP1 header.
Currently, this is obtained from the source net device. However, the net
device may not yet exist, or can be destroyed in parallel to this QP1 send
operation (e.g through the VPI port change flow) so accessing it may cause
a kernel crash.
To fix this, we maintain a source MAC cache per port for the net device in
struct mlx4_ib_roce. This cached MAC is initialized to be the default MAC
address obtained during HCA initialization via QUERY_PORT. This cached MAC
is updated via the netdev event notifier handler.
Since the cached MAC is held in an atomic64 object, we do not need locking
when accessing it.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Jack Morgenstein [Thu, 11 Sep 2014 11:11:16 +0000 (14:11 +0300)]
mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses
There is a chance that the VF mlx4 RoCE driver (mlx4_ib) may see a 0-mac
as the current default MAC address when a RoCE interface first comes up.
In this case, the RoCE driver registers the 0-mac to get its MAC index --
used in the INIT2RTR transition when it creates its proxy Q1 qp's.
If we do not allow QP1 to be created, the RoCE driver will not come up.
If we do not register the 0-mac, but simply use a random mac-index,
QP1 will attempt to send packets with an someone's else source MAC which
will get the system into more troubled.
Since a 0-mac was previously used to indicate a free slot, this leads to
errors, both when the 0-mac is registered and when it is unregistered.
The required fix is to check in addition that the slot containing the
0-mac has a reference count of zero.
Additionally, when comparing MAC addresses, need to mask out the 2 MSBs
of the u64 mac on both sides of the comparison.
Note that when the EN driver (mlx4_en) comes up, it set itself a proper
mac --> the RoCE driver gets to be notified on that and further handing
is done with the update qp command, as was added by commit
9433c188915c
("IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes").
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Matan Barak [Tue, 2 Sep 2014 12:32:34 +0000 (15:32 +0300)]
IB/core: When marshaling uverbs path, clear unused fields
When marsheling a user path to the kernel struct ib_sa_path, need
to zero smac, dmac and set the vlan id to the "no vlan" value.
Fixes:
dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
Reported-by: Aleksey Senin <alekseys@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Moni Shoua [Thu, 21 Aug 2014 11:28:42 +0000 (14:28 +0300)]
IB/mlx4: Avoid executing gid task when device is being removed
When device is being removed (e.g during VPI port link type change
from ETH to IB), tasks for gid table changes should not be executed.
Flush the current queue of tasks and block further tasks from entering the queue.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Jack Morgenstein [Thu, 21 Aug 2014 11:28:41 +0000 (14:28 +0300)]
IB/mlx4: Fix lockdep splat for the iboe lock
Chuck Lever reported the following stack trace:
=================================
[ INFO: inconsistent lock state ]
3.16.0-rc2-00024-g2e78883 #17 Tainted: G E
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&(&iboe->lock)->rlock){+.?...}, at: [<
ffffffffa065f68b>] mlx4_ib_addr_event+0xdb/0x1a0 [mlx4_ib]
{SOFTIRQ-ON-W} state was registered at:
[<
ffffffff810b3110>] mark_irqflags+0x110/0x170
[<
ffffffff810b4806>] __lock_acquire+0x2c6/0x5b0
[<
ffffffff810b4bd9>] lock_acquire+0xe9/0x120
[<
ffffffff815f7f6e>] _raw_spin_lock+0x3e/0x80
[<
ffffffffa0661084>] mlx4_ib_scan_netdevs+0x34/0x260 [mlx4_ib]
[<
ffffffffa06612db>] mlx4_ib_netdev_event+0x2b/0x40 [mlx4_ib]
[<
ffffffff81522219>] register_netdevice_notifier+0x99/0x1e0
[<
ffffffffa06626e3>] mlx4_ib_add+0x743/0xbc0 [mlx4_ib]
[<
ffffffffa05ec168>] mlx4_add_device+0x48/0xa0 [mlx4_core]
[<
ffffffffa05ec2c3>] mlx4_register_interface+0x73/0xb0 [mlx4_core]
[<
ffffffffa05c505e>] cm_req_handler+0x13e/0x460 [ib_cm]
[<
ffffffff810002e2>] do_one_initcall+0x112/0x1c0
[<
ffffffff810e8264>] do_init_module+0x34/0x190
[<
ffffffff810ea62f>] load_module+0x5cf/0x740
[<
ffffffff810ea939>] SyS_init_module+0x99/0xd0
[<
ffffffff815f8fd2>] system_call_fastpath+0x16/0x1b
irq event stamp: 336142
hardirqs last enabled at (336142): [<
ffffffff810612f5>] __local_bh_enable_ip+0xb5/0xc0
hardirqs last disabled at (336141): [<
ffffffff81061296>] __local_bh_enable_ip+0x56/0xc0
softirqs last enabled at (336004): [<
ffffffff8106123a>] _local_bh_enable+0x4a/0x50
softirqs last disabled at (336005): [<
ffffffff810617a4>] irq_exit+0x44/0xd0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&iboe->lock)->rlock);
<Interrupt>
lock(&(&iboe->lock)->rlock);
*** DEADLOCK ***
The above problem was caused by the spin lock being taken both in the process
context and in a soft-irq context (in a netdev notifier handler).
The required fix is to use spin_lock/unlock_bh() instead of spin_lock/unlock
on the iboe lock.
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Moni Shoua [Thu, 21 Aug 2014 11:28:40 +0000 (14:28 +0300)]
IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up
When a RoCE port becomes active and the netdev of the port has upper
device (e.g bond/team), GIDs derived from the upper dev should appear
in the port's RoCE GID table.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Moni Shoua [Thu, 21 Aug 2014 11:28:39 +0000 (14:28 +0300)]
IB/mlx4: Reorder steps in RoCE GID table initialization
There's no need to reset the gid table twice and we need to do it only
for Ethernet ports. Also, no need to actively scan ndetdevs since it's
being done immediatly after we register netdev notifiers.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Moni Shoua [Thu, 21 Aug 2014 11:28:38 +0000 (14:28 +0300)]
IB/mlx4: Don't duplicate the default RoCE GID
When reading the IPv6 addresses from the net-device, make sure to
avoid adding a duplicate entry to the GID table because of equality
between the default GID we generate and the default IPv6 link-local
address of the device.
Fixes:
acc4fccf4eff ("IB/mlx4: Make sure GID index 0 is always occupied")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Moni Shoua [Thu, 21 Aug 2014 11:28:37 +0000 (14:28 +0300)]
IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()
When Ethernet netdev is not present for a port (e.g. when the link
layer type of the port is InfiniBand) it's possible to dereference a
null pointer when we do netdevice scanning.
To fix that, we move a section of code that needs to run only when
netdev is present to a proper if () statement.
Fixes:
ad4885d279b6 ("IB/mlx4: Build the port IBoE GID table properly under bonding")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Or Gerlitz [Tue, 2 Sep 2014 14:08:43 +0000 (17:08 +0300)]
IB/iser: Bump version to 1.4.1
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Sagi Grimberg [Tue, 2 Sep 2014 14:08:42 +0000 (17:08 +0300)]
IB/iser: Allow bind only when connection state is UP
We need to fail the bind operation if the iser connection state != UP
(started teardown) and this should be done under the state lock.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Roi Dayan [Tue, 2 Sep 2014 14:08:41 +0000 (17:08 +0300)]
IB/iser: Fix RX/TX CQ resource leak on error flow
When failing to allocate TX CQ we already allocated RX CQ, so we need to make
sure we release it. Also, when failing to register notification to the RX CQ
we currently leak both RX and TX CQs of the current index, fix that too.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
devesh.sharma@emulex.com [Fri, 5 Sep 2014 09:39:49 +0000 (15:09 +0530)]
RDMA/ocrdma: Use right macro in query AH
ocrdma_query_ah() does not use correct macro, and checks the wrong bit
for the validity of address handle in vector table. Fix this.
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
devesh.sharma@emulex.com [Fri, 5 Sep 2014 09:39:48 +0000 (15:09 +0530)]
RDMA/ocrdma: Resolve L2 address when creating user AH
Because of IP-based GIDs, userspace AHs must have MAC and VLAN ID
resolved separately. Presently, user AHs are broken for ocrdma. This
patch resolves L2 addresses while creating user AH and obtains the
right DMAC and VLAN ID before creating AH.
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Matan Barak [Thu, 11 Sep 2014 10:18:37 +0000 (13:18 +0300)]
mlx4: Correct error flows in rereg_mr
This patch addresses feedback from Sagi Grimberg on the rereg_mr
implementation of mlx4. The following are fixed:
1. Set the correct pd_flags
2. Make sure we change the iova and size MR fields only after
successful write and allocation of the MTTs.
3. Make the error checking more robust
Fixes:
e630664c8383 ("mlx4_core: Add helper functions to support MR re-registration")
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Linus Torvalds [Mon, 22 Sep 2014 15:42:55 +0000 (08:42 -0700)]
Merge branch 'for-3.17-fixes' of git://git./linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"create_singlethread_workqueue() is the old interface which is kept
around for backward compatibility - each should be reviewed to
determine whether singlethread usage was to save worker threads or for
ordering guarantee and whether it's depended upon by memory reclaim
path.
While adding NUMA support for unbound workqueues during v3.10, I
forgot to update it breaking the singlethread and ordering properties
on NUMA setups. The breakage was unfortunately rather subtle and went
without being reported until now.
The only missing piece is __WQ_ORDERED flag which makes the unbounded
workqueue use a single backend queue across different NUMA nodes.
It's fixed by making create_singlethread_workqueue() wrap
alloc_ordered_workqueue() so that possible future updates are
inherited automatically"
* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: apply __WQ_ORDERED to create_singlethread_workqueue()
Anton Altaparmakov [Mon, 22 Sep 2014 00:53:03 +0000 (01:53 +0100)]
Fix nasty 32-bit overflow bug in buffer i/o code.
On 32-bit architectures, the legacy buffer_head functions are not always
handling the sector number with the proper 64-bit types, and will thus
fail on 4TB+ disks.
Any code that uses __getblk() (and thus bread(), breadahead(),
sb_bread(), sb_breadahead(), sb_getblk()), and calls it using a 64-bit
block on a 32-bit arch (where "long" is 32-bit) causes an inifinite loop
in __getblk_slow() with an infinite stream of errors logged to dmesg
like this:
__find_get_block_slow() failed. block=
6740375944, b_blocknr=
2445408648
b_state=0x00000020, b_size=512
device sda1 blocksize: 512
Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the
top 32-bits are missing (in this case the 0x1 at the top).
This is because grow_dev_page() is broken and has a 32-bit overflow due
to shifting the page index value (a pgoff_t - which is just 32 bits on
32-bit architectures) left-shifted as the block number. But the top
bits to get lost as the pgoff_t is not type cast to sector_t / 64-bit
before the shift.
This patch fixes this issue by type casting "index" to sector_t before
doing the left shift.
Note this is not a theoretical bug but has been seen in the field on a
4TiB hard drive with logical sector size 512 bytes.
This patch has been verified to fix the infinite loop problem on 3.17-rc5
kernel using a 4TB disk image mounted using "-o loop". Without this patch
doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop
100% reproducibly whilst with the patch it works fine as expected.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>