Len Brown [Fri, 23 Dec 2016 04:57:55 +0000 (23:57 -0500)]
tools/power turbostat: Make extensible via the --add parameter
Create the "--add" parameter. This can be used to teach an existing
turbostat binary about any number of any type of counter.
turbostat(8) details the syntax for --add.
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 2 Dec 2016 04:10:39 +0000 (23:10 -0500)]
tools/power turbostat: Denverton uses a 25 MHz crystal, not 19.2 MHz
This changes only the TSC frequency decoding line seen with --debug
old: TSC: 1382 MHz (
19200000 Hz * 216 / 3 / 1000000)
new: TSC: 1800 MHz (
25000000 Hz * 216 / 3 / 1000000)
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 2 Dec 2016 02:14:38 +0000 (21:14 -0500)]
tools/power turbostat: line up headers when -M is used
The -M option adds an 18-column item, and the header
needs to be wide enough to keep the header aligned
with the columns.
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 2 Dec 2016 01:27:46 +0000 (20:27 -0500)]
tools/power turbostat: fix SKX PKG_CSTATE_LIMIT decoding
SKX has fewer package C-states than previous generations,
and so the decoding of PKG_CSTATE_LIMIT has changed.
This changes the line ending with pkg-cstate-limit=XXX: pcYYY
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Thu, 1 Dec 2016 06:35:38 +0000 (01:35 -0500)]
tools/power turbostat: Support Knights Mill (KNM)
Original-author: Piotr Luc <piotr.luc@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Srinivas Pandruvada [Fri, 11 Nov 2016 22:29:48 +0000 (14:29 -0800)]
tools/power turbostat: Display HWP OOB status
Display if the HWP is enabled in OOB (Out of band) mode.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Xiaolong Wang [Fri, 30 Sep 2016 09:53:40 +0000 (17:53 +0800)]
tools/power turbostat: fix Denverton BCLK
Add Denverton to the group of SandyBridge and later processors,
to let the bclk be recognized as 100MHz rather than 133MHz,
then avoid the wrong value of the frequencies based on it,
including Bzy_MHz, max efficiency freuency, base frequency,
and turbo mode frequencies.
Signed-off-by: Xiaolong Wang <xiaolong.wang@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 17 Jun 2016 03:22:37 +0000 (23:22 -0400)]
tools/power turbostat: use intel-family.h model strings
All except for model 1F, a Nehalem, which is currently incorrectly
indentified as a Westmere in that new header.
Signed-off-by: Len Brown <len.brown@intel.com>
Jacob Pan [Thu, 16 Jun 2016 16:48:22 +0000 (09:48 -0700)]
tools/power/turbostat: Add Denverton RAPL support
The Denverton CPU RAPL supports package, core, and DRAM domains.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Jacob Pan [Thu, 16 Jun 2016 16:48:21 +0000 (09:48 -0700)]
tools/power/turbostat: Add Denverton support
Denverton is an Atom based micro server which shares the same
Goldmont architecture as Broxton. The available C-states on
Denverton is a subset of Broxton with only C1, C1e, and C6.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Jacob Pan [Thu, 16 Jun 2016 16:48:20 +0000 (09:48 -0700)]
tools/power/turbostat: split core MSR support into status + limit
Some CPUs may not have PP0/Core domain power limit MSRs. We
should still allow its domain energy status to be used. This
patch splits PP0/Core RAPL into two separate flags for power
limit and energy status such that energy status can continue
to be reported without power limit.
Without this patch, turbostat will not be able to use the
remaining RAPL features if some PL MSRs are not present.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Colin Ian King [Mon, 25 Apr 2016 12:03:15 +0000 (13:03 +0100)]
tools/power turbostat: fix error case overflow read of slm_freq_table[]
When i >= SLM_BCLK_FREQS, the frequency read from the slm_freq_table
is off the end of the array because msr is set to 3 rather than the
actual array index i. Set i to 3 rather than msr to fix this.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Mika Westerberg [Fri, 22 Apr 2016 08:13:23 +0000 (11:13 +0300)]
tools/power turbostat: Allocate correct amount of fd and irq entries
The tool uses topo.max_cpu_num to determine number of entries needed for
fd_percpu[] and irqs_per_cpu[]. For example on a system with 4 CPUs
topo.max_cpu_num is 3 so we get too small array for holding per-CPU items.
Fix this to use right number of entries, which is topo.max_cpu_num + 1.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Sat, 23 Apr 2016 03:24:27 +0000 (23:24 -0400)]
tools/power turbostat: switch to tab delimited output
Switch to tab-delimited output from fixed-width columns
to make it simpler to import into spreadsheets.
As the fixed width columnns were 8-spaces wide,
the output on the screen should not change.
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Sat, 23 Apr 2016 00:31:46 +0000 (20:31 -0400)]
tools/power turbostat: Gracefully handle ACPI S3
turbostat gives valid results across suspend to idle, aka freeze,
whether invoked in interval mode, or in command mode.
Indeed, this can be used to measure suspend to idle:
turbostat echo freeze > /sys/power/state
But this does not work across suspend to ACPI S3, because the
processor counters, including the TSC, are reset on resume.
Further, when turbostat detects a problem, it does't forgive
the hardware, and interval mode will print *'s from there on out.
Instead, upon detecting counters going backwards, simply
reset and start over.
Interval mode across ACPI S3: (observe TSC going backwards)
root@sharkbay:/home/lenb/turbostat-src# ./turbostat -M 0x10
CPU Avg_MHz Busy% Bzy_MHz TSC_MHz MSR 0x010
- 1 0.06 858 2294 0x0000000000000000
0 0 0.06 847 2294 0x0000002a254b98ac
1 1 0.06 878 2294 0x0000002a254efa3a
2 1 0.07 843 2294 0x0000002a2551df65
3 0 0.05 863 2294 0x0000002a2553fea2
turbostat: re-initialized with num_cpus 4
CPU Avg_MHz Busy% Bzy_MHz TSC_MHz MSR 0x010
- 2 0.20 849 2294 0x0000000000000000
0 2 0.26 856 2294 0x0000000449abb60d
1 2 0.20 844 2294 0x0000000449b087ec
2 2 0.21 850 2294 0x0000000449b35d5d
3 1 0.12 839 2294 0x0000000449b5fd5a
^C
Command mode across ACPI S3:
root@sharkbay:/home/lenb/turbostat-src# ./turbostat -M 0x10 sleep 10
./turbostat: Counter reset detected
14.196299 sec
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Thu, 7 Apr 2016 03:56:02 +0000 (23:56 -0400)]
tools/power turbostat: tidy up output on Joule counter overflow
The RAPL Joules counter is limited in capacity.
Turbostat estimates how soon it can roll-over
based on the max TDP of the processor --
which tells us the maximum increment rate.
eg.
RAPL: 2759 sec. Joule Counter Range, at 95 Watts
So if a sample duration is longer than 2759 seconds on this system,
'**' replace the decimal place in the display to indicate
that the results may be suspect.
But the display had an extra ' ' in this case, throwing off the columns.
Also, the -J "Joules" option appended an extra "time" column
to the display. While this may be useful, it printed the interval time,
which may not be the accurate time per processor. Remove this column,
which appeared only when using '-J',
as we plan to add accurate per-cpu interval times in a future commit.
Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Thu, 1 Dec 2016 00:33:41 +0000 (16:33 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"7 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: fix false-positive WARN_ON() in truncate/invalidate for hugetlb
kasan: support use-after-scope detection
kasan: update kasan_global for gcc 7
lib/debugobjects: export for use in modules
zram: fix unbalanced idr management at hot removal
thp: fix corner case of munlock() of PTE-mapped THPs
mm, thp: propagation of conditional compilation in khugepaged.c
Kirill A. Shutemov [Wed, 30 Nov 2016 23:54:19 +0000 (15:54 -0800)]
mm: fix false-positive WARN_ON() in truncate/invalidate for hugetlb
Hugetlb pages have ->index in size of the huge pages (PMD_SIZE or
PUD_SIZE), not in PAGE_SIZE as other types of pages. This means we
cannot user page_to_pgoff() to check whether we've got the right page
for the radix-tree index.
Let's introduce page_to_index() which would return radix-tree index for
given page.
We will be able to get rid of this once hugetlb will be switched to
multi-order entries.
Fixes:
fc127da085c2 ("truncate: handle file thp")
Link: http://lkml.kernel.org/r/20161123093053.mjbnvn5zwxw5e6lk@black.fi.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Doug Nelson <doug.nelson@intel.com>
Tested-by: Doug Nelson <doug.nelson@intel.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dmitry Vyukov [Wed, 30 Nov 2016 23:54:16 +0000 (15:54 -0800)]
kasan: support use-after-scope detection
Gcc revision 241896 implements use-after-scope detection. Will be
available in gcc 7. Support it in KASAN.
Gcc emits 2 new callbacks to poison/unpoison large stack objects when
they go in/out of scope. Implement the callbacks and add a test.
[dvyukov@google.com: v3]
Link: http://lkml.kernel.org/r/1479998292-144502-1-git-send-email-dvyukov@google.com
Link: http://lkml.kernel.org/r/1479226045-145148-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org> [4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dmitry Vyukov [Wed, 30 Nov 2016 23:54:13 +0000 (15:54 -0800)]
kasan: update kasan_global for gcc 7
kasan_global struct is part of compiler/runtime ABI. gcc revision
241983 has added a new field to kasan_global struct. Update kernel
definition of kasan_global struct to include the new field.
Without this patch KASAN is broken with gcc 7.
Link: http://lkml.kernel.org/r/1479219743-28682-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org> [4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Wilson [Wed, 30 Nov 2016 23:54:10 +0000 (15:54 -0800)]
lib/debugobjects: export for use in modules
Drivers, or other modules, that use a mixture of objects (especially
objects embedded within other objects) would like to take advantage of
the debugobjects facilities to help catch misuse. Currently, the
debugobjects interface is only available to builtin drivers and requires
a set of EXPORT_SYMBOL_GPL for use by modules.
I am using the debugobjects in i915.ko to try and catch some invalid
operations on embedded objects. The problem currently only presents
itself across module unload so forcing i915 to be builtin is not an
option.
Link: http://lkml.kernel.org/r/20161122143039.6433-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Du, Changbin" <changbin.du@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Wed, 30 Nov 2016 23:54:08 +0000 (15:54 -0800)]
zram: fix unbalanced idr management at hot removal
The zram hot removal code calls idr_remove() even when zram_remove()
returns an error (typically -EBUSY). This results in a leftover at the
device release, eventually leading to a crash when the module is
reloaded.
As described in the bug report below, the following procedure would
cause an Oops with zram:
- provision three zram devices via modprobe zram num_devices=3
- configure a size for each device
+ echo "1G" > /sys/block/$zram_name/disksize
- mkfs and mount zram0 only
- attempt to hot remove all three devices
+ echo 2 > /sys/class/zram-control/hot_remove
+ echo 1 > /sys/class/zram-control/hot_remove
+ echo 0 > /sys/class/zram-control/hot_remove
- zram0 removal fails with EBUSY, as expected
- unmount zram0
- try zram0 hot remove again
+ echo 0 > /sys/class/zram-control/hot_remove
- fails with ENODEV (unexpected)
- unload zram kernel module
+ completes successfully
- zram0 device node still exists
- attempt to mount /dev/zram0
+ mount command is killed
+ following BUG is encountered
BUG: unable to handle kernel paging request at
ffffffffa0002ba0
IP: get_disk+0x16/0x50
Oops: 0000 [#1] SMP
CPU: 0 PID: 252 Comm: mount Not tainted 4.9.0-rc6 #176
Call Trace:
exact_lock+0xc/0x20
kobj_lookup+0xdc/0x160
get_gendisk+0x2f/0x110
__blkdev_get+0x10c/0x3c0
blkdev_get+0x19d/0x2e0
blkdev_open+0x56/0x70
do_dentry_open.isra.19+0x1ff/0x310
vfs_open+0x43/0x60
path_openat+0x2c9/0xf30
do_filp_open+0x79/0xd0
do_sys_open+0x114/0x1e0
SyS_open+0x19/0x20
entry_SYSCALL_64_fastpath+0x13/0x94
This patch adds the proper error check in hot_remove_store() not to call
idr_remove() unconditionally.
Fixes:
17ec4cd98578 ("zram: don't call idr_remove() from zram_remove()")
Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1010970
Link: http://lkml.kernel.org/r/20161121132140.12683-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Reported-by: David Disseldorp <ddiss@suse.de>
Tested-by: David Disseldorp <ddiss@suse.de>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: <stable@vger.kernel.org> [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill A. Shutemov [Wed, 30 Nov 2016 23:54:05 +0000 (15:54 -0800)]
thp: fix corner case of munlock() of PTE-mapped THPs
The following program triggers BUG() in munlock_vma_pages_range():
// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <sys/mman.h>
int main()
{
mmap((void*)0x20105000ul, 0xc00000ul, 0x2ul, 0x2172ul, -1, 0);
mremap((void*)0x201fd000ul, 0x4000ul, 0xc00000ul, 0x3ul, 0x203f0000ul);
return 0;
}
The test-case constructs the situation when munlock_vma_pages_range()
finds PTE-mapped THP-head in the middle of page table and, by mistake,
skips HPAGE_PMD_NR pages after that.
As result, on the next iteration it hits the middle of PMD-mapped THP
and gets upset seeing mlocked tail page.
The solution is only skip HPAGE_PMD_NR pages if the THP was mlocked
during munlock_vma_page(). It would guarantee that the page is
PMD-mapped as we never mlock PTE-mapeed THPs.
Fixes:
e90309c9f772 ("thp: allow mlocked THP again")
Link: http://lkml.kernel.org/r/20161115132703.7s7rrgmwttegcdh4@black.fi.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jérémy Lefaure [Wed, 30 Nov 2016 23:54:02 +0000 (15:54 -0800)]
mm, thp: propagation of conditional compilation in khugepaged.c
Commit
b46e756f5e47 ("thp: extract khugepaged from mm/huge_memory.c")
moved code from huge_memory.c to khugepaged.c. Some of this code should
be compiled only when CONFIG_SYSFS is enabled but the condition around
this code was not moved into khugepaged.c.
The result is a compilation error when CONFIG_SYSFS is disabled:
mm/built-in.o: In function `khugepaged_defrag_store': khugepaged.c:(.text+0x2d095): undefined reference to `single_hugepage_flag_store'
mm/built-in.o: In function `khugepaged_defrag_show': khugepaged.c:(.text+0x2d0ab): undefined reference to `single_hugepage_flag_show'
This commit adds the #ifdef CONFIG_SYSFS around the code related to
sysfs.
Link: http://lkml.kernel.org/r/20161114203448.24197-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 30 Nov 2016 23:15:49 +0000 (15:15 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Two small fixes for MIPI PLLs on sunxi devices and a build fix for a
Broadcom clk driver having unmet dependencies"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: bcm: Fix unmet Kconfig dependencies for CLK_BCM_63XX
clk: sunxi-ng: enable so-said LDOs for A33 SoC's pll-mipi clock
clk: sunxi-ng: sun6i-a31: Enable PLL-MIPI LDOs when ungating it
Linus Torvalds [Wed, 30 Nov 2016 19:53:50 +0000 (11:53 -0800)]
Merge tag 'pwm/for-4.9' of git://git./linux/kernel/git/thierry.reding/linux-pwm
Pull pwm fixes from Thierry Reding:
"This contains two one-line fixes for issues that were introduced in
v4.9-rc1"
* tag 'pwm/for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: Fix device reference leak
pwm: meson: Add missing spin_lock_init()
Mike Rapoport [Wed, 30 Nov 2016 07:52:01 +0000 (09:52 +0200)]
isofs: add KERN_CONT to printing of ER records
The ER records are printed without explicit log level presuming line
continuation until "\n". After the commit
4bcc595ccd8 (printk:
reinstate KERN_CONT for printing continuation lines), the ER records are
printed a character per line.
Adding KERN_CONT to appropriate printk statements restores the printout
behavior.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 30 Nov 2016 02:30:50 +0000 (18:30 -0800)]
Merge tag 'arc-4.9-final' of git://git./linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- fix PAE40 crash [Yuriy]
- disable IO-Coherency by default
- use a different inline asm constraint for Zero Overhead loops
* tag 'arc-4.9-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: mm: PAE40: Fix crash at munmap
ARC: mm: IOC: Don't enable IOC by default
ARC: Don't use "+l" inline asm constraint
Linus Torvalds [Tue, 29 Nov 2016 23:20:14 +0000 (15:20 -0800)]
Re-enable CONFIG_MODVERSIONS in a slightly weaker form
This enables CONFIG_MODVERSIONS again, but allows for missing symbol CRC
information in order to work around the issue that newer binutils
versions seem to occasionally drop the CRC on the floor. binutils 2.26
seems to work fine, while binutils 2.27 seems to break MODVERSIONS of
symbols that have been defined in assembler files.
[ We've had random missing CRC's before - it may be an old problem that
just is now reliably triggered with the weak asm symbols and a new
version of binutils ]
Some day I really do want to remove MODVERSIONS entirely. Sadly, today
does not appear to be that day: Debian people apparently do want the
option to enable MODVERSIONS to make it easier to have external modules
across kernel versions, and this seems to be a fairly minimal fix for
the annoying problem.
Cc: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 29 Nov 2016 19:33:28 +0000 (11:33 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
"A few misc important cifs fixes, including a fix for a 4.9 regression
in posix_acl xattr handling"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: iterate over posix acl xattr entry correctly in ACL_to_cifs_posix()
Call echo service immediately after socket reconnect
CIFS: Fix BUG() in calc_seckey()
Linus Torvalds [Tue, 29 Nov 2016 19:15:37 +0000 (11:15 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four small fixes.
The be2iscsi is a potential device overrun in consistent memory, which
could have nasty consequences if the consistent allocations are
packed.
The hpsa one fixes a regression where older controllers can now get a
numbering clash between the first internal disk and the controller.
The libfc one is a regression in timespec conversions which causes a
user visible issue in a command line tool and the mpt3sas one fixes a
regression where the controller could remain permanently blocked after
an ATA pass through command followed by a reset"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: be2iscsi: allocate enough memory in beiscsi_boot_get_sinfo()
scsi: mpt3sas: Unblock device after controller reset
scsi: hpsa: use bus '3' for legacy HBA devices
scsi: libfc: fix seconds_since_last_reset miscalculation
Yuriy Kolerov [Mon, 28 Nov 2016 04:07:17 +0000 (07:07 +0300)]
ARC: mm: PAE40: Fix crash at munmap
commit
1c3c90930392 broke PAE40. Macro pfn_pte(pfn, prot) creates paddr
from pfn, but the page shift was getting truncated to 32 bits since we lost
the proper cast to 64 bits (for PAE400
Instead of reverting that commit, use a better helper which is 32/64 bits
safe just like ARM implementation.
Fixes:
1c3c90930392 ("ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS")
Cc: <stable@vger.kernel.org> #4.4+
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
[vgupta: massaged changelog]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Aaron Lu [Tue, 29 Nov 2016 05:27:31 +0000 (13:27 +0800)]
mremap: move_ptes: check pte dirty after its removal
Linus found there still is a race in mremap after commit
5d1904204c99
("mremap: fix race between mremap() and page cleanning").
As described by Linus:
"the issue is that another thread might make the pte be dirty (in the
hardware walker, so no locking of ours will make any difference)
*after* we checked whether it was dirty, but *before* we removed it
from the page tables"
Fix it by moving the check after we removed it from the page table.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johan Hovold [Tue, 1 Nov 2016 10:46:39 +0000 (11:46 +0100)]
pwm: Fix device reference leak
Make sure to drop the reference to the parent device taken by
class_find_device() after "unexporting" any children when deregistering
a PWM chip.
Fixes:
0733424c9ba9 ("pwm: Unexport children before chip removal")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Eryu Guan [Mon, 24 Oct 2016 12:46:40 +0000 (20:46 +0800)]
CIFS: iterate over posix acl xattr entry correctly in ACL_to_cifs_posix()
Commit
2211d5ba5c6c ("posix_acl: xattr representation cleanups")
removes the typedefs and the zero-length a_entries array in struct
posix_acl_xattr_header, and uses bare struct posix_acl_xattr_header
and struct posix_acl_xattr_entry directly.
But it failed to iterate over posix acl slots when converting posix
acls to CIFS format, which results in several test failures in
xfstests (generic/053 generic/105) when testing against a samba v1
server, starting from v4.9-rc1 kernel. e.g.
[root@localhost xfstests]# diff -u tests/generic/105.out /root/xfstests/results//generic/105.out.bad
--- tests/generic/105.out 2016-09-19 16:33:28.
577962575 +0800
+++ /root/xfstests/results//generic/105.out.bad 2016-10-22 15:41:15.
201931110 +0800
@@ -1,3 +1,4 @@
QA output created by 105
-rw-r--r-- root
+setfacl: subdir: Invalid argument
-rw-r--r-- root
Fix it by introducing a new "ace" var, like what
cifs_copy_posix_acl() does, and iterating posix acl xattr entries
over it in the for loop.
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Sachin Prabhu [Thu, 20 Oct 2016 23:52:24 +0000 (19:52 -0400)]
Call echo service immediately after socket reconnect
Commit
4fcd1813e640 ("Fix reconnect to not defer smb3 session reconnect
long after socket reconnect") changes the behaviour of the SMB2 echo
service and causes it to renegotiate after a socket reconnect. However
under default settings, the echo service could take up to 120 seconds to
be scheduled.
The patch forces the echo service to be called immediately resulting a
negotiate call being made immediately on reconnect.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Sachin Prabhu [Mon, 17 Oct 2016 20:40:22 +0000 (16:40 -0400)]
CIFS: Fix BUG() in calc_seckey()
Andy Lutromirski's new virtually mapped kernel stack allocations moves
kernel stacks the vmalloc area. This triggers the bug
kernel BUG at ./include/linux/scatterlist.h:140!
at calc_seckey()->sg_init()
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Linus Torvalds [Mon, 28 Nov 2016 22:17:10 +0000 (14:17 -0800)]
Merge branch 'for-4.9-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"The recent changes in ahci MSI handling need one more fix. Hopefully,
this restores parity with before.
The other two are minor fixes with both low impact and risk"
* 'for-4.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: always fall back to single-MSI mode
libata-scsi: Fixup ata_gen_passthru_sense()
mvsas: fix error return code in mvs_task_prep()
Linus Torvalds [Mon, 28 Nov 2016 21:53:19 +0000 (13:53 -0800)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
"Two ugly build warning fixes"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
dbri: Fix compiler warning
qlogicpti: Fix compiler warnings
Tushar Dave [Thu, 24 Nov 2016 20:35:16 +0000 (12:35 -0800)]
dbri: Fix compiler warning
dbri uses 'u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enabled 64bit
DMA and therefore dma_addr_t became of type u64. This makes
'incompatible pointer type' warnings inevitable.
e.g.
sound/sparc/dbri.c: In function ‘snd_dbri_create’:
sound/sparc/dbri.c:2538: warning: passing argument 3 of ‘dma_zalloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:608: note: expected ‘dma_addr_t *’ but argument is of type ‘u32 *’
For the record, dbri(sbus) driver never executes on sun4v. Therefore
even though 64bit DMA is enabled on SPARC, dbri continues to use
legacy iommu that guarantees DMA address is always in 32bit range.
This patch resolves above compiler warning.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: thomas tai <thomas.tai@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tushar Dave [Thu, 24 Nov 2016 02:28:04 +0000 (18:28 -0800)]
qlogicpti: Fix compiler warnings
qlogicpti uses '__u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enabled 64bit
DMA and therefore dma_addr_t became of type u64. This makes
'incompatible pointer type' warnings inevitable.
e.g.
drivers/scsi/qlogicpti.c: In function ‘qpti_map_queues’:
drivers/scsi/qlogicpti.c:813: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
drivers/scsi/qlogicpti.c:822: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
For the record, qlogicpti never executes on sun4v. Therefore even
though 64bit DMA is enabled on SPARC, qlogicpti continues to use
legacy iommu that guarantees DMA address is always in 32bit range.
This patch resolves aforementioned compiler warnings.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: thomas tai <thomas.tai@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vineet Gupta [Mon, 28 Nov 2016 17:18:21 +0000 (09:18 -0800)]
ARC: mm: IOC: Don't enable IOC by default
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Vineet Gupta [Thu, 24 Nov 2016 01:43:17 +0000 (17:43 -0800)]
ARC: Don't use "+l" inline asm constraint
Apparenty this is coming in the way of gcc fix which inhibits the usage
of LP_COUNT as a gpr.
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Linus Torvalds [Sun, 27 Nov 2016 21:08:04 +0000 (13:08 -0800)]
Linux 4.9-rc7
Linus Torvalds [Sun, 27 Nov 2016 16:24:46 +0000 (08:24 -0800)]
Merge git://git.infradead.org/intel-iommu
Pull IOMMU fixes from David Woodhouse:
"Two minor fixes.
The first fixes the assignment of SR-IOV virtual functions to the
correct IOMMU unit, and the second fixes the excessively large (and
physically contiguous) PASID tables used with SVM"
* git://git.infradead.org/intel-iommu:
iommu/vt-d: Fix PASID table allocation
iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions
Linus Torvalds [Sun, 27 Nov 2016 16:22:59 +0000 (08:22 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.9:
- Fix unreadable output in __do_page_fault due to the KERN_CONT
patchset
- Correctly handle MIPS R6 fixes to the c0_wired register"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: mm: Fix output of __do_page_fault
MIPS: Mask out limit field when calculating wired entry count
Linus Torvalds [Sun, 27 Nov 2016 01:21:13 +0000 (17:21 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull vfs splice fix from Al Viro.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix default_file_splice_read()
Al Viro [Sun, 27 Nov 2016 01:05:42 +0000 (20:05 -0500)]
fix default_file_splice_read()
Botched calculation of number of pages. As the result,
we were dropping pieces when doing splice to pipe from
e.g. 9p.
Reported-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Sat, 26 Nov 2016 23:28:34 +0000 (15:28 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Here is a revert and two bugfixes for the I2C designware driver.
Please note that we are still hunting down a regression for the
i2c-octeon driver. While there is a fix pending, we have unclear
feedback from the testers currently. An rc8 would be quite helpful
for this case"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Revert "i2c: designware: do not disable adapter after transfer"
i2c: designware: fix rx fifo depth tracking
i2c: designware: report short transfers
Linus Torvalds [Sat, 26 Nov 2016 23:26:20 +0000 (15:26 -0800)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fix from Russell King:
"This resolves the ksyms issues by reverting the commit which
introduced the breakage"
There was what I consider to be a better fix, but it's late in the rc
game, so I'll take the revert.
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
Revert "arm: move exports to definitions"
Linus Torvalds [Sat, 26 Nov 2016 21:05:05 +0000 (13:05 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix leak in fsl/fman driver, from Dan Carpenter.
2) Call flow dissector initcall earlier than any networking driver can
register and start to use it, from Eric Dumazet.
3) Some dup header fixes from Geliang Tang.
4) TIPC link monitoring compat fix from Jon Paul Maloy.
5) Link changes require EEE re-negotiation in bcm_sf2 driver, from
Florian Fainelli.
6) Fix bogus handle ID passed into tfilter_notify_chain(), from Roman
Mashak.
7) Fix dump size calculation in rtnl_calcit(), from Zhang Shengju.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
tipc: resolve connection flow control compatibility problem
mvpp2: use correct size for memset
net/mlx5: drop duplicate header delay.h
net: ieee802154: drop duplicate header delay.h
ibmvnic: drop duplicate header seq_file.h
fsl/fman: fix a leak in tgec_free()
net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS
tipc: improve sanity check for received domain records
tipc: fix compatibility bug in link monitoring
net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented
dwc_eth_qos: drop duplicate headers
net sched filters: fix filter handle ID in tfilter_notify_chain()
net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change
bnxt: do not busy-poll when link is down
udplite: call proper backlog handlers
ipv6: bump genid when the IFA_F_TENTATIVE flag is clear
net/mlx4_en: Free netdev resources under state lock
net: revert "net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit"
rtnetlink: fix the wrong minimal dump size getting from rtnl_calcit()
bnxt_en: Fix a VXLAN vs GENEVE issue
...
Linus Torvalds [Sat, 26 Nov 2016 20:24:47 +0000 (12:24 -0800)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
- Fix a crash that occurs at driver initialization if the memory region
is already busy (request_mem_region() fails).
- Fix a vma validation check that mistakenly allows a private device-
dax mapping to be established. Device-dax explicitly forbids private
mappings so it can guarantee a given fault granularity and backing
memory type.
Both of these fixes have soaked in -next and are tagged for -stable.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
device-dax: fail all private mapping attempts
device-dax: check devm_nsio_enable() return value
Linus Torvalds [Sat, 26 Nov 2016 20:18:59 +0000 (12:18 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"Four fixes for bugs found by syzkaller on x86, all for stable"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: check for pic and ioapic presence before use
KVM: x86: fix out-of-bounds accesses of rtc_eoi map
KVM: x86: drop error recovery in em_jmp_far and em_ret_far
KVM: x86: fix out-of-bounds access in lapic
Linus Torvalds [Sat, 26 Nov 2016 19:24:03 +0000 (11:24 -0800)]
Merge tag 'powerpc-4.9-6' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fixes marked for stable:
- Set missing wakeup bit in LPCR on POWER9
- Fix the early OPAL console wrappers
- Fixup kernel read only mapping
Fixes for code merged this cycle:
- Fix missing CRCs, add more asm-prototypes.h declarations"
* tag 'powerpc-4.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fixup kernel read only mapping
powerpc/boot: Fix the early OPAL console wrappers
powerpc: Fix missing CRCs, add more asm-prototypes.h declarations
powerpc: Set missing wakeup bit in LPCR on POWER9
Jon Paul Maloy [Thu, 24 Nov 2016 23:47:07 +0000 (18:47 -0500)]
tipc: resolve connection flow control compatibility problem
In commit
10724cc7bb78 ("tipc: redesign connection-level flow control")
we replaced the previous message based flow control with one based on
1k blocks. In order to ensure backwards compatibility the mechanism
falls back to using message as base unit when it senses that the peer
doesn't support the new algorithm. The default flow control window,
i.e., how many units can be sent before the sender blocks and waits
for an acknowledge (aka advertisement) is 512. This was tested against
the previous version, which uses an acknowledge frequency of on ack per
256 received message, and found to work fine.
However, we missed the fact that versions older than Linux 3.15 use an
acknowledge frequency of 512, which is exactly the limit where a 4.6+
sender will stop and wait for acknowledge. This would also work fine if
it weren't for the fact that if the first sent message on a 4.6+ server
side is an empty SYNACK, this one is also is counted as a sent message,
while it is not counted as a received message on a legacy 3.15-receiver.
This leads to the sender always being one step ahead of the receiver, a
scenario causing the sender to block after 512 sent messages, while the
receiver only has registered 511 read messages. Hence, the legacy
receiver is not trigged to send an acknowledge, with a permanently
blocked sender as result.
We solve this deadlock by simply allowing the sender to send one more
message before it blocks, i.e., by a making minimal change to the
condition used for determining connection congestion.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Thu, 24 Nov 2016 16:28:12 +0000 (17:28 +0100)]
mvpp2: use correct size for memset
gcc-7 detects a short memset in mvpp2, introduced in the original
merge of the driver:
drivers/net/ethernet/marvell/mvpp2.c: In function 'mvpp2_cls_init':
drivers/net/ethernet/marvell/mvpp2.c:3296:2: error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size]
The result seems to be that we write uninitialized data into the
flow table registers, although we did not get any warning about
that uninitialized data usage.
Using sizeof() lets us initialize then entire array instead.
Fixes:
3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Thu, 24 Nov 2016 13:58:33 +0000 (21:58 +0800)]
net/mlx5: drop duplicate header delay.h
Drop duplicate header delay.h from mlx5/core/main.c.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Acked-by: Matan Barak <matanb@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Thu, 24 Nov 2016 13:58:32 +0000 (21:58 +0800)]
net: ieee802154: drop duplicate header delay.h
Drop duplicate header delay.h from adf7242.c.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Acked-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Thu, 24 Nov 2016 13:58:29 +0000 (21:58 +0800)]
ibmvnic: drop duplicate header seq_file.h
Drop duplicate header seq_file.h from ibmvnic.c.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 24 Nov 2016 11:20:43 +0000 (14:20 +0300)]
fsl/fman: fix a leak in tgec_free()
We set "tgec->cfg" to NULL before passing it to kfree(). There is no
need to set it to NULL at all. Let's just delete it.
Fixes:
57ba4c9b56d8 ("fsl/fman: Add FMan MAC support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Miroslav Lichvar [Thu, 24 Nov 2016 09:55:06 +0000 (10:55 +0100)]
net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS
The ETHTOOL_GLINKSETTINGS command is deprecating the ETHTOOL_GSET
command and likewise it shouldn't require the CAP_NET_ADMIN capability.
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Thu, 24 Nov 2016 04:46:09 +0000 (23:46 -0500)]
tipc: improve sanity check for received domain records
In commit
35c55c9877f8 ("tipc: add neighbor monitoring framework") we
added a data area to the link monitor STATE messages under the
assumption that previous versions did not use any such data area.
For versions older than Linux 4.3 this assumption is not correct. In
those version, all STATE messages sent out from a node inadvertently
contain a 16 byte data area containing a string; -a leftover from
previous RESET messages which were using this during the setup phase.
This string serves no purpose in STATE messages, and should no be there.
Unfortunately, this data area is delivered to the link monitor
framework, where a sanity check catches that it is not a correct domain
record, and drops it. It also issues a rate limited warning about the
event.
Since such events occur much more frequently than anticipated, we now
choose to remove the warning in order to not fill the kernel log with
useless contents. We also make the sanity check stricter, to further
reduce the risk that such data is inavertently admitted.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Thu, 24 Nov 2016 02:05:26 +0000 (21:05 -0500)]
tipc: fix compatibility bug in link monitoring
commit
817298102b0b ("tipc: fix link priority propagation") introduced a
compatibility problem between TIPC versions newer than Linux 4.6 and
those older than Linux 4.4. In versions later than 4.4, link STATE
messages only contain a non-zero link priority value when the sender
wants the receiver to change its priority. This has the effect that the
receiver resets itself in order to apply the new priority. This works
well, and is consistent with the said commit.
However, in versions older than 4.4 a valid link priority is present in
all sent link STATE messages, leading to cyclic link establishment and
reset on the 4.6+ node.
We fix this by adding a test that the received value should not only
be valid, but also differ from the current value in order to cause the
receiving link endpoint to reset.
Reported-by: Amar Nv <amar.nv005@gmail.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Wed, 23 Nov 2016 23:08:13 +0000 (00:08 +0100)]
net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented
The mvneta driver advertises it supports IFF_UNICAST_FLT. However, it
actually does not. The hardware probably does support it, but there is
no code to configure the filter. As a quick and simple fix, remove the
flag. This will cause the core to fall back to promiscuous mode.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes:
b50b72de2f2f ("net: mvneta: enable features before registering the driver")
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 26 Nov 2016 00:47:15 +0000 (16:47 -0800)]
Merge branch 'parisc-4.9-4' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"On parisc we were still seeing occasional random segmentation faults
and memory corruption on SMP machines. Dave Anglin then looked again
at the TLB related code and found two issues in the PCI DMA and
generic TLB flush functions.
Then, in our startup code we had some timing of the cache and TLB
functions to calculate a threshold when to use a complete TLB/cache
flush or just to flush a specific range. This code produced a race
with newly started CPUs and thus lead to occasional kernel crashes
(due to stale TLB/cache entries). The patch by Dave fixes this issue
by flushing the local caches before starting secondary CPUs and by
removing the race.
The last problem fixed by this series is that we quite often suffered
from hung tasks and self-detected stalls on the CPUs. It was somehow
clear that this was related to the (in v4.7) newly introduced cr16
clocksource and the own implementation of sched_clock(). I replaced
the open-coded sched_clock() function and switched to the generic
sched_clock() implementation which seems to have fixed this isse as
well.
All patches have been sucessfully tested on a variety of machines,
including our debian buildd servers.
All patches (beside the small pr_cont fix) are tagged for stable
releases"
* 'parisc-4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Also flush data TLB in flush_icache_page_asm
parisc: Fix race in pci-dma.c
parisc: Switch to generic sched_clock implementation
parisc: Fix races in parisc_setup_cache_timing()
parisc: Fix printk continuations in system detection
Linus Torvalds [Fri, 25 Nov 2016 23:53:45 +0000 (15:53 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/linux-security
Pull keys fixes from James Morris:
"From David:
- Fix mpi_powm()'s handling of a number with a zero exponent
[CVE-2016-8650].
Integrate my and Andrey's patches for mpi_powm() and use
mpi_resize() instead of RESIZE_IF_NEEDED() - the latter adds a
duplicate check into the execution path of a trivial case we
don't normally expect to be taken.
- Fix double free in X.509 error handling"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
mpi: Fix NULL ptr dereference in mpi_powm() [ver #3]
X.509: Fix double free in x509_cert_parse() [ver #3]
Linus Torvalds [Fri, 25 Nov 2016 23:44:47 +0000 (15:44 -0800)]
Fix subtle CONFIG_MODVERSIONS problems
CONFIG_MODVERSIONS has been broken for pretty much the whole 4.9 series,
and quite frankly, nobody has cared very deeply. We absolutely know how
to fix it, and it's not _complicated_, but it's not exactly pretty
either.
This oneliner fixes it without the ugliness, and allows for further
future cleanups.
"We've secretly replaced their regular MODVERSIONS with nothing at
all, let's see if they notice"
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 25 Nov 2016 23:16:51 +0000 (15:16 -0800)]
Merge tag 'acpi-4.9-rc7' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"Two ACPI fixes for 4.9-rc7.
One of them reverts a recent ACPI commit that attempted to improve
reboot/power-off on some systems, but introduced problems elsewhere,
and the other one fixes kernel builds with the new WDAT watchdog
driver enabled in some configurations.
Specifics:
- Revert the recent commit that caused the ACPI _PTS method to be
executed in the power-off/reboot code path (as per the
specification) in an attempt to improve things on some systems
(apparently expecting _PTS to be executed in that code path), but
broke power-off/reboot on at least one other machine (Rafael
Wysocki).
- Fix kernel builds with the new WDAT watchdog driver enabled in some
configurations by explicitly selecting WATCHDOG_CORE when enabling
the WDAT watchdog driver (Mika Westerberg)"
* tag 'acpi-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
watchdog: wdat_wdt: Select WATCHDOG_CORE
Revert "ACPI: Execute _PTS before system reboot"
Rafael J. Wysocki [Thu, 24 Nov 2016 23:13:56 +0000 (00:13 +0100)]
MAINTAINERS: Add bug tracking system location entry type
Following the kernel Bugzilla discussion during the Kernel Summit
(https://lwn.net/Articles/705245/), add bug tracking system location
entry type (B) to MAINTAINERS and populate it for several subsystems
known to be using the kernel BZ actively (and add the upstream BZ for
ACPICA too).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jarkko Nikula [Fri, 25 Nov 2016 15:22:27 +0000 (17:22 +0200)]
Revert "i2c: designware: do not disable adapter after transfer"
This reverts commit
0317e6c0f1dc1ba86b8d9dccc010c5e77b8355fa.
Srinivas reported recently touchscreen and touchpad stopped working in
Haswell based machine in Linux 4.9-rc series with timeout errors from
i2c_designware:
[ 16.508013] i2c_designware INT33C3:00: controller timed out
[ 16.508302] i2c_hid i2c-MSFT0001:02: failed to change power setting.
[ 17.532016] i2c_designware INT33C3:00: controller timed out
[ 18.556022] i2c_designware INT33C3:00: controller timed out
[ 18.556315] i2c_hid i2c-ATML1000:00: failed to retrieve report from device.
I managed to reproduce similar errors on another Haswell based machine
where touchscreen initialization fails maybe in every 1/5 - 1/2 boots.
Since root cause for these errors is not clear yet and debugging is
ongoing it's better to revert this commit as we are near to release.
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Rafael J. Wysocki [Fri, 25 Nov 2016 21:24:07 +0000 (22:24 +0100)]
Merge branches 'acpi-sleep-fixes' and 'acpi-wdat-fixes'
* acpi-sleep-fixes:
Revert "ACPI: Execute _PTS before system reboot"
* acpi-wdat-fixes:
watchdog: wdat_wdt: Select WATCHDOG_CORE
David S. Miller [Fri, 25 Nov 2016 21:17:12 +0000 (16:17 -0500)]
Merge tag 'linux-can-fixes-for-4.9-
20161123' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2016-11-23
this is a pull request for net/master.
The patch by Oliver Hartkopp for the broadcast manager (bcm) fixes the
CAN-FD support, which may cause an out-of-bounds access otherwise.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Wed, 23 Nov 2016 14:24:35 +0000 (22:24 +0800)]
dwc_eth_qos: drop duplicate headers
Drop duplicate headers types.h and delay.h from dwc_eth_qos.c.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 25 Nov 2016 19:36:35 +0000 (11:36 -0800)]
Merge tag 'mfd-fixes-4.9.1' of git://git./linux/kernel/git/lee/mfd
Pull MFD fixes from Lee Jones:
"Received a copule of last minute fixes for v4.9.
The patches from Viresh are fixing issues displayed in KernelCI"
* tag 'mfd-fixes-4.9.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: wm8994-core: Don't use managed regulator bulk get API
mfd: wm8994-core: Disable regulators before removing them
mfd: syscon: Support native-endian regmaps
Linus Torvalds [Fri, 25 Nov 2016 19:31:01 +0000 (11:31 -0800)]
Merge tag 'media/v4.9-4' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fix from Mauro Carvalho Chehab:
"Fix for the firmware load logic of the tuner-xc2028 driver"
* tag 'media/v4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
xc2028: Fix use-after-free bug properly
Linus Torvalds [Fri, 25 Nov 2016 18:51:35 +0000 (10:51 -0800)]
Merge tag 'drm-fixes-for-v4.9-rc7' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Seems to be quietening down nicely, a few mediatek, one exynos and one
hdlcd fix, along with two amd fixes"
* tag 'drm-fixes-for-v4.9-rc7' of git://people.freedesktop.org/~airlied/linux:
gpu/drm/exynos/exynos_hdmi - Unmap region obtained by of_iomap
drm/mediatek: fix null pointer dereference
drm/mediatek: fixed the calc method of data rate per lane
drm/mediatek: fix a typo of DISP_OD_CFG to OD_RELAYMODE
drm/radeon: fix power state when port pm is unavailable (v2)
drm/amdgpu: fix power state when port pm is unavailable
drm/arm: hdlcd: fix plane base address update
drm/amd/powerplay: avoid out of bounds access on array ps.
John David Anglin [Fri, 25 Nov 2016 01:18:14 +0000 (20:18 -0500)]
parisc: Also flush data TLB in flush_icache_page_asm
This is the second issue I noticed in reviewing the parisc TLB code.
The fic instruction may use either the instruction or data TLB in
flushing the instruction cache. Thus, on machines with a split TLB, we
should also flush the data TLB after setting up the temporary alias
registers.
Although this has no functional impact, I changed the pdtlb and pitlb
instructions to consistently use the index register %r0. These
instructions do not support integer displacements.
Tested on rp3440 and c8000.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: Helge Deller <deller@gmx.de>
John David Anglin [Fri, 25 Nov 2016 01:06:32 +0000 (20:06 -0500)]
parisc: Fix race in pci-dma.c
We are still troubled by occasional random segmentation faults and
memory memory corruption on SMP machines. The causes quite a few
package builds to fail on the Debian buildd machines for parisc. When
gcc-6 failed to build three times in a row, I looked again at the TLB
related code. I found a couple of issues. This is the first.
In general, we need to ensure page table updates and corresponding TLB
purges are atomic. The attached patch fixes an instance in pci-dma.c
where the page table update was not guarded by the TLB lock.
Tested on rp3440 and c8000. So far, no further random segmentation
faults have been observed.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Tue, 22 Nov 2016 17:08:30 +0000 (18:08 +0100)]
parisc: Switch to generic sched_clock implementation
Drop the open-coded sched_clock() function and replace it by the provided
GENERIC_SCHED_CLOCK implementation. We have seen quite some hung tasks in the
past, which seem to be fixed by this patch.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Helge Deller <deller@gmx.de>
John David Anglin [Mon, 21 Nov 2016 02:12:36 +0000 (21:12 -0500)]
parisc: Fix races in parisc_setup_cache_timing()
Helge reported to me the following startup crash:
[ 0.000000] Linux version 4.8.0-1-parisc64-smp (debian-kernel@lists.debian.org) (gcc version 5.4.1
20161019 (GCC) ) #1 SMP Debian 4.8.7-1 (2016-11-13)
[ 0.000000] The 64-bit Kernel has started...
[ 0.000000] Kernel default page size is 4 KB. Huge pages enabled with 1 MB physical and 2 MB virtual size.
[ 0.000000] Determining PDC firmware type: System Map.
[ 0.000000] model 9000/785/J5000
[ 0.000000] Total Memory: 2048 MB
[ 0.000000] Memory: 2018528K/2097152K available (9272K kernel code, 3053K rwdata, 1319K rodata, 1024K init, 840K bss, 78624K reserved, 0K cma-reserved)
[ 0.000000] virtual kernel memory layout:
[ 0.000000] vmalloc : 0x0000000000008000 - 0x000000003f000000 (1007 MB)
[ 0.000000] memory : 0x0000000040000000 - 0x00000000c0000000 (2048 MB)
[ 0.000000] .init : 0x0000000040100000 - 0x0000000040200000 (1024 kB)
[ 0.000000] .data : 0x0000000040b0e000 - 0x0000000040f533e0 (4372 kB)
[ 0.000000] .text : 0x0000000040200000 - 0x0000000040b0e000 (9272 kB)
[ 0.768910] Brought up 1 CPUs
[ 0.992465] NET: Registered protocol family 16
[ 2.429981] Releasing cpu 1 now, hpa=
fffffffffffa2000
[ 2.635751] CPU(s): 2 out of 2 PA8500 (PCX-W) at 440.000000 MHz online
[ 2.726692] Setting cache flush threshold to 1024 kB
[ 2.729932] Not-handled unaligned insn 0x43ffff80
[ 2.798114] Setting TLB flush threshold to 140 kB
[ 2.928039] Unaligned handler failed, ret = -1
[ 3.000419] _______________________________
[ 3.000419] < Your System ate a SPARC! Gah! >
[ 3.000419] -------------------------------
[ 3.000419] \ ^__^
[ 3.000419] (__)\ )\/\
[ 3.000419] U ||----w |
[ 3.000419] || ||
[ 9.340055] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-1-parisc64-smp #1 Debian 4.8.7-1
[ 9.448082] task:
00000000bfd48060 task.stack:
00000000bfd50000
[ 9.528040]
[ 10.760029] IASQ:
0000000000000000 0000000000000000 IAOQ:
000000004025d154 000000004025d158
[ 10.868052] IIR:
43ffff80 ISR:
0000000000340000 IOR:
000001ff54150960
[ 10.960029] CPU: 1 CR30:
00000000bfd50000 CR31:
0000000011111111
[ 11.052057] ORIG_R28:
000000004021e3b4
[ 11.100045] IAOQ[0]: irq_exit+0x94/0x120
[ 11.152062] IAOQ[1]: irq_exit+0x98/0x120
[ 11.208031] RP(r2): irq_exit+0xb8/0x120
[ 11.256074] Backtrace:
[ 11.288067] [<
00000000402cd944>] cpu_startup_entry+0x1e4/0x598
[ 11.368058] [<
0000000040109528>] smp_callin+0x2c0/0x2f0
[ 11.436308] [<
00000000402b53fc>] update_curr+0x18c/0x2d0
[ 11.508055] [<
00000000402b73b8>] dequeue_entity+0x2c0/0x1030
[ 11.584040] [<
00000000402b3cc0>] set_next_entity+0x80/0xd30
[ 11.660069] [<
00000000402c1594>] pick_next_task_fair+0x614/0x720
[ 11.740085] [<
000000004020dd34>] __schedule+0x394/0xa60
[ 11.808054] [<
000000004020e488>] schedule+0x88/0x118
[ 11.876039] [<
0000000040283d3c>] rescuer_thread+0x4d4/0x5b0
[ 11.948090] [<
000000004028fc4c>] kthread+0x1ec/0x248
[ 12.016053] [<
0000000040205020>] end_fault_vector+0x20/0xc0
[ 12.092239] [<
00000000402050c0>] _switch_to_ret+0x0/0xf40
[ 12.164044]
[ 12.184036] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-1-parisc64-smp #1 Debian 4.8.7-1
[ 12.244040] Backtrace:
[ 12.244040] [<
000000004021c480>] show_stack+0x68/0x80
[ 12.244040] [<
00000000406f332c>] dump_stack+0xec/0x168
[ 12.244040] [<
000000004021c74c>] die_if_kernel+0x25c/0x430
[ 12.244040] [<
000000004022d320>] handle_unaligned+0xb48/0xb50
[ 12.244040]
[ 12.632066] ---[ end trace
9ca05a7215c7bbb2 ]---
[ 12.692036] Kernel panic - not syncing: Attempted to kill the idle task!
We have the insn 0x43ffff80 in IIR but from IAOQ we should have:
4025d150: 0f f3 20 df ldd,s r19(r31),r31
4025d154: 0f 9f 00 9c ldw r31(ret0),ret0
4025d158: bf 80 20 58 cmpb,*<> r0,ret0,
4025d18c <irq_exit+0xcc>
Cpu0 has just completed running parisc_setup_cache_timing:
[ 2.429981] Releasing cpu 1 now, hpa=
fffffffffffa2000
[ 2.635751] CPU(s): 2 out of 2 PA8500 (PCX-W) at 440.000000 MHz online
[ 2.726692] Setting cache flush threshold to 1024 kB
[ 2.729932] Not-handled unaligned insn 0x43ffff80
[ 2.798114] Setting TLB flush threshold to 140 kB
[ 2.928039] Unaligned handler failed, ret = -1
From the backtrace, cpu1 is in smp_callin:
void __init smp_callin(void)
{
int slave_id = cpu_now_booting;
smp_cpu_init(slave_id);
preempt_disable();
flush_cache_all_local(); /* start with known state */
flush_tlb_all_local(NULL);
local_irq_enable(); /* Interrupts have been off until now */
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
So, it has just flushed its caches and the TLB. It would seem either the
flushes in parisc_setup_cache_timing or smp_callin have corrupted kernel
memory.
The attached patch reworks parisc_setup_cache_timing to remove the races
in setting the cache and TLB flush thresholds. It also corrects the
number of bytes flushed in the TLB calculation.
The patch flushes the cache and TLB on cpu0 before starting the
secondary processors so that they are started from a known state.
Tested with a few reboots on c8000.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Helge Deller <deller@gmx.de>
Viresh Kumar [Thu, 27 Oct 2016 10:20:18 +0000 (15:50 +0530)]
mfd: wm8994-core: Don't use managed regulator bulk get API
The kernel WARNs and then crashes today if wm8994_device_init() fails
after calling devm_regulator_bulk_get().
That happens because there are multiple devices involved here and the
order in which managed resources are freed isn't correct.
The regulators are added as children of wm8994->dev. Whereas,
devm_regulator_bulk_get() receives wm8994->dev as the device, though it
gets the same regulators which were added as children of wm8994->dev
earlier.
During failures, the children are removed first and the core eventually
calls regulator_unregister() for them. As regulator_put() was never done
for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at
WARN_ON(rdev->open_count);
And eventually it crashes from debugfs_remove_recursive().
--------x------------------x----------------
wm8994 3-001a: Device is not a WM8994, ID is 0
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<
c010e24c>] (unwind_backtrace) from [<
c010af38>] (show_stack+0x10/0x14)
[<
c010af38>] (show_stack) from [<
c032a1c4>] (dump_stack+0x88/0x9c)
[<
c032a1c4>] (dump_stack) from [<
c011a98c>] (__warn+0xe8/0x100)
[<
c011a98c>] (__warn) from [<
c011aa54>] (warn_slowpath_null+0x20/0x28)
[<
c011aa54>] (warn_slowpath_null) from [<
c0384a0c>] (regulator_unregister+0xc8/0xd0)
[<
c0384a0c>] (regulator_unregister) from [<
c0406434>] (release_nodes+0x16c/0x1dc)
[<
c0406434>] (release_nodes) from [<
c04039c4>] (__device_release_driver+0x8c/0x110)
[<
c04039c4>] (__device_release_driver) from [<
c0403a64>] (device_release_driver+0x1c/0x28)
[<
c0403a64>] (device_release_driver) from [<
c0402b24>] (bus_remove_device+0xd8/0x104)
[<
c0402b24>] (bus_remove_device) from [<
c03ffcd8>] (device_del+0x10c/0x218)
[<
c03ffcd8>] (device_del) from [<
c0404e4c>] (platform_device_del+0x1c/0x88)
[<
c0404e4c>] (platform_device_del) from [<
c0404ec4>] (platform_device_unregister+0xc/0x20)
[<
c0404ec4>] (platform_device_unregister) from [<
c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64)
[<
c0428bc0>] (mfd_remove_devices_fn) from [<
c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78)
[<
c03ff9d8>] (device_for_each_child_reverse) from [<
c04288c4>] (mfd_remove_devices+0x20/0x30)
[<
c04288c4>] (mfd_remove_devices) from [<
c042758c>] (wm8994_device_init+0x2ac/0x7f0)
[<
c042758c>] (wm8994_device_init) from [<
c04f14a8>] (i2c_device_probe+0x178/0x1fc)
[<
c04f14a8>] (i2c_device_probe) from [<
c04036fc>] (driver_probe_device+0x214/0x2c0)
[<
c04036fc>] (driver_probe_device) from [<
c0403854>] (__driver_attach+0xac/0xb0)
[<
c0403854>] (__driver_attach) from [<
c0401a74>] (bus_for_each_dev+0x68/0x9c)
[<
c0401a74>] (bus_for_each_dev) from [<
c0402cf0>] (bus_add_driver+0x1a0/0x218)
[<
c0402cf0>] (bus_add_driver) from [<
c040406c>] (driver_register+0x78/0xf8)
[<
c040406c>] (driver_register) from [<
c04f20a0>] (i2c_register_driver+0x34/0x84)
[<
c04f20a0>] (i2c_register_driver) from [<
c01017d0>] (do_one_initcall+0x40/0x170)
[<
c01017d0>] (do_one_initcall) from [<
c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
[<
c0a00dbc>] (kernel_init_freeable) from [<
c06e07b0>] (kernel_init+0x8/0x114)
[<
c06e07b0>] (kernel_init) from [<
c0107978>] (ret_from_fork+0x14/0x3c)
---[ end trace
0919d3d0bc998260 ]---
[snip..]
Unable to handle kernel NULL pointer dereference at virtual address
00000078
pgd =
c0004000
[
00000078] *pgd=
00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.8.0-rc6-00154-g54fe84cbd50b #41
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
task:
ee874000 task.stack:
ee878000
PC is at down_write+0x14/0x54
LR is at debugfs_remove_recursive+0x30/0x150
[snip..]
[<
c06e489c>] (down_write) from [<
c02e9954>] (debugfs_remove_recursive+0x30/0x150)
[<
c02e9954>] (debugfs_remove_recursive) from [<
c0382b78>] (_regulator_put+0x24/0xac)
[<
c0382b78>] (_regulator_put) from [<
c0382c1c>] (regulator_put+0x1c/0x2c)
[<
c0382c1c>] (regulator_put) from [<
c0406434>] (release_nodes+0x16c/0x1dc)
[<
c0406434>] (release_nodes) from [<
c04035d4>] (driver_probe_device+0xec/0x2c0)
[<
c04035d4>] (driver_probe_device) from [<
c0403854>] (__driver_attach+0xac/0xb0)
[<
c0403854>] (__driver_attach) from [<
c0401a74>] (bus_for_each_dev+0x68/0x9c)
[<
c0401a74>] (bus_for_each_dev) from [<
c0402cf0>] (bus_add_driver+0x1a0/0x218)
[<
c0402cf0>] (bus_add_driver) from [<
c040406c>] (driver_register+0x78/0xf8)
[<
c040406c>] (driver_register) from [<
c04f20a0>] (i2c_register_driver+0x34/0x84)
[<
c04f20a0>] (i2c_register_driver) from [<
c01017d0>] (do_one_initcall+0x40/0x170)
[<
c01017d0>] (do_one_initcall) from [<
c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
[<
c0a00dbc>] (kernel_init_freeable) from [<
c06e07b0>] (kernel_init+0x8/0x114)
[<
c06e07b0>] (kernel_init) from [<
c0107978>] (ret_from_fork+0x14/0x3c)
Code:
e1a04000 f590f000 e3a03001 e34f3fff (
e1902f9f)
---[ end trace
0919d3d0bc998262 ]---
--------x------------------x----------------
Fix the kernel warnings and crashes by using regulator_bulk_get()
instead of devm_regulator_bulk_get() and explicitly freeing the supplies
in exit paths.
Tested on Exynos 5250, dual core ARM A15 machine.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Viresh Kumar [Fri, 16 Sep 2016 03:26:59 +0000 (08:56 +0530)]
mfd: wm8994-core: Disable regulators before removing them
The order in which resources were freed in wm8994_device_exit() isn't
correct. The regulators are removed before they are disabled.
Fix it by reordering code a bit, which makes it exact opposite of
wm8994_device_init() as well.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Matt Redfearn [Wed, 9 Nov 2016 13:26:25 +0000 (13:26 +0000)]
MIPS: mm: Fix output of __do_page_fault
Since commit
4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
continuation lines") the output from __do_page_fault on MIPS has been
pretty unreadable due to the lack of KERN_CONT markers. Use pr_cont
to provide the appropriate markers & restore the expected output.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14544/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Fri, 14 Oct 2016 09:17:31 +0000 (10:17 +0100)]
mfd: syscon: Support native-endian regmaps
The regmap devicetree binding documentation states that a native-endian
property should be supported as well as big-endian & little-endian,
however syscon in its duplication of the parsing of these properties
omits support for native-endian. Fix this by setting
REGMAP_ENDIAN_NATIVE when a native-endian property is found.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Dave Airlie [Fri, 25 Nov 2016 04:21:26 +0000 (14:21 +1000)]
Merge branch 'mediatek-drm-fixes-2016-11-24' of https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes
This branch include patches of fixing a typo, accurate dsi frame rate,
and fixing null pointer dereference.
* 'mediatek-drm-fixes-2016-11-24' of https://github.com/ckhu-mediatek/linux.git-tags:
drm/mediatek: fix null pointer dereference
drm/mediatek: fixed the calc method of data rate per lane
drm/mediatek: fix a typo of DISP_OD_CFG to OD_RELAYMODE
Aneesh Kumar K.V [Thu, 24 Nov 2016 09:39:54 +0000 (15:09 +0530)]
powerpc/mm: Fixup kernel read only mapping
With commit
e58e87adc8bf9 ("powerpc/mm: Update _PAGE_KERNEL_RO") we
started using the ppp value 0b110 to map kernel readonly. But that
facility was only added as part of ISA 2.04. For earlier ISA version
only supported ppp bit value for readonly mapping is 0b011. (This
implies both user and kernel get mapped using the same ppp bit value for
readonly mapping.).
Update the code such that for earlier architecture version we use ppp
value 0b011 for readonly mapping. We don't differentiate between power5+
and power5 here and apply the new ppp bits only from power6 (ISA 2.05).
This keep the changes minimal.
This fixes issue with PS3 spu usage reported at
https://lkml.kernel.org/r/rep.
1421449714.geoff@infradead.org
Fixes:
e58e87adc8bf9 ("powerpc/mm: Update _PAGE_KERNEL_RO")
Cc: stable@vger.kernel.org # v4.7+
Tested-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Andrey Ryabinin [Thu, 24 Nov 2016 13:23:10 +0000 (13:23 +0000)]
mpi: Fix NULL ptr dereference in mpi_powm() [ver #3]
This fixes CVE-2016-8650.
If mpi_powm() is given a zero exponent, it wants to immediately return
either 1 or 0, depending on the modulus. However, if the result was
initalised with zero limb space, no limbs space is allocated and a
NULL-pointer exception ensues.
Fix this by allocating a minimal amount of limb space for the result when
the 0-exponent case when the result is 1 and not touching the limb space
when the result is 0.
This affects the use of RSA keys and X.509 certificates that carry them.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffff8138ce5d>] mpi_powm+0x32/0x7e6
PGD 0
Oops: 0002 [#1] SMP
Modules linked in:
CPU: 3 PID: 3014 Comm: keyctl Not tainted 4.9.0-rc6-fscache+ #278
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
task:
ffff8804011944c0 task.stack:
ffff880401294000
RIP: 0010:[<
ffffffff8138ce5d>] [<
ffffffff8138ce5d>] mpi_powm+0x32/0x7e6
RSP: 0018:
ffff880401297ad8 EFLAGS:
00010212
RAX:
0000000000000000 RBX:
ffff88040868bec0 RCX:
ffff88040868bba0
RDX:
ffff88040868b260 RSI:
ffff88040868bec0 RDI:
ffff88040868bee0
RBP:
ffff880401297ba8 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000047 R11:
ffffffff8183b210 R12:
0000000000000000
R13:
ffff8804087c7600 R14:
000000000000001f R15:
ffff880401297c50
FS:
00007f7a7918c700(0000) GS:
ffff88041fb80000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000000 CR3:
0000000401250000 CR4:
00000000001406e0
Stack:
ffff88040868bec0 0000000000000020 ffff880401297b00 ffffffff81376cd4
0000000000000100 ffff880401297b10 ffffffff81376d12 ffff880401297b30
ffffffff81376f37 0000000000000100 0000000000000000 ffff880401297ba8
Call Trace:
[<
ffffffff81376cd4>] ? __sg_page_iter_next+0x43/0x66
[<
ffffffff81376d12>] ? sg_miter_get_next_page+0x1b/0x5d
[<
ffffffff81376f37>] ? sg_miter_next+0x17/0xbd
[<
ffffffff8138ba3a>] ? mpi_read_raw_from_sgl+0xf2/0x146
[<
ffffffff8132a95c>] rsa_verify+0x9d/0xee
[<
ffffffff8132acca>] ? pkcs1pad_sg_set_buf+0x2e/0xbb
[<
ffffffff8132af40>] pkcs1pad_verify+0xc0/0xe1
[<
ffffffff8133cb5e>] public_key_verify_signature+0x1b0/0x228
[<
ffffffff8133d974>] x509_check_for_self_signed+0xa1/0xc4
[<
ffffffff8133cdde>] x509_cert_parse+0x167/0x1a1
[<
ffffffff8133d609>] x509_key_preparse+0x21/0x1a1
[<
ffffffff8133c3d7>] asymmetric_key_preparse+0x34/0x61
[<
ffffffff812fc9f3>] key_create_or_update+0x145/0x399
[<
ffffffff812fe227>] SyS_add_key+0x154/0x19e
[<
ffffffff81001c2b>] do_syscall_64+0x80/0x191
[<
ffffffff816825e4>] entry_SYSCALL64_slow_path+0x25/0x25
Code: 56 41 55 41 54 53 48 81 ec a8 00 00 00 44 8b 71 04 8b 42 04 4c 8b 67 18 45 85 f6 89 45 80 0f 84 b4 06 00 00 85 c0 75 2f 41 ff ce <49> c7 04 24 01 00 00 00 b0 01 75 0b 48 8b 41 18 48 83 38 01 0f
RIP [<
ffffffff8138ce5d>] mpi_powm+0x32/0x7e6
RSP <
ffff880401297ad8>
CR2:
0000000000000000
---[ end trace
d82015255d4a5d8d ]---
Basically, this is a backport of a libgcrypt patch:
http://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgcrypt.git;a=patch;h=
6e1adb05d290aeeb1c230c763970695f4a538526
Fixes:
cdec9cb5167a ("crypto: GnuPG based MPI lib - source files (part 1)")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
cc: linux-ima-devel@lists.sourceforge.net
cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
Andrey Ryabinin [Thu, 24 Nov 2016 13:23:03 +0000 (13:23 +0000)]
X.509: Fix double free in x509_cert_parse() [ver #3]
We shouldn't free cert->pub->key in x509_cert_parse() because
x509_free_certificate() also does this:
BUG: Double free or freeing an invalid pointer
...
Call Trace:
[<
ffffffff81896c20>] dump_stack+0x63/0x83
[<
ffffffff81356571>] kasan_object_err+0x21/0x70
[<
ffffffff81356ed9>] kasan_report_double_free+0x49/0x60
[<
ffffffff813561ad>] kasan_slab_free+0x9d/0xc0
[<
ffffffff81350b7a>] kfree+0x8a/0x1a0
[<
ffffffff81844fbf>] public_key_free+0x1f/0x30
[<
ffffffff818455d4>] x509_free_certificate+0x24/0x90
[<
ffffffff818460bc>] x509_cert_parse+0x2bc/0x300
[<
ffffffff81846cae>] x509_key_preparse+0x3e/0x330
[<
ffffffff818444cf>] asymmetric_key_preparse+0x6f/0x100
[<
ffffffff8178bec0>] key_create_or_update+0x260/0x5f0
[<
ffffffff8178e6d9>] SyS_add_key+0x199/0x2a0
[<
ffffffff821d823b>] entry_SYSCALL_64_fastpath+0x1e/0xad
Object at
ffff880110bd1900, in cache kmalloc-512 size: 512
....
Freed:
PID = 2579
[<
ffffffff8104283b>] save_stack_trace+0x1b/0x20
[<
ffffffff813558f6>] save_stack+0x46/0xd0
[<
ffffffff81356183>] kasan_slab_free+0x73/0xc0
[<
ffffffff81350b7a>] kfree+0x8a/0x1a0
[<
ffffffff818460a3>] x509_cert_parse+0x2a3/0x300
[<
ffffffff81846cae>] x509_key_preparse+0x3e/0x330
[<
ffffffff818444cf>] asymmetric_key_preparse+0x6f/0x100
[<
ffffffff8178bec0>] key_create_or_update+0x260/0x5f0
[<
ffffffff8178e6d9>] SyS_add_key+0x199/0x2a0
[<
ffffffff821d823b>] entry_SYSCALL_64_fastpath+0x1e/0xad
Fixes:
db6c43bd2132 ("crypto: KEYS: convert public key and digsig asym to the akcipher api")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Arvind Yadav [Wed, 19 Oct 2016 10:04:16 +0000 (15:34 +0530)]
gpu/drm/exynos/exynos_hdmi - Unmap region obtained by of_iomap
Free memory mapping, if hdmi_probe is not successful.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
David S. Miller [Thu, 24 Nov 2016 21:24:20 +0000 (16:24 -0500)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth
Johan Hedberg says:
====================
pull request: bluetooth 2016-11-23
Sorry about the late pull request for 4.9, but we have one more
important Bluetooth patch that should make it to the release. It fixes
connection creation for Bluetooth LE controllers that do not have a
public address (only a random one).
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Wed, 23 Nov 2016 01:57:04 +0000 (20:57 -0500)]
net sched filters: fix filter handle ID in tfilter_notify_chain()
Should pass valid filter handle, not the netlink flags.
Fixes:
30a391a13ab92 ("net sched filters: pass netlink message flags in event notification")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 22 Nov 2016 19:40:58 +0000 (11:40 -0800)]
net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change
In case the link change and EEE is enabled or disabled, always try to
re-negotiate this with the link partner.
Fixes:
450b05c15f9c ("net: dsa: bcm_sf2: add support for controlling EEE")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Gospodarek [Tue, 22 Nov 2016 18:14:08 +0000 (13:14 -0500)]
bnxt: do not busy-poll when link is down
When busy polling while a link is down (during a link-flap test), TX
timeouts were observed as well as the following messages in the ring
buffer:
bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free tx failed. rc:-1
bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free rx failed. rc:-1
These were resolved by checking for link status and returning if link
was not up.
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Tested-by: Rob Miller <rob.miller@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 22 Nov 2016 17:06:45 +0000 (09:06 -0800)]
udplite: call proper backlog handlers
In commits
93821778def10 ("udp: Fix rcv socket locking") and
f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into
__udpv6_queue_rcv_skb") UDP backlog handlers were renamed, but UDPlite
was forgotten.
This leads to crashes if UDPlite header is pulled twice, which happens
starting from commit
e6afc8ace6dd ("udp: remove headers from UDP packets
before queueing")
Bug found by syzkaller team, thanks a lot guys !
Note that backlog use in UDP/UDPlite is scheduled to be removed starting
from linux-4.10, so this patch is only needed up to linux-4.9
Fixes:
93821778def1 ("udp: Fix rcv socket locking")
Fixes:
f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb")
Fixes:
e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 24 Nov 2016 18:51:18 +0000 (10:51 -0800)]
Merge tag 'mmc-v4.9-rc5' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC host:
- sdhci-of-esdhc: Fix card detection
- dw_mmc: Fix DMA error path"
* tag 'mmc-v4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: dw_mmc: fix the error handling for dma operation
mmc: sdhci-of-esdhc: fixup PRESENT_STATE read
Linus Torvalds [Thu, 24 Nov 2016 18:38:20 +0000 (10:38 -0800)]
Merge tag 'usb-4.9-rc7' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a few small USB fixes and new device ids for 4.9-rc7.
The majority of these fixes are in the musb driver, fixing a number of
regressions that have been reported but took a while to resolve. The
other fixes are all small ones, to resolve other reported minor
issues.
All have been in linux-next for a while with no reported issues"
* tag 'usb-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: gadget: f_fs: fix wrong parenthesis in ffs_func_req_match()
phy: twl4030-usb: Fix for musb session bit based PM
usb: musb: Drop pointless PM runtime code for dsps glue
usb: musb: Add missing pm_runtime_disable and drop 2430 PM timeout
usb: musb: Fix PM for hub disconnect
usb: musb: Fix sleeping function called from invalid context for hdrc glue
usb: musb: Fix broken use of static variable for multiple instances
USB: serial: cp210x: add ID for the Zone DPMX
usb: chipidea: move the lock initialization to core file
Fix USB CB/CBI storage devices with CONFIG_VMAP_STACK=y
USB: serial: ftdi_sio: add support for TI CC3200 LaunchPad
Linus Torvalds [Thu, 24 Nov 2016 17:40:26 +0000 (09:40 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- DMA-on-stack fixes for a couple drivers, from Benjamin Tissoires
- small memory sanitization fix for sensor-hub driver, from Song
Hongyan
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: hid-sensor-hub: clear memory to avoid random data
HID: rmi: make transfer buffers DMA capable
HID: magicmouse: make transfer buffers DMA capable
HID: lg: make transfer buffers DMA capable
HID: cp2112: make transfer buffers DMA capable
Radim Krčmář [Wed, 23 Nov 2016 20:25:48 +0000 (21:25 +0100)]
KVM: x86: check for pic and ioapic presence before use
Split irqchip allows pic and ioapic routes to be used without them being
created, which results in NULL access. Check for NULL and avoid it.
(The setup is too racy for a nicer solutions.)
Found by syzkaller:
general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 3 PID: 11923 Comm: kworker/3:2 Not tainted 4.9.0-rc5+ #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: events irqfd_inject
task:
ffff88006a06c7c0 task.stack:
ffff880068638000
RIP: 0010:[...] [...] __lock_acquire+0xb35/0x3380 kernel/locking/lockdep.c:3221
RSP: 0000:
ffff88006863ea20 EFLAGS:
00010006
RAX:
dffffc0000000000 RBX:
dffffc0000000000 RCX:
0000000000000000
RDX:
0000000000000039 RSI:
0000000000000000 RDI:
1ffff1000d0c7d9e
RBP:
ffff88006863ef58 R08:
0000000000000001 R09:
0000000000000000
R10:
00000000000001c8 R11:
0000000000000000 R12:
ffff88006a06c7c0
R13:
0000000000000001 R14:
ffffffff8baab1a0 R15:
0000000000000001
FS:
0000000000000000(0000) GS:
ffff88006d100000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000004abdd0 CR3:
000000003e2f2000 CR4:
00000000000026e0
Stack:
ffffffff894d0098 1ffff1000d0c7d56 ffff88006863ecd0 dffffc0000000000
ffff88006a06c7c0 0000000000000000 ffff88006863ecf8 0000000000000082
0000000000000000 ffffffff815dd7c1 ffffffff00000000 ffffffff00000000
Call Trace:
[...] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3746
[...] __raw_spin_lock include/linux/spinlock_api_smp.h:144
[...] _raw_spin_lock+0x38/0x50 kernel/locking/spinlock.c:151
[...] spin_lock include/linux/spinlock.h:302
[...] kvm_ioapic_set_irq+0x4c/0x100 arch/x86/kvm/ioapic.c:379
[...] kvm_set_ioapic_irq+0x8f/0xc0 arch/x86/kvm/irq_comm.c:52
[...] kvm_set_irq+0x239/0x640 arch/x86/kvm/../../../virt/kvm/irqchip.c:101
[...] irqfd_inject+0xb4/0x150 arch/x86/kvm/../../../virt/kvm/eventfd.c:60
[...] process_one_work+0xb40/0x1ba0 kernel/workqueue.c:2096
[...] worker_thread+0x214/0x18a0 kernel/workqueue.c:2230
[...] kthread+0x328/0x3e0 kernel/kthread.c:209
[...] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes:
49df6397edfc ("KVM: x86: Split the APIC from the rest of IRQCHIP.")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Radim Krčmář [Wed, 23 Nov 2016 20:15:27 +0000 (21:15 +0100)]
KVM: x86: fix out-of-bounds accesses of rtc_eoi map
KVM was using arrays of size KVM_MAX_VCPUS with vcpu_id, but ID can be
bigger that the maximal number of VCPUs, resulting in out-of-bounds
access.
Found by syzkaller:
BUG: KASAN: slab-out-of-bounds in __apic_accept_irq+0xb33/0xb50 at addr [...]
Write of size 1 by task a.out/27101
CPU: 1 PID: 27101 Comm: a.out Not tainted 4.9.0-rc5+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[...]
Call Trace:
[...] __apic_accept_irq+0xb33/0xb50 arch/x86/kvm/lapic.c:905
[...] kvm_apic_set_irq+0x10e/0x180 arch/x86/kvm/lapic.c:495
[...] kvm_irq_delivery_to_apic+0x732/0xc10 arch/x86/kvm/irq_comm.c:86
[...] ioapic_service+0x41d/0x760 arch/x86/kvm/ioapic.c:360
[...] ioapic_set_irq+0x275/0x6c0 arch/x86/kvm/ioapic.c:222
[...] kvm_ioapic_inject_all arch/x86/kvm/ioapic.c:235
[...] kvm_set_ioapic+0x223/0x310 arch/x86/kvm/ioapic.c:670
[...] kvm_vm_ioctl_set_irqchip arch/x86/kvm/x86.c:3668
[...] kvm_arch_vm_ioctl+0x1a08/0x23c0 arch/x86/kvm/x86.c:3999
[...] kvm_vm_ioctl+0x1fa/0x1a70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3099
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes:
af1bae5497b9 ("KVM: x86: bump KVM_MAX_VCPU_ID to 1023")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Radim Krčmář [Wed, 23 Nov 2016 20:15:00 +0000 (21:15 +0100)]
KVM: x86: drop error recovery in em_jmp_far and em_ret_far
em_jmp_far and em_ret_far assumed that setting IP can only fail in 64
bit mode, but syzkaller proved otherwise (and SDM agrees).
Code segment was restored upon failure, but it was left uninitialized
outside of long mode, which could lead to a leak of host kernel stack.
We could have fixed that by always saving and restoring the CS, but we
take a simpler approach and just break any guest that manages to fail
as the error recovery is error-prone and modern CPUs don't need emulator
for this.
Found by syzkaller:
WARNING: CPU: 2 PID: 3668 at arch/x86/kvm/emulate.c:2217 em_ret_far+0x428/0x480
Kernel panic - not syncing: panic_on_warn set ...
CPU: 2 PID: 3668 Comm: syz-executor Not tainted 4.9.0-rc4+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[...]
Call Trace:
[...] __dump_stack lib/dump_stack.c:15
[...] dump_stack+0xb3/0x118 lib/dump_stack.c:51
[...] panic+0x1b7/0x3a3 kernel/panic.c:179
[...] __warn+0x1c4/0x1e0 kernel/panic.c:542
[...] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
[...] em_ret_far+0x428/0x480 arch/x86/kvm/emulate.c:2217
[...] em_ret_far_imm+0x17/0x70 arch/x86/kvm/emulate.c:2227
[...] x86_emulate_insn+0x87a/0x3730 arch/x86/kvm/emulate.c:5294
[...] x86_emulate_instruction+0x520/0x1ba0 arch/x86/kvm/x86.c:5545
[...] emulate_instruction arch/x86/include/asm/kvm_host.h:1116
[...] complete_emulated_io arch/x86/kvm/x86.c:6870
[...] complete_emulated_mmio+0x4e9/0x710 arch/x86/kvm/x86.c:6934
[...] kvm_arch_vcpu_ioctl_run+0x3b7a/0x5a90 arch/x86/kvm/x86.c:6978
[...] kvm_vcpu_ioctl+0x61e/0xdd0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2557
[...] vfs_ioctl fs/ioctl.c:43
[...] do_vfs_ioctl+0x18c/0x1040 fs/ioctl.c:679
[...] SYSC_ioctl fs/ioctl.c:694
[...] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685
[...] entry_SYSCALL_64_fastpath+0x1f/0xc2
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes:
d1442d85cc30 ("KVM: x86: Handle errors when RIP is set during far jumps")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>