Dong Jia Shi [Wed, 2 May 2018 07:25:59 +0000 (09:25 +0200)]
vfio: ccw: fix error return in vfio_ccw_sch_event
If the device has not been registered, or there is work pending,
we should reschedule a sch_event call again.
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <
20180502072559.50691-1-bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Harald Freudenberger [Wed, 25 Apr 2018 09:43:17 +0000 (11:43 +0200)]
s390/archrandom: Rework arch random implementation.
The arch_get_random_seed_long() invocation done by the random device
driver is done in interrupt context and may be invoked very very
frequently. The existing s390 arch_get_random_seed*() implementation
uses the PRNO(TRNG) instruction which produces excellent high quality
entropy but is relatively slow and thus expensive.
This fix reworks the arch_get_random_seed* implementation. It
introduces a buffer concept to decouple the delivery of random data
via arch_get_random_seed*() from the generation of new random
bytes. The buffer of random data is filled asynchronously by a
workqueue thread.
If there are enough bytes in the buffer the s390_arch_random_generate()
just delivers these bytes. Otherwise false is returned until the worker
thread refills the buffer.
The worker fills the rng buffer by pulling fresh entropy from the
high quality (but slow) true hardware random generator. This entropy
is then spread over the buffer with an pseudo random generator.
As the arch_get_random_seed_long() fetches 8 bytes and the calling
function add_interrupt_randomness() counts this as 1 bit entropy the
distribution needs to make sure there is in fact 1 bit entropy
contained in 8 bytes of the buffer. The current values pull 32 byte
entropy and scatter this into a 2048 byte buffer. So 8 byte in the
buffer will contain 1 bit of entropy.
The worker thread is rescheduled based on the charge level of the
buffer but at least with 500 ms delay to avoid too much cpu consumption.
So the max. amount of rng data delivered via arch_get_random_seed is
limited to 4Kb per second.
Signed-off-by: Harald Freudenberger <freude@de.ibm.com>
Reviewed-by: Patrick Steuer <patrick.steuer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ursula Braun [Tue, 22 May 2018 10:42:57 +0000 (12:42 +0200)]
s390/net: add pnetid support
s390 hardware supports the definition of a so-call Physical NETwork
IDentifier (short PNETID) per network device port. These PNETIDS
can be used to identify network devices that are attached to the same
physical network (broadcast domain).
This patch provides the interface to extract the PNETID of a port of
a device attached to the ccw-bus or pci-bus.
Parts of this patch are based on an initial implementation by
Thomas Richter.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Wed, 16 May 2018 09:25:21 +0000 (11:25 +0200)]
s390/dasd: simplify locking in dasd_times_out
Provide __dasd_cancel_req that is called with the ccw device lock
held to simplify the locking in dasd_times_out. Also this removes
the following sparse warning:
context imbalance in 'dasd_times_out' - different lock contexts for basic block
Note: with this change dasd_schedule_device_bh is now called (via
dasd_cancel_req) with the ccw device lock held. But is is already
the case for other codepaths.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Wed, 15 Feb 2017 10:45:07 +0000 (11:45 +0100)]
s390/cio: add test for ccwgroup device
Add a test to check if a given device is a ccwgroup device.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Thu, 23 Jun 2016 08:58:15 +0000 (10:58 +0200)]
s390/cio: add helper to query utility strings per given ccw device
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Masahiro Yamada [Wed, 9 May 2018 07:51:39 +0000 (16:51 +0900)]
s390: remove no-op macro VMLINUX_SYMBOL()
VMLINUX_SYMBOL() is no-op unless CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
is defined. It has ever been selected only by BLACKFIN and METAG.
VMLINUX_SYMBOL() is unneeded for s390-specific code.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Wed, 9 May 2018 07:47:53 +0000 (09:47 +0200)]
s390: remove closung punctuation from spectre messages
There should not be a '.' at the end of the spectre syslog messages.
Remove them.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Vasily Gorbik [Thu, 3 May 2018 14:40:13 +0000 (16:40 +0200)]
s390: introduce compile time check for empty .bss section
Introduce compile time check for files which should avoid using .bss
section, because of the following reasons:
- .bss section has not been zeroed yet,
- initrd has not been moved to a safe location and could be overlapping
with .bss section.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Fri, 4 May 2018 11:22:12 +0000 (13:22 +0200)]
s390/early: move functions which may not access bss section to extra file
Move functions which may not access bss section to extra file. This
makes it easier to verify that all early functions which may not rely
on an initialized bss section are not accessing it.
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Fri, 4 May 2018 10:58:48 +0000 (12:58 +0200)]
s390/early: get rid of #ifdef CONFIG_BLK_DEV_INITRD
Use IS_ENABLED to get rid of an #ifdef statement.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Fri, 4 May 2018 10:58:00 +0000 (12:58 +0200)]
s390/early: get rid of memmove_early
memmove_early was introduced with commit
d543a106f96d6 ("s390: fix
initrd corruptions with gcov/kcov instrumented kernels"). The reason
for writing this extra memmove implementation was to be able to move
memory from an unknown location (aka SCSI IPL parameter block) to a
known location.
This had to done early before it was known if the SCSI IPL parameter
block pointer was valid or not, and therefore the memmove
implementation was supposed to be able to handle program checks.
The code has been changed and especially with commit
d08091ac9654
("s390/ipl: rely on diag308 store to get ipl info") and
commit
b4623d4e5b23 ("s390: provide memmove implementation") there
is no need to have a memmove version that can handle program checks,
and in addition it cannot be gcov/kcov instrumented anymore.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Thomas Richter [Tue, 8 May 2018 08:18:39 +0000 (10:18 +0200)]
s390/cpum_sf: Add data entry sizes to sampling trailer entry
The CPU Measurement sampling facility creates a trailer entry for each
Sample-Data-Block of stored samples. The trailer entry contains the sizes
(in bytes) of the stored sampling types:
- basic-sampling data entry size
- diagnostic-sampling data entry size
Both sizes are 2 bytes long.
This patch changes the trailer entry definition to reflect this.
Fixes:
fcc77f507333 ("s390/cpum_sf: Atomically reset trailer entry fields of sample-data-blocks")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Thomas Richter [Tue, 8 May 2018 05:53:39 +0000 (07:53 +0200)]
perf: fix invalid bit in diagnostic entry
The s390 CPU measurement facility sampling mode supports basic entries
and diagnostic entries. Each entry has a valid bit to indicate the
status of the entry as valid or invalid.
This bit is bit 31 in the diagnostic entry, but the bit mask definition
refers to bit 30.
Fix this by making the reserved field one bit larger.
Fixes:
7e75fc3ff4cf ("s390/cpum_sf: Add raw data sampling to support the diagnostic-sampling function")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Hendrik Brueckner [Thu, 3 May 2018 13:56:15 +0000 (15:56 +0200)]
s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero
Correct a trinity finding for the perf_event_open() system call with
a perf event attribute structure that uses a frequency but has the
sampling frequency set to zero. This causes a FP divide exception during
the sample rate initialization for the hardware sampling facility.
Fixes:
8c069ff4bd606 ("s390/perf: add support for the CPU-Measurement Sampling Facility")
Cc: stable@vger.kernel.org # 3.14+
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Mon, 23 Apr 2018 12:31:36 +0000 (14:31 +0200)]
s390: use expoline thunks in the BPF JIT
The BPF JIT need safe guarding against spectre v2 in the sk_load_xxx
assembler stubs and the indirect branches generated by the JIT itself
need to be converted to expolines.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Tue, 24 Apr 2018 13:32:08 +0000 (15:32 +0200)]
s390: extend expoline to BC instructions
The BPF JIT uses a 'b <disp>(%r<x>)' instruction in the definition
of the sk_load_word and sk_load_half functions.
Add support for branch-on-condition instructions contained in the
thunk code of an expoline.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Tue, 24 Apr 2018 09:18:49 +0000 (11:18 +0200)]
s390: remove indirect branch from do_softirq_own_stack
The inline assembly to call __do_softirq on the irq stack uses
an indirect branch. This can be replaced with a normal relative
branch.
Cc: stable@vger.kernel.org # 4.16
Fixes:
f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Tue, 24 Apr 2018 06:23:54 +0000 (08:23 +0200)]
s390: move spectre sysfs attribute code
The nospec-branch.c file is compiled without the gcc options to
generate expoline thunks. The return branch of the sysfs show
functions cpu_show_spectre_v1 and cpu_show_spectre_v2 is an indirect
branch as well. These need to be compiled with expolines.
Move the sysfs functions for spectre reporting to a separate file
and loose an '.' for one of the messages.
Cc: stable@vger.kernel.org # 4.16
Fixes:
d424986f1d ("s390: add sysfs attributes for spectre")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Wed, 25 Apr 2018 16:41:30 +0000 (18:41 +0200)]
s390/kernel: use expoline for indirect branches
The assember code in arch/s390/kernel uses a few more indirect branches
which need to be done with execute trampolines for CONFIG_EXPOLINE=y.
Cc: stable@vger.kernel.org # 4.16
Fixes:
f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Wed, 25 Apr 2018 16:35:26 +0000 (18:35 +0200)]
s390/ftrace: use expoline for indirect branches
The return from the ftrace_stub, _mcount, ftrace_caller and
return_to_handler functions is done with "br %r14" and "br %r1".
These are indirect branches as well and need to use execute
trampolines for CONFIG_EXPOLINE=y.
The ftrace_caller function is a special case as it returns to the
start of a function and may only use %r0 and %r1. For a pre z10
machine the standard execute trampoline uses a LARL + EX to do
this, but this requires *two* registers in the range %r1..%r15.
To get around this the 'br %r1' located in the lowcore is used,
then the EX instruction does not need an address register.
But the lowcore trick may only be used for pre z14 machines,
with noexec=on the mapping for the first page may not contain
instructions. The solution for that is an ALTERNATIVE in the
expoline THUNK generated by 'GEN_BR_THUNK %r1' to switch to
EXRL, this relies on the fact that a machine that supports
noexec=on has EXRL as well.
Cc: stable@vger.kernel.org # 4.16
Fixes:
f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Mon, 23 Apr 2018 12:31:36 +0000 (14:31 +0200)]
s390/lib: use expoline for indirect branches
The return from the memmove, memset, memcpy, __memset16, __memset32 and
__memset64 functions are done with "br %r14". These are indirect branches
as well and need to use execute trampolines for CONFIG_EXPOLINE=y.
Cc: stable@vger.kernel.org # 4.16
Fixes:
f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Mon, 23 Apr 2018 12:31:36 +0000 (14:31 +0200)]
s390/crc32-vx: use expoline for indirect branches
The return from the crc32_le_vgfm_16/crc32c_le_vgfm_16 and the
crc32_be_vgfm_16 functions are done with "br %r14". These are indirect
branches as well and need to use execute trampolines for CONFIG_EXPOLINE=y.
Cc: stable@vger.kernel.org # 4.16
Fixes:
f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Fri, 20 Apr 2018 14:49:46 +0000 (16:49 +0200)]
s390: move expoline assembler macros to a header
To be able to use the expoline branches in different assembler
files move the associated macros from entry.S to a new header
nospec-insn.h.
While we are at it make the macros a bit nicer to use.
Cc: stable@vger.kernel.org # 4.16
Fixes:
f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Halil Pasic [Tue, 24 Apr 2018 11:26:56 +0000 (13:26 +0200)]
vfio: ccw: fix cleanup if cp_prefetch fails
If the translation of a channel program fails, we may end up attempting
to clean up (free, unpin) stuff that never got translated (and allocated,
pinned) in the first place.
By adjusting the lengths of the chains accordingly (so the element that
failed, and all subsequent elements are excluded) cleanup activities
based on false assumptions can be avoided.
Let's make sure cp_free works properly after cp_prefetch returns with an
error by setting ch_len of a ccw chain to the number of the translated
CCWs on that chain.
Cc: stable@vger.kernel.org #v4.12+
Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <
20180423110113.59385-2-bjsdjshi@linux.vnet.ibm.com>
[CH: fixed typos]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Wed, 25 Apr 2018 10:39:06 +0000 (12:39 +0200)]
s390/kexec_file: add declaration of purgatory related globals
Fix the following sparse complaints:
arch/s390/purgatory/purgatory.c:18:5: warning: symbol 'kernel_entry' was not declared. Should it be static?
arch/s390/purgatory/purgatory.c:19:5: warning: symbol 'kernel_type' was not declared. Should it be static?
arch/s390/purgatory/purgatory.c:21:5: warning: symbol 'crash_start' was not declared. Should it be static?
arch/s390/purgatory/purgatory.c:22:5: warning: symbol 'crash_size' was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Thu, 26 Apr 2018 12:31:52 +0000 (14:31 +0200)]
s390: update defconfigs
Change the following to y:
arch/s390/configs/performance_defconfig:262:warning: symbol value 'm' invalid for NF_TABLES_IPV4
arch/s390/configs/performance_defconfig:264:warning: symbol value 'm' invalid for NF_TABLES_ARP
arch/s390/configs/performance_defconfig:285:warning: symbol value 'm' invalid for NF_TABLES_IPV6
arch/s390/configs/performance_defconfig:306:warning: symbol value 'm' invalid for NF_TABLES_BRIDGE
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Harald Freudenberger [Mon, 23 Apr 2018 09:14:28 +0000 (11:14 +0200)]
MAINTAINERS: update s390 zcrypt maintainers email address
Signed-off-by: Harald Freudenberger <freude@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Linus Torvalds [Thu, 26 Apr 2018 23:36:11 +0000 (16:36 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio fixups from Michael Tsirkin:
- Latest header update will break QEMU (if it's rebuilt with the new
header) - and it seems that the code there is so fragile that any
change in this header will break it. Add a better interface so users
do not need to change their code every time that header changes.
- Fix virtio console for spec compliance.
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_console: reset on out of memory
virtio_console: move removal code
virtio_console: drop custom control queue cleanup
virtio_console: free buffers after reset
virtio: add ability to iterate over vqs
virtio_console: don't tie bufs to a vq
virtio_balloon: add array of stat names
Linus Torvalds [Thu, 26 Apr 2018 23:33:54 +0000 (16:33 -0700)]
Merge tag 'hwmon-for-linus-v4.17-rc3' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Add support for new Ryzen chips to k10temp driver
... making Phoronix happy
- Fix inconsistent chip access in nct6683 driver
- Handle absence of few types of sensors in scmi driver
* tag 'hwmon-for-linus-v4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (k10temp) Add support for AMD Ryzen w/ Vega graphics
hwmon: (k10temp) Add temperature offset for Ryzen 2700X
hwmon: (nct6683) Enable EC access if disabled at boot
hwmon: (scmi) handle absence of few types of sensors
Linus Torvalds [Thu, 26 Apr 2018 23:28:24 +0000 (16:28 -0700)]
Merge tag 'pci-v4.17-fixes-1' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- fix Aardvark MRRS setting (Evan Wang)
- clarify "bandwidth available" link status message (Jakub Kicinski)
- update Kirin GPIO name to fix probe failure (Loic Poulain)
- fix Aardvark IRQ usage (Victor Gu)
- fix Aardvark config accessor issues (Victor Gu)
* tag 'pci-v4.17-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Add "PCIe" to pcie_print_link_status() messages
PCI: kirin: Fix reset gpio name
PCI: aardvark: Fix PCIe Max Read Request Size setting
PCI: aardvark: Use ISR1 instead of ISR0 interrupt in legacy irq mode
PCI: aardvark: Set PIO_ADDR_LS correctly in advk_pcie_rd_conf()
PCI: aardvark: Fix logic in advk_pcie_{rd,wr}_conf()
Linus Torvalds [Thu, 26 Apr 2018 23:22:47 +0000 (16:22 -0700)]
Merge tag 'trace-v4.17-rc1' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Add workqueue forward declaration (for new work, but a nice clean up)
- seftest fixes for the new histogram code
- Print output fix for hwlat tracer
- Fix missing system call events - due to change in x86 syscall naming
- Fix kprobe address being used by perf being hashed
* tag 'trace-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix missing tab for hwlat_detector print format
selftests: ftrace: Add a testcase for multiple actions on trigger
selftests: ftrace: Fix trigger extended error testcase
kprobes: Fix random address output of blacklist file
tracing: Fix kernel crash while using empty filter with perf
tracing/x86: Update syscall trace events to handle new prefixed syscall func names
tracing: Add missing forward declaration
Linus Torvalds [Thu, 26 Apr 2018 18:06:36 +0000 (11:06 -0700)]
Merge tag 'acpi-4.17-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These are two watchdog-related fixes, fix for a backlight regression
from the 4.16 cycle that unfortunately was propagated to -stable and a
button module modification to prevent graphics driver modules from
failing to load due to unmet dependencies if ACPI is disabled from the
kernel command line.
Specifics:
- Change the ACPI subsystem initialization ordering to initialize the
WDAT watchodg before reserving PNP motherboard resources so as to
allow the watchdog to allocate its resources before the PNP code
gets to them and prevents it from working correctly (Mika
Westerberg).
- Add a quirk for Lenovo Z50-70 to use the iTCO watchdog instead of
the WDAT one which conflicts with the RTC on that platform (Mika
Westerberg).
- Avoid breaking backlight handling on Dell XPS 13 2013 model by
allowing laptops to use the ACPI backlight by default even if they
are Windows 8-ready in principle (Hans de Goede)"
* tag 'acpi-4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: Only default only_lcd to true on Win8-ready _desktops_
ACPI / button: make module loadable when booted in non-ACPI mode
ACPI / watchdog: Prefer iTCO_wdt on Lenovo Z50-70
ACPI / scan: Initialize watchdog before PNP
Linus Torvalds [Thu, 26 Apr 2018 18:03:02 +0000 (11:03 -0700)]
Merge tag 'pm-4.17-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These are a Low Power S0 Idle quirk, a hibernation handling fix for
the PCI bus type and a brcmstb-avs-cpufreq driver fixup removing
development debug code from it.
Specifics:
- Blacklist the Low Power S0 Idle _DSM on ThinkPad X1 Tablet(2016)
where it causes issues and make it use ACPI S3 which works instead
of the non-working suspend-to-idle by default (Chen Yu).
- Fix the handling of hibernation in the PCI core for devices with
the DPM_FLAG_SMART_SUSPEND flag set to fix a regression affecting
intel-lpss I2C devices (Mika Westerberg).
- Drop development debug code from the brcmstb-avs-cpufreq driver
(Markus Mayer)"
* tag 'pm-4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: brcmstb-avs-cpufreq: remove development debug support
PCI / PM: Do not clear state_saved in pci_pm_freeze() when smart suspend is set
ACPI / PM: Blacklist Low Power S0 Idle _DSM for ThinkPad X1 Tablet(2016)
Linus Torvalds [Thu, 26 Apr 2018 17:59:56 +0000 (10:59 -0700)]
Merge tag 'random_for_linus_stable' of git://git./linux/kernel/git/tytso/random
Pull /dev/random fixes from Ted Ts'o:
"Fix a regression on NUMA kernels and suppress excess unseeded entropy
pool warnings"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: rate limit unseeded randomness warnings
random: fix possible sleeping allocation from irq context
Linus Torvalds [Thu, 26 Apr 2018 17:29:46 +0000 (10:29 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A couple of bug fixes:
- correct some CPU-MF counter names for z13 and z14
- correct locking in the vfio-ccw fsm_io_helper function
- provide arch_uretprobe_is_alive to avoid sigsegv with uretprobes
- fix a corner case with CPU-MF sampling in regard to execve
- fix expoline code revert for loadable modules
- update chpid descriptor for resource accessibility events
- fix dasd I/O errors due to outdated device alias infomation"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: correct module section names for expoline code revert
vfio: ccw: process ssch with interrupts disabled
s390: update sampling tag after task pid change
s390/cpum_cf: rename IBM z13/z14 counter names
s390/dasd: fix IO error for newly defined devices
s390/uprobes: implement arch_uretprobe_is_alive()
s390/cio: update chpid descriptor after resource accessibility event
Rafael J. Wysocki [Thu, 26 Apr 2018 13:11:39 +0000 (15:11 +0200)]
Merge branches 'acpi-watchdog', 'acpi-button' and 'acpi-video'
* acpi-watchdog:
ACPI / watchdog: Prefer iTCO_wdt on Lenovo Z50-70
* acpi-button:
ACPI / button: make module loadable when booted in non-ACPI mode
* acpi-video:
ACPI / video: Only default only_lcd to true on Win8-ready _desktops_
Rafael J. Wysocki [Thu, 26 Apr 2018 13:10:25 +0000 (15:10 +0200)]
Merge branches 'acpi-pm' and 'pm-cpufreq'
* acpi-pm:
ACPI / PM: Blacklist Low Power S0 Idle _DSM for ThinkPad X1 Tablet(2016)
* pm-cpufreq:
cpufreq: brcmstb-avs-cpufreq: remove development debug support
Linus Torvalds [Thu, 26 Apr 2018 04:23:38 +0000 (21:23 -0700)]
Merge tag 'for_v4.17-rc3' of git://git./linux/kernel/git/jack/linux-fs
Pull fsnotify fix from Jan Kara:
"A fix of a fsnotify race causing panics / softlockups"
* tag 'for_v4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: Fix fsnotify_mark_connector race
Linus Torvalds [Thu, 26 Apr 2018 04:13:40 +0000 (21:13 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Eight bug fixes, one spelling update and one tracepoint addition.
The most serious is probably the mptsas write same fix because it
means anyone using these controllers sees errors when modern
filesystems try to issue discards"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target: fix crash with iscsi target and dvd
scsi: sd_zbc: Avoid that resetting a zone fails sporadically
scsi: sd: Defer spinning up drive while SANITIZE is in progress
scsi: megaraid_sas: Do not log an error if FW successfully initializes.
scsi: ufs: add trace event for ufs upiu
scsi: core: remove reference to scsi_show_extd_sense()
scsi: mptsas: Disable WRITE SAME
scsi: fnic: fix spelling mistake in fnic stats "Abord" -> "Abort"
scsi: scsi_debug: IMMED related delay adjustments
scsi: iscsi: respond to netlink with unicast when appropriate
Linus Torvalds [Thu, 26 Apr 2018 04:05:15 +0000 (21:05 -0700)]
Merge tag 'for-linus-
20180425' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
"I ended up sitting on this about a week longer than I wanted to, since
we were hashing out details with a timeout change. I've now killed
that patch, so we can flush the existing queue in due time.
This contains:
- Fix for an old regression, where entering the queue can be
disturbed by a signal to the process. This can cause spurious EIO.
Fix from Alan Jenkins.
- cdrom information leak fix from Dan.
- Trivial helper for testing queue FUA from Dave Chinner, part of his
O_DIRECT FUA series.
- Series of swim fixes from Finn that actually makes it work again.
- Loop O_DIRECT corruption fix, which caused data corruption in
production for us. From me.
- BFQ crash fix from me.
- bcache maintainer update. Michael no longer has the time to do it,
Coly has stepped up to serve as the new maintainer.
- blkcg locking fixes from Jiang Biao.
- Revert of a change from this merge window from Ming, that causes an
issue on some hardware.
- Minor clarification doc addition from Linus Walleij"
* tag 'for-linus-
20180425' of git://git.kernel.dk/linux-block: (22 commits)
Revert "blk-mq: remove code for dealing with remapping queue"
block: mq: Add some minor doc for core structs
bcache: mark Coly Li as bcache maintainer
MAINTAINERS: Remove me as maintainer of bcache
blkcg: init root blkcg_gq under lock
blkcg: small fix on comment in blkcg_init_queue
blkcg: don't hold blkcg lock when deactivating policy
block: add blk_queue_fua() helper function
cdrom: information leak in cdrom_ioctl_media_changed()
bfq-iosched: ensure to clear bic/bfqq pointers when preparing request
blk-mq: start request gstate with gen 1
block/swim: Select appropriate drive on device open
block/swim: Fix IO error at end of medium
block/swim: Check drive type
block/swim: Rename macros to avoid inconsistent inverted logic
block/swim: Don't log an error message for an invalid ioctl
block/swim: Remove extra put_disk() call from error path
block/swim: Fix array bounds check
m68k/mac: Don't remap SWIM MMIO region
loop: handle short DIO reads
...
Linus Torvalds [Thu, 26 Apr 2018 03:27:23 +0000 (20:27 -0700)]
Merge tag 'riscv-for-linus-4.17-rc3' of git://git./linux/kernel/git/palmer/riscv-linux
Pull RISC-V fixes from Palmer Dabbelt:
"This contains three small fixes related to the RISC-V port that I'd
like to target for 4.17-rc3:
- a Kconfig cleanup to select DMA_DIRECT_OPS instead of redefining it
in arch/riscv
- the removal of asm/handle_irq.h, which doesn't exist, from our arch
header list
- the addition of "-no-pie" the link rules for our VDSO-related
files, which fixes the build on systems where PIE is enabled by
default"
* tag 'riscv-for-linus-4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
RISC-V: build vdso-dummy.o with -no-pie
riscv: there is no <asm/handle_irq.h>
riscv: select DMA_DIRECT_OPS instead of redefining it
Linus Torvalds [Wed, 25 Apr 2018 18:48:09 +0000 (11:48 -0700)]
Merge tag 'dma-mapping-4.17-3' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
"A few small dma-mapping fixes for Linux 4.17-rc3:
- don't loop to try GFP_DMA allocations if ZONE_DMA is not actually
enabled (regression in 4.16)
- don't try to do virt_to_page before we know we actuall have a valid
page in dma_common_mmap
- a comment fixup related to the above fix"
* tag 'dma-mapping-4.17-3' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: postpone cpu addr translation on mmap
dma-coherent: clarify dma_mmap_from_dev_coherent documentation
dma-direct: don't retry allocation for no-op GFP_DMA
Michael S. Tsirkin [Fri, 20 Apr 2018 18:00:13 +0000 (21:00 +0300)]
virtio_console: reset on out of memory
When out of memory and we can't add ctrl vq buffers,
probe fails. Unfortunately the error handling is
out of spec: it calls del_vqs without bothering
to reset the device first.
To fix, call the full cleanup function in this case.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Michael S. Tsirkin [Fri, 20 Apr 2018 17:51:18 +0000 (20:51 +0300)]
virtio_console: move removal code
Will make it reusable for error handling.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Michael S. Tsirkin [Fri, 20 Apr 2018 17:49:04 +0000 (20:49 +0300)]
virtio_console: drop custom control queue cleanup
We now cleanup all VQs on device removal - no need
to handle the control VQ specially.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Michael S. Tsirkin [Fri, 20 Apr 2018 17:24:23 +0000 (20:24 +0300)]
virtio_console: free buffers after reset
Console driver is out of spec. The spec says:
A driver MUST NOT decrement the available idx on a live
virtqueue (ie. there is no way to “unexpose” buffers).
and it does exactly that by trying to detach unused buffers
without doing a device reset first.
Defer detaching the buffers until device unplug.
Of course this means we might get an interrupt for
a vq without an attached port now. Handle that by
discarding the consumed buffer.
Reported-by: Tiwei Bie <tiwei.bie@intel.com>
Fixes:
b3258ff1d6 ("virtio: Decrement avail idx on buffer detach")
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Michael S. Tsirkin [Fri, 20 Apr 2018 17:22:40 +0000 (20:22 +0300)]
virtio: add ability to iterate over vqs
For cleanup it's helpful to be able to simply scan all vqs and discard
all data. Add an iterator to do that.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Michael S. Tsirkin [Fri, 20 Apr 2018 16:54:23 +0000 (19:54 +0300)]
virtio_console: don't tie bufs to a vq
an allocated buffer doesn't need to be tied to a vq -
only vq->vdev is ever used. Pass the function the
just what it needs - the vdev.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ming Lei [Tue, 24 Apr 2018 20:01:44 +0000 (04:01 +0800)]
Revert "blk-mq: remove code for dealing with remapping queue"
This reverts commit
37c7c6c76d431dd7ef9c29d95f6052bd425f004c.
Turns out some drivers(most are FC drivers) may not use managed
IRQ affinity, and has their customized .map_queues meantime, so
still keep this code for avoiding regression.
Reported-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Cc: Ewan Milne <emilne@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Peter Xu [Thu, 15 Mar 2018 06:06:39 +0000 (14:06 +0800)]
tracing: Fix missing tab for hwlat_detector print format
It's been missing for a while but no one is touching that up. Fix it.
Link: http://lkml.kernel.org/r/20180315060639.9578-1-peterx@redhat.com
CC: Ingo Molnar <mingo@kernel.org>
Cc:stable@vger.kernel.org
Fixes:
7b2c86250122d ("tracing: Add NMI tracing in hwlat detector")
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masami Hiramatsu [Thu, 5 Apr 2018 09:29:12 +0000 (18:29 +0900)]
selftests: ftrace: Add a testcase for multiple actions on trigger
Add a testcase for multiple actions with different
parameters on an event trigger, which has been fixed
by commit
192c283e93bd ("tracing: Add action comparisons
when testing matching hist triggers").
Link: http://lkml.kernel.org/r/152292055227.15769.6327959816123227152.stgit@devbox
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masami Hiramatsu [Thu, 5 Apr 2018 09:28:42 +0000 (18:28 +0900)]
selftests: ftrace: Fix trigger extended error testcase
Previous testcase redirects echo-out into /dev/null
using "&>" as below
echo "trigger-command" >> trigger &> /dev/null
But this means redirecting both stdout and stderr into
/dev/null because it is same as below
echo "trigger-command" >> trigger > /dev/null 2>&1
So ">> trigger" redirects stdout to trigger file, but
next "> /dev/null" redirects stdout to /dev/null again
and the last "2>/&1" redirects stderr to stdout (/dev/null)
This fixes it by "2> /dev/null". And also, since it
must fail, add "!" to echo command.
Link: http://lkml.kernel.org/r/152292052250.15769.12565292689264162435.stgit@devbox
Fixes:
f06eec4d0f2c ("selftests: ftrace: Add inter-event hist triggers testcases")
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Thomas Richter [Thu, 19 Apr 2018 10:55:56 +0000 (12:55 +0200)]
kprobes: Fix random address output of blacklist file
File /sys/kernel/debug/kprobes/blacklist displays random addresses:
[root@s8360046 linux]# cat /sys/kernel/debug/kprobes/blacklist
0x0000000047149a90-0x00000000bfcb099a print_type_x8
....
This breaks 'perf probe' which uses the blacklist file to prohibit
probes on certain functions by checking the address range.
Fix this by printing the correct (unhashed) address.
The file mode is read all but this is not an issue as the file
hierarchy points out:
# ls -ld /sys/ /sys/kernel/ /sys/kernel/debug/ /sys/kernel/debug/kprobes/
/sys/kernel/debug/kprobes/blacklist
dr-xr-xr-x 12 root root 0 Apr 19 07:56 /sys/
drwxr-xr-x 8 root root 0 Apr 19 07:56 /sys/kernel/
drwx------ 16 root root 0 Apr 19 06:56 /sys/kernel/debug/
drwxr-xr-x 2 root root 0 Apr 19 06:56 /sys/kernel/debug/kprobes/
-r--r--r-- 1 root root 0 Apr 19 06:56 /sys/kernel/debug/kprobes/blacklist
Everything in and below /sys/kernel/debug is rwx to root only,
no group or others have access.
Background:
Directory /sys/kernel/debug/kprobes is created by debugfs_create_dir()
which sets the mode bits to rwxr-xr-x. Maybe change that to use the
parent's directory mode bits instead?
Link: http://lkml.kernel.org/r/20180419105556.86664-1-tmricht@linux.ibm.com
Fixes:
ad67b74d2469 ("printk: hash addresses printed with %p")
Cc: stable@vger.kernel.org
Cc: <stable@vger.kernel.org> # v4.15+
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: acme@kernel.org
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Ravi Bangoria [Fri, 20 Apr 2018 15:07:58 +0000 (20:37 +0530)]
tracing: Fix kernel crash while using empty filter with perf
Kernel is crashing when user tries to record 'ftrace:function' event
with empty filter:
# perf record -e ftrace:function --filter="" ls
# dmesg
BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
Oops: 0000 [#1] SMP PTI
...
RIP: 0010:ftrace_profile_set_filter+0x14b/0x2d0
RSP: 0018:
ffffa4a7c0da7d20 EFLAGS:
00010246
RAX:
ffffa4a7c0da7d64 RBX:
0000000000000000 RCX:
0000000000000006
RDX:
0000000000000000 RSI:
0000000000000092 RDI:
ffff8c48ffc968f0
...
Call Trace:
_perf_ioctl+0x54a/0x6b0
? rcu_all_qs+0x5/0x30
...
After patch:
# perf record -e ftrace:function --filter="" ls
failed to set filter "" on event ftrace:function with 22 (Invalid argument)
Also, if user tries to echo "" > filter, it used to throw an error.
This behavior got changed by commit
80765597bc58 ("tracing: Rewrite
filter logic to be simpler and faster"). This patch restores the
behavior as a side effect:
Before patch:
# echo "" > filter
#
After patch:
# echo "" > filter
bash: echo: write error: Invalid argument
#
Link: http://lkml.kernel.org/r/20180420150758.19787-1-ravi.bangoria@linux.ibm.com
Fixes:
80765597bc58 ("tracing: Rewrite filter logic to be simpler and faster")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Steven Rostedt (VMware) [Tue, 17 Apr 2018 21:41:28 +0000 (17:41 -0400)]
tracing/x86: Update syscall trace events to handle new prefixed syscall func names
Arnaldo noticed that the latest kernel is missing the syscall event system
directory in x86. I bisected it down to
d5a00528b58c ("syscalls/core,
syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()").
The system call trace events are special, as there is only one trace event
for all system calls (the raw_syscalls). But a macro that wraps the system
calls creates meta data for them that copies the name to find the system
call that maps to the system call table (the number). At boot up, it does a
kallsyms lookup of the system call table to find the function that maps to
the meta data of the system call. If it does not find a function, then that
system call is ignored.
Because the x86 system calls had "__x64_", or "__ia32_" prefixed to the
"sys" for the names, they do not match the default compare algorithm. As
this was a problem for power pc, the algorithm can be overwritten by the
architecture. The solution is to have x86 have its own algorithm to do the
compare and this brings back the system call trace events.
Link: http://lkml.kernel.org/r/20180417174128.0f3457f0@gandalf.local.home
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Fixes:
d5a00528b58c ("syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Linus Walleij [Fri, 20 Apr 2018 08:29:51 +0000 (10:29 +0200)]
block: mq: Add some minor doc for core structs
As it came up in discussion on the mailing list that the semantic
meaning of 'blk_mq_ctx' and 'blk_mq_hw_ctx' isn't completely
obvious to everyone, let's add some minimal kerneldoc for a
starter.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 20 Apr 2018 03:07:14 +0000 (21:07 -0600)]
bcache: mark Coly Li as bcache maintainer
Since Michael had to step back, Coly has agreed to be the new
maintainer. Mark him as such.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Michael Lyle [Thu, 19 Apr 2018 17:59:31 +0000 (10:59 -0700)]
MAINTAINERS: Remove me as maintainer of bcache
Too much to do with other projects. I've enjoyed working with everyone
here, and hope to occasionally contribute on bcache.
Signed-off-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Guenter Roeck [Tue, 24 Apr 2018 15:59:45 +0000 (08:59 -0700)]
hwmon: (k10temp) Add support for AMD Ryzen w/ Vega graphics
Enable k10temp for AMD Ryzen APUs w/ Vega Mobile Gfx.
Based on patch from René Rebe <rene@exactcode.de>. Dropped temperature
offsets since those are not supposed to apply for the affected CPUs.
Cc: stable@vger.kernel.org # v4.16+
Cc: René Rebe <rene@exactcode.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Guenter Roeck [Tue, 24 Apr 2018 13:55:55 +0000 (06:55 -0700)]
hwmon: (k10temp) Add temperature offset for Ryzen 2700X
Ryzen 2700X has a temperature offset of 10 degrees C. If bit 19 of the
Temperature Control register is set, there is an additional offset of
49 degrees C. Take this into account as well.
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Theodore Ts'o [Wed, 25 Apr 2018 05:12:32 +0000 (01:12 -0400)]
random: rate limit unseeded randomness warnings
On systems without sufficient boot randomness, no point spamming dmesg.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Linus Torvalds [Wed, 25 Apr 2018 00:58:51 +0000 (17:58 -0700)]
Merge branch 'userns-linus' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull userns bug fix from Eric Biederman:
"Just a small fix to properly set the return code on error"
* 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
commoncap: Handle memory allocation failure.
Linus Torvalds [Tue, 24 Apr 2018 21:16:40 +0000 (14:16 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix rtnl deadlock in ipvs, from Julian Anastasov.
2) s390 qeth fixes from Julian Wiedmann (control IO completion stalls,
bad MAC address update sequence, request side races on command IO
timeouts).
3) Handle seq_file overflow properly in l2tp, from Guillaume Nault.
4) Fix VLAN priority mappings in cpsw driver, from Ivan Khoronzhuk.
5) Packet scheduler ife action fixes (malformed TLV lengths, etc.) from
Alexander Aring.
6) Fix out of bounds access in tcp md5 option parser, from Jann Horn.
7) Missing netlink attribute policies in rtm_ipv6_policy table, from
Eric Dumazet.
8) Missing socket address length checks in l2tp and pppoe connect, from
Guillaume Nault.
9) Fix netconsole over team and bonding, from Xin Long.
10) Fix race with AF_PACKET socket state bitfields, from Willem de
Bruijn.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (51 commits)
ice: Fix insufficient memory issue in ice_aq_manage_mac_read
sfc: ARFS filter IDs
net: ethtool: Add missing kernel doc for FEC parameters
packet: fix bitfield update race
ice: Do not check INTEVENT bit for OICR interrupts
ice: Fix incorrect comment for action type
ice: Fix initialization for num_nodes_added
igb: Fix the transmission mode of queue 0 for Qav mode
ixgbevf: ensure xdp_ring resources are free'd on error exit
team: fix netconsole setup over team
amd-xgbe: Only use the SFP supported transceiver signals
amd-xgbe: Improve KR auto-negotiation and training
amd-xgbe: Add pre/post auto-negotiation phy hooks
pppoe: check sockaddr length in pppoe_connect()
l2tp: check sockaddr length in pppol2tp_connect()
net: phy: marvell: clear wol event before setting it
ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave
tcp: don't read out-of-bounds opsize
ibmvnic: Clean actual number of RX or TX pools
...
Hans de Goede [Tue, 17 Apr 2018 16:23:50 +0000 (18:23 +0200)]
ACPI / video: Only default only_lcd to true on Win8-ready _desktops_
Commit
5928c281524f (ACPI / video: Default lcd_only to true on Win8-ready
and newer machines) made only_lcd default to true on all machines where
acpi_osi_is_win8() returns true, including laptops.
The purpose of this is to avoid the bogus / non-working acpi backlight
interface which many newer BIOS-es define on desktop machines.
But this is causing a regression on some laptops, specifically on the
Dell XPS 13 2013 model, which does not have the LCD flag set for its
fully functional ACPI backlight interface.
Rather then DMI quirking our way out of this, this commits changes the
logic for setting only_lcd to true, to only do this on machines with
a desktop (or server) dmi chassis-type.
Note that we cannot simply only check the chassis-type and not register
the backlight interface based on that as there are some laptops and
tablets which have their chassis-type set to "3" aka desktop. Hopefully
the combination of checking the LCD flag, but only on devices with
a desktop(ish) chassis-type will avoid the needs for DMI quirks for this,
or at least limit the amount of DMI quirks which we need to a minimum.
Fixes:
5928c281524f (ACPI / video: Default lcd_only to true on Win8-ready and newer machines)
Reported-and-tested-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
David S. Miller [Tue, 24 Apr 2018 20:17:59 +0000 (16:17 -0400)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2018-04-24
This series contains fixes to ixgbevf, igb and ice drivers.
Colin Ian King fixes the return value on error for the new XDP support
that went into ixgbevf for 4.17.
Vinicius provides a fix for queue 0 for igb, which was not receiving all
the credits it needed when QAV mode was enabled.
Anirudh provides several fixes for the new ice driver, starting with
properly initializing num_nodes_added to zero. Fixed up a code comment
to better reflect what is really going on in the code. Fixed how to
detect if an OICR interrupt has occurred to a more reliable method.
Md Fahad fixes the ice driver to allocate the right amount of memory
when reading and storing the devices MAC addresses. The device can have
up to 2 MAC addresses (LAN and WoL), while WoL is currently not
supported, we need to ensure it can be properly handled when support is
added.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Md Fahad Iqbal Polash [Mon, 16 Apr 2018 17:07:03 +0000 (10:07 -0700)]
ice: Fix insufficient memory issue in ice_aq_manage_mac_read
For the MAC read operation, the device can return up to two (LAN and WoL)
MAC addresses. Without access to adequate memory, the device will return
an error. Fixed this by allocating the right amount of memory. Also, logic
to detect and copy the LAN MAC address into the port_info structure has
been added. Note that the WoL MAC address is ignored currently as the WoL
feature isn't supported yet.
Fixes:
dc49c7723676 ("ice: Get MAC/PHY/link info and scheduler topology")
Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Michael S. Tsirkin [Fri, 13 Apr 2018 13:37:04 +0000 (16:37 +0300)]
virtio_balloon: add array of stat names
Jason Wang points out that it's very hard for users to build an array of
stat names. The naive thing is to use VIRTIO_BALLOON_S_NR but that
breaks if we add more stats - as done e.g. recently by commit
6c64fe7f2
("virtio_balloon: export hugetlb page allocation counts").
Let's add an array of reasonably readable names.
Fixes:
6c64fe7f2 ("virtio_balloon: export hugetlb page allocation counts")
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jonathan Helman <jonathan.helman@oracle.com>
Aurelien Jarno [Wed, 21 Mar 2018 21:26:31 +0000 (22:26 +0100)]
RISC-V: build vdso-dummy.o with -no-pie
Debian toolcahin defaults to PIE, and I guess that will also be the case
of most distributions. This causes the following build failure:
AS arch/riscv/kernel/vdso/getcpu.o
AS arch/riscv/kernel/vdso/flush_icache.o
VDSOLD arch/riscv/kernel/vdso/vdso.so.dbg
OBJCOPY arch/riscv/kernel/vdso/vdso.so
AS arch/riscv/kernel/vdso/vdso.o
VDSOLD arch/riscv/kernel/vdso/vdso-dummy.o
LD arch/riscv/kernel/vdso/vdso-syms.o
riscv64-linux-gnu-ld: attempted static link of dynamic object `arch/riscv/kernel/vdso/vdso-dummy.o'
make[2]: *** [arch/riscv/kernel/vdso/Makefile:43: arch/riscv/kernel/vdso/vdso-syms.o] Error 1
make[1]: *** [scripts/Makefile.build:575: arch/riscv/kernel/vdso] Error 2
make: *** [Makefile:1018: arch/riscv/kernel] Error 2
While the root Makefile correctly passes "-fno-PIE" to build individual
object files, the RISC-V kernel also builds vdso-dummy.o as an
executable, which is therefore linked as PIE. Fix that by updating this
specific link rule to also include "-no-pie".
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Christoph Hellwig [Mon, 16 Apr 2018 06:57:52 +0000 (08:57 +0200)]
riscv: there is no <asm/handle_irq.h>
So don't list it as generic-y.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Christoph Hellwig [Mon, 16 Apr 2018 12:53:51 +0000 (14:53 +0200)]
riscv: select DMA_DIRECT_OPS instead of redefining it
DMA_DIRECT_OPS is defined in lib/Kconfig, so don't duplicate it in
arch/riscv/Kconfig.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Edward Cree [Tue, 24 Apr 2018 16:09:30 +0000 (17:09 +0100)]
sfc: ARFS filter IDs
Associate an arbitrary ID with each ARFS filter, allowing to properly query
for expiry. The association is maintained in a hash table, which is
protected by a spinlock.
v3: fix build warnings when CONFIG_RFS_ACCEL is disabled (thanks lkp-robot).
v2: fixed uninitialised variable (thanks davem and lkp-robot).
Fixes:
3af0f34290f6 ("sfc: replace asynchronous filter operations")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 23 Apr 2018 22:51:38 +0000 (15:51 -0700)]
net: ethtool: Add missing kernel doc for FEC parameters
While adding support for ethtool::get_fecparam and set_fecparam, kernel
doc for these functions was missed, add those.
Fixes:
1a5f3da20bd9 ("net: ethtool: add support for forward error correction modes")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Mon, 23 Apr 2018 21:37:03 +0000 (17:37 -0400)]
packet: fix bitfield update race
Updates to the bitfields in struct packet_sock are not atomic.
Serialize these read-modify-write cycles.
Move po->running into a separate variable. Its writes are protected by
po->bind_lock (except for one startup case at packet_create). Also
replace a textual precondition warning with lockdep annotation.
All others are set only in packet_setsockopt. Serialize these
updates by holding the socket lock. Analogous to other field updates,
also hold the lock when testing whether a ring is active (pg_vec).
Fixes:
8dc419447415 ("[PACKET]: Add optional checksum computation for recvmsg")
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Reported-by: Byoungyoung Lee <byoungyoung@purdue.edu>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Shelton [Wed, 11 Apr 2018 19:21:33 +0000 (12:21 -0700)]
ice: Do not check INTEVENT bit for OICR interrupts
According to the hardware spec, checking the INTEVENT bit isn't a
reliable way to detect if an OICR interrupt has occurred. This is
because this bit can be cleared by the hardware/firmware before the
interrupt service routine has run. So instead, just check for OICR
events every time.
Fixes:
940b61af02f4 ("ice: Initialize PF and setup miscellaneous interrupt")
Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Theodore Ts'o [Mon, 23 Apr 2018 22:51:28 +0000 (18:51 -0400)]
random: fix possible sleeping allocation from irq context
We can do a sleeping allocation from an irq context when CONFIG_NUMA
is enabled. Fix this by initializing the NUMA crng instances in a
workqueue.
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot+9de458f6a5e713ee8c1a@syzkaller.appspotmail.com
Fixes:
8ef35c866f8862df ("random: set up the NUMA crng instances...")
Cc: stable@vger.kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Anirudh Venkataramanan [Wed, 11 Apr 2018 17:41:47 +0000 (10:41 -0700)]
ice: Fix incorrect comment for action type
Action type 5 defines large action generic values. Fix comment to
reflect that better.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anirudh Venkataramanan [Tue, 10 Apr 2018 17:49:49 +0000 (10:49 -0700)]
ice: Fix initialization for num_nodes_added
ice_sched_add_nodes_to_layer is used recursively, and so we start
with num_nodes_added being 0. This way, in case of an error or if
num_nodes is NULL, the function just returns 0 to indicate that no
nodes were added.
Fixes:
5513b920a4f7 ("ice: Update Tx scheduler tree for VSI multi-Tx queue support")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Vinicius Costa Gomes [Sat, 31 Mar 2018 00:06:52 +0000 (17:06 -0700)]
igb: Fix the transmission mode of queue 0 for Qav mode
When Qav mode is enabled, queue 0 should be kept on Stream Reservation
mode. From the i210 datasheet, section 8.12.19:
"Note: Queue0 QueueMode must be set to 1b when TransmitMode is set to
Qav." ("QueueMode 1b" represents the Stream Reservation mode)
The solution is to give queue 0 the all the credits it might need, so
it has priority over queue 1.
A situation where this can happen is when cbs is "installed" only on
queue 1, leaving queue 0 alone. For example:
$ tc qdisc replace dev enp2s0 handle 100: parent root mqprio num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0
$ tc qdisc replace dev enp2s0 parent 100:2 cbs locredit -1470 \
hicredit 30 sendslope -980000 idleslope 20000 offload 1
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Colin Ian King [Tue, 27 Mar 2018 14:21:48 +0000 (15:21 +0100)]
ixgbevf: ensure xdp_ring resources are free'd on error exit
The current error handling for failed resource setup for xdp_ring
data is a break out of the loop and returning 0 indicated everything
was OK, when in fact it is not. Fix this by exiting via the
error exit label err_setup_tx that will clean up the resources
correctly and return and error status.
Detected by CoverityScan, CID#1466879 ("Logically dead code")
Fixes:
21092e9ce8b1 ("ixgbevf: Add support for XDP_TX action")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Xin Long [Tue, 24 Apr 2018 06:33:37 +0000 (14:33 +0800)]
team: fix netconsole setup over team
The same fix in Commit
dbe173079ab5 ("bridge: fix netconsole
setup over bridge") is also needed for team driver.
While at it, remove the unnecessary parameter *team from
team_port_enable_netpoll().
v1->v2:
- fix it in a better way, as does bridge.
Fixes:
0fb52a27a04a ("team: cleanup netpoll clode")
Reported-by: João Avelino Bellomo Filho <jbellomo@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ard Biesheuvel [Mon, 23 Apr 2018 09:16:56 +0000 (11:16 +0200)]
ACPI / button: make module loadable when booted in non-ACPI mode
Modules such as nouveau.ko and i915.ko have a link time dependency on
acpi_lid_open(), and due to its use of acpi_bus_register_driver(),
the button.ko module that provides it is only loadable when booted in
ACPI mode. However, the ACPI button driver can be built into the core
kernel as well, in which case the dependency can always be satisfied,
and the dependent modules can be loaded regardless of whether the
system was booted in ACPI mode or not.
So let's fix this asymmetry by making the ACPI button driver loadable
as a module even if not booted in ACPI mode, so it can provide the
acpi_lid_open() symbol in the same way as when built into the kernel.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[ rjw: Minor adjustments of comments, whitespace and names. ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Markus Mayer [Wed, 18 Apr 2018 15:56:42 +0000 (08:56 -0700)]
cpufreq: brcmstb-avs-cpufreq: remove development debug support
This debug code was helpful while developing the driver, but it isn't
being used for anything anymore.
Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Mika Westerberg [Mon, 23 Apr 2018 11:16:03 +0000 (14:16 +0300)]
ACPI / watchdog: Prefer iTCO_wdt on Lenovo Z50-70
WDAT table on Lenovo Z50-70 is using RTC SRAM (ports 0x70 and 0x71) to
store state of the timer. This conflicts with Linux RTC driver
(rtc-cmos.c) who fails to reserve those ports for itself preventing RTC
from functioning. In addition the WDAT table seems not to be fully
functional because it does not reset the system when the watchdog times
out.
On this system iTCO_wdt works just fine so we simply prefer to use it
instead of WDAT. This makes RTC working again and also results working
watchdog via iTCO_wdt.
Reported-by: Peter Milley <pbmilley@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199033
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
David S. Miller [Tue, 24 Apr 2018 01:24:23 +0000 (21:24 -0400)]
Merge branch 'amd-xgbe-fixes'
aTom Lendacky says:
====================
amd-xgbe: AMD XGBE driver fixes 2018-04-23
This patch series addresses some issues in the AMD XGBE driver.
The following fixes are included in this driver update series:
- Improve KR auto-negotiation and training (2 patches)
- Add pre and post auto-negotiation hooks
- Use the pre and post auto-negotiation hooks to disable CDR tracking
during auto-negotiation page exchange in KR mode
- Check for SFP tranceiver signal support and only use the signal if the
SFP indicates that it is supported
This patch series is based on net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Lendacky [Mon, 23 Apr 2018 16:43:34 +0000 (11:43 -0500)]
amd-xgbe: Only use the SFP supported transceiver signals
The SFP eeprom indicates the transceiver signals (Rx LOS, Tx Fault, etc.)
that it supports. Update the driver to include checking the eeprom data
when deciding whether to use a transceiver signal.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Lendacky [Mon, 23 Apr 2018 16:43:17 +0000 (11:43 -0500)]
amd-xgbe: Improve KR auto-negotiation and training
Update xgbe-phy-v2.c to make use of the auto-negotiation (AN) phy hooks
to improve the ability to successfully complete Clause 73 AN when running
at 10gbps. Hardware can sometimes have issues with CDR lock when the
AN DME page exchange is being performed.
The AN and KR training hooks are used as follows:
- The pre AN hook is used to disable CDR tracking in the PHY so that the
DME page exchange can be successfully and consistently completed.
- The post KR training hook is used to re-enable the CDR tracking so that
KR training can successfully complete.
- The post AN hook is used to check for an unsuccessful AN which will
increase a CDR tracking enablement delay (up to a maximum value).
Add two debugfs entries to allow control over use of the CDR tracking
workaround. The debugfs entries allow the CDR tracking workaround to
be disabled and determine whether to re-enable CDR tracking before or
after link training has been initiated.
Also, with these changes the receiver reset cycle that is performed during
the link status check can be performed less often.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Lendacky [Mon, 23 Apr 2018 16:43:08 +0000 (11:43 -0500)]
amd-xgbe: Add pre/post auto-negotiation phy hooks
Add hooks to the driver auto-negotiation (AN) flow to allow the different
phy implementations to perform any steps necessary to improve AN.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 23 Apr 2018 14:38:27 +0000 (16:38 +0200)]
pppoe: check sockaddr length in pppoe_connect()
We must validate sockaddr_len, otherwise userspace can pass fewer data
than we expect and we end up accessing invalid data.
Fixes:
224cf5ad14c0 ("ppp: Move the PPP drivers")
Reported-by: syzbot+4f03bdf92fdf9ef5ddab@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 23 Apr 2018 14:15:14 +0000 (16:15 +0200)]
l2tp: check sockaddr length in pppol2tp_connect()
Check sockaddr_len before dereferencing sp->sa_protocol, to ensure that
it actually points to valid data.
Fixes:
fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Reported-by: syzbot+a70ac890b23b1bf29f5c@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jingju Hou [Mon, 23 Apr 2018 07:22:49 +0000 (15:22 +0800)]
net: phy: marvell: clear wol event before setting it
If WOL event happened once, the LED[2] interrupt pin will not be
cleared unless we read the CSISR register. If interrupts are in use,
the normal interrupt handling will clear the WOL event. Let's clear the
WOL event before enabling it if !phy_interrupt_is_valid().
Signed-off-by: Jingju Hou <Jingju.Hou@synaptics.com>
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 23 Apr 2018 20:22:24 +0000 (16:22 -0400)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:
1) Fix SIP conntrack with phones sending session descriptions for different
media types but same port numbers, from Florian Westphal.
2) Fix incorrect rtnl_lock mutex logic from IPVS sync thread, from Julian
Anastasov.
3) Skip compat array allocation in ebtables if there is no entries, also
from Florian.
4) Do not lose left/right bits when shifting marks from xt_connmark, from
Jack Ma.
5) Silence false positive memleak in conntrack extensions, from Cong Wang.
6) Fix CONFIG_NF_REJECT_IPV6=m link problems, from Arnd Bergmann.
7) Cannot kfree rule that is already in list in nf_tables, switch order
so this error handling is not required, from Florian Westphal.
8) Release set name in error path, from Florian.
9) include kmemleak.h in nf_conntrack_extend.c, from Stepheh Rothwell.
10) NAT chain and extensions depend on NF_TABLES.
11) Out of bound access when renaming chains, from Taehee Yoo.
12) Incorrect casting in xt_connmark leads to wrong bitshifting.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 23 Apr 2018 01:29:23 +0000 (18:29 -0700)]
ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
KMSAN reported use of uninit-value that I tracked to lack
of proper size check on RTA_TABLE attribute.
I also believe RTA_PREFSRC lacks a similar check.
Fixes:
86872cb57925 ("[IPv6] route: FIB6 configuration using struct fib6_config")
Fixes:
c3968a857a6b ("ipv6: RTA_PREFSRC support for ipv6 route source address selection")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Sun, 22 Apr 2018 11:11:50 +0000 (19:11 +0800)]
bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave
After Commit
8a8efa22f51b ("bonding: sync netpoll code with bridge"), it
would set slave_dev npinfo in slave_enable_netpoll when enslaving a dev
if bond->dev->npinfo was set.
However now slave_dev npinfo is set with bond->dev->npinfo before calling
slave_enable_netpoll. With slave_dev npinfo set, __netpoll_setup called
in slave_enable_netpoll will not call slave dev's .ndo_netpoll_setup().
It causes that the lower dev of this slave dev can't set its npinfo.
One way to reproduce it:
# modprobe bonding
# brctl addbr br0
# brctl addif br0 eth1
# ifconfig bond0 192.168.122.1/24 up
# ifenslave bond0 eth2
# systemctl restart netconsole
# ifenslave bond0 br0
# ifconfig eth2 down
# systemctl restart netconsole
The netpoll won't really work.
This patch is to remove that slave_dev npinfo setting in bond_enslave().
Fixes:
8a8efa22f51b ("bonding: sync netpoll code with bridge")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jann Horn [Fri, 20 Apr 2018 13:57:30 +0000 (15:57 +0200)]
tcp: don't read out-of-bounds opsize
The old code reads the "opsize" variable from out-of-bounds memory (first
byte behind the segment) if a broken TCP segment ends directly after an
opcode that is neither EOL nor NOP.
The result of the read isn't used for anything, so the worst thing that
could theoretically happen is a pagefault; and since the physmap is usually
mostly contiguous, even that seems pretty unlikely.
The following C reproducer triggers the uninitialized read - however, you
can't actually see anything happen unless you put something like a
pr_warn() in tcp_parse_md5sig_option() to print the opsize.
====================================
#define _GNU_SOURCE
#include <arpa/inet.h>
#include <stdlib.h>
#include <errno.h>
#include <stdarg.h>
#include <net/if.h>
#include <linux/if.h>
#include <linux/ip.h>
#include <linux/tcp.h>
#include <linux/in.h>
#include <linux/if_tun.h>
#include <err.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <assert.h>
void systemf(const char *command, ...) {
char *full_command;
va_list ap;
va_start(ap, command);
if (vasprintf(&full_command, command, ap) == -1)
err(1, "vasprintf");
va_end(ap);
printf("systemf: <<<%s>>>\n", full_command);
system(full_command);
}
char *devname;
int tun_alloc(char *name) {
int fd = open("/dev/net/tun", O_RDWR);
if (fd == -1)
err(1, "open tun dev");
static struct ifreq req = { .ifr_flags = IFF_TUN|IFF_NO_PI };
strcpy(req.ifr_name, name);
if (ioctl(fd, TUNSETIFF, &req))
err(1, "TUNSETIFF");
devname = req.ifr_name;
printf("device name: %s\n", devname);
return fd;
}
#define IPADDR(a,b,c,d) (((a)<<0)+((b)<<8)+((c)<<16)+((d)<<24))
void sum_accumulate(unsigned int *sum, void *data, int len) {
assert((len&2)==0);
for (int i=0; i<len/2; i++) {
*sum += ntohs(((unsigned short *)data)[i]);
}
}
unsigned short sum_final(unsigned int sum) {
sum = (sum >> 16) + (sum & 0xffff);
sum = (sum >> 16) + (sum & 0xffff);
return htons(~sum);
}
void fix_ip_sum(struct iphdr *ip) {
unsigned int sum = 0;
sum_accumulate(&sum, ip, sizeof(*ip));
ip->check = sum_final(sum);
}
void fix_tcp_sum(struct iphdr *ip, struct tcphdr *tcp) {
unsigned int sum = 0;
struct {
unsigned int saddr;
unsigned int daddr;
unsigned char pad;
unsigned char proto_num;
unsigned short tcp_len;
} fakehdr = {
.saddr = ip->saddr,
.daddr = ip->daddr,
.proto_num = ip->protocol,
.tcp_len = htons(ntohs(ip->tot_len) - ip->ihl*4)
};
sum_accumulate(&sum, &fakehdr, sizeof(fakehdr));
sum_accumulate(&sum, tcp, tcp->doff*4);
tcp->check = sum_final(sum);
}
int main(void) {
int tun_fd = tun_alloc("inject_dev%d");
systemf("ip link set %s up", devname);
systemf("ip addr add 192.168.42.1/24 dev %s", devname);
struct {
struct iphdr ip;
struct tcphdr tcp;
unsigned char tcp_opts[20];
} __attribute__((packed)) syn_packet = {
.ip = {
.ihl = sizeof(struct iphdr)/4,
.version = 4,
.tot_len = htons(sizeof(syn_packet)),
.ttl = 30,
.protocol = IPPROTO_TCP,
/* FIXUP check */
.saddr = IPADDR(192,168,42,2),
.daddr = IPADDR(192,168,42,1)
},
.tcp = {
.source = htons(1),
.dest = htons(1337),
.seq = 0x12345678,
.doff = (sizeof(syn_packet.tcp)+sizeof(syn_packet.tcp_opts))/4,
.syn = 1,
.window = htons(64),
.check = 0 /*FIXUP*/
},
.tcp_opts = {
/* INVALID: trailing MD5SIG opcode after NOPs */
1, 1, 1, 1, 1,
1, 1, 1, 1, 1,
1, 1, 1, 1, 1,
1, 1, 1, 1, 19
}
};
fix_ip_sum(&syn_packet.ip);
fix_tcp_sum(&syn_packet.ip, &syn_packet.tcp);
while (1) {
int write_res = write(tun_fd, &syn_packet, sizeof(syn_packet));
if (write_res != sizeof(syn_packet))
err(1, "packet write failed");
}
}
====================================
Fixes:
cfb6eeb4c860 ("[TCP]: MD5 Signature Option (RFC2385) support.")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guenter Roeck [Mon, 23 Apr 2018 01:16:54 +0000 (18:16 -0700)]
hwmon: (nct6683) Enable EC access if disabled at boot
On Asrock Z370M Pro4, it was observed that EC access was disabled after
initially booting the system. As a result, the driver failed to load
with
nct6683: EC is disabled
After a suspend/resume cycle, the driver loaded correctly.
nct6683: Found NCT6683D or compatible chip at 0x2e:0xa20
nct6683 nct6683.2592: NCT6683D EC firmware version 1.0 build 07/18/16
Enable EC access after identifying the chip if disabled to fix the problem.
Warn the user that the data it reports may be unusable, similar to other
drivers for chips from Nuvoton.
Fixes:
41082d66bfd6f ("hwmon: Driver for NCT6683D")
Reported-by: Jonathan Sims <jonathan.625266@earthlink.net>
Tested-by: Jonathan Sims <jonathan.625266@earthlink.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Jacopo Mondi [Fri, 13 Apr 2018 17:25:37 +0000 (19:25 +0200)]
dma-mapping: postpone cpu addr translation on mmap
Postpone calling virt_to_page() translation on memory locations not
guaranteed to be backed by a struct page. Try first to map memory from
the device coherent memory pool, then perform translation if that fails.
On some architectures, specifically SH when configured with the SPARSEMEM
memory model, assuming a struct page is always assigned to a memory
address lead to unexpected hangs during the virtual to page address
translation. This patch fixes that specific issue but applies in the
general case too.
Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Robin Murphy [Mon, 9 Apr 2018 17:59:14 +0000 (18:59 +0100)]
dma-coherent: clarify dma_mmap_from_dev_coherent documentation
The use of "correctly mapped" here is misleading, since it can give the
wrong expectation in the case that the memory *should* have been mapped
from the per-device pool, but doing so failed for other reasons.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Takashi Iwai [Sun, 15 Apr 2018 09:08:07 +0000 (11:08 +0200)]
dma-direct: don't retry allocation for no-op GFP_DMA
When an allocation with lower dma_coherent mask fails, dma_direct_alloc()
retries the allocation with GFP_DMA. But, this is useless for
architectures that hav no ZONE_DMA.
Fix it by adding the check of CONFIG_ZONE_DMA before retrying the
allocation.
Fixes:
95f183916d4b ("dma-direct: retry allocations using GFP_DMA for small masks")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Mika Westerberg [Fri, 20 Apr 2018 12:22:02 +0000 (15:22 +0300)]
PCI / PM: Do not clear state_saved in pci_pm_freeze() when smart suspend is set
If a driver uses DPM_FLAG_SMART_SUSPEND and the device is already
runtime suspended when hibernate is started PCI core skips runtime
resuming the device but still clears pci_dev->state_saved. After the
hibernation image is written pci_pm_thaw_noirq() makes sure subsequent
thaw phases for the device are also skipped leaving it runtime suspended
with pci_dev->state_saved == false.
When the device is eventually runtime resumed pci_pm_runtime_resume()
restores config space by calling pci_restore_standard_config(), however
because pci_dev->state_saved == false pci_restore_state() never actually
restores the config space leaving the device in a state that is not what
the driver might expect.
For example here is what happens for intel-lpss I2C devices once the
hibernation snapshot is taken:
intel-lpss 0000:00:15.0: power state changed by ACPI to D0
intel-lpss 0000:00:1e.0: power state changed by ACPI to D3cold
video LNXVIDEO:00: Restoring backlight state
PM: hibernation exit
i2c_designware i2c_designware.1: Unknown Synopsys component type: 0xffffffff
i2c_designware i2c_designware.0: Unknown Synopsys component type: 0xffffffff
i2c_designware i2c_designware.1: timeout in disabling adapter
i2c_designware i2c_designware.0: timeout in disabling adapter
Since PCI config space is not restored the device is still in D3hot
making MMIO register reads return 0xffffffff.
Fix this by clearing pci_dev->state_saved only if we actually end up
runtime resuming the device.
Fixes:
c4b65157aeef (PCI / PM: Take SMART_SUSPEND driver flag into account)
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>