review.tizen.org Git - platform/kernel/linux-starfive.git/log

x86/tsc: Force TSC_ADJUST register to value >= zero

Roland reported that his DELL T5810 sports a value add BIOS which
completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
random negative TSC_ADJUST values, different on all CPUs. That renders the
TSC useless because the sycnchronization check fails.

Roland tested the new TSC_ADJUST mechanism. While it manages to readjust
the TSCs he needs to disable the TSC deadline timer, otherwise the machine
just stops booting.

Deeper investigation unearthed that the TSC deadline timer is sensitive to
the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an
interrupt storm caused by the TSC deadline timer.

This does not make any sense and it's hard to imagine what kind of hardware
wreckage is behind that misfeature, but it's reliably reproducible on other
systems which have TSC_ADJUST and TSC deadline timer.

While it would be understandable that a big enough negative value which
moves the resulting TSC readout into the negative space could have the
described effect, this happens even with a adjust value of -1, which keeps
the TSC readout definitely in the positive space. The compare register for
the TSC deadline timer is set to a positive value larger than the TSC, but
despite not having reached the deadline the interrupt is raised
immediately. If this happens on the boot CPU, then the machine dies
silently because this setup happens before the NMI watchdog is armed.

Further experiments showed that any other adjustment of TSC_ADJUST works as
expected as long as it stays in the positive range. The direction of the
adjustment has no influence either. See the lkml link for further analysis.

Yet another proof for the theory that timers are designed by janitors and
the underlying (obviously undocumented) mechanisms which allow BIOSes to
wreckage them are considered a feature. Well done Intel - NOT!

To address this wreckage add the following sanity measures:

- If the TSC_ADJUST value on the boot cpu is not 0, set it to 0

- If the TSC_ADJUST value on any cpu is negative, set it to 0

- Prevent the cross package synchronization mechanism from setting negative
TSC_ADJUST values.

Reported-and-tested-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Allen Hung <allen_hung@dell.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161213131211.397588033@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Validate TSC_ADJUST after resume

Some 'feature' BIOSes fiddle with the TSC_ADJUST register during
suspend/resume which renders the TSC unusable.

Add sanity checks into the resume path and restore the
original value if it was adjusted.

Reported-and-tested-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Allen Hung <allen_hung@dell.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161213131211.317654500@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Validate cpumask pointer before accessing it

0-day testing encountered a NULL pointer dereference in a cpumask access
from tsc_store_and_check_tsc_adjust().

This happens when the function is called on the boot CPU and the topology
masks are not yet available due to CPUMASK_OFFSTACK=y.

Add a NULL pointer check for the mask pointer. If NULL it's safe to assume
that the CPU is the boot CPU and the first one in the package.

Fixes: 8b223bc7abe0 ("x86/tsc: Store and check TSC ADJUST MSR")
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Fix broken CONFIG_X86_TSC=n build

Add the missing return statement to the inline stub
tsc_store_and_check_tsc_adjust() and add the other stubs to make a
SMP=y,TSC=n build happy.

While at it, remove the unused variable from the UP variant of
tsc_store_and_check_tsc_adjust().

Fixes: commit ba75fb646931 ("x86/tsc: Sync test only for the first cpu in a package")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Try to adjust TSC if sync test fails

If the first CPU of a package comes online, it is necessary to test whether
the TSC is in sync with a CPU on some other package. When a deviation is
observed (time going backwards between the two CPUs) the TSC is marked
unstable, which is a problem on large machines as they have to fall back to
the HPET clocksource, which is insanely slow.

It has been attempted to compensate the TSC by adding the offset to the TSC
and writing it back some time ago, but this never was merged because it did
not turn out to be stable, especially not on older systems.

Modern systems have become more stable in that regard and the TSC_ADJUST
MSR allows us to compensate for the time deviation in a sane way. If it's
available allow up to three synchronization runs and if a time warp is
detected the starting CPU can compensate the time warp via the TSC_ADJUST
MSR and retry. If the third run still shows a deviation or when random time
warps are detected the test terminally fails.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161119134018.048237517@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Prepare warp test for TSC adjustment

To allow TSC compensation cross nodes its necessary to know in which
direction the TSC warp was observed. Return the maximum observed value on
the calling CPU so the caller can determine the direction later.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161119134017.970859287@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Move sync cleanup to a safe place

Cleaning up the stop marker on the control CPU is wrong when we want to add
retry support. Move the cleanup to the starting CPU.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161119134017.892095627@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Sync test only for the first cpu in a package

If the TSC_ADJUST MSR is available all CPUs in a package are forced to the
same value. So TSCs cannot be out of sync when the first CPU in the package
was in sync.

That allows to skip the sync test for all CPUs except the first starting
CPU in a package.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161119134017.809901363@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Verify TSC_ADJUST from idle

When entering idle, it's a good oportunity to verify that the TSC_ADJUST
MSR has not been tampered with (BIOS hiding SMM cycles). If tampering is
detected, emit a warning and restore it to the previous value.

This is especially important for machines, which mark the TSC reliable
because there is no watchdog clocksource available (SoCs).

This is not sufficient for HPC (NOHZ_FULL) situations where a CPU never
goes idle, but adding a timer to do the check periodically is not an option
either. On a machine, which has this issue, the check triggeres right
during boot, so there is a decent chance that the sysadmin will notice.

Rate limit the check to once per second and warn only once per cpu.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161119134017.732180441@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Store and check TSC ADJUST MSR

The TSC_ADJUST MSR shows whether the TSC has been modified. This is helpful
in a two aspects:

1) It allows to detect BIOS wreckage, where SMM code tries to 'hide' the
   cycles spent by storing the TSC value at SMM entry and restoring it at
   SMM exit. On affected machines the TSCs run slowly out of sync up to the
   point where the clocksource watchdog (if available) detects it.

   The TSC_ADJUST MSR allows to detect the TSC modification before that and
   eventually restore it. This is also important for SoCs which have no
   watchdog clocksource and therefore TSC wreckage cannot be detected and
   acted upon.

2) All threads in a package are required to have the same TSC_ADJUST
   value. Broken BIOSes break that and as a result the TSC synchronization
   check fails.

   The TSC_ADJUST MSR allows to detect the deviation when a CPU comes
   online. If detected set it to the value of an already online CPU in the
   same package. This also allows to reduce the number of sync tests
   because with that in place the test is only required for the first CPU
   in a package.

   In principle all CPUs in a system should have the same TSC_ADJUST value
   even across packages, but with physical CPU hotplug this assumption is
   not true because the TSC starts with power on, so physical hotplug has
   to do some trickery to bring the TSC into sync with already running
   packages, which requires to use an TSC_ADJUST value different from CPUs
   which got powered earlier.

   A final enhancement is the opportunity to compensate for unsynced TSCs
   accross nodes at boot time and make the TSC usable that way. It won't
   help for TSCs which run apart due to frequency skew between packages,
   but this gets detected by the clocksource watchdog later.

The first step toward this is to store the TSC_ADJUST value of a starting
CPU and compare it with the value of an already online CPU in the same
package. If they differ, emit a warning and adjust it to the reference
value. The !SMP version just stores the boot value for later verification.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161119134017.655323776@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Detect random warps

If time warps can be observed then they should only ever be observed on one
CPU. If they are observed on both CPUs then the system is completely hosed.

Add a check for this condition and notify if it happens.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161119134017.574838461@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Use X86_FEATURE_TSC_ADJUST in detect_art()

The art detection uses rdmsrl_safe() to detect the availablity of the
TSC_ADJUST MSR.

That's pointless because we have a feature bit for this. Use it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161119134017.483561692@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Finalize the split of the TSC_RELIABLE flag

All places which used the TSC_RELIABLE to skip the delayed calibration
have been converted to use the TSC_KNOWN_FREQ flag.

Make the immeditate clocksource registration, which skips the long term
calibration, solely depend on TSC_KNOWN_FREQ.

The TSC_RELIABLE now merily removes the requirement for a watchdog
clocksource.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>

x86/tsc: Set TSC_KNOWN_FREQ and TSC_RELIABLE flags on Intel Atom SoCs

TSC on Intel Atom SoCs capable of determining TSC frequency by MSR is
reliable and the frequency is known (provided by HW).

On these platforms PIT/HPET is generally not available so calibration won't
work at all and there is no other clocksource to act as a watchdog for the
TSC, so we have no other choice than to trust it.

Set both X86_FEATURE_TSC_KNOWN_FREQ and X86_FEATURE_TSC_RELIABLE flags to
make sure the calibration is skipped and no watchdog is required.

Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-5-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Mark Intel ATOM_GOLDMONT TSC reliable

On Intel GOLDMONT Atom SoC TSC is the only available clocksource, so there
is no way to do software calibration or have a watchdog clocksource for it.
Software calibration is already disabled via the TSC_KNOWN_FREQ flag, but
the watchdog requirement still persists, so such systems cannot switch to
high resolution/nohz mode.

Mark it reliable, so it becomes usable. Hardware teams confirmed that this
is safe on that SoC.

Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-4-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Mark TSC frequency determined by CPUID as known

CPUs/SoCs with CPUID leaf 0x15 come with a known frequency and will report
the frequency to software via CPUID instruction. This hardware provided
frequency is the "real" frequency of TSC.

Set the X86_FEATURE_TSC_KNOWN_FREQ flag for such systems to skip the
software calibration process.

A 24 hours test on one of the CPUID 0x15 capable platforms was
conducted. PIT calibrated frequency resulted in more than 3 seconds drift
whereas the CPUID determined frequency showed less than 0.5 second
drift.

Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-3-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/tsc: Add X86_FEATURE_TSC_KNOWN_FREQ flag

The X86_FEATURE_TSC_RELIABLE flag in Linux kernel implies both reliable
(at runtime) and trustable (at calibration). But reliable running and
trustable calibration independent of each other.

Add a new flag X86_FEATURE_TSC_KNOWN_FREQ, which denotes that the frequency
is known (via MSR/CPUID). This flag is only meant to skip the long term
calibration on systems which have a known frequency.

Add X86_FEATURE_TSC_KNOWN_FREQ to the skip the delayed calibration and
leave X86_FEATURE_TSC_RELIABLE in place.

After converting the existing users of X86_FEATURE_TSC_RELIABLE to use
either both flags or just X86_FEATURE_TSC_KNOWN_FREQ we can seperate the
functionality.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-2-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Linux 4.9-rc5

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"ARM fixes.  There are a couple pending x86 patches but they'll have to
  wait for next week"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: arm/arm64: vgic: Kick VCPUs when queueing already pending IRQs
  KVM: arm/arm64: vgic: Prevent access to invalid SPIs
  arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU

Merge branch 'media-fixes' (patches from Mauro)

Merge media fixes from Mauro Carvalho Chehab:
"This contains two patches fixing problems with my patch series meant
  to make USB drivers to work again after the DMA on stack changes.

  The last patch on this series is actually not related to DMA on stack.
  It solves a longstanding bug affecting module unload, causing
  module_put() to be called twice. It was reported by the user who
  reported and tested the issues with the gp8psk driver with the DMA
  fixup patches. As we're late at -rc cycle, maybe you prefer to not
  apply it right now. If this is the case, I'll add to the pile of
  patches for 4.10.

  Exceptionally this time, I'm sending the patches via e-mail, because
  I'm on another trip, and won't be able to use the usual procedure
  until Monday. Also, it is only three patches, and you followed already
  the discussions about the first one"

* emailed patches from Mauro Carvalho Chehab <mchehab@osg.samsung.com>:
  gp8psk: Fix DVB frontend attach
  gp8psk: fix gp8psk_usb_in_op() logic
  dvb-usb: move data_mutex to struct dvb_usb_device

Merge tag 'char-misc-4.9-rc5' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
"Here are three small driver fixes for some reported issues for
  4.9-rc5.

  One for the hyper-v subsystem, fixing up a naming issue that showed up
  in 4.9-rc1, one mei driver fix, and one fix for parallel ports,
  resolving a reported regression.

  All have been in linux-next with no reported issues"

* tag 'char-misc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  ppdev: fix double-free of pp->pdev->name
  vmbus: make sysfs names consistent with PCI
  mei: bus: fix received data size check in NFC fixup

Merge tag 'driver-core-4.9-rc5' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
"Here are two driver core fixes for 4.9-rc5.

  The first resolves an issue with some drivers not liking to be unbound
  and bound again (if CONFIG_DEBUG_TEST_DRIVER_REMOVE is enabled), which
  solves some reported problems with graphics and storage drivers. The
  other resolves a smatch error with the 4.9-rc1 driver core changes
  around this feature.

  Both have been in linux-next with no reported issues"

* tag 'driver-core-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: fix smatch warning on dev->bus check
  driver core: skip removal test for non-removable drivers

Merge tag 'staging-4.9-rc5' of git://git./linux/kernel/git/gregkh/staging

Pull staging/IIO fixes from Grek KH:
"Here are a few small staging and iio driver fixes for reported issues.

  The last one was cherry-picked from my -next branch to resolve a build
  warning that Arnd fixed, in his quest to be able to turn
  -Wmaybe-uninitialized back on again. That patch, and all of the
  others, have been in linux-next for a while with no reported issues"

* tag 'staging-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: maxim_thermocouple: detect invalid storage size in read()
  staging: nvec: remove managed resource from PS2 driver
  Revert "staging: nvec: ps2: change serio type to passthrough"
  drivers: staging: nvec: remove bogus reset command for PS/2 interface
  staging: greybus: arche-platform: fix device reference leak
  staging: comedi: ni_tio: fix buggy ni_tio_clock_period_ps() return value
  staging: sm750fb: Fix bugs introduced by early commits
  iio: hid-sensors: Increase the precision of scale to fix wrong reading interpretation.
  iio: orientation: hid-sensor-rotation: Add PM function (fix non working driver)
  iio: st_sensors: fix scale configuration for h3lis331dl
  staging: iio: ad5933: avoid uninitialized variable in error case

Merge tag 'usb-4.9-rc5' of git://git./linux/kernel/git/gregkh/usb

Pull USB / PHY fixes from Greg KH:
"Here are a number of small USB and PHY driver fixes for 4.9-rc5

  Nothing major, just small fixes for reported issues, all of these have
  been in linux-next for a while with no reported issues"

* tag 'usb-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: cdc-acm: fix TIOCMIWAIT
  cdc-acm: fix uninitialized variable
  drivers/usb: Skip auto handoff for TI and RENESAS usb controllers
  usb: musb: remove duplicated actions
  usb: musb: da8xx: Don't print phy error on -EPROBE_DEFER
  phy: sun4i: check PMU presence when poking unknown bit of pmu
  phy-rockchip-pcie: remove deassert of phy_rst from exit callback
  phy: da8xx-usb: rename the ohci device to ohci-da8xx
  phy: Add reset callback for not generic phy
  uwb: fix device reference leaks
  usb: gadget: u_ether: remove interrupt throttling
  usb: dwc3: st: add missing <linux/pinctrl/consumer.h> include
  usb: dwc3: Fix error handling for core init

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull more block fixes from Jens Axboe:
"Since I mistakenly left out the lightnvm regression fix yesterday and
  the aoeblk seems adequately tested at this point, might as well send
  out another pull to make -rc5"

* 'for-linus' of git://git.kernel.dk/linux-block:
  aoe: fix crash in page count manipulation
  lightnvm: invalid offset calculation for lba_shift

Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"The megaraid_sas patch in here fixes a major regression in the last
  fix set that made all megaraid_sas cards unusable. It turns out no-one
  had actually tested such an "obvious" fix, sigh. The fix for the fix
  has been tested ...

  The next most serious is the vmw_pvscsi abort problem which basically
  means that aborts don't work on the vmware paravirt devices and error
  handling always escalates to reset.

  The rest are an assortment of missed reference counting in certain
  paths and corner case bugs that show up on some architectures"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
  scsi: qla2xxx: fix invalid DMA access after command aborts in PCI device remove
  scsi: qla2xxx: do not queue commands when unloading
  scsi: libcxgbi: fix incorrect DDP resource cleanup
  scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
  scsi: scsi_dh_alua: Fix a reference counting bug
  scsi: vmw_pvscsi: return SUCCESS for successful command aborts
  scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk
  scsi: scsi_dh_alua: fix missing kref_put() in alua_rtpg_work()

Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
"The typical collection of minor bug fixes in clk drivers. We don't
  have anything in the core framework here, just driver fixes.

  There's a boot fix for Samsung devices and a safety measure for qoriq
  to prevent CPUs from running too fast. There's also a fix for i.MX6Q
  to properly handle audio clock rates. We also have some "that's
  obviously wrong" fixes like bad NULL pointer checks in the MPP driver
  and a poor usage of __pa in the xgene clk driver that are fixed here"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: mmp: pxa910: fix return value check in pxa910_clk_init()
  clk: mmp: pxa168: fix return value check in pxa168_clk_init()
  clk: mmp: mmp2: fix return value check in mmp2_clk_init()
  clk: qoriq: Don't allow CPU clocks higher than starting value
  clk: imx: fix integer overflow in AV PLL round rate
  clk: xgene: Don't call __pa on ioremaped address
  clk/samsung: Use CLK_OF_DECLARE_DRIVER initialization method for CLKOUT
  clk: rockchip: don't return NULL when failing to register ddrclk branch

gp8psk: Fix DVB frontend attach

The DVB binding schema at the DVB core assumes that the frontend is a
separate driver.  Faling to do that causes OOPS when the module is
removed, as it tries to do a symbol_put_addr on an internal symbol,
causing craches like:

    WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
    Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
    CPU: 1 PID: 28102 Comm: rmmod Tainted: P        WC O 4.8.4-build.1 #1
    Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
    Call Trace:
       dump_stack+0x44/0x64
       __warn+0xfa/0x120
       module_put+0x57/0x70
       module_put+0x57/0x70
       warn_slowpath_null+0x23/0x30
       module_put+0x57/0x70
       gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
       symbol_put_addr+0x27/0x50
       dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]

From Derek's tests:
    "Attach bug is fixed, tuning works, module unloads without
     crashing. Everything seems ok!"

Reported-by: Derek <user.vdr@gmail.com>
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gp8psk: fix gp8psk_usb_in_op() logic

Commit bc29131ecb10 ("[media] gp8psk: don't do DMA on stack") fixed the
usage of DMA on stack, but the memcpy was wrong for gp8psk_usb_in_op().
Fix it.

From Derek's email:
"Fix confirmed using 2 different Skywalker models with
HD mpeg4, SD mpeg2."

Suggested-by: Johannes Stezenbach <js@linuxtv.org>
Fixes: bc29131ecb10 ("[media] gp8psk: don't do DMA on stack")
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

dvb-usb: move data_mutex to struct dvb_usb_device

The data_mutex is initialized too late, as it is needed for
each device driver's power control, causing an OOPS:

    dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm state.
    BUG: unable to handle kernel NULL pointer dereference at           (null)
    IP: [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 PGD 0
    Oops: 0002 [#1] SMP
    Modules linked in: dvb_usb_cinergyT2(+) dvb_usb
    CPU: 0 PID: 2029 Comm: modprobe Not tainted 4.9.0-rc4-dvbmod #24
    Hardware name: FUJITSU LIFEBOOK A544/FJNBB35 , BIOS Version 1.17 05/09/2014
    task: ffff88020e943840 task.stack: ffff8801f36ec000
    RIP: 0010:[<ffffffff846617af>]  [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100
    RSP: 0018:ffff8801f36efb10  EFLAGS: 00010282
    RAX: 0000000000000000 RBX: ffff88021509bdc8 RCX: 00000000c0000100
    RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88021509bdcc
    RBP: ffff8801f36efb58 R08: ffff88021f216320 R09: 0000000000100000
    R10: ffff88021f216320 R11: 00000023fee6c5a1 R12: ffff88020e943840
    R13: ffff88021509bdcc R14: 00000000ffffffff R15: ffff88021509bdd0
    FS:  00007f21adb86740(0000) GS:ffff88021f200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 0000000215bce000 CR4: 00000000001406f0
    Call Trace:
       mutex_lock+0x16/0x25
       cinergyt2_power_ctrl+0x1f/0x60 [dvb_usb_cinergyT2]
       dvb_usb_device_init+0x21e/0x5d0 [dvb_usb]
       cinergyt2_usb_probe+0x21/0x50 [dvb_usb_cinergyT2]
       usb_probe_interface+0xf3/0x2a0
       driver_probe_device+0x208/0x2b0
       __driver_attach+0x87/0x90
       driver_probe_device+0x2b0/0x2b0
       bus_for_each_dev+0x52/0x80
       bus_add_driver+0x1a3/0x220
       driver_register+0x56/0xd0
       usb_register_driver+0x77/0x130
       do_one_initcall+0x46/0x180
       free_vmap_area_noflush+0x38/0x70
       kmem_cache_alloc+0x84/0xc0
       do_init_module+0x50/0x1be
       load_module+0x1d8b/0x2100
       find_symbol_in_section+0xa0/0xa0
       SyS_finit_module+0x89/0x90
       entry_SYSCALL_64_fastpath+0x13/0x94
    Code: e8 a7 1d 00 00 8b 03 83 f8 01 0f 84 97 00 00 00 48 8b 43 10 4c 8d 7b 08 48 89 63 10 4c 89 3c 24 41 be ff ff ff ff 48 89 44 24 08 <48> 89 20 4c 89 64 24 10 eb 1a 49 c7 44 24 08 02 00 00 00 c6 43 RIP  [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 RSP <ffff8801f36efb10>
    CR2: 0000000000000000

So, move it to the struct dvb_usb_device and initialize it
before calling the driver's callbacks.

Reported-by: Jörg Otte <jrg.otte@gmail.com>
Tested-by: Jörg Otte <jrg.otte@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

iio: maxim_thermocouple: detect invalid storage size in read()

As found by gcc -Wmaybe-uninitialized, having a storage_bytes value other
than 2 or 4 will result in undefined behavior:

drivers/iio/temperature/maxim_thermocouple.c: In function 'maxim_thermocouple_read':
drivers/iio/temperature/maxim_thermocouple.c:141:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This probably cannot happen, but returning -EINVAL here is appropriate
and makes gcc happy and the code more robust.

Fixes: 231147ee77f3 ("iio: maxim_thermocouple: Align 16 bit big endian value of raw reads")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
(cherry picked from commit 32cb7d27e65df9daa7cee8f1fdf7b259f214bee2)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

aoe: fix crash in page count manipulation

aoeblk contains some mysterious code, that wants to elevate the bio
vec page counts while it's under IO. That is not needed, it's
fragile, and it's causing kernel oopses for some.

Reported-by: Tested-by: Don Koch <kochd@us.ibm.com>
Tested-by: Tested-by: Don Koch <kochd@us.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

lightnvm: invalid offset calculation for lba_shift

The ns->lba_shift assumes its value to be the logarithmic of the
LA size. A previous patch duplicated the lba_shift calculation into
lightnvm. It prematurely also subtracted a 512byte shift, which commonly
is applied per-command. The 512byte shift being subtracted twice led to
data loss when restoring the logical to physical mapping table from
device and when issuing I/O commands using rrpc.

Fix offset by removing the 512byte shift subtraction when calculating
lba_shift.

Fixes: b0b4e09c1ae7 "lightnvm: control life of nvm_dev in driver"
Reported-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

Merge tag 'acpi-4.9-rc5' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
"Fix a recent regression in the 8250_dw serial driver introduced by
  adding a quirk for the APM X-Gene SoC to it which uncovered an issue
  related to the handling of built-in device properties in the core ACPI
  device enumeration code (Heikki Krogerus)"

* tag 'acpi-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / platform: Add support for build-in properties

Merge tag 'pm-4.9-rc5' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These fix two bugs in error code paths in the PM core (system-wide
  suspend of devices), a device reference leak in the boot-time suspend
  test code and a cpupower utility regression from the 4.7 cycle.

  Specifics:

   - Prevent the PM core from attempting to suspend parent devices if
     any of their children, whose suspend callbacks were invoked
     asynchronously, have failed to suspend during the "late" and
     "noirq" phases of system-wide suspend of devices (Brian Norris).

   - Prevent the boot-time system suspend test code from leaking a
     reference to the RTC device used by it (Johan Hovold).

   - Fix cpupower to use the return value of one of its library
     functions correctly and restore the correct behavior of it when
     used for setting cpufreq tunables broken during the 4.7 development
     cycle (Laura Abbott)"

* tag 'pm-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails
  PM / sleep: fix device reference leak in test_suspend
  cpupower: Correct return type of cpu_power_is_cpu_online() in cpufreq-set

Merge tag 'arc-4.9-rc5' of git://git./linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

- mmap handler for dma ops as generic handler no longer works for us
   [Alexey]

- Fixes for EZChip platform [Noam]

- Fix RTC clocksource driver build issue

- ARC IRQ handling fixes [Yuriy]

- Revert a recent makefile change which doesn't go well with oldish
   tools out in the wild

* tag 'arc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARCv2: MCIP: Use IDU_M_DISTRI_DEST mode if there is only 1 destination core
  ARC: IRQ: Do not use hwirq as virq and vice versa
  ARC: [plat-eznps] set default baud for early console
  ARC: [plat-eznps] remove IPI clear from SMP operations
  Revert "ARC: build: retire old toggles"
  ARC: timer: rtc: implement read loop in "C" vs. inline asm
  ARC: change return value of userspace cmpxchg assist syscall
  arc: Implement arch-specific dma_map_ops.mmap
  ARC: [SMP] avoid overriding present cpumask
  ARC: Enable PERF_EVENTS in nSIM driven platforms

Merge tag 'platform-drivers-x86-v4.9-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86

Pull x86 platform driver fixes from Darren Hart:
"Minor doc fix, a DMI match for ideapad and a fix to toshiba-wmi to
  avoid loading on non-toshiba systems.

  Documentation/ABI:
   - ibm_rtl: The "What:" fields are incomplete

  toshiba-wmi:
   - Fix loading the driver on non Toshiba laptops

  ideapad-laptop:
   - Add another DMI entry for Yoga 900"

* tag 'platform-drivers-x86-v4.9-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
  Documentation/ABI: ibm_rtl: The "What:" fields are incomplete
  toshiba-wmi: Fix loading the driver on non Toshiba laptops
  ideapad-laptop: Add another DMI entry for Yoga 900

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"Two small (really, one liners both of them!) fixes that should go into
  this series:

   - Request allocation error handling fix for nbd, from Christophe,
     fixing a regression in this series.

   - An oops fix for drbd. Not a regression in this series, but stable
     material. From Richard"

* 'for-linus' of git://git.kernel.dk/linux-block:
  drbd: Fix kernel_sendmsg() usage - potential NULL deref
  nbd: Fix error handling

Merge tag 'pci-v4.9-fixes-3' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

- Update MAINTAINERS for Intel VMD driver filename

- Update Rockchip rk3399 host bridge driver DTS and resets

- Fix ROM shadow problem that made some video device initialization
   fail

* tag 'pci-v4.9-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: VMD: Update filename to reflect move
  arm64: dts: rockchip: add three new resets for rk3399 PCIe controller
  PCI: rockchip: Add three new resets as required properties
  PCI: Don't attempt to claim shadow copies of ROM

Merge tag 'drm-fixes-for-v4.9-rc5' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"AMD, radeon, i915, imx, msm and udl fixes:

   - amdgpu/radeon have a number of power management regressions and
     fixes along with some better error checking

   - imx has a single regression fix

   - udl has a single kmalloc instead of stack for usb control msg fix

   - msm has some fixes for modesetting bugs and regressions

   - i915 has a one fix for a Sandybridge regression along with some
     others for DP audio.

  They all seem pretty okay at this stage, we've got one MST fix I know
  going through process for i915, but I expect it'll be next week"

* tag 'drm-fixes-for-v4.9-rc5' of git://people.freedesktop.org/~airlied/linux: (30 commits)
  drm/udl: make control msg static const. (v2)
  drm/amd/powerplay: implement get_clock_by_type for iceland.
  drm/amd/powerplay/smu7: fix checks in smu7_get_evv_voltages (v2)
  drm/amd/powerplay: update phm_get_voltage_evv_on_sclk for iceland
  drm/amd/powerplay: propagate errors in phm_get_voltage_evv_on_sclk
  drm/imx: disable planes before DC
  drm/amd/powerplay: return false instead of -EINVAL
  drm/amdgpu/powerplay/smu7: fix unintialized data usage
  drm/amdgpu: fix crash in acp_hw_fini
  drm/i915: Limit Valleyview and earlier to only using mappable scanout
  drm/i915: Round tile chunks up for constructing partial VMAs
  drm/i915/dp: Extend BDW DP audio workaround to GEN9 platforms
  drm/i915/dp: BDW cdclk fix for DP audio
  drm/i915/vlv: Prevent enabling hpd polling in late suspend
  drm/i915: Respect alternate_ddc_pin for all DDI ports
  drm/msm: Fix error handling crashes seen when VRAM allocation fails
  drm/msm/mdp5: 8x16 actually has 8 mixer stages
  drm/msm/mdp5: no scaling support on RGBn pipes for 8x16
  drm/msm/mdp5: handle non-fullscreen base plane case
  drm/msm: Set CLK_IGNORE_UNUSED flag for PLL clocks
  ...

Merge tag 'mmc-v4.9-rc4' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
"MMC core:
   - Fix mmc card initialization for hosts not supporting HW busy
     detection
   - Fix mmc_test for sending commands during non-blocking write

  MMC host:
   - mxs: Avoid using an uninitialized
   - sdhci: Restore enhanced strobe setting during runtime resume
   - sdhci: Fix a couple of reset related issues
   - dw_mmc: Fix a reset controller issue"

* tag 'mmc-v4.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: mxs: Initialize the spinlock prior to using it
  mmc: mmc: Use 500ms as the default generic CMD6 timeout
  mmc: mmc_test: Fix "Commands during non-blocking write" tests
  mmc: sdhci: Fix missing enhanced strobe setting during runtime resume
  mmc: sdhci: Reset cmd and data circuits after tuning failure
  mmc: sdhci: Fix unexpected data interrupt handling
  mmc: sdhci: Fix CMD line reset interfering with ongoing data transfer
  mmc: dw_mmc: add the "reset" as name of reset controller
  Documentation: synopsys-dw-mshc: add binding for reset-names

Merge tag 'pinctrl-v4.9-3' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
"All is about drivers, no core business going on.

   - Fix a host of runtime problems with the Intel Cherryview driver:
     suspend/resume needs to be marshalled properly, and strange effects
     from BIOS interaction during suspend/resume need to be dealt with.

   - A single bit was being set wrong in the Aspeed driver.

   - Fix an iProc probe ordering fallout resulting from v4.9
     refactorings for bus population.

   - Do not specify a default trigger in the ST Micro cascaded GPIO IRQ
     controller: the kernel will moan.

   - Make IRQs optional altogether on the STM32 driver, it turns out not
     all systems have them or want them.

   - Fix a re-probe bug in the i.MX driver, it will eventually crash if
     probed repeatedly, not good"

* tag 'pinctrl-v4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl-aspeed-g5: Never set SCU90[6]
  pinctrl: cherryview: Prevent possible interrupt storm on resume
  pinctrl: cherryview: Serialize register access in suspend/resume
  pinctrl: imx: reset group index on probe
  pinctrl: stm32: move gpio irqs binding to optional
  pinctrl: stm32: remove dependency with interrupt controller
  pinctrl: st: don't specify default interrupt trigger
  pinctrl: iproc: Fix iProc and NSP GPIO support

Merge branches 'pm-tools-fixes' and 'pm-sleep-fixes'

* pm-tools-fixes:
  cpupower: Correct return type of cpu_power_is_cpu_online() in cpufreq-set

* pm-sleep-fixes:
  PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails
  PM / sleep: fix device reference leak in test_suspend

Merge branch 'device-properties'

* device-properties:
ACPI / platform: Add support for build-in properties

Merge branch 'maybe-uninitialized' (patches from Arnd)

Merge fixes for -Wmaybe-uninitialized from Arnd Bergmann:
"It took a while for some patches to make it into mainline through
  maintainer trees, but the 28-patch series is now reduced to 10, with
  one tiny patch added at the end.

  Aside from patches that are no longer required, I did these changes
  compared to version 1:

   - Dropped "iio: maxim_thermocouple: detect invalid storage size in
     read()", which is currently in linux-next as commit 32cb7d27e65d.
     This is the only remaining warning I see for a couple of corner
     cases (kbuild bot reports it on blackfin, kernelci bot and arm-soc
     bot both report it on arm64)

   - Dropped "brcmfmac: avoid maybe-uninitialized warning in
     brcmf_cfg80211_start_ap", which is currently in net/master merge
     pending.

   - Dropped two x86 patches, "x86: math-emu: possible uninitialized
     variable use" and "x86: mark target address as output in 'insb'
     asm" as they do not seem to trigger for a default build, and I got
     no feedback on them. Both of these are ancient issues and seem
     harmless, I will send them again to the x86 maintainers once the
     rest is merged.

   - Dropped "rbd: false-postive gcc-4.9 -Wmaybe-uninitialized" based on
     feedback from Ilya Dryomov, who already has a different fix queued
     up for v4.10. The kbuild bot reports this as a warning for xtensa.

   - Replaced "crypto: aesni: avoid -Wmaybe-uninitialized warning" with
     a simpler patch, this one always triggers but my first solution
     would not be safe for linux-4.9 any more at this point. I'll follow
     up with the larger patch as a cleanup for 4.10.

   - Replaced "dib0700: fix nec repeat handling" with a better one,
     contributed by Sean Young"

* -Wmaybe-uninitialized fixes:
  Kbuild: enable -Wmaybe-uninitialized warnings by default
  pcmcia: fix return value of soc_pcmcia_regulator_set
  infiniband: shut up a maybe-uninitialized warning
  crypto: aesni: shut up -Wmaybe-uninitialized warning
  rc: print correct variable for z8f0811
  dib0700: fix nec repeat handling
  s390: pci: don't print uninitialized data for debugging
  nios2: fix timer initcall return value
  x86: apm: avoid uninitialized data
  NFSv4.1: work around -Wmaybe-uninitialized warning
  Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"15 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  lib/stackdepot: export save/fetch stack for drivers
  mm: kmemleak: scan .data.ro_after_init
  memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB
  coredump: fix unfreezable coredumping task
  mm/filemap: don't allow partially uptodate page for pipes
  mm/hugetlb: fix huge page reservation leak in private mapping error paths
  ocfs2: fix not enough credit panic
  Revert "console: don't prefer first registered if DT specifies stdout-path"
  mm: hwpoison: fix thp split handling in memory_failure()
  swapfile: fix memory corruption via malformed swapfile
  mm/cma.c: check the max limit for cma allocation
  scripts/bloat-o-meter: fix SIGPIPE
  shmem: fix pageflags after swapping DMA32 object
  mm, frontswap: make sure allocated frontswap map is assigned
  mm: remove extra newline from allocation stall warning

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull VFS fixes from Al Viro:
"Christoph's and Jan's aio fixes, fixup for generic_file_splice_read
  (removal of pointless detritus that actually breaks it when used for
  gfs2 ->splice_read()) and fixup for generic_file_read_iter()
  interaction with ITER_PIPE destinations."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  splice: remove detritus from generic_file_splice_read()
  mm/filemap: don't allow partially uptodate page for pipes
  aio: fix freeze protection of aio writes
  fs: remove aio_run_iocb
  fs: remove the never implemented aio_fsync file operation
  aio: hold an extra file reference over AIO read/write operations

Merge tag 'ceph-for-4.9-rc5' of git://github.com/ceph/ceph-client

Pull Ceph fixes from Ilya Dryomov:
"Ceph's ->read_iter() implementation is incompatible with the new
  generic_file_splice_read() code that went into -rc1.  Switch to the
  less efficient default_file_splice_read() for now; the proper fix is
  being held for 4.10.

  We also have a fix for a 4.8 regression and a trival libceph fixup"

* tag 'ceph-for-4.9-rc5' of git://github.com/ceph/ceph-client:
  libceph: initialize last_linger_id with a large integer
  libceph: fix legacy layout decode with pool 0
  ceph: use default file splice read callback

Merge tag 'nfs-for-4.9-3' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client bugfixes from Anna Schumaker:
"Most of these fix regressions in 4.9, and none are going to stable
  this time around.

  Bugfixes:
   - Trim extra slashes in v4 nfs_paths to fix tools that use this
   - Fix a -Wmaybe-uninitialized warnings
   - Fix suspicious RCU usages
   - Fix Oops when mounting multiple servers at once
   - Suppress a false-positive pNFS error
   - Fix a DMAR failure in NFS over RDMA"

* tag 'nfs-for-4.9-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  xprtrdma: Fix DMAR failure in frwr_op_map() after reconnect
  fs/nfs: Fix used uninitialized warn in nfs4_slot_seqid_in_use()
  NFS: Don't print a pNFS error if we aren't using pNFS
  NFS: Ignore connections that have cl_rpcclient uninitialized
  SUNRPC: Fix suspicious RCU usage
  NFSv4.1: work around -Wmaybe-uninitialized warning
  NFS: Trim extra slash in v4 nfs_path

Merge tag 'xfs-fixes-for-linus-4.9-rc5' of git://git./linux/kernel/git/dgc/linux-xfs

Pull xfs fix from Dave Chinner:
"This is a fix for an unmount hang (regression) when the filesystem is
  shutdown.  It was supposed to go to you for -rc3, but I accidentally
  tagged the commit prior to it in that pullreq.

  Summary:

   - fix for aborting deferred transactions on filesystem shutdown"

* tag 'xfs-fixes-for-linus-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
  xfs: defer should abort intent items if the trans roll fails

Kbuild: enable -Wmaybe-uninitialized warnings by default

Previously the warnings were added back at the W=1 level and above, this
now turns them on again by default, assuming that we have addressed all
warnings and again have a clean build for v4.10.

I found a number of new warnings in linux-next already and submitted
bugfixes for those. Hopefully they are caught by the 0day builder in
the future as soon as this patch is merged.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pcmcia: fix return value of soc_pcmcia_regulator_set

The newly introduced soc_pcmcia_regulator_set() function sometimes
returns without setting its return code, as shown by this warning:

drivers/pcmcia/soc_common.c: In function 'soc_pcmcia_regulator_set':
drivers/pcmcia/soc_common.c:112:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This changes it to propagate the regulator_disable() result instead.

Fixes: ac61b6001a63 ("pcmcia: soc_common: add support for Vcc and Vpp regulators")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

infiniband: shut up a maybe-uninitialized warning

Some configurations produce this harmless warning when built with gcc
-Wmaybe-uninitialized:

  infiniband/core/cma.c: In function 'cma_get_net_dev':
  infiniband/core/cma.c:1242:12: warning: 'src_addr_storage.sin_addr.s_addr' may be used uninitialized in this function [-Wmaybe-uninitialized]

I previously reported this for the powerpc64 defconfig, but have now
reproduced the same thing for x86 as well, using gcc-5 or higher.

The code looks correct to me, and this change just rearranges it by
making sure we alway initialize the entire address structure to make the
warning disappear.  My first approach added an initialization at the
time of the declaration, which Doug commented may be too costly, so I
hope this version doesn't add overhead.

Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.7-rc6/buildall.powerpc.ppc64_defconfig.log.passed
Link: https://patchwork.kernel.org/patch/9212825/
Acked-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

crypto: aesni: shut up -Wmaybe-uninitialized warning

The rfc4106 encrypy/decrypt helper functions cause an annoying
false-positive warning in allmodconfig if we turn on
-Wmaybe-uninitialized warnings again:

  arch/x86/crypto/aesni-intel_glue.c: In function ‘helper_rfc4106_decrypt’:
  include/linux/scatterlist.h:67:31: warning: ‘dst_sg_walk.sg’ may be used uninitialized in this function [-Wmaybe-uninitialized]

The problem seems to be that the compiler doesn't track the state of the
'one_entry_in_sg' variable across the kernel_fpu_begin/kernel_fpu_end
section.

This takes the easy way out by adding a bogus initialization, which
should be harmless enough to get the patch into v4.9 so we can turn on
this warning again by default without producing useless output.  A
follow-up patch for v4.10 rearranges the code to make the warning go
away.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rc: print correct variable for z8f0811

A recent rework accidentally left a debugging printk untouched while
changing the meaning of the variables, leading to an uninitialized
variable being printed:

drivers/media/i2c/ir-kbd-i2c.c: In function 'get_key_haup_common':
drivers/media/i2c/ir-kbd-i2c.c:62:2: error: 'toggle' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This prints the correct one instead, as we did before the patch.

Fixes: 00bb820755ed ("[media] rc: Hauppauge z8f0811 can decode RC6")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

dib0700: fix nec repeat handling

When receiving a nec repeat, ensure the correct scancode is repeated
rather than a random value from the stack.  This removes the need for
the bogus uninitialized_var() and also fixes the warnings:

    drivers/media/usb/dvb-usb/dib0700_core.c: In function ‘dib0700_rc_urb_completion’:
    drivers/media/usb/dvb-usb/dib0700_core.c:679: warning: ‘protocol’ may be used uninitialized in this function

[sean addon: So after writing the patch and submitting it, I've bought the
             hardware on ebay. Without this patch you get random scancodes
             on nec repeats, which the patch indeed fixes.]

Signed-off-by: Sean Young <sean@mess.org>
Tested-by: Sean Young <sean@mess.org>
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

s390: pci: don't print uninitialized data for debugging

gcc correctly warns about an incorrect use of the 'pa' variable in case
we pass an empty scatterlist to __s390_dma_map_sg:

  arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg':
  arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized]

This adds a bogus initialization to the function to sanitize the debug
output.  I would have preferred a solution without the initialization,
but I only got the report from the kbuild bot after turning on the
warning again, and didn't manage to reproduce it myself.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nios2: fix timer initcall return value

When called more than twice, the nios2_time_init() function return an
uninitialized value, as detected by gcc -Wmaybe-uninitialized

arch/nios2/kernel/time.c: warning: 'ret' may be used uninitialized in this function

This makes it return '0' here, matching the comment above the function.

Acked-by: Ley Foon Tan <lftan@altera.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

x86: apm: avoid uninitialized data

apm_bios_call() can fail, and return a status in its argument structure.
If that status however is zero during a call from
apm_get_power_status(), we end up using data that may have never been
set, as reported by "gcc -Wmaybe-uninitialized":

  arch/x86/kernel/apm_32.c: In function ‘apm’:
  arch/x86/kernel/apm_32.c:1729:17: error: ‘bx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  arch/x86/kernel/apm_32.c:1835:5: error: ‘cx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  arch/x86/kernel/apm_32.c:1730:17: note: ‘cx’ was declared here
  arch/x86/kernel/apm_32.c:1842:27: error: ‘dx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  arch/x86/kernel/apm_32.c:1731:17: note: ‘dx’ was declared here

This changes the function to return "APM_NO_ERROR" here, which makes the
code more robust to broken BIOS versions, and avoids the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

NFSv4.1: work around -Wmaybe-uninitialized warning

A bugfix introduced a harmless gcc warning in nfs4_slot_seqid_in_use if
we enable -Wmaybe-uninitialized again:

fs/nfs/nfs4session.c:203:54: error: 'cur_seq' may be used uninitialized in this function [-Werror=maybe-uninitialized]

gcc is not smart enough to conclude that the IS_ERR/PTR_ERR pair results
in a nonzero return value here. Using PTR_ERR_OR_ZERO() instead makes
this clear to the compiler.

Fixes: e09c978aae5b ("NFSv4.1: Fix Oopsable condition in server callback races")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"

Traditionally, we have always had warnings about uninitialized variables
enabled, as this is part of -Wall, and generally a good idea [1], but it
also always produced false positives, mainly because this is a variation
of the halting problem and provably impossible to get right in all cases
[2].

Various people have identified cases that are particularly bad for false
positives, and in commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized
when building with -Os"), I turned off the warning for any build that
was done with CC_OPTIMIZE_FOR_SIZE.  This drastically reduced the number
of false positive warnings in the default build but unfortunately had
the side effect of turning the warning off completely in 'allmodconfig'
builds, which in turn led to a lot of warnings (both actual bugs, and
remaining false positives) to go in unnoticed.

With commit 877417e6ffb9 ("Kbuild: change CC_OPTIMIZE_FOR_SIZE
definition") enabled the warning again for allmodconfig builds in v4.7
and in v4.8-rc1, I had finally managed to address all warnings I get in
an ARM allmodconfig build and most other maybe-uninitialized warnings
for ARM randconfig builds.

However, commit 6e8d666e9253 ("Disable "maybe-uninitialized" warning
globally") was merged at the same time and disabled it completely for
all configurations, because of false-positive warnings on x86 that I had
not addressed until then.  This caused a lot of actual bugs to get
merged into mainline, and I sent several dozen patches for these during
the v4.9 development cycle.  Most of these are actual bugs, some are for
correct code that is safe because it is only called under external
constraints that make it impossible to run into the case that gcc sees,
and in a few cases gcc is just stupid and finds something that can
obviously never happen.

I have now done a few thousand randconfig builds on x86 and collected
all patches that I needed to address every single warning I got (I can
provide the combined patch for the other warnings if anyone is
interested), so I hope we can get the warning back and let people catch
the actual bugs earlier.

This reverts the change to disable the warning completely and for now
brings it back at the "make W=1" level, so we can get it merged into
mainline without introducing false positives.  A follow-up patch enables
it on all levels unless some configuration option turns it off because
of false-positives.

Link: https://rusty.ozlabs.org/?p=232
Link: https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/stackdepot: export save/fetch stack for drivers

Some drivers would like to record stacktraces in order to aide leak
tracing.  As stackdepot already provides a facility for only storing the
unique traces, thereby reducing the memory required, export that
functionality for use by drivers.

The code was originally created for KASAN and moved under lib in commit
cd11016e5f521 ("mm, kasan: stackdepot implementation.  Enable stackdepot
for SLAB") so that it could be shared with mm/.  In turn, we want to
share it now with drivers.

Link: http://lkml.kernel.org/r/20161108133209.22704-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: kmemleak: scan .data.ro_after_init

Limit the number of kmemleak false positives by including
.data.ro_after_init in memory scanning. To achieve this we need to add
symbols for start and end of the section to the linker scripts.

The problem was been uncovered by commit 56989f6d8568 ("genetlink: mark
families as __ro_after_init").

Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

While testing OBJFREELIST_SLAB integration with pagealloc, we found a
bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
CFLGS_OBJFREELIST_SLAB. When it happened, critical allocations needed
for loading drivers or creating new caches will fail.

The original kmem_cache is created early making OFF_SLAB not possible.
When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
is enabled it will try to enable it first under certain conditions.
Given kmem_cache(sys) reuses the original flag, you can have both flags
at the same time resulting in allocation failures and odd behaviors.

This fix discards allocator specific flags from memcg before calling
create_cache.

The bug exists since 4.6-rc1 and affects testing debug pagealloc
configurations.

Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
Link: http://lkml.kernel.org/r/1478553075-120242-1-git-send-email-thgarnie@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Tested-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

coredump: fix unfreezable coredumping task

It could be not possible to freeze coredumping task when it waits for
'core_state->startup' completion, because threads are frozen in
get_signal() before they got a chance to complete 'core_state->startup'.

Inability to freeze a task during suspend will cause suspend to fail.
Also CRIU uses cgroup freezer during dump operation. So with an
unfreezable task the CRIU dump will fail because it waits for a
transition from 'FREEZING' to 'FROZEN' state which will never happen.

Use freezer_do_not_count() to tell freezer to ignore coredumping task
while it waits for core_state->startup completion.

Link: http://lkml.kernel.org/r/1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/filemap: don't allow partially uptodate page for pipes

Starting from 4.9-rc1 kernel, I started noticing some test failures of
sendfile(2) and splice(2) (sendfile0N and splice01 from LTP) when
testing on sub-page block size filesystems (tested both XFS and ext4),
these syscalls start to return EIO in the tests.  e.g.

  sendfile02    1  TFAIL  :  sendfile02.c:133: sendfile(2) failed to return expected value, expected: 26, got: -1
  sendfile02    2  TFAIL  :  sendfile02.c:133: sendfile(2) failed to return expected value, expected: 24, got: -1
  sendfile02    3  TFAIL  :  sendfile02.c:133: sendfile(2) failed to return expected value, expected: 22, got: -1
  sendfile02    4  TFAIL  :  sendfile02.c:133: sendfile(2) failed to return expected value, expected: 20, got: -1

This is because that in sub-page block size cases, we don't need the
whole page to be uptodate, only the part we care about is uptodate is OK
(if fs has ->is_partially_uptodate defined).

But page_cache_pipe_buf_confirm() doesn't have the ability to check the
partially-uptodate case, it needs the whole page to be uptodate.  So it
returns EIO in this case.

This is a regression introduced by commit 82c156f85384 ("switch
generic_file_splice_read() to use of ->read_iter()").  Prior to the
change, generic_file_splice_read() doesn't allow partially-uptodate page
either, so it worked fine.

Fix it by skipping the partially-uptodate check if we're working on a
pipe in do_generic_file_read(), so we read the whole page from disk as
long as the page is not uptodate.

I think the other way to fix it is to add the ability to check & allow
partially-uptodate page to page_cache_pipe_buf_confirm(), but that is
much harder to do and seems gain little.

Link: http://lkml.kernel.org/r/1477986187-12717-1-git-send-email-guaneryu@gmail.com
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/hugetlb: fix huge page reservation leak in private mapping error paths

Error paths in hugetlb_cow() and hugetlb_no_page() may free a newly
allocated huge page.

If a reservation was associated with the huge page, alloc_huge_page()
consumed the reservation while allocating.  When the newly allocated
page is freed in free_huge_page(), it will increment the global
reservation count.  However, the reservation entry in the reserve map
will remain.

This is not an issue for shared mappings as the entry in the reserve map
indicates a reservation exists.  But, an entry in a private mapping
reserve map indicates the reservation was consumed and no longer exists.
This results in an inconsistency between the reserve map and the global
reservation count.  This 'leaks' a reserved huge page.

Create a new routine restore_reserve_on_error() to restore the reserve
entry in these specific error paths.  This routine makes use of a new
function vma_add_reservation() which will add a reserve entry for a
specific address/page.

In general, these error paths were rarely (if ever) taken on most
architectures.  However, powerpc contained arch specific code that that
resulted in an extra fault and execution of these error paths on all
private mappings.

Fixes: 67961f9db8c4 ("mm/hugetlb: fix huge page reserve accounting for private mappings)
Link: http://lkml.kernel.org/r/1476933077-23091-2-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill A . Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: fix not enough credit panic

The following panic was caught when run ocfs2 disconfig single test
(block size 512 and cluster size 8192).  ocfs2_journal_dirty() return
-ENOSPC, that means credits were used up.

The total credit should include 3 times of "num_dx_leaves" from
ocfs2_dx_dir_rebalance(), because 2 times will be consumed in
ocfs2_dx_dir_transfer_leaf() and 1 time will be consumed in
ocfs2_dx_dir_new_cluster() -> __ocfs2_dx_dir_new_cluster() ->
ocfs2_dx_dir_format_cluster().  But only two times is included in
ocfs2_dx_dir_rebalance_credits(), fix it.

This can cause read-only fs(v4.1+) or panic for mainline linux depending
on mount option.

  ------------[ cut here ]------------
  kernel BUG at fs/ocfs2/journal.c:775!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport acpi_cpufreq i2c_piix4 i2c_core pcspkr ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod
  CPU: 2 PID: 10601 Comm: dd Not tainted 4.1.12-71.el6uek.bug24939243.x86_64 #2
  Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016
  task: ffff8800b6de6200 ti: ffff8800a7d48000 task.ti: ffff8800a7d48000
  RIP: ocfs2_journal_dirty+0xa7/0xb0 [ocfs2]
  RSP: 0018:ffff8800a7d4b6d8  EFLAGS: 00010286
  RAX: 00000000ffffffe4 RBX: 00000000814d0a9c RCX: 00000000000004f9
  RDX: ffffffffa008e990 RSI: ffffffffa008f1ee RDI: ffff8800622b6460
  RBP: ffff8800a7d4b6f8 R08: ffffffffa008f288 R09: ffff8800622b6460
  R10: 0000000000000000 R11: 0000000000000282 R12: 0000000002c8421e
  R13: ffff88006d0cad00 R14: ffff880092beef60 R15: 0000000000000070
  FS:  00007f9b83e92700(0000) GS:ffff8800be880000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fb2c0d1a000 CR3: 0000000008f80000 CR4: 00000000000406e0
  Call Trace:
    ocfs2_dx_dir_transfer_leaf+0x159/0x1a0 [ocfs2]
    ocfs2_dx_dir_rebalance+0xd9b/0xea0 [ocfs2]
    ocfs2_find_dir_space_dx+0xd3/0x300 [ocfs2]
    ocfs2_prepare_dx_dir_for_insert+0x219/0x450 [ocfs2]
    ocfs2_prepare_dir_for_insert+0x1d6/0x580 [ocfs2]
    ocfs2_mknod+0x5a2/0x1400 [ocfs2]
    ocfs2_create+0x73/0x180 [ocfs2]
    vfs_create+0xd8/0x100
    lookup_open+0x185/0x1c0
    do_last+0x36d/0x780
    path_openat+0x92/0x470
    do_filp_open+0x4a/0xa0
    do_sys_open+0x11a/0x230
    SyS_open+0x1e/0x20
    system_call_fastpath+0x12/0x71
  Code: 1d 3f 29 09 00 48 85 db 74 1f 48 8b 03 0f 1f 80 00 00 00 00 48 8b 7b 08 48 83 c3 10 4c 89 e6 ff d0 48 8b 03 48 85 c0 75 eb eb 90 <0f> 0b eb fe 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54
  RIP  ocfs2_journal_dirty+0xa7/0xb0 [ocfs2]
  ---[ end trace 91ac5312a6ee1288 ]---
  Kernel panic - not syncing: Fatal exception
  Kernel Offset: disabled

Link: http://lkml.kernel.org/r/1478248135-31963-1-git-send-email-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "console: don't prefer first registered if DT specifies stdout-path"

This reverts commit 05fd007e4629 ("console: don't prefer first
registered if DT specifies stdout-path").

The reverted commit changes existing behavior on which many ARM boards
rely.  Many ARM small-board-computers, like e.g.  the Raspberry Pi have
both a video output and a serial console.  Depending on whether the user
is using the device as a more regular computer; or as a headless device
we need to have the console on either one or the other.

Many users rely on the kernel behavior of the console being present on
both outputs, before the reverted commit the console setup with no
console= kernel arguments on an ARM board which sets stdout-path in dt
would look like this:

  [root@localhost ~]# cat /proc/consoles
  ttyS0                -W- (EC p a)    4:64
  tty0                 -WU (E  p  )    4:1

Where as after the reverted commit, it looks like this:

  [root@localhost ~]# cat /proc/consoles
  ttyS0                -W- (EC p a)    4:64

This commit reverts commit 05fd007e4629 ("console: don't prefer first
registered if DT specifies stdout-path") restoring the original
behavior.

Fixes: 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path")
Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: hwpoison: fix thp split handling in memory_failure()

When memory_failure() runs on a thp tail page after pmd is split, we
trigger the following VM_BUG_ON_PAGE():

   page:ffffd7cd819b0040 count:0 mapcount:0 mapping:         (null) index:0x1
   flags: 0x1fffc000400000(hwpoison)
   page dumped because: VM_BUG_ON_PAGE(!page_count(p))
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/mm/memory-failure.c:1132!

memory_failure() passed refcount and page lock from tail page to head
page, which is not needed because we can pass any subpage to
split_huge_page().

Fixes: 61f5d698cc97 ("mm: re-enable THP")
Link: http://lkml.kernel.org/r/1477961577-7183-1-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

swapfile: fix memory corruption via malformed swapfile

When root activates a swap partition whose header has the wrong
endianness, nr_badpages elements of badpages are swabbed before
nr_badpages has been checked, leading to a buffer overrun of up to 8GB.

This normally is not a security issue because it can only be exploited
by root (more specifically, a process with CAP_SYS_ADMIN or the ability
to modify a swap file/partition), and such a process can already e.g.
modify swapped-out memory of any other userspace process on the system.

Link: http://lkml.kernel.org/r/1477949533-2509-1-git-send-email-jann@thejh.net
Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/cma.c: check the max limit for cma allocation

CMA allocation request size is represented by size_t that gets truncated
when same is passed as int to bitmap_find_next_zero_area_off.

We observe that during fuzz testing when cma allocation request is too
high, bitmap_find_next_zero_area_off still returns success due to the
truncation. This leads to kernel crash, as subsequent code assumes that
requested memory is available.

Fail cma allocation in case the request breaches the corresponding cma
region size.

Link: http://lkml.kernel.org/r/1478189211-3467-1-git-send-email-shashim@codeaurora.org
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scripts/bloat-o-meter: fix SIGPIPE

Fix piping output to a program which quickly exits (read: head -n1)

$ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux | head -n1
add/remove: 0/0 grow/shrink: 9/60 up/down: 124/-305 (-181)
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr

Link: http://lkml.kernel.org/r/20161028204618.GA29923@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

shmem: fix pageflags after swapping DMA32 object

If shmem_alloc_page() does not set PageLocked and PageSwapBacked, then
shmem_replace_page() needs to do so for itself. Without this, it puts
newpage on the wrong lru, re-unlocks the unlocked newpage, and system
descends into "Bad page" reports and freeze; or if CONFIG_DEBUG_VM=y, it
hits an earlier VM_BUG_ON_PAGE(!PageLocked), depending on config.

But shmem_replace_page() is not a common path: it's only called when
swapin (or swapoff) finds the page was already read into an unsuitable
zone: usually all zones are suitable, but gem objects for a few drm
devices (gma500, omapdrm, crestline, broadwater) require zone DMA32 if
there's more than 4GB of ram.

Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1611062003510.11253@eggly.anvils
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [4.8.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm, frontswap: make sure allocated frontswap map is assigned

Christian Borntraeger reports:

With commit 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to
static key") kmemleak complains about a memory leak in swapon

    unreferenced object 0x3e09ba56000 (size 32112640):
      comm "swapon", pid 7852, jiffies 4294968787 (age 1490.770s)
      hex dump (first 32 bytes):
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
         __vmalloc_node_range+0x194/0x2d8
         vzalloc+0x58/0x68
         SyS_swapon+0xd60/0x12f8
         system_call+0xd6/0x270

Turns out kmemleak is right.  We now allocate the frontswap map
depending on the kernel config (and no longer on the enablement)

  swapfile.c:
  [...]
      if (IS_ENABLED(CONFIG_FRONTSWAP))
                frontswap_map = vzalloc(BITS_TO_LONGS(maxpages) * sizeof(long));

but later on this is passed along
  --> enable_swap_info(p, prio, swap_map, cluster_info, frontswap_map);

and ignored if frontswap is disabled
  --> frontswap_init(p->type, frontswap_map);

  static inline void frontswap_init(unsigned type, unsigned long *map)
  {
        if (frontswap_enabled())
                __frontswap_init(type, map);
  }

Thing is, that frontswap map is never freed.

The leakage is relatively not that bad, because swapon is an infrequent
and privileged operation.  However, if the first frontswap backend is
registered after a swap type has been already enabled, it will WARN_ON
in frontswap_register_ops() and frontswap will not be available for the
swap type.

Fix this by making sure the map is assigned by frontswap_init() as long
as CONFIG_FRONTSWAP is enabled.

Fixes: 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key")
Link: http://lkml.kernel.org/r/20161026134220.2566-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: remove extra newline from allocation stall warning

Commit 63f53dea0c98 ("mm: warn about allocations which stall for too
long") by error embedded "\n" in the format string, resulting in strange
output.

  [  722.876655] kworker/0:1: page alloction stalls for 160001ms, order:0
  [  722.876656] , mode:0x2400000(GFP_NOIO)
  [  722.876657] CPU: 0 PID: 6966 Comm: kworker/0:1 Not tainted 4.8.0+ #69

Link: http://lkml.kernel.org/r/1476026219-7974-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'kvm-arm-for-v4.9-rc4' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM updates for v4.9-rc4

- Kick the vcpu when a pending interrupt becomes pending again
- Prevent access to invalid interrupt registers
- Invalid TLBs when two vcpus from the same VM share a CPU

PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails

Consider two devices, A and B, where B is a child of A, and B utilizes
asynchronous suspend (it does not matter whether A is sync or async). If
B fails to suspend_noirq() or suspend_late(), or is interrupted by a
wakeup (pm_wakeup_pending()), then it aborts and sets the async_error
variable. However, device A does not (immediately) check the async_error
variable; it may continue to run its own suspend_noirq()/suspend_late()
callback. This is bad.

We can resolve this problem by doing our error and wakeup checking
(particularly, for the async_error flag) after waiting for children to
suspend, instead of before. This also helps align the logic for the noirq and
late suspend cases with the logic in __device_suspend().

It's easy to observe this erroneous behavior by, for example, forcing a
device to sleep a bit in its suspend_noirq() (to ensure the parent is
waiting for the child to complete), then return an error, and watch the
parent suspend_noirq() still get called. (Or similarly, fake a wakeup
event at the right (or is it wrong?) time.)

Fixes: de377b397272 (PM / sleep: Asynchronous threads for suspend_late)
Fixes: 28b6fd6e3779 (PM / sleep: Asynchronous threads for suspend_noirq)
Reported-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

splice: remove detritus from generic_file_splice_read()

i_size check is a leftover from the horrors that used to play with
the page cache in that function. With the switch to ->read_iter(),
it's neither needed nor correct - for gfs2 it ends up being buggy,
since i_size is not guaranteed to be correct until later (inside
->read_iter()).

Spotted-by: Abhi Das <adas@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge tag 'imx-drm-fixes-2016-11-10' of git://git.pengutronix.de/git/pza/linux into drm-fixes

imx-drm: fix possible hangup when disabling crtcs

- only ever disable the display controller (DC) module after all plane
  IDMAC channels are stopped. This fixes a regression introduced by the
  atomic modeset conversion.

* tag 'imx-drm-fixes-2016-11-10' of git://git.pengutronix.de/git/pza/linux:
  drm/imx: disable planes before DC

Merge branch 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Regression fix for powerplay on some iceland boards.

* 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux:
  drm/amd/powerplay: implement get_clock_by_type for iceland.
  drm/amd/powerplay/smu7: fix checks in smu7_get_evv_voltages (v2)
  drm/amd/powerplay: update phm_get_voltage_evv_on_sclk for iceland
  drm/amd/powerplay: propagate errors in phm_get_voltage_evv_on_sclk

drm/udl: make control msg static const. (v2)

Thou shall not send control msg from the stack,
does that mean I can send it from the RO memory area?

and it looks like the answer is no, so here's
v2 which kmemdups.

Reported-by: poma
Tested-by: poma <poma@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

PCI: VMD: Update filename to reflect move

Updating MAINTAINERS to reflect the new location of the VMD driver.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

libceph: initialize last_linger_id with a large integer

osdc->last_linger_id is a counter for lreq->linger_id, which is used
for watch cookies. Starting with a large integer should ease the task
of telling apart kernel and userspace clients.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: fix legacy layout decode with pool 0

If your data pool was pool 0, ceph_file_layout_from_legacy()
transform that to -1 unconditionally, which broke upgrades.
We only want do that for a fully zeroed ceph_file_layout,
so that it still maps to a file_layout_t. If any fields
are set, though, we trust the fl_pgpool to be a valid pool.

Fixes: 7627151ea30bc ("libceph: define new ceph_file_layout structure")
Link: http://tracker.ceph.com/issues/17825
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

ceph: use default file splice read callback

Splice read/write implementation changed recently. When using
generic_file_splice_read(), iov_iter with type == ITER_PIPE is
passed to filesystem's read_iter callback. But ceph_sync_read()
can't serve ITER_PIPE iov_iter correctly (ITER_PIPE iov_iter
expects pages from page cache).

Fixing ceph_sync_read() requires a big patch. So use default
splice read callback for now.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

drm/amd/powerplay: implement get_clock_by_type for iceland.

iceland use pptable v0.

bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=185681
https://bugs.freedesktop.org/show_bug.cgi?id=98357

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Merge remote-tracking branch 'mkp-scsi/4.9/scsi-fixes' into fixes

arm64: dts: rockchip: add three new resets for rk3399 PCIe controller

pm_rst, aclk_rst and pclk_rst should be controlled by driver, so we
need to add these three resets for PCIe controller.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>

PCI: rockchip: Add three new resets as required properties

pm_rst, aclk_rst, pclk_rst was controlled by ROM code so the software
wasn't needed to control it again in theory. But it didn't work properly,
so we do need to do it again and add enough delay between the assert of
pm_rst and the deassert of pm_rst. The Soc intergrated with this
controller, rk3399, is still under MP test internally, so the backward
compatibility won't be a big deal.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Rob Herring <robh@kernel.org>

drm/amd/powerplay/smu7: fix checks in smu7_get_evv_voltages (v2)

Only check if the tables exist in relevant configs. This
fixes a failure on V0 tables.

v2: fix version check as suggested by Rex

bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=185681
https://bugs.freedesktop.org/show_bug.cgi?id=98357

Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/powerplay: update phm_get_voltage_evv_on_sclk for iceland

Was missing the handling for iceland.

bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=185681
https://bugs.freedesktop.org/show_bug.cgi?id=98357

Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/powerplay: propagate errors in phm_get_voltage_evv_on_sclk

Missing for one case.

bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=185681
https://bugs.freedesktop.org/show_bug.cgi?id=98357

Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

xprtrdma: Fix DMAR failure in frwr_op_map() after reconnect

When a LOCALINV WR is flushed, the frmr is marked STALE, then
frwr_op_unmap_sync DMA-unmaps the frmr's SGL. These STALE frmrs
are then recovered when frwr_op_map hunts for an INVALID frmr to
use.

All other cases that need frmr recovery leave that SGL DMA-mapped.
The FRMR recovery path unconditionally DMA-unmaps the frmr's SGL.

To avoid DMA unmapping the SGL twice for flushed LOCAL_INV WRs,
alter the recovery logic (rather than the hot frwr_op_unmap_sync
path) to distinguish among these cases. This solution also takes
care of the case where multiple LOCAL_INV WRs are issued for the
same rpcrdma_req, some complete successfully, but some are flushed.

Reported-by: Vasco Steinmetz <linux@kyberraum.net>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Vasco Steinmetz <linux@kyberraum.net>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

ppdev: fix double-free of pp->pdev->name

free_pardevice() is called by parport_unregister_device() and already frees
pp->pdev->name, don't try to do it again.

This bug causes kernel crashes.

I found and verified this with KASAN and some added pr_emerg()s:

[   60.316568] pp_release: pp->pdev->name == ffff88039cb264c0
[   60.316692] free_pardevice: freeing par_dev->name at ffff88039cb264c0
[   60.316706] pp_release: kfree(ffff88039cb264c0)
[   60.316714] ==========================================================
[   60.316722] BUG: Double free or freeing an invalid pointer
[   60.316731] Unexpected shadow byte: 0xFB
[   60.316801] Object at ffff88039cb264c0, in cache kmalloc-32 size: 32
[   60.316813] Allocated:
[   60.316824] PID = 1695
[   60.316869] Freed:
[   60.316880] PID = 1695
[   60.316935] ==========================================================

Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: cdc-acm: fix TIOCMIWAIT

The TIOCMIWAIT implementation would return -EINVAL if any of the three
supported signals were included in the mask.

Instead of returning an error in case TIOCM_CTS is included, simply
drop the mask check completely, which is in accordance with how other
drivers implement this ioctl.

Fixes: 5a6a62bdb925 ("cdc-acm: add TIOCMIWAIT")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drbd: Fix kernel_sendmsg() usage - potential NULL deref

Don't pass a size larger than iov_len to kernel_sendmsg().
Otherwise it will cause a NULL pointer deref when kernel_sendmsg()
returns with rv < size.

DRBD as external module has been around in the kernel 2.4 days already.
We used to be compatible to 2.4 and very early 2.6 kernels,
we used to use
rv = sock_sendmsg(sock, &msg, iov.iov_len);
then later changed to
rv = kernel_sendmsg(sock, &msg, &iov, 1, size);
when we should have used
rv = kernel_sendmsg(sock, &msg, &iov, 1, iov.iov_len);

tcp_sendmsg() used to totally ignore the size parameter.
57be5bd ip: convert tcp_sendmsg() to iov_iter primitives
changes that, and exposes our long standing error.

Even with this error exposed, to trigger the bug, we would need to have
an environment (config or otherwise) causing us to not use sendpage()
for larger transfers, a failing connection, and have it fail "just at the
right time". Apparently that was unlikely enough for most, so this went
unnoticed for years.

Still, it is known to trigger at least some of these,
and suspected for the others:
[0] http://lists.linbit.com/pipermail/drbd-user/2016-July/023112.html
[1] http://lists.linbit.com/pipermail/drbd-dev/2016-March/003362.html
[2] https://forums.grsecurity.net/viewtopic.php?f=3&t=4546
[3] https://ubuntuforums.org/showthread.php?t=2336150
[4] http://e2.howsolveproblem.com/i/1175162/

This should go into 4.9,
and into all stable branches since and including v4.0,
which is the first to contain the exposing change.

It is correct for all stable branches older than that as well
(which contain the DRBD driver; which is 2.6.33 and up).

It requires a small "conflict" resolution for v4.4 and earlier, with v4.5
we dropped the comment block immediately preceding the kernel_sendmsg().

Fixes: b411b3637fa7 ("The DRBD driver")
Cc: <stable@vger.kernel.org> # 2.6.33.x-
Cc: viro@zeniv.linux.org.uk
Cc: christoph.lechleitner@iteg.at
Cc: wolfgang.glas@iteg.at
Reported-by: Christoph Lechleitner <christoph.lechleitner@iteg.at>
Tested-by: Christoph Lechleitner <christoph.lechleitner@iteg.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
[changed oneliner to be "obvious" without context; more verbose message]
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

ACPI / platform: Add support for build-in properties

We have a couple of drivers, acpi_apd.c and acpi_lpss.c,
that need to pass extra build-in properties to the devices
they create. Previously the drivers added those properties
to the struct device which is member of the struct
acpi_device, but that does not work. Those properties need
to be assigned to the struct device of the platform device
instead in order for them to become available to the
drivers.

To fix this, this patch changes acpi_create_platform_device
function to take struct property_entry pointer as parameter.

Fixes: 20a875e2e86e (serial: 8250_dw: Add quirk for APM X-Gene SoC)
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Yazen Ghannam <yazen.ghannam@amd.com>
Tested-by: Jérôme de Bretagne <jerome.debretagne@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Merge branch 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

3 more amdgpu fixes.

* 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux:
  drm/amd/powerplay: return false instead of -EINVAL
  drm/amdgpu/powerplay/smu7: fix unintialized data usage
  drm/amdgpu: fix crash in acp_hw_fini

Merge tag 'drm-intel-fixes-2016-11-09' of git://anongit.freedesktop.org/drm-intel into drm-fixes

i915 fixes, include Sandybridge rendering regression fix.

* tag 'drm-intel-fixes-2016-11-09' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Limit Valleyview and earlier to only using mappable scanout
  drm/i915: Round tile chunks up for constructing partial VMAs
  drm/i915/dp: Extend BDW DP audio workaround to GEN9 platforms
  drm/i915/dp: BDW cdclk fix for DP audio
  drm/i915/vlv: Prevent enabling hpd polling in late suspend
  drm/i915: Respect alternate_ddc_pin for all DDI ports