review.tizen.org Git - platform/kernel/linux-starfive.git/log

scsi: ufs: Fix livelock of ufshcd_clear_ua_wluns()

When gate_work/ungate_work experience an error during hibern8_enter or exit
we can livelock:

ufshcd_err_handler()
   ufshcd_scsi_block_requests()
   ufshcd_reset_and_restore()
     ufshcd_clear_ua_wluns() -> stuck
   ufshcd_scsi_unblock_requests()

In order to avoid this, ufshcd_clear_ua_wluns() can be called per recovery
flows such as suspend/resume, link_recovery, and error_handler.

Link: https://lore.kernel.org/r/20210107185316.788815-2-jaegeuk@kernel.org
Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd resets")
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Fix missing cast of ibmvfc_event pointer to u64 handle

Commit 2aa0102c6688 ("scsi: ibmvfc: Use correlation token to tag commands")
sets the vfcFrame correlation token to the pointer handle of the associated
ibmvfc_event. However, that commit failed to cast the pointer to an
appropriate type which in this case is a u64. As such sparse warnings are
generated for both correlation token assignments.

ibmvfc.c:2375:36: sparse: incorrect type in argument 1 (different base types)
ibmvfc.c:2375:36: sparse: expected unsigned long long [usertype] val
ibmvfc.c:2375:36: sparse: got struct ibmvfc_event *[assigned] evt

Add the appropriate u64 casts when assigning an ibmvfc_event as a
correlation token.

Link: https://lore.kernel.org/r/20210106203721.1054693-1-tyreld@linux.ibm.com
Fixes: 2aa0102c6688 ("scsi: ibmvfc: Use correlation token to tag commands")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: ufshcd-pltfrm depends on HAS_IOMEM

Building ufshcd-pltfrm.c on arch/s390/ has a linker error since S390 does
not support IOMEM, so add a dependency on HAS_IOMEM.

s390-linux-ld: drivers/scsi/ufs/ufshcd-pltfrm.o: in function `ufshcd_pltfrm_init':
ufshcd-pltfrm.c:(.text+0x38e): undefined reference to `devm_platform_ioremap_resource'

where that devm_ function is inside an #ifdef CONFIG_HAS_IOMEM/#endif
block.

Link: lore.kernel.org/r/202101031125.ZEFCUiKi-lkp@intel.com
Link: https://lore.kernel.org/r/20210106040822.933-1-rdunlap@infradead.org
Fixes: 03b1781aa978 ("[SCSI] ufs: Add Platform glue driver for ufshcd")
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: linux-scsi@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression

Phil Oester reported that a fix for a possible buffer overrun that I sent
caused a regression that manifests in this output:

Event Message: A PCI parity error was detected on a component at bus 0 device 5 function 0.
Severity: Critical
Message ID: PCI1308

The original code tried to handle the sense data pointer differently when
using 32-bit 64-bit DMA addressing, which would lead to a 32-bit dma_addr_t
value of 0x11223344 to get stored

32-bit kernel:       44 33 22 11 ?? ?? ?? ??
64-bit LE kernel:    44 33 22 11 00 00 00 00
64-bit BE kernel:    00 00 00 00 44 33 22 11

or a 64-bit dma_addr_t value of 0x1122334455667788 to get stored as

32-bit kernel:       88 77 66 55 ?? ?? ?? ??
64-bit kernel:       88 77 66 55 44 33 22 11

In my patch, I tried to ensure that the same value is used on both 32-bit
and 64-bit kernels, and picked what seemed to be the most sensible
combination, storing 32-bit addresses in the first four bytes (as 32-bit
kernels already did), and 64-bit addresses in eight consecutive bytes (as
64-bit kernels already did), but evidently this was incorrect.

Always storing the dma_addr_t pointer as 64-bit little-endian,
i.e. initializing the second four bytes to zero in case of 32-bit
addressing, apparently solved the problem for Phil, and is consistent with
what all 64-bit little-endian machines did before.

I also checked in the history that in previous versions of the code, the
pointer was always in the first four bytes without padding, and that
previous attempts to fix 64-bit user space, big-endian architectures and
64-bit DMA were clearly flawed and seem to have introduced made this worse.

Link: https://lore.kernel.org/r/20210104234137.438275-1-arnd@kernel.org
Fixes: 381d34e376e3 ("scsi: megaraid_sas: Check user-provided offsets")
Fixes: 107a60dd71b5 ("scsi: megaraid_sas: Add support for 64bit consistent DMA")
Fixes: 94cd65ddf4d7 ("[SCSI] megaraid_sas: addded support for big endian architecture")
Fixes: 7b2519afa1ab ("[SCSI] megaraid_sas: fix 64 bit sense pointer truncation")
Reported-by: Phil Oester <kernel@linuxace.com>
Tested-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: docs: ABI: sysfs-driver-ufs: Add DeepSleep power mode

Update sysfs documentation for addition of DeepSleep power mode.

Link: https://lore.kernel.org/r/20210104155026.16417-1-adrian.hunter@intel.com
Fixes: fe1d4c2ebcae ("scsi: ufs: Add DeepSleep feature")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sd: Remove obsolete variable in sd_remove()

Commit 996e509bbc95 ("sd: use __register_blkdev to avoid a modprobe for an
unregistered dev_t") removed blk_register_region(devt, ...) in sd_remove()
and since then, devt is unused in sd_remove().

Hence, make W=1 warns:

drivers/scsi/sd.c:3516:8:
warning: variable 'devt' set but not used [-Wunused-but-set-variable]

Simply remove this obsolete variable.

[mkp: fixed commit sha]

Link: https://lore.kernel.org/r/20201214095424.12479-1-lukas.bulwahn@gmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sd: Suppress spurious errors when WRITE SAME is being disabled

The block layer code will split a large zeroout request into multiple bios
and if WRITE SAME is disabled because the storage device reports that it
does not support it (or support the length used), we can get an error
message from the block layer despite the setting of RQF_QUIET on the first
request. This is because more than one request may have already been
submitted.

Fix this by setting RQF_QUIET when BLK_STS_TARGET is returned to fail the
request early, we don't need to log a message because we did not actually
submit the command to the device, and the block layer code will handle the
error by submitting individual write bios.

Link: https://lore.kernel.org/r/20201207221021.28243-1-emilne@redhat.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: scsi_debug: Fix memleak in scsi_debug_init()

When sdeb_zbc_model does not match BLK_ZONED_NONE, BLK_ZONED_HA or
BLK_ZONED_HM, we should free sdebug_q_arr to prevent memleak. Also there is
no need to execute sdebug_erase_store() on failure of sdeb_zbc_model_str().

Link: https://lore.kernel.org/r/20201226061503.20050-1-dinghao.liu@zju.edu.cn
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Fix spelling mistake in Kconfig "compatiblity" -> "compatibility"

There is a spelling mistake in the Kconfig help text. Fix it.

Link: https://lore.kernel.org/r/20201217172019.57768-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedi: Correct max length of CHAP secret

The CHAP secret displayed garbage characters causing iSCSI login
authentication failure. Correct the CHAP password max length.

Link: https://lore.kernel.org/r/20201217105144.8055-1-njavali@marvell.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback

Users can initiate resets to specific SCSI device/target/host through
IOCTL. When this happens, the SCSI cmd passed to eh_device/target/host
_reset_handler() callbacks is initialized with a request whose tag is -1.
In this case it is not right for eh_device_reset_handler() callback to
count on the LUN get from hba->lrb[-1]. Fix it by getting LUN from the SCSI
device associated with the SCSI cmd.

Link: https://lore.kernel.org/r/1609157080-26283-1-git-send-email-cang@codeaurora.org
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: Relocate flush of exceptional event

The current flush location does not guarantee disabling BKOPS for the case
of requesting device power off.

1) The exceptional event handler is queued

2) ufs suspend starts with a request of device power off

3) BKOPS is disabled in ufs suspend

4) The queued work for the handler is done and BKOPS is re-enabled

Relocate the flush statement to ensure BKOPS remain disabled.

Link: https://lore.kernel.org/r/1608360039-16390-1-git-send-email-kwmad.kim@samsung.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: Relax the condition of UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL

UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL is intended to skip enabling
fWriteBoosterBufferFlushEn while WriteBooster is initializing. Therefore
it is better to apply the checking during WriteBooster initialization only.

Link: https://lore.kernel.org/r/20201222072905.32221-3-stanley.chu@mediatek.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: Fix possible power drain during system suspend

Currently if device needs to do flush or BKOP operations, the device VCC
power is kept during runtime-suspend period.

However, if system suspend is happening while device is runtime-suspended,
such power may not be disabled successfully.

The reasons may be,

1. If current PM level is the same as SPM level, device will keep
   runtime-suspended by ufshcd_system_suspend().

2. Flush recheck work may not be scheduled successfully during system
   suspend period. If it can wake up the system, this is also not the
   intention of the recheck work.

To fix this issue, simply runtime-resume the device if the flush is allowed
during runtime suspend period. Flush capability will be disabled while
leaving runtime suspend, and also not be allowed in system suspend period.

Link: https://lore.kernel.org/r/20201222072905.32221-2-stanley.chu@mediatek.com
Fixes: 51dd905bd2f6 ("scsi: ufs: Fix WriteBooster flush during runtime suspend")
Reviewed-by: Chaotian Jing <chaotian.jing@mediatek.com>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Merge branch '5.11/scsi-postmerge' into 5.11/scsi-fixes

Merge two commits that had dependencies on other 5.11 trees (the block
and the irq trees respectively).

- We reverted a megaraid_sas change in 5.10 due to missing block
layer plumbing. Now that this is in place, reinstate the change.

- The hisi_sas driver had a dependency on a driver core irq change
that went in through Thomas' tree.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Linux 5.11-rc2

Merge tag 's390-5.11-3' of git://git./linux/kernel/git/s390/linux

Pull s390 cleanups from Vasily Gorbik:
"Update defconfigs and sort config select list"

* tag 's390-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/Kconfig: sort config S390 select list once again
s390: update defconfigs

Merge tag 'pm-5.11-rc2' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These fix a crash in intel_pstate during resume from suspend-to-RAM
  that may occur after recent changes and two resource leaks in error
  paths in the operating performance points (OPP) framework, add a new
  C-states table to intel_idle and update the cpuidle MAINTAINERS entry
  to cover the governors too.

  Specifics:

   - Fix recently introduced crash in the intel_pstate driver that
     occurs if scale-invariance is disabled during resume from
     suspend-to-RAM due to inconsistent changes of APERF or MPERF MSR
     values made by the platform firmware (Rafael Wysocki).

   - Fix a memory leak and add a missing clk_put() in error paths in the
     OPP framework (Quanyang Wang, Viresh Kumar).

   - Add new C-states table for SnowRidge processors to the intel_idle
     driver (Artem Bityutskiy).

   - Update the MAINTAINERS entry for cpuidle to make it clear that the
     governors are covered by it too (Lukas Bulwahn)"

* tag 'pm-5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  intel_idle: add SnowRidge C-state table
  cpufreq: intel_pstate: Fix fast-switch fallback path
  opp: Call the missing clk_put() on error
  opp: fix memory leak in _allocate_opp_table
  MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK

Merge branches 'pm-cpufreq' and 'pm-cpuidle'

* pm-cpufreq:
  cpufreq: intel_pstate: Fix fast-switch fallback path

* pm-cpuidle:
  intel_idle: add SnowRidge C-state table
  MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK

Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi).

  The big core two fixes are for power management ("block: Do not accept
  any requests while suspended" and "block: Fix a race in the runtime
  power management code") which finally sorts out the resume problems
  we've occasionally been having.

  To make the resume fix, there are seven necessary precursors which
  effectively renames REQ_PREEMPT to REQ_PM, so every "special" request
  in block is automatically a power management exempt one.

  All of the non-PM preempt cases are removed except for the one in the
  SCSI Parallel Interface (spi) domain validation which is a genuine
  case where we have to run requests at high priority to validate the
  bus so this becomes an autopm get/put protected request"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (22 commits)
  scsi: cxgb4i: Fix TLS dependency
  scsi: ufs: Un-inline ufshcd_vops_device_reset function
  scsi: ufs: Re-enable WriteBooster after device reset
  scsi: ufs-mediatek: Use correct path to fix compile error
  scsi: mpt3sas: Signedness bug in _base_get_diag_triggers()
  scsi: block: Do not accept any requests while suspended
  scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT
  scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE
  scsi: scsi_transport_spi: Set RQF_PM for domain validation commands
  scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT
  scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
  scsi: block: Introduce BLK_MQ_REQ_PM
  scsi: block: Fix a race in the runtime power management code
  scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers
  scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers
  scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
  scsi: ufs-pci: Fix restore from S4 for Intel controllers
  scsi: ufs-mediatek: Keep VCC always-on for specific devices
  scsi: ufs: Allow regulators being always-on
  scsi: ufs: Clear UAC for RPMB after ufshcd resets
  ...

Merge tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"Two minor block fixes from this last week that should go into 5.11:

   - Add missing NOWAIT debugfs definition (Andres)

   - Fix kerneldoc warning introduced this merge window (Randy)"

* tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block:
  block: add debugfs stanza for QUEUE_FLAG_NOWAIT
  fs: block_dev.c: fix kernel-doc warnings from struct block_device changes

Merge tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
"A few fixes that should go into 5.11, all marked for stable as well:

   - Fix issue around identity COW'ing and users that share a ring
     across processes

   - Fix a hang associated with unregistering fixed files (Pavel)

   - Move the 'process is exiting' cancelation a bit earlier, so
     task_works aren't affected by it (Pavel)"

* tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block:
  kernel/io_uring: cancel io_uring before task works
  io_uring: fix io_sqe_files_unregister() hangs
  io_uring: add a helper for setting a ref node
  io_uring: don't assume mm is constant across submits

depmod: handle the case of /sbin/depmod without /sbin in PATH

Commit 436e980e2ed5 ("kbuild: don't hardcode depmod path") stopped
hard-coding the path of depmod, but in the process caused trouble for
distributions that had that /sbin location, but didn't have it in the
PATH (generally because /sbin is limited to the super-user path).

Work around it for now by just adding /sbin to the end of PATH in the
depmod.sh script.

Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/io_uring: cancel io_uring before task works

For cancelling io_uring requests it needs either to be able to run
currently enqueued task_works or having it shut down by that moment.
Otherwise io_uring_cancel_files() may be waiting for requests that won't
ever complete.

Go with the first way and do cancellations before setting PF_EXITING and
so before putting the task_work infrastructure into a transition state
where task_work_run() would better not be called.

Cc: stable@vger.kernel.org # 5.5+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

io_uring: fix io_sqe_files_unregister() hangs

io_sqe_files_unregister() uninterruptibly waits for enqueued ref nodes,
however requests keeping them may never complete, e.g. because of some
userspace dependency. Make sure it's interruptible otherwise it would
hang forever.

Cc: stable@vger.kernel.org # 5.6+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

io_uring: add a helper for setting a ref node

Setting a new reference node to a file data is not trivial, don't repeat
it, add and use a helper.

Cc: stable@vger.kernel.org # 5.6+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
"A fix for an edge case in MClientRequest encoding and a couple of
  trivial fixups for the new msgr2 support"

* tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client:
  libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE
  libceph: align session_key and con_secret to 16 bytes
  libceph: fix auth_signature buffer allocation in secure mode
  ceph: reencode gid_list when reconnecting

intel_idle: add SnowRidge C-state table

Add C-state table for the SnowRidge SoC which is found on Intel Jacobsville
platforms.

The following has been changed.

1. C1E latency changed from 10us to 15us. It was measured using the
    open source "wult" tool (the "nic" method, 15us is the 99.99th
    percentile).

2. C1E power break even changed from 20us to 25us, which may result
    in less C1E residency in some workloads.

3. C6 latency changed from 50us to 130us. Measured the same way as C1E.

The C6 C-state is supported only by some SnowRidge revisions, so add a C-state
table commentary about this.

On SnowRidge, C6 support is enumerated via the usual mechanism: "mwait" leaf of
the "cpuid" instruction. The 'intel_idle' driver does check this leaf, so even
though C6 is present in the table, the driver will only use it if the CPU does
support it.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpufreq: intel_pstate: Fix fast-switch fallback path

When sugov_update_single_perf() falls back to the "frequency"
path due to the missing scale-invariance, it will call
cpufreq_driver_fast_switch() via sugov_fast_switch()
and the driver's ->fast_switch() callback will be invoked,
so it must not be NULL.

However, after commit a365ab6b9dfb ("cpufreq: intel_pstate: Implement
the ->adjust_perf() callback") intel_pstate sets ->fast_switch() to
NULL when it is going to use intel_cpufreq_adjust_perf(), which is a
mistake, because on x86 the scale-invariance may be turned off
dynamically, so modify it to retain the original ->adjust_perf()
callback pointer.

Fixes: a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback")
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Merge branch 'opp/linux-next' of git://git./linux/kernel/git/vireshk/pm

Pull operating performance points (OPP) framework fixes for 5.11-rc2
from Viresh Kumar:

"This contains two patches to fix freeing of resources in error paths."

* 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
opp: Call the missing clk_put() on error
opp: fix memory leak in _allocate_opp_table

s390/Kconfig: sort config S390 select list once again

...and add comments at the top and bottom.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390: update defconfigs

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

block: add debugfs stanza for QUEUE_FLAG_NOWAIT

This was missed in 021a24460dc2. Leads to the numeric value of
QUEUE_FLAG_NOWAIT (i.e. 29) showing up in
/sys/kernel/debug/block/*/state.

Fixes: 021a24460dc28e7412aecfae89f60e1847e685c0
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

fs: block_dev.c: fix kernel-doc warnings from struct block_device changes

Fix new kernel-doc warnings in fs/block_dev.c:

../fs/block_dev.c:1066: warning: Excess function parameter 'whole' description in 'bd_abort_claiming'
../fs/block_dev.c:1837: warning: Function parameter or member 'dev' not described in 'lookup_bdev'

Fixes: 4e7b5671c6a8 ("block: remove i_bdev")
Fixes: 37c3fc9abb25 ("block: simplify the block device claiming interface")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-fsdevel@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"16 patches

  Subsystems affected by this patch series: mm (selftests, hugetlb,
  pagecache, mremap, kasan, and slub), kbuild, checkpatch, misc, and
  lib"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: slub: call account_slab_page() after slab page initialization
  zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c
  lib/zlib: fix inflating zlib streams on s390
  lib/genalloc: fix the overflow when size is too big
  kdev_t: always inline major/minor helper functions
  sizes.h: add SZ_8G/SZ_16G/SZ_32G macros
  local64.h: make <asm/local64.h> mandatory
  kasan: fix null pointer dereference in kasan_record_aux_stack
  mm: generalise COW SMC TLB flushing race comment
  mm/mremap.c: fix extent calculation
  mm: memmap defer init doesn't work as expected
  mm: add prototype for __add_to_page_cache_locked()
  checkpatch: prefer strscpy to strlcpy
  Revert "kbuild: avoid static_assert for genksyms"
  mm/hugetlb: fix deadlock in hugetlb_cow error path
  selftests/vm: fix building protection keys test

mm: slub: call account_slab_page() after slab page initialization

It's convenient to have page->objects initialized before calling into
account_slab_page(). In particular, this information can be used to
pre-alloc the obj_cgroup vector.

Let's call account_slab_page() a bit later, after the initialization of
page->objects.

This commit doesn't bring any functional change, but is required for
further optimizations.

[akpm@linux-foundation.org: undo changes needed by forthcoming mm-memcg-slab-pre-allocate-obj_cgroups-for-slab-caches-with-slab_account.patch]

Link: https://lkml.kernel.org/r/20201110195753.530157-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c

In commit 11fb479ff5d9 ("zlib: export S390 symbols for zlib modules"), I
added EXPORT_SYMBOL()s to dfltcc_inflate.c but then Mikhail said that
these should probably be in dfltcc_syms.c with the other
EXPORT_SYMBOL()s.

However, that is contrary to the current kernel style, which places
EXPORT_SYMBOL() immediately after the function that it applies to, so
move all EXPORT_SYMBOL()s to their respective function locations and
drop the dfltcc_syms.c file. Also move MODULE_LICENSE() from the
deleted file to dfltcc.c.

[rdunlap@infradead.org: remove dfltcc_syms.o from Makefile]
Link: https://lkml.kernel.org/r/20201227171837.15492-1-rdunlap@infradead.org
Link: https://lkml.kernel.org/r/20201219052530.28461-1-rdunlap@infradead.org
Fixes: 11fb479ff5d9 ("zlib: export S390 symbols for zlib modules")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Zaslonko Mikhail <zaslonko@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/zlib: fix inflating zlib streams on s390

Decompressing zlib streams on s390 fails with "incorrect data check"
error.

Userspace zlib checks inflate_state.flags in order to byteswap checksums
only for zlib streams, and s390 hardware inflate code, which was ported
from there, tries to match this behavior. At the same time, kernel zlib
does not use inflate_state.flags, so it contains essentially random
values. For many use cases either zlib stream is zeroed out or checksum
is not used, so this problem is masked, but at least SquashFS is still
affected.

Fix by always passing a checksum to and from the hardware as is, which
matches zlib_inflate()'s expectations.

Link: https://lkml.kernel.org/r/20201215155551.894884-1-iii@linux.ibm.com
Fixes: 126196100063 ("lib/zlib: add s390 hardware support for kernel zlib_inflate")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Cc: <stable@vger.kernel.org> [5.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/genalloc: fix the overflow when size is too big

Some graphic card has very big memory on chip, such as 32G bytes.

In the following case, it will cause overflow:

    pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE);
    ret = gen_pool_add(pool, 0x1000000, SZ_32G, NUMA_NO_NODE);

    va = gen_pool_alloc(pool, SZ_4G);

The overflow occurs in gen_pool_alloc_algo_owner():

....
size = nbits << order;
....

The @nbits is "int" type, so it will overflow.
Then the gen_pool_avail() will return the wrong value.

This patch converts some "int" to "unsigned long", and
changes the compare code in while.

Link: https://lkml.kernel.org/r/20201229060657.3389-1-sjhuang@iluvatar.ai
Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
Reported-by: Shi Jiasheng <jiasheng.shi@iluvatar.ai>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kdev_t: always inline major/minor helper functions

Silly GCC doesn't always inline these trivial functions.

Fixes the following warning:

arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled

Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> [build-tested]
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sizes.h: add SZ_8G/SZ_16G/SZ_32G macros

Add these macros, since we can use them in drivers.

Link: https://lkml.kernel.org/r/20201229072819.11183-1-sjhuang@iluvatar.ai
Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

local64.h: make <asm/local64.h> mandatory

Make <asm-generic/local64.h> mandatory in include/asm-generic/Kbuild and
remove all arch/*/include/asm/local64.h arch-specific files since they
only #include <asm-generic/local64.h>.

This fixes build errors on arch/c6x/ and arch/nios2/ for
block/blk-iocost.c.

Build-tested on 21 of 25 arch-es. (tools problems on the others)

Yes, we could even rename <asm-generic/local64.h> to
<linux/local64.h> and change all #includes to use
<linux/local64.h> instead.

Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: fix null pointer dereference in kasan_record_aux_stack

Syzbot reported the following [1]:

  BUG: kernel NULL pointer dereference, address: 0000000000000008
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 2d993067 P4D 2d993067 PUD 19a3c067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP KASAN
  CPU: 1 PID: 3852 Comm: kworker/1:2 Not tainted 5.10.0-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Workqueue: events free_ipc
  RIP: 0010:kasan_record_aux_stack+0x77/0xb0

Add null checking slab object from kasan_get_alloc_meta() in order to
avoid null pointer dereference.

[1] https://syzkaller.appspot.com/x/log.txt?x=10a82a50d00000

Link: https://lkml.kernel.org/r/20201228080018.23041-1-walter-zh.wu@mediatek.com
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: generalise COW SMC TLB flushing race comment

I'm not sure if I'm completely missing something here, but AFAIKS the
reference to the mysterious "COW SMC race" confuses the issue. The
original changelog and mailing list thread didn't help me either.

This SMC race is where the problem was detected, but isn't the general
problem bigger and more obvious: that the new PTE could be picked up at
any time by any TLB while entries for the old PTE exist in other TLBs
before the TLB flush takes effect?

The case where the iTLB and dTLB of a CPU are pointing at different pages
is an interesting one but follows from the general problem.

The other (minor) thing with the comment I think it makes it a bit clearer
to say what the old code was doing (i.e., it avoids the race as opposed to
what?).

References: 4ce072f1faf29 ("mm: fix a race condition under SMC + COW")
Link: https://lkml.kernel.org/r/20201215121119.351650-1-npiggin@gmail.com
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/mremap.c: fix extent calculation

When `next < old_addr`, `next - old_addr` arithmetic underflows causing
`extent` to be incorrect.

Make `extent` the smaller of `next - old_addr` or `old_end - old_addr`.

Link: https://lkml.kernel.org/r/20201219170433.2418867-1-kaleshsingh@google.com
Fixes: c49dd34018026 ("mm: speedup mremap on 1GB or larger regions")
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: memmap defer init doesn't work as expected

VMware observed a performance regression during memmap init on their
platform, and bisected to commit 73a6e474cb376 ("mm: memmap_init:
iterate over memblock regions rather that check each PFN") causing it.

Before the commit:

  [0.033176] Normal zone: 1445888 pages used for memmap
  [0.033176] Normal zone: 89391104 pages, LIFO batch:63
  [0.035851] ACPI: PM-Timer IO Port: 0x448

With commit

  [0.026874] Normal zone: 1445888 pages used for memmap
  [0.026875] Normal zone: 89391104 pages, LIFO batch:63
  [2.028450] ACPI: PM-Timer IO Port: 0x448

The root cause is the current memmap defer init doesn't work as expected.

Before, memmap_init_zone() was used to do memmap init of one whole zone,
to initialize all low zones of one numa node, but defer memmap init of
the last zone in that numa node.  However, since commit 73a6e474cb376,
function memmap_init() is adapted to iterater over memblock regions
inside one zone, then call memmap_init_zone() to do memmap init for each
region.

E.g, on VMware's system, the memory layout is as below, there are two
memory regions in node 2.  The current code will mistakenly initialize the
whole 1st region [mem 0xab00000000-0xfcffffffff], then do memmap defer to
iniatialize only one memmory section on the 2nd region [mem
0x10000000000-0x1033fffffff].  In fact, we only expect to see that there's
only one memory section's memmap initialized.  That's why more time is
costed at the time.

[    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
[    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff]
[    0.008843] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x55ffffffff]
[    0.008844] ACPI: SRAT: Node 1 PXM 1 [mem 0x5600000000-0xaaffffffff]
[    0.008844] ACPI: SRAT: Node 2 PXM 2 [mem 0xab00000000-0xfcffffffff]
[    0.008845] ACPI: SRAT: Node 2 PXM 2 [mem 0x10000000000-0x1033fffffff]

Now, let's add a parameter 'zone_end_pfn' to memmap_init_zone() to pass
down the real zone end pfn so that defer_init() can use it to judge
whether defer need be taken in zone wide.

Link: https://lkml.kernel.org/r/20201223080811.16211-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20201223080811.16211-2-bhe@redhat.com
Fixes: commit 73a6e474cb376 ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
Signed-off-by: Baoquan He <bhe@redhat.com>
Reported-by: Rahul Gopakumar <gopakumarr@vmware.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: add prototype for __add_to_page_cache_locked()

Otherwise it causes a gcc warning:

mm/filemap.c:830:14: warning: no previous prototype for `__add_to_page_cache_locked' [-Wmissing-prototypes]

A previous attempt to make this function static led to compilation
errors when CONFIG_DEBUG_INFO_BTF is enabled because
__add_to_page_cache_locked() is referred to by BPF code.

Adding a prototype will silence the warning.

Link: https://lkml.kernel.org/r/1608693702-4665-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch: prefer strscpy to strlcpy

Prefer strscpy over the deprecated strlcpy function.

Link: https://lkml.kernel.org/r/19fe91084890e2c16fe56f960de6c570a93fa99b.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Requested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "kbuild: avoid static_assert for genksyms"

This reverts commit 14dc3983b5dff513a90bd5a8cc90acaf7867c3d0.

Macro Elver had sent a fix proper fix earlier, and also pointed out
corner cases:
"I guess what you propose is simpler, but might still have corner cases
  where we still get warnings. In particular, if some file (for whatever
  reason) does not include build_bug.h and uses a raw _Static_assert(),
  then we still get warnings. E.g. I see 1 user of raw _Static_assert()
  (drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h )."

I believe the raw use of _Static_assert() should be allowed, so this
should be fixed in genksyms.

Even after commit 14dc3983b5df ("kbuild: avoid static_assert for
genksyms"), I confirmed the following test code emits the warning.

  ---------------->8----------------
  #include <linux/export.h>

  _Static_assert((1 ?: 0), "");

  void foo(void) { }
  EXPORT_SYMBOL(foo);
  ---------------->8----------------

  WARNING: modpost: EXPORT symbol "foo" [vmlinux] version generation failed, symbol will not be versioned.

Now that commit 869b91992bce ("genksyms: Ignore module scoped
_Static_assert()") fixed this issue properly, the workaround should
be reverted.

Link: https://lkml.org/lkml/2020/12/10/845
Link: https://lkml.kernel.org/r/20201219183911.181442-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/hugetlb: fix deadlock in hugetlb_cow error path

syzbot reported the deadlock here [1].  The issue is in hugetlb cow
error handling when there are not enough huge pages for the faulting
task which took the original reservation.  It is possible that other
(child) tasks could have consumed pages associated with the reservation.
In this case, we want the task which took the original reservation to
succeed.  So, we unmap any associated pages in children so that they can
be used by the faulting task that owns the reservation.

The unmapping code needs to hold i_mmap_rwsem in write mode.  However,
due to commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd
sharing synchronization") we are already holding i_mmap_rwsem in read
mode when hugetlb_cow is called.

Technically, i_mmap_rwsem does not need to be held in read mode for COW
mappings as they can not share pmd's.  Modifying the fault code to not
take i_mmap_rwsem in read mode for COW (and other non-sharable) mappings
is too involved for a stable fix.

Instead, we simply drop the hugetlb_fault_mutex and i_mmap_rwsem before
unmapping.  This is OK as it is technically not needed.  They are
reacquired after unmapping as expected by calling code.  Since this is
done in an uncommon error path, the overhead of dropping and reacquiring
mutexes is acceptable.

While making changes, remove redundant BUG_ON after unmap_ref_private.

[1] https://lkml.kernel.org/r/000000000000b73ccc05b5cf8558@google.com

Link: https://lkml.kernel.org/r/4c5781b8-3b00-761e-c0c7-c5edebb6ec1a@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: syzbot+5eee4145df3c15e96625@syzkaller.appspotmail.com
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

selftests/vm: fix building protection keys test

Commit d8cbe8bfa7d ("tools/testing/selftests/vm: fix build error") tried
to include a ARCH check for powerpc, however ARCH is not defined in the
Makefile before including lib.mk. This makes test building to skip on
both x86 and powerpc.

Fix the arch check by replacing it using machine type as it is already
defined and used in the test.

Link: https://lkml.kernel.org/r/20201215100402.257376-1-harish@linux.ibm.com
Fixes: d8cbe8bfa7d ("tools/testing/selftests/vm: fix build error")
Signed-off-by: Harish <harish@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

io_uring: don't assume mm is constant across submits

If we COW the identity, we assume that ->mm never changes. But this
isn't true of multiple processes end up sharing the ring. Hence treat
id->mm like like any other process compontent when it comes to the
identity mapping. This is pretty trivial, just moving the existing grab
into io_grab_identity(), and including a check for the match.

Cc: stable@vger.kernel.org # 5.10
Fixes: 1e6fa5216a0e ("io_uring: COW io_identity on mismatch")
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>:
Tested-by: Christian Brauner <christian.brauner@ubuntu.com>:
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'for-5.11/dm-fix' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mike Snitzer:
"Revert WQ_SYSFS change that broke reencryption (and all other
functionality that requires reloading a dm-crypt DM table)"

* tag 'for-5.11/dm-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
Revert "dm crypt: export sysfs of kcryptd workqueue"

Revert "dm crypt: export sysfs of kcryptd workqueue"

This reverts commit a2b8b2d975673b1a50ab0bcce5d146b9335edfad.

WQ_SYSFS breaks the ability to reload a DM table due to sysfs kobject
collision (due to active and inactive table). Given lack of
demonstrated need for exposing this workqueue via sysfs: revert
exposing it.

Reported-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE

Avoid -Wunused-const-variable warnings for "make W=1".

Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: align session_key and con_secret to 16 bytes

crypto_shash_setkey() and crypto_aead_setkey() will do a (small)
GFP_ATOMIC allocation to align the key if it isn't suitably aligned.
It's not a big deal, but at the same time easy to avoid.

The actual alignment requirement is dynamic, queryable with
crypto_shash_alignmask() and crypto_aead_alignmask(), but shouldn't
be stricter than 16 bytes for our algorithms.

Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: fix auth_signature buffer allocation in secure mode

auth_signature frame is 68 bytes in plain mode and 96 bytes in
secure mode but we are requesting 68 bytes in both modes. By luck,
this doesn't actually result in any invalid memory accesses because
the allocation is satisfied out of kmalloc-96 slab and so exactly
96 bytes are allocated, but KASAN rightfully complains.

Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)")
Reported-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

ceph: reencode gid_list when reconnecting

On reconnect, cap and dentry releases are dropped and the fields
that follow must be reencoded into the freed space.  Currently these
are timestamp and gid_list, but gid_list isn't reencoded.  This
results in

  failed to decode message of type 24 v4: End of buffer

errors on the MDS.

While at it, make a change to encode gid_list unconditionally,
without regard to what head/which version was used as a result
of checking whether CEPH_FEATURE_FS_BTIME is supported or not.

URL: https://tracker.ceph.com/issues/48618
Fixes: 4f1ddb1ea874 ("ceph: implement updated ceph_mds_request_head structure")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

Merge branch 'for-5.11' of git://git./linux/kernel/git/tj/wq

Pull workqueue update from Tejun Heo:
"The same as the cgroup tree - one commit which was scheduled for the
  5.11 merge window.

  All the commit does is avoding spurious worker wakeups from workqueue
  allocation / config change path to help cpuisol use cases"

* 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Kick a worker based on the actual activation of delayed works

Merge branch 'for-5.11' of git://git./linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:
"These three patches were scheduled for the merge window but I forgot
  to send them out. Sorry about that.

  None of them are significant and they fit well in a fix pull request
  too - two are cosmetic and one fixes a memory leak in the mount option
  parsing path"

* 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Fix memory leak when parsing multiple source parameters
  cgroup/cgroup.c: replace 'of->kn->priv' with of_cft()
  kernel: cgroup: Mundane spelling fixes throughout the file

opp: Call the missing clk_put() on error

Fix the clock reference counting by calling the missing clk_put() in the
error path.

Cc: v5.10 <stable@vger.kernel.org> # v5.10
Fixes: dd461cd9183f ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

opp: fix memory leak in _allocate_opp_table

In function _allocate_opp_table, opp_dev is allocated and referenced
by opp_table via _add_opp_dev. But in the case that the subsequent calls
return -EPROBE_DEFER, it will jump to err label and opp_table will be
freed. Then opp_dev becomes an unreferenced object to cause memory leak.
So let's call _remove_opp_dev to do the cleanup.

This fixes the following kmemleak report:

unreferenced object 0xffff000801524a00 (size 128):
  comm "swapper/0", pid 1, jiffies 4294892465 (age 84.616s)
  hex dump (first 32 bytes):
    40 00 56 01 08 00 ff ff 40 00 56 01 08 00 ff ff  @.V.....@.V.....
    b8 52 77 7f 08 00 ff ff 00 3c 4c 00 08 00 ff ff  .Rw......<L.....
  backtrace:
    [<00000000b1289fb1>] kmemleak_alloc+0x30/0x40
    [<0000000056da48f0>] kmem_cache_alloc+0x3d4/0x588
    [<00000000a84b3b0e>] _add_opp_dev+0x2c/0x88
    [<0000000062a380cd>] _add_opp_table_indexed+0x124/0x268
    [<000000008b4c8f1f>] dev_pm_opp_of_add_table+0x20/0x1d8
    [<00000000e5316798>] dev_pm_opp_of_cpumask_add_table+0x48/0xf0
    [<00000000db0a8ec2>] dt_cpufreq_probe+0x20c/0x448
    [<0000000030a3a26c>] platform_probe+0x68/0xd8
    [<00000000c618e78d>] really_probe+0xd0/0x3a0
    [<00000000642e856f>] driver_probe_device+0x58/0xb8
    [<00000000f10f5307>] device_driver_attach+0x74/0x80
    [<0000000004f254b8>] __driver_attach+0x58/0xe0
    [<0000000009d5d19e>] bus_for_each_dev+0x70/0xc8
    [<0000000000d22e1c>] driver_attach+0x24/0x30
    [<0000000001d4e952>] bus_add_driver+0x14c/0x1f0
    [<0000000089928aaa>] driver_register+0x64/0x120

Cc: v5.10 <stable@vger.kernel.org> # v5.10
Fixes: dd461cd9183f ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
[ Viresh: Added the stable tag ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Linux 5.11-rc1

proc mountinfo: make splice available again

Since commit 36e2c7421f02 ("fs: don't allow splice read/write without
explicit ops") we've required that file operation structures explicitly
enable splice support, rather than falling back to the default handlers.

Most /proc files use the indirect 'struct proc_ops' to describe their
file operations, and were fixed up to support splice earlier in commits
40be821d627c..b24c30c67863, but the mountinfo files interact with the
VFS directly using their own 'struct file_operations' and got missed as
a result.

This adds the necessary support for splice to work for /proc/*/mountinfo
and friends.

Reported-by: Joan Bruguera Micó <joanbrugueram@gmail.com>
Reported-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=209971
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'ntb-5.11' of git://github.com/jonmason/ntb

Pull NTB fixes from Jon Mason:
"Bug fix for IDT NTB and Intel NTB LTR management support"

* tag 'ntb-5.11' of git://github.com/jonmason/ntb:
ntb: intel: add Intel NTB LTR vendor support for gen4 NTB
ntb: idt: fix error check in ntb_hw_idt.c

Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
"Fix a number of autobuild failures due to missing Kconfig
  dependencies"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: qat - add CRYPTO_AES to Kconfig dependencies
  crypto: keembay - Add dependency on HAS_IOMEM
  crypto: keembay - CRYPTO_DEV_KEEMBAY_OCS_AES_SM4 should depend on ARCH_KEEMBAY

Merge tag 'objtool-urgent-2020-12-27' of git://git./linux/kernel/git/tip/tip

Pull objtool fix from Ingo Molnar:
"Fix a segfault that occurs when built with Clang"

* tag 'objtool-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix seg fault with Clang non-section symbols

Merge tag 'locking-urgent-2020-12-27' of git://git./linux/kernel/git/tip/tip

Pull locking fixes from Ingo Molnar:
"Misc fixes/updates:

   - Fix static keys usage in module __init sections

   - Add separate MAINTAINERS entry for static branches/calls

   - Fix lockdep splat with CONFIG_PREEMPTIRQ_EVENTS=y tracing"

* tag 'locking-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  softirq: Avoid bad tracing / lockdep interaction
  jump_label/static_call: Add MAINTAINERS
  jump_label: Fix usage in module __init

Merge tag 'timers-urgent-2020-12-27' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Ingo Molnar:
"Update/fix two CPU sanity checks in the hotplug and the boot code, and
  fix a typo in the Kconfig help text.

  [ Context: the first two commits are the result of an ongoing
    annotation+review work of (intentional) tick_do_timer_cpu() data
    races reported by KCSAN, but the annotations aren't fully cooked
    yet ]"

* tag 'timers-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Fix spelling mistake in Kconfig "fullfill" -> "fulfill"
  tick/sched: Remove bogus boot "safety" check
  tick: Remove pointless cpu valid check in hotplug code

Merge tag 'sched-urgent-2020-12-27' of git://git./linux/kernel/git/tip/tip

Pull scheduler fix from Ingo Molnar:
"Fix a context switch performance regression"

* tag 'sched-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Optimize finish_lock_switch()

mfd: ab8500-debugfs: Remove extraneous seq_putc

Commit c9a3c4e637ac ("mfd: ab8500-debugfs: Remove extraneous curly
brace") removed a left-over curly brace that caused build failures, but
Joe Perches points out that the subsequent 'seq_putc()' should also be
removed, because the commit that caused all these problems already added
the final '\n' to the seq_printf() above it.

Reported-by: Joe Perches <joe@perches.com>
Fixes: 886c8121659d ("mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc")
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'pci-v5.11-fixes-1' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

- Fix a tegra enumeration regression (Rob Herring)

- Fix a designware-host check that warned on *success*, not failure
   (Alexander Lobakin)

* tag 'pci-v5.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: dwc: Fix inverted condition of DMA mask setup warning
  PCI: tegra: Fix host link initialization

mfd: ab8500-debugfs: Remove extraneous curly brace

Clang errors:

  drivers/mfd/ab8500-debugfs.c:1526:2: error: non-void function does not return a value [-Werror,-Wreturn-type]
          }
          ^
  drivers/mfd/ab8500-debugfs.c:1528:2: error: expected identifier or '('
  return 0;
          ^
  drivers/mfd/ab8500-debugfs.c:1529:1: error: extraneous closing brace ('}')
  }
  ^
  3 errors generated.

The cleanup in ab8500_interrupts_show left a curly brace around, remove
it to fix the error.

Fixes: 886c8121659d ("mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

PCI: dwc: Fix inverted condition of DMA mask setup warning

Commit 660c486590aa ("PCI: dwc: Set 32-bit DMA mask for MSI target address
allocation") added dma_mask_set() call to explicitly set 32-bit DMA mask
for MSI message mapping, but for now it throws a warning on ret == 0, while
dma_set_mask() returns 0 in case of success.

Fix this by inverting the condition.

[bhelgaas: join string to make it greppable]
Fixes: 660c486590aa ("PCI: dwc: Set 32-bit DMA mask for MSI target address allocation")
Link: https://lore.kernel.org/r/20201222150708.67983-1-alobakin@pm.me
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: tegra: Fix host link initialization

Commit b9ac0f9dc8ea ("PCI: dwc: Move dw_pcie_setup_rc() to DWC common
code") broke enumeration of downstream devices on Tegra:

In non-working case (next-20201211):

  0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
  0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
  0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)

In working case (v5.10-rc7):

  0001:00:00.0 PCI bridge: Molex Incorporated Device 1ad2 (rev a1)
  0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
  0005:00:00.0 PCI bridge: Molex Incorporated Device 1ad0 (rev a1)
  0005:01:00.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
  0005:02:02.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
  0005:03:00.0 USB controller: PLX Technology, Inc. Device 3380 (rev ab)

The problem seems to be dw_pcie_setup_rc() is now called twice before and
after the link up handling. The fix is to move Tegra's link up handling to
.start_link() function like other DWC drivers. Tegra is a bit more
complicated than others as it re-inits the whole DWC controller to retry
the link. With this, the initialization ordering is restored to match the
prior sequence.

Fixes: b9ac0f9dc8ea ("PCI: dwc: Move dw_pcie_setup_rc() to DWC common code")
Link: https://lore.kernel.org/r/20201218143905.1614098-1-robh@kernel.org
Reported-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Jonathan Hunter <jonathanh@nvidia.com>
Cc: Vidya Sagar <vidyas@nvidia.com>

drm/amd/display: avoid uninitialized variable warning

clang (quite rightly) complains fairly loudly about the newly added
mpc1_get_mpc_out_mux() function returning an uninitialized value if the
'opp_id' checks don't pass.

This may not happen in practice, but the code really shouldn't return
garbage if the sanity checks don't pass.

So just initialize 'val' to zero to avoid the issue.

Fixes: 110b055b2827 ("drm/amd/display: add getter routine to retrieve mpcc mux")
Cc: Josip Pavic <Josip.Pavic@amd.com>
Cc: Bindu Ramamurthy <bindu.r@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'perf-tools-2020-12-24' of git://git./linux/kernel/git/acme/linux

Pull more perf tools updates from Arnaldo Carvalho de Melo:

- Refactor 'perf stat' per CPU/socket/die/thread aggregation fixing use
   cases in ARM machines.

- Fix memory leak when synthesizing SDT probes in 'perf probe'.

- Update kernel header copies related to KVM, epol_pwait. msr-index and
   powerpc and s390 syscall tables.

* tag 'perf-tools-2020-12-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (24 commits)
  perf probe: Fix memory leak when synthesizing SDT probes
  perf stat aggregation: Add separate thread member
  perf stat aggregation: Add separate core member
  perf stat aggregation: Add separate die member
  perf stat aggregation: Add separate socket member
  perf stat aggregation: Add separate node member
  perf stat aggregation: Start using cpu_aggr_id in map
  perf cpumap: Drop in cpu_aggr_map struct
  perf cpumap: Add new map type for aggregation
  perf stat: Replace aggregation ID with a struct
  perf cpumap: Add new struct for cpu aggregation
  perf cpumap: Use existing allocator to avoid using malloc
  perf tests: Improve topology test to check all aggregation types
  perf tools: Update s390's syscall.tbl copy from the kernel sources
  perf tools: Update powerpc's syscall.tbl copy from the kernel sources
  perf s390: Move syscall.tbl check into check-headers.sh
  perf powerpc: Move syscall.tbl check to check-headers.sh
  tools headers UAPI: Synch KVM's svm.h header with the kernel
  tools kvm headers: Update KVM headers from the kernel sources
  tools headers UAPI: Sync KVM's vmx.h header with the kernel sources
  ...

Merge branch 'for-5.11' of git://git./linux/kernel/git/jlawall/linux

Pull coccinelle updates from Julia Lawall.

* 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  scripts: coccicheck: Correct usage of make coccicheck
  coccinelle: update expiring email addresses
  coccinnelle: Remove ptr_ret script
  kbuild: do not use scripts/ld-version.sh for checking spatch version
  remove boolinit.cocci

genirq: Fix export of irq_to_desc() for powerpc KVM

Commit 64a1b95bb9fe ("genirq: Restrict export of irq_to_desc()") removed
the export of irq_to_desc() unless powerpc KVM is being built, because
there is still a use of irq_to_desc() in modular code there.

However it used:

#ifdef CONFIG_KVM_BOOK3S_64_HV

Which doesn't work when that symbol is =m, leading to a build failure:

ERROR: modpost: "irq_to_desc" [arch/powerpc/kvm/kvm-hv.ko] undefined!

Fix it by checking for the definedness of the correct symbol which is
CONFIG_KVM_BOOK3S_64_HV_MODULE.

Fixes: 64a1b95bb9fe ("genirq: Restrict export of irq_to_desc()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'work.misc' of git://git./linux/kernel/git/viro/vfs

Pull misc vfs updates from Al Viro:
"Assorted patches from previous cycle(s)..."

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix hostfs_open() use of ->f_path.dentry
  Make sure that make_create_in_sticky() never sees uninitialized value of dir_mode
  fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
  fs: Handle I_DONTCACHE in iput_final() instead of generic_drop_inode()
  fs/namespace.c: WARN if mnt_count has become negative

Merge tag 'docs-5.11-2' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
"A small set of late-arriving, small documentation fixes"

* tag 'docs-5.11-2' of git://git.lwn.net/linux:
  docs: admin-guide: Fix default value of max_map_count in sysctl/vm.rst
  Documentation/submitting-patches: Document the SoB chain
  Documentation: process: Correct numbering
  docs: submitting-patches: Trivial - fix grammatical error

Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
"Various bug fixes and cleanups for ext4; no new features this cycle"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (29 commits)
  ext4: remove unnecessary wbc parameter from ext4_bio_write_page
  ext4: avoid s_mb_prefetch to be zero in individual scenarios
  ext4: defer saving error info from atomic context
  ext4: simplify ext4 error translation
  ext4: move functions in super.c
  ext4: make ext4_abort() use __ext4_error()
  ext4: standardize error message in ext4_protect_reserved_inode()
  ext4: remove redundant sb checksum recomputation
  ext4: don't remount read-only with errors=continue on reboot
  ext4: fix deadlock with fs freezing and EA inodes
  jbd2: add a helper to find out number of fast commit blocks
  ext4: make fast_commit.h byte identical with e2fsprogs/fast_commit.h
  ext4: fix fall-through warnings for Clang
  ext4: add docs about fast commit idempotence
  ext4: remove the unused EXT4_CURRENT_REV macro
  ext4: fix an IS_ERR() vs NULL check
  ext4: check for invalid block size early when mounting a file system
  ext4: fix a memory leak of ext4_free_data
  ext4: delete nonsensical (commented-out) code inside ext4_xattr_block_set()
  ext4: update ext4_data_block_valid related comments
  ...

Merge tag 'Smack-for-5.11-io_uring-fix' of git://github.com/cschaufler/smack-next

Pull smack fix from Casey Schaufler:
"Provide a fix for the incorrect handling of privilege in the face of
  io_uring's use of kernel threads. That invalidated an long standing
  assumption regarding the privilege of kernel threads.

  The fix is simple and safe. It was provided by Jens Axboe and has been
  tested"

* tag 'Smack-for-5.11-io_uring-fix' of git://github.com/cschaufler/smack-next:
  Smack: Handle io_uring kernel thread privileges

Merge tag 'riscv-for-linus-5.11-rc1' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fix from Palmer Dabbelt
"Avoid trying to initialize memory regions outside the usable range"

* tag 'riscv-for-linus-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Fix usage of memblock_enforce_memory_limit

Merge tag 'powerpc-5.11-2' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

- Four commits fixing various things in the new C VDSO code

- One fix for a 32-bit VMAP stack bug

- Two minor build fixes

Thanks to Cédric Le Goater, Christophe Leroy, and Will Springer.

* tag 'powerpc-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/32: Fix vmap stack - Properly set r1 before activating MMU on syscall too
  powerpc/vdso: Fix DOTSYM for 32-bit LE VDSO
  powerpc/vdso: Don't pass 64-bit ABI cflags to 32-bit VDSO
  powerpc/vdso: Block R_PPC_REL24 relocations
  powerpc/smp: Add __init to init_big_cores()
  powerpc/time: Force inlining of get_tb()
  powerpc/boot: Fix build of dts/fsl

Merge tag 'irq-core-2020-12-23' of git://git./linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
"This is the second attempt after the first one failed miserably and
  got zapped to unblock the rest of the interrupt related patches.

  A treewide cleanup of interrupt descriptor (ab)use with all sorts of
  racy accesses, inefficient and disfunctional code. The goal is to
  remove the export of irq_to_desc() to prevent these things from
  creeping up again"

* tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
  genirq: Restrict export of irq_to_desc()
  xen/events: Implement irq distribution
  xen/events: Reduce irq_info:: Spurious_cnt storage size
  xen/events: Only force affinity mask for percpu interrupts
  xen/events: Use immediate affinity setting
  xen/events: Remove disfunct affinity spreading
  xen/events: Remove unused bind_evtchn_to_irq_lateeoi()
  net/mlx5: Use effective interrupt affinity
  net/mlx5: Replace irq_to_desc() abuse
  net/mlx4: Use effective interrupt affinity
  net/mlx4: Replace irq_to_desc() abuse
  PCI: mobiveil: Use irq_data_get_irq_chip_data()
  PCI: xilinx-nwl: Use irq_data_get_irq_chip_data()
  NTB/msi: Use irq_has_action()
  mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc
  pinctrl: nomadik: Use irq_has_action()
  drm/i915/pmu: Replace open coded kstat_irqs() copy
  drm/i915/lpe_audio: Remove pointless irq_to_desc() usage
  s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt()
  parisc/irq: Use irq_desc_kstat_cpu() in show_interrupts()
  ...

Merge tag 'efi_updates_for_v5.11' of git://git./linux/kernel/git/tip/tip

Pull EFI updates from Borislav Petkov:
"These got delayed due to a last minute ia64 build issue which got
  fixed in the meantime.

  EFI updates collected by Ard Biesheuvel:

   - Don't move BSS section around pointlessly in the x86 decompressor

   - Refactor helper for discovering the EFI secure boot mode

   - Wire up EFI secure boot to IMA for arm64

   - Some fixes for the capsule loader

   - Expose the RT_PROP table via the EFI test module

   - Relax DT and kernel placement restrictions on ARM

  with a few followup fixes:

   - fix the build breakage on IA64 caused by recent capsule loader
     changes

   - suppress a type mismatch build warning in the expansion of
     EFI_PHYS_ALIGN on ARM"

* tag 'efi_updates_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: arm: force use of unsigned type for EFI_PHYS_ALIGN
  efi: ia64: disable the capsule loader
  efi: stub: get rid of efi_get_max_fdt_addr()
  efi/efi_test: read RuntimeServicesSupported
  efi: arm: reduce minimum alignment of uncompressed kernel
  efi: capsule: clean scatter-gather entries from the D-cache
  efi: capsule: use atomic kmap for transient sglist mappings
  efi: x86/xen: switch to efi_get_secureboot_mode helper
  arm64/ima: add ima_arch support
  ima: generalize x86/EFI arch glue for other EFI architectures
  efi: generalize efi_get_secureboot
  efi/libstub: EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER should not default to yes
  efi/x86: Only copy the compressed kernel image in efi_relocate_kernel()
  efi/libstub/x86: simplify efi_is_native()

Merge tag 'io_uring-5.11-2020-12-23' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
"All straight fixes, or a prep patch for a fix, either bound for stable
  or fixing issues from this merge window. In particular:

   - Fix new shutdown op not breaking links on failure

   - Hold mm->mmap_sem for mm->locked_vm manipulation

   - Various cancelation fixes (me, Pavel)

   - Fix error path potential double ctx free (Pavel)

   - IOPOLL fixes (Xiaoguang)"

* tag 'io_uring-5.11-2020-12-23' of git://git.kernel.dk/linux-block:
  io_uring: hold uring_lock while completing failed polled io in io_wq_submit_work()
  io_uring: fix double io_uring free
  io_uring: fix ignoring xa_store errors
  io_uring: end waiting before task cancel attempts
  io_uring: always progress task_work on task cancel
  io-wq: kill now unused io_wq_cancel_all()
  io_uring: make ctx cancel on exit targeted to actual ctx
  io_uring: fix 0-iov read buffer select
  io_uring: close a small race gap for files cancel
  io_uring: fix io_wqe->work_list corruption
  io_uring: limit {io|sq}poll submit locking scope
  io_uring: inline io_cqring_mark_overflow()
  io_uring: consolidate CQ nr events calculation
  io_uring: remove racy overflow list fast checks
  io_uring: cancel reqs shouldn't kill overflow list
  io_uring: hold mmap_sem for mm->locked_vm manipulation
  io_uring: break links on shutdown failure

Merge tag 'block-5.11-2020-12-23' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A few stragglers in here, but mostly just straight fixes. In
  particular:

   - Set of rnbd fixes for issues around changes for the merge window
     (Gioh, Jack, Md Haris Iqbal)

   - iocost tracepoint addition (Baolin)

   - Copyright/maintainers update (Christoph)

   - Remove old blk-mq fast path CPU warning (Daniel)

   - loop max_part fix (Josh)

   - Remote IPI threaded IRQ fix (Sebastian)

   - dasd stable fixes (Stefan)

   - bcache merge window fixup and style fixup (Yi, Zheng)"

* tag 'block-5.11-2020-12-23' of git://git.kernel.dk/linux-block:
  md/bcache: convert comma to semicolon
  bcache:remove a superfluous check in register_bcache
  block: update some copyrights
  block: remove a pointless self-reference in block_dev.c
  MAINTAINERS: add fs/block_dev.c to the block section
  blk-mq: Don't complete on a remote CPU in force threaded mode
  s390/dasd: fix list corruption of lcu list
  s390/dasd: fix list corruption of pavgroup group list
  s390/dasd: prevent inconsistent LCU device data
  s390/dasd: fix hanging device offline processing
  blk-iocost: Add iocg idle state tracepoint
  nbd: Respect max_part for all partition scans
  block/rnbd-clt: Does not request pdu to rtrs-clt
  block/rnbd-clt: Dynamically allocate sglist for rnbd_iu
  block/rnbd: Set write-back cache and fua same to the target device
  block/rnbd: Fix typos
  block/rnbd-srv: Protect dev session sysfs removal
  block/rnbd-clt: Fix possible memleak
  block/rnbd-clt: Get rid of warning regarding size argument in strlcpy
  blk-mq: Remove 'running from the wrong CPU' warning

Merge tag 'libnvdimm-for-5.11' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
"Twas the day before Christmas and the only thing stirring in libnvdimm
  / device-dax land is a pile of miscellaneous fixups and cleanups.

  The bulk of it has appeared in -next save the last two patches to
  device-dax that have passed my build and unit tests.

   - Fix a long standing block-window-namespace issue surfaced by the
     ndctl change to attempt to preserve the kernel device name over
     a 'reconfigure'

   - Fix a few error path memory leaks in nfit and device-dax

   - Silence a smatch warning in the ioctl path

   - Miscellaneous cleanups"

* tag 'libnvdimm-for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  device-dax: Avoid an unnecessary check in alloc_dev_dax_range()
  device-dax: Fix range release
  device-dax: delete a redundancy check in dev_dax_validate_align()
  libnvdimm/label: Return -ENXIO for no slot in __blk_label_update
  device-dax/core: Fix memory leak when rmmod dax.ko
  device-dax/pmem: Convert comma to semicolon
  libnvdimm: Cleanup include of badblocks.h
  ACPI: NFIT: Fix input validation of bus-family
  libnvdimm/namespace: Fix reaping of invalidated block-window-namespace labels
  ACPI/nfit: avoid accessing uninitialized memory in acpi_nfit_ctl()

Merge tag 'drm-next-2020-12-24' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Xmas eve pull request present.

  Just some fixes that trickled in this past week: Mostly amdgpu fixes,
  with a dma-buf/mips build fix and some misc komeda fixes.

  dma-buf:
   - fix build on mips

  komeda:
   - fix commit tail operation order
   - NULL pointer fix
   - out of bounds access fix

  ttm:
   - remove an unused function

  amdgpu:
   - Vangogh SMU fixes
   - Arcturus gfx9 fixes
   - Misc display fixes
   - Sienna Cichlid SMU update
   - Fix S3 display memory leak
   - Fix regression caused by DP sub-connector support

  amdkfd:
   - Properly require pcie atomics for gfx10"

* tag 'drm-next-2020-12-24' of git://anongit.freedesktop.org/drm/drm: (31 commits)
  drm/amd/display: Fix memory leaks in S3 resume
  drm/amdgpu: Fix a copy-pasta comment
  drm/amdgpu: only set DP subconnector type on DP and eDP connectors
  drm/amd/pm: bump Sienna Cichlid smu_driver_if version to match latest pmfw
  drm/amd/display: add getter routine to retrieve mpcc mux
  drm/amd/display: always program DPPDTO unless not safe to lower
  drm/amd/display: [FW Promotion] Release 0.0.47
  drm/amd/display: updated wm table for Renoir
  drm/amd/display: Acquire DSC during split stream for ODM only if top_pipe
  drm/amd/display: Multi-display underflow observed
  drm/amd/display: Remove unnecessary NULL check
  drm/amd/display: Update RN/VGH active display count workaround
  drm/amd/display: change SMU repsonse timeout to 2s.
  drm/amd/display: gradually ramp ABM intensity
  drm/amd/display: To modify the condition in indicating branch device
  drm/amd/display: Modify the hdcp device count check condition
  drm/amd/display: Interfaces for hubp blank and soft reset
  drm/amd/display: handler not correctly checked at remove_irq_handler
  drm/amdgpu: check gfx pipe availability before toggling its interrupts
  drm/amdgpu: remove unnecessary asic type check
  ...

Merge tag 'devicetree-fixes-for-5.11-1' of git://git./linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

- Correct the JSON pointer syntax in binding schemas

- Drop unnecessary *-supply schema constraints

- Drop redundant maxItems/items on array schemas

- Fix various yamllint warnings

- Fix various missing 'additionalProperties' properties

* tag 'devicetree-fixes-for-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: Drop redundant maxItems/items
  dt-bindings: net: qcom,ipa: Drop unnecessary type ref on 'memory-region'
  dt-bindings: Drop unnecessary *-supply schemas properties
  dt-bindings/display: abt,y030xx067a: Fix binding
  dt-bindings: clock: imx8qxp-lpcg: eliminate yamllint warnings
  dt-bindings: display: eliminate yamllint warnings
  dt-bindings: media: nokia,smia: eliminate yamllint warnings
  dt-bindings: devapc: add the required property 'additionalProperties'
  dt-bindings: soc: add the required property 'additionalProperties'
  dt-bindings: serial: add the required property 'additionalProperties'
  dt-bindings: xlnx,vcu-settings: fix dt_binding_check warnings
  media: dt-bindings: coda: Add missing 'additionalProperties'
  dt-bindings: Fix JSON pointers

Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:

- vdpa sim refactoring

- virtio mem: Big Block Mode support

- misc cleanus, fixes

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (61 commits)
  vdpa: Use simpler version of ida allocation
  vdpa: Add missing comment for virtqueue count
  uapi: virtio_ids: add missing device type IDs from OASIS spec
  uapi: virtio_ids.h: consistent indentions
  vhost scsi: fix error return code in vhost_scsi_set_endpoint()
  virtio_ring: Fix two use after free bugs
  virtio_net: Fix error code in probe()
  virtio_ring: Cut and paste bugs in vring_create_virtqueue_packed()
  tools/virtio: add barrier for aarch64
  tools/virtio: add krealloc_array
  tools/virtio: include asm/bug.h
  vdpa/mlx5: Use write memory barrier after updating CQ index
  vdpa: split vdpasim to core and net modules
  vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov
  vdpa_sim: make vdpasim->buffer size configurable
  vdpa_sim: use kvmalloc to allocate vdpasim->buffer
  vdpa_sim: set vringh notify callback
  vdpa_sim: add set_config callback in vdpasim_dev_attr
  vdpa_sim: add get_config callback in vdpasim_dev_attr
  vdpa_sim: make 'config' generic and usable for any device type
  ...

Merge branch 'for-5.11/dax' into for-5.11/libnvdimm

Pull in miscellaneous device-dax fixups and cleanups for v5.11.

device-dax: Avoid an unnecessary check in alloc_dev_dax_range()

Swap the calling sequence of krealloc() and __request_region(), call the
latter first. In this way, the value of dev_dax->nr_range does not need to
be considered when __request_region() failed.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20201219081840.1149-2-thunder.leizhen@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

device-dax: Fix range release

There are multiple locations that open-code the release of the last
range in a device-dax instance. Consolidate this into a new
dev_dax_trim_range() helper.

This also addresses a kmemleak report:

# cat /sys/kernel/debug/kmemleak
[..]
unreferenced object 0xffff976bd46f6240 (size 64):
   comm "ndctl", pid 23556, jiffies 4299514316 (age 5406.733s)
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 00 00 20 c3 37 00 00 00  .......... .7...
     ff ff ff 7f 38 00 00 00 00 00 00 00 00 00 00 00  ....8...........
   backtrace:
     [<00000000064003cf>] __kmalloc_track_caller+0x136/0x379
     [<00000000d85e3c52>] krealloc+0x67/0x92
     [<00000000d7d3ba8a>] __alloc_dev_dax_range+0x73/0x25c
     [<0000000027d58626>] devm_create_dev_dax+0x27d/0x416
     [<00000000434abd43>] __dax_pmem_probe+0x1c9/0x1000 [dax_pmem_core]
     [<0000000083726c1c>] dax_pmem_probe+0x10/0x1f [dax_pmem]
     [<00000000b5f2319c>] nvdimm_bus_probe+0x9d/0x340 [libnvdimm]
     [<00000000c055e544>] really_probe+0x230/0x48d
     [<000000006cabd38e>] driver_probe_device+0x122/0x13b
     [<0000000029c7b95a>] device_driver_attach+0x5b/0x60
     [<0000000053e5659b>] bind_store+0xb7/0xc3
     [<00000000d3bdaadc>] drv_attr_store+0x27/0x31
     [<00000000949069c5>] sysfs_kf_write+0x4a/0x57
     [<000000004a8b5adf>] kernfs_fop_write+0x150/0x1e5
     [<00000000bded60f0>] __vfs_write+0x1b/0x34
     [<00000000b92900f0>] vfs_write+0xd8/0x1d1

Reported-by: Jane Chu <jane.chu@oracle.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/160834570161.1791850.14911670304441510419.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

perf probe: Fix memory leak when synthesizing SDT probes

The argv_split() function must be paired with argv_free(), else we must
keep a reference to the argv array received or do the freeing ourselves,
in synthesize_sdt_probe_command() we were simply leaking that argv[]
array.

Fixes: 3b1f8311f6963cd1 ("perf probe: Add sdt probes arguments into the uprobe cmd string")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: He Zhe <zhe.he@windriver.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201224135139.GF477817@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf stat aggregation: Add separate thread member

A separate field isn't strictly required. The core field could be
re-used for thread IDs as a single field was used previously.

But separating them will avoid confusion and catch potential errors
where core IDs are read as thread IDs and vice versa.

Also remove the placeholder id field which is now no longer used.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20201126141328.6509-13-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf stat aggregation: Add separate core member

Add core as a separate member so that it doesn't have to be packed into
the int value.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20201126141328.6509-12-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf stat aggregation: Add separate die member

Add die as a separate member so that it doesn't have to be packed into
the int value.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20201126141328.6509-11-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>