Dennis Zhou (Facebook) [Fri, 31 Aug 2018 20:22:43 +0000 (16:22 -0400)]
blkcg: delay blkg destruction until after writeback has finished
Currently, blkcg destruction relies on a sequence of events:
1. Destruction starts. blkcg_css_offline() is called and blkgs
release their reference to the blkcg. This immediately destroys
the cgwbs (writeback).
2. With blkgs giving up their reference, the blkcg ref count should
become zero and eventually call blkcg_css_free() which finally
frees the blkcg.
Jiufei Xue reported that there is a race between blkcg_bio_issue_check()
and cgroup_rmdir(). To remedy this, blkg destruction becomes contingent
on the completion of all writeback associated with the blkcg. A count of
the number of cgwbs is maintained and once that goes to zero, blkg
destruction can follow. This should prevent premature blkg destruction
related to writeback.
The new process for blkcg cleanup is as follows:
1. Destruction starts. blkcg_css_offline() is called which offlines
writeback. Blkg destruction is delayed on the cgwb_refcnt count to
avoid punting potentially large amounts of outstanding writeback
to root while maintaining any ongoing policies. Here, the base
cgwb_refcnt is put back.
2. When the cgwb_refcnt becomes zero, blkcg_destroy_blkgs() is called
and handles destruction of blkgs. This is where the css reference
held by each blkg is released.
3. Once the blkcg ref count goes to zero, blkcg_css_free() is called.
This finally frees the blkg.
It seems in the past blk-throttle didn't do the most understandable
things with taking data from a blkg while associating with current. So,
the simplification and unification of what blk-throttle is doing caused
this.
Fixes:
08e18eab0c579 ("block: add bi_blkg to the bio for cgroups")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Dennis Zhou <dennisszhou@gmail.com>
Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Dennis Zhou (Facebook) [Fri, 31 Aug 2018 20:22:42 +0000 (16:22 -0400)]
Revert "blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()"
This reverts commit
4c6994806f708559c2812b73501406e21ae5dcd0.
Destroying blkgs is tricky because of the nature of the relationship. A
blkg should go away when either a blkcg or a request_queue goes away.
However, blkg's pin the blkcg to ensure they remain valid. To break this
cycle, when a blkcg is offlined, blkgs put back their css ref. This
eventually lets css_free() get called which frees the blkcg.
The above commit (
4c6994806f70) breaks this order of events by trying to
destroy blkgs in css_free(). As the blkgs still hold references to the
blkcg, css_free() is never called.
The race between blkcg_bio_issue_check() and cgroup_rmdir() will be
addressed in the following patch by delaying destruction of a blkg until
all writeback associated with the blkcg has been finished.
Fixes:
4c6994806f70 ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Dennis Zhou <dennisszhou@gmail.com>
Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 29 Aug 2018 17:05:20 +0000 (11:05 -0600)]
Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph.
* 'nvme-4.19' of git://git.infradead.org/nvme:
nvmet: free workqueue object if module init fails
nvme-fcloop: Fix dropped LS's to removed target port
nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event
Scott Bauer [Thu, 26 Apr 2018 17:51:08 +0000 (11:51 -0600)]
cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status
Like
d88b6d04: "cdrom: information leak in cdrom_ioctl_media_changed()"
There is another cast from unsigned long to int which causes
a bounds check to fail with specially crafted input. The value is
then used as an index in the slot array in cdrom_slot_status().
Signed-off-by: Scott Bauer <scott.bauer@intel.com>
Signed-off-by: Scott Bauer <sbauer@plzdonthack.me>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Chaitanya Kulkarni [Thu, 16 Aug 2018 01:48:25 +0000 (18:48 -0700)]
nvmet: free workqueue object if module init fails
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
James Smart [Thu, 9 Aug 2018 23:00:14 +0000 (16:00 -0700)]
nvme-fcloop: Fix dropped LS's to removed target port
When a targetport is removed from the config, fcloop will avoid calling
the LS done() routine thinking the targetport is gone. This leaves the
initiator reset/reconnect hanging as it waits for a status on the
Create_Association LS for the reconnect.
Change the filter in the LS callback path. If tport null (set when
failed validation before "sending to remote port"), be sure to call
done. This was the main bug. But, continue the logic that only calls
done if tport was set but there is no remoteport (e.g. case where
remoteport has been removed, thus host doesn't expect a completion).
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Michal Wnukowski [Wed, 15 Aug 2018 22:51:57 +0000 (15:51 -0700)]
nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event
In many architectures loads may be reordered with older stores to
different locations. In the nvme driver the following two operations
could be reordered:
- Write shadow doorbell (dbbuf_db) into memory.
- Read EventIdx (dbbuf_ei) from memory.
This can result in a potential race condition between driver and VM host
processing requests (if given virtual NVMe controller has a support for
shadow doorbell). If that occurs, then the NVMe controller may decide to
wait for MMIO doorbell from guest operating system, and guest driver may
decide not to issue MMIO doorbell on any of subsequent commands.
This issue is purely timing-dependent one, so there is no easy way to
reproduce it. Currently the easiest known approach is to run "Oracle IO
Numbers" (orion) that is shipped with Oracle DB:
orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \
concat -write 40 -duration 120 -matrix row -testname nvme_test
Where nvme_test is a .lun file that contains a list of NVMe block
devices to run test against. Limiting number of vCPUs assigned to given
VM instance seems to increase chances for this bug to occur. On test
environment with VM that got 4 NVMe drives and 1 vCPU assigned the
virtual NVMe controller hang could be observed within 10-20 minutes.
That correspond to about 400-500k IO operations processed (or about
100GB of IO read/writes).
Orion tool was used as a validation and set to run in a loop for 36
hours (equivalent of pushing 550M IO operations). No issues were
observed. That suggest that the patch fixes the issue.
Fixes:
f9f38e33389c ("nvme: improve performance for virtual NVMe devices")
Signed-off-by: Michal Wnukowski <wnukowski@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
[hch: updated changelog and comment a bit]
Signed-off-by: Christoph Hellwig <hch@lst.de>
John Pittman [Mon, 27 Aug 2018 18:33:05 +0000 (14:33 -0400)]
block: bsg: move atomic_t ref_count variable to refcount API
Currently, variable ref_count within the bsg_device struct is of
type atomic_t. For variables being used as reference counters,
the refcount API should be used instead of atomic. The newer
refcount API works to prevent counter overflows and use-after-free
bugs. So, move this varable from the atomic API to refcount,
potentially avoiding the issues mentioned.
Signed-off-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Chengguang Xu [Mon, 27 Aug 2018 23:31:11 +0000 (07:31 +0800)]
block: remove unnecessary condition check
kmem_cache_destroy() can handle NULL pointer correctly, so there is
no need to check e->icq_cache before calling kmem_cache_destroy().
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Walleij [Sun, 15 Jul 2018 20:09:29 +0000 (22:09 +0200)]
ata: ftide010: Add a quirk for SQ201
The DMA is broken on this specific device for some unknown
reason (probably badly designed or plain broken interface
electronics) and will only work with PIO. Other users of
the same hardware does not have this problem.
Add a specific quirk so that this Gemini device gets
DMA turned off. Also fix up some code around passing the
port information around in probe while we're at it.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 27 Aug 2018 19:32:12 +0000 (13:32 -0600)]
blk-wbt: remove dead code
We already note and mark discard and swap IO from bio_to_wbt_flags().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 27 Aug 2018 17:27:32 +0000 (11:27 -0600)]
Merge branch 'stable/for-jens-4.19' of git://git./linux/kernel/git/konrad/xen into for-linus
Pull Xen block driver fixes from Konrad:
"Fix for flushing out persistent pages at a deterministic rate"
* 'stable/for-jens-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/blkback: remove unused pers_gnts_lock from struct xen_blkif_ring
xen/blkback: move persistent grants flags to bool
xen/blkfront: reorder tests in xlblk_init()
xen/blkfront: cleanup stale persistent grants
xen/blkback: don't keep persistent grants too long
Jens Axboe [Sun, 26 Aug 2018 16:10:05 +0000 (10:10 -0600)]
blk-wbt: improve waking of tasks
We have two potential issues:
1) After commit
2887e41b910b, we only wake one process at the time when
we finish an IO. We really want to wake up as many tasks as can
queue IO. Before this commit, we woke up everyone, which could cause
a thundering herd issue.
2) A task can potentially consume two wakeups, causing us to (in
practice) miss a wakeup.
Fix both by providing our own wakeup function, which stops
__wake_up_common() from waking up more tasks if we fail to get a
queueing token. With the strict ordering we have on the wait list, this
wakes the right tasks and the right amount of tasks.
Based on a patch from Jianchao Wang <jianchao.w.wang@oracle.com>.
Tested-by: Agarwal, Anchal <anchalag@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Sun, 26 Aug 2018 16:09:06 +0000 (10:09 -0600)]
blk-wbt: abstract out end IO completion handler
Prep patch for calling the handler from a different context,
no functional changes in this patch.
Tested-by: Agarwal, Anchal <anchalag@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Juergen Gross [Mon, 13 Aug 2018 14:01:14 +0000 (16:01 +0200)]
xen/blkback: remove unused pers_gnts_lock from struct xen_blkif_ring
pers_gnts_lock isn't being used anywhere. Remove it.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Juergen Gross [Mon, 13 Aug 2018 14:01:13 +0000 (16:01 +0200)]
xen/blkback: move persistent grants flags to bool
The struct persistent_gnt flags member is meant to be a bitfield of
different flags. There is only PERSISTENT_GNT_ACTIVE flag left, so
convert it to a bool named "active".
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Juergen Gross [Mon, 13 Aug 2018 14:01:12 +0000 (16:01 +0200)]
xen/blkfront: reorder tests in xlblk_init()
In case we don't want pv block devices we should not test parameters
for sanity and eventually print out error messages. So test precluding
conditions before checking parameters.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Juergen Gross [Mon, 13 Aug 2018 14:01:11 +0000 (16:01 +0200)]
xen/blkfront: cleanup stale persistent grants
Add a periodic cleanup function to remove old persistent grants which
are no longer in use on the backend side. This avoids starvation in
case there are lots of persistent grants for a device which no longer
is involved in I/O business.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Juergen Gross [Mon, 13 Aug 2018 14:01:10 +0000 (16:01 +0200)]
xen/blkback: don't keep persistent grants too long
Persistent grants are allocated until a threshold per ring is being
reached. Those grants won't be freed until the ring is being destroyed
meaning there will be resources kept busy which might no longer be
used.
Instead of freeing only persistent grants until the threshold is
reached add a timestamp and remove all persistent grants not having
been in use for a minute.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Linus Torvalds [Sat, 25 Aug 2018 20:30:23 +0000 (13:30 -0700)]
Merge tag 'for-linus-
20180825' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few small fixes for this merge window:
- Locking imbalance fix for bcache (Shan Hai)
- A few small fixes for wbt. One is a cleanup/prep, one is a fix for
an existing issue, and the last two are fixes for changes that went
into this merge window (me)"
* tag 'for-linus-
20180825' of git://git.kernel.dk/linux-block:
blk-wbt: don't maintain inflight counts if disabled
blk-wbt: fix has-sleeper queueing check
blk-wbt: use wq_has_sleeper() for wq active check
blk-wbt: move disable check into get_limit()
bcache: release dc->writeback_lock properly in bch_writeback_thread()
Linus Torvalds [Sat, 25 Aug 2018 20:27:35 +0000 (13:27 -0700)]
Merge tag 'upstream-4.19-rc1-fix' of git://git.infradead.org/linux-ubifs
Pull UBIFS fix from Richard Weinberger:
"Remove an empty file from UBIFS source"
* tag 'upstream-4.19-rc1-fix' of git://git.infradead.org/linux-ubifs:
ubifs: Remove empty file.h
Linus Torvalds [Sat, 25 Aug 2018 20:17:53 +0000 (13:17 -0700)]
Merge tag '4.19-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three small SMB3 fixes, one for stable"
* tag '4.19-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module version number for cifs.ko to 2.12
cifs: check kmalloc before use
cifs: check if SMB2 PDU size has been padded and suppress the warning
cifs: create a define for how many iovs we need for an SMB2_open()
Linus Torvalds [Mon, 9 Jul 2018 20:19:49 +0000 (13:19 -0700)]
mm/cow: don't bother write protecting already write-protected pages
This is not normally noticeable, but repeated forks are unnecessarily
expensive because they repeatedly dirty the parent page tables during
the page table copy operation.
It's trivial to just avoid write protecting the page table entry if it
was already not writable.
This patch was inspired by
https://bugzilla.kernel.org/show_bug.cgi?id=200447
which points to an ancient "waste time re-doing fork" issue in the
presence of lots of signals.
That bug was fixed by Eric Biederman's signal handling series
culminating in commit
c3ad2c3b02e9 ("signal: Don't restart fork when
signals come in"), but the unnecessary work for repeated forks is still
work just fixing, particularly since the fix is trivial.
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Colin Ian King [Sat, 25 Aug 2018 10:24:31 +0000 (12:24 +0200)]
hpfs: remove unnecessary checks on the value of r when assigning error code
At the point where r is being checked for different values, r is always
going to be equal to 2 as the previous if statements jump to end or end1
if r is not 2. Hence the assignment to err can be simplified to just
err an assignment without any checks on the value or r.
Detected by CoverityScan, CID#1226737 ("Logically dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jens Axboe [Sat, 25 Aug 2018 16:10:33 +0000 (10:10 -0600)]
libata: maintainership update
Tejun Heo wrote:
>
> I asked Jens whether he could take care of the libata tree and he
> thankfully agreed, so, from now on, Jens will be the libata
> maintainer.
>
> Thanks a lot!
Thanks for your work in this area. I still remember the first linux
storage summit we did in Vancouver 2001, Tejun was invited to talk about
his libata error handling work. Before that, it was basically a crap
shoot if we recovered properly or not... A lot of water has flown under
the bridge since then!
Here's an "official" patch. Linus, can you apply it?
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 24 Aug 2018 20:20:33 +0000 (13:20 -0700)]
Merge branch 'for-4.19' of git://git./linux/kernel/git/tj/libata
Pull libata updates from Tejun Heo:
"Nothing too interesting. Mostly ahci and ahci_platform changes, many
around power management"
* 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (22 commits)
ata: ahci_platform: enable to get and control reset
ata: libahci_platform: add reset control support
ata: add an extra argument to ahci_platform_get_resources()
ata: sata_rcar: Add r8a77965 support
ata: sata_rcar: exclude setting of PHY registers in Gen3
ata: sata_rcar: really mask all interrupts on Gen2 and later
Revert "ata: ahci_platform: allow disabling of hotplug to save power"
ata: libahci: Allow reconfigure of DEVSLP register
ata: libahci: Correct setting of DEVSLP register
ata: ahci: Enable DEVSLP by default on x86 with SLP_S0
ata: ahci: Support state with min power but Partial low power state
Revert "ata: ahci_platform: convert kcalloc to devm_kcalloc"
ata: sata_rcar: Add rudimentary Runtime PM support
ata: sata_rcar: Provide a short-hand for &pdev->dev
ata: Only output sg element mapped number in verbose debug
ata: Guard ata_scsi_dump_cdb() by ATA_VERBOSE_DEBUG
ata: ahci_platform: convert kcalloc to devm_kcalloc
ata: ahci_platform: convert kzallloc to kcalloc
ata: ahci_platform: correct parameter documentation for ahci_platform_shutdown
libata: remove ata_sff_data_xfer_noirq()
...
Linus Torvalds [Fri, 24 Aug 2018 20:19:27 +0000 (13:19 -0700)]
Merge branch 'for-4.19' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
"Just one commit from Steven to take out spin lock from trace event
handlers"
* 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/tracing: Move taking of spin lock out of trace event handlers
Linus Torvalds [Fri, 24 Aug 2018 20:16:36 +0000 (13:16 -0700)]
Merge branch 'for-4.19' of git://git./linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
"Over the lockdep cross-release churn, workqueue lost some of the
existing annotations. Johannes Berg restored it and also improved
them"
* 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: re-add lockdep dependencies for flushing
workqueue: skip lockdep wq dependency in cancel_work_sync()
Linus Torvalds [Fri, 24 Aug 2018 20:10:38 +0000 (13:10 -0700)]
Merge tag 'iommu-updates-v4.19' of git://git./linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- PASID table handling updates for the Intel VT-d driver. It implements
a global PASID space now so that applications usings multiple devices
will just have one PASID.
- A new config option to make iommu passthroug mode the default.
- New sysfs attribute for iommu groups to export the type of the
default domain.
- A debugfs interface (for debug only) usable by IOMMU drivers to
export internals to user-space.
- R-Car Gen3 SoCs support for the ipmmu-vmsa driver
- The ARM-SMMU now aborts transactions from unknown devices and devices
not attached to any domain.
- Various cleanups and smaller fixes all over the place.
* tag 'iommu-updates-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (42 commits)
iommu/omap: Fix cache flushes on L2 table entries
iommu: Remove the ->map_sg indirection
iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel
iommu/arm-smmu-v3: Prevent any devices access to memory without registration
iommu/ipmmu-vmsa: Don't register as BUS IOMMU if machine doesn't have IPMMU-VMSA
iommu/ipmmu-vmsa: Clarify supported platforms
iommu/ipmmu-vmsa: Fix allocation in atomic context
iommu: Add config option to set passthrough as default
iommu: Add sysfs attribyte for domain type
iommu/arm-smmu-v3: sync the OVACKFLG to PRIQ consumer register
iommu/arm-smmu: Error out only if not enough context interrupts
iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE
iommu/io-pgtable-arm: Fix pgtable allocation in selftest
iommu/vt-d: Remove the obsolete per iommu pasid tables
iommu/vt-d: Apply per pci device pasid table in SVA
iommu/vt-d: Allocate and free pasid table
iommu/vt-d: Per PCI device pasid table interfaces
iommu/vt-d: Add for_each_device_domain() helper
iommu/vt-d: Move device_domain_info to header
iommu/vt-d: Apply global PASID in SVA
...
Linus Torvalds [Fri, 24 Aug 2018 20:03:51 +0000 (13:03 -0700)]
Merge branch 'next' of git://git./linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
- Add Daniel Lezcano as the reviewer of thermal framework and SoC
driver changes (Daniel Lezcano).
- Fix a bug in intel_dts_soc_thermal driver, which does not translate
IO-APIC GSI (Global System Interrupt) into Linux irq number (Hans de
Goede).
- For device tree bindings, allow cooling devices sharing same trip
point with same contribution value to share cooling map (Viresh
Kumar).
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
dt-bindings: thermal: Allow multiple devices to share cooling map
MAINTAINERS: Add Daniel Lezcano as designated reviewer for thermal
Thermal: Intel SoC DTS: Translate IO-APIC GSI number to linux irq number
Linus Torvalds [Fri, 24 Aug 2018 20:00:33 +0000 (13:00 -0700)]
Merge tag 'apparmor-pr-2018-08-23' of git://git./linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"There is nothing major this time just four bug fixes and a patch to
remove some dead code:
Cleanups:
- remove no-op permission check in policy_unpack
Bug fixes:
- fix an error code in __aa_create_ns()
- fix failure to audit context info in build_change_hat
- check buffer bounds when mapping permissions mask
- fully initialize aa_perms struct when answering userspace query"
* tag 'apparmor-pr-2018-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: remove no-op permission check in policy_unpack
apparmor: fix an error code in __aa_create_ns()
apparmor: Fix failure to audit context info in build_change_hat
apparmor: Fully initialize aa_perms struct when answering userspace query
apparmor: Check buffer bounds when mapping permissions mask
Linus Torvalds [Fri, 24 Aug 2018 16:34:23 +0000 (09:34 -0700)]
Merge tag 'powerpc-4.19-2' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- An implementation for the newly added hv_ops->flush() for the OPAL
hvc console driver backends, I forgot to apply this after merging the
hvc driver changes before the merge window.
- Enable all PCI bridges at boot on powernv, to avoid races when
multiple children of a bridge try to enable it simultaneously. This
is a workaround until the PCI core can be enhanced to fix the races.
- A fix to query PowerVM for the correct system topology at boot before
initialising sched domains, seen in some configurations to cause
broken scheduling etc.
- A fix for pte_access_permitted() on "nohash" platforms.
- Two commits to fix SIGBUS when using remap_pfn_range() seen on Power9
due to a workaround when using the nest MMU (GPUs, accelerators).
- Another fix to the VFIO code used by KVM, the previous fix had some
bugs which caused guests to not start in some configurations.
- A handful of other minor fixes.
Thanks to: Aneesh Kumar K.V, Benjamin Herrenschmidt, Christophe Leroy,
Hari Bathini, Luke Dashjr, Mahesh Salgaonkar, Nicholas Piggin, Paul
Mackerras, Srikar Dronamraju.
* tag 'powerpc-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mce: Fix SLB rebolting during MCE recovery path.
KVM: PPC: Book3S: Fix guest DMA when guest partially backed by THP pages
powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition
powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid.
powerpc/nohash: fix pte_access_permitted()
powerpc/topology: Get topology for shared processors at boot
powerpc64/ftrace: Include ftrace.h needed for enable/disable calls
powerpc/powernv/pci: Work around races in PCI bridge enabling
powerpc/fadump: cleanup crash memory ranges support
powerpc/powernv: provide a console flush operation for opal hvc driver
powerpc/traps: Avoid rate limit messages from show unhandled signals
powerpc/64s: Fix PACA_IRQ_HARD_DIS accounting in idle_power4()
Linus Torvalds [Fri, 24 Aug 2018 16:31:34 +0000 (09:31 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
- A couple of patches for the zcrypt driver:
+ Add two masks to determine which AP cards and queues are host
devices, this will be useful for KVM AP device passthrough
+ Add-on patch to improve the parsing of the new apmask and aqmask
+ Some code beautification
- Second try to reenable the GCC plugins, the first patch set had a
patch to do this but the merge somehow missed this
- Remove the s390 specific GCC version check and use the generic one
- Three patches for kdump, two bug fixes and one cleanup
- Three patches for the PCI layer, one bug fix and two cleanups
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: remove gcc version check (4.3 or newer)
s390/zcrypt: hex string mask improvements for apmask and aqmask.
s390/zcrypt: AP bus support for alternate driver(s)
s390/zcrypt: code beautify
s390/zcrypt: switch return type to bool for ap_instructions_available()
s390/kdump: Remove kzalloc_panic
s390/kdump: Fix memleak in nt_vmcoreinfo
s390/kdump: Make elfcorehdr size calculation ABI compliant
s390/pci: remove fmb address from debug output
s390/pci: remove stale rc
s390/pci: fix out of bounds access during irq setup
s390/zcrypt: fix ap_instructions_available() returncodes
s390: reenable gcc plugins for real
Linus Torvalds [Fri, 24 Aug 2018 16:29:44 +0000 (09:29 -0700)]
Merge tag 'acpi-4.19-rc1-3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI Kconfig fix from Rafael Wysocki:
"Fix recent menuconfig breakage causing it to present ACPI-specific
options incorrectly (Arnd Bergmann)"
* tag 'acpi-4.19-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: fix menuconfig presentation of ACPI submenu
Linus Torvalds [Fri, 24 Aug 2018 16:25:39 +0000 (09:25 -0700)]
Merge branch 'userns-linus' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull namespace fixes from Eric Biederman:
"This is a set of four fairly obvious bug fixes:
- a switch from d_find_alias to d_find_any_alias because the xattr
code perversely takes a dentry
- two mutex vs copy_to_user fixes from Jann Horn
- a fix to use a sanitized size not the size userspace passed in from
Christian Brauner"
* 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
getxattr: use correct xattr length
sys: don't hold uts_sem while accessing userspace memory
userns: move user access out of the mutex
cap_inode_getsecurity: use d_find_any_alias() instead of d_find_alias()
Linus Torvalds [Fri, 24 Aug 2018 16:22:54 +0000 (09:22 -0700)]
Merge tag 'drm-next-2018-08-24' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Just a couple of fixes"
One MAINTAINERS address change, two panels fixes, and set of amdgpu
fixes (build fixes, display fixes and some others)"
* tag 'drm-next-2018-08-24' of git://anongit.freedesktop.org/drm/drm:
drm/edid: Add 6 bpc quirk for SDC panel in Lenovo B50-80
drm/amd/display: Don't build DCN1 when kcov is enabled
Revert "drm/amdgpu/display: Replace CONFIG_DRM_AMD_DC_DCN1_0 with CONFIG_X86"
drm/amdgpu/display: disable eDP fast boot optimization on DCE8
drm/amdgpu: fix amdgpu_amdkfd_remove_eviction_fence v3
drm/amdgpu: fix incorrect use of drm_file->pid
drm/amdgpu: fix incorrect use of fcheck
drm/powerplay: enable dpm under pass-through
drm/amdgpu: access register without KIQ
drm/amdgpu: set correct base for THM/NBIF/MP1 IP
drm/amd/display: fix dentist did ranges
drm/amd/display: make dp_ss_off optional
drm/amd/display: fix dp_ss_control vbios flag parsing
drm/amd/display: Do not retain link settings
MAINTAINERS: drm-misc: Change seanpaul's email address
drm/panel: simple: tv123wam: Add unprepare delay
Linus Torvalds [Fri, 24 Aug 2018 15:57:26 +0000 (08:57 -0700)]
Merge branch 'i2c/for-next' of git://git./linux/kernel/git/wsa/linux
Pull second i2c update from Wolfram Sang:
"As promised, here is my 2nd pull request for I2C, containing:
- removal of the attach_adapter callback, converting its last user
- removal of any __deprecated usage within I2C
- one email address update
- some SPDX conversion"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: don't use any __deprecated handling anymore
i2c: use SPDX identifier for Renesas drivers
i2c: ocores: update my email address
i2c: remove deprecated attach_adapter callback
macintosh: therm_windtunnel: drop using attach_adapter
Linus Torvalds [Fri, 24 Aug 2018 15:45:19 +0000 (08:45 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"virtio, vhost: fixes, tweaks
No new features but a bunch of tweaks such as switching balloon from
oom notifier to shrinker"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost/scsi: increase VHOST_SCSI_PREALLOC_PROT_SGLS to 2048
vhost: allow vhost-scsi driver to be built-in
virtio: pci-legacy: Validate queue pfn
virtio: mmio-v1: Validate queue PFN
virtio_balloon: replace oom notifier with shrinker
virtio-balloon: kzalloc the vb struct
virtio-balloon: remove BUG() in init_vqs
Sedat Dilek [Sun, 19 Aug 2018 13:51:35 +0000 (15:51 +0200)]
i2c: don't use any __deprecated handling anymore
This can be dropped with commit
771c035372a036f83353eef46dbb829780330234
("deprecate the '__deprecated' attribute warnings entirely and for good")
now in upstream.
And we got rid of the last __deprecated use, too.
Signed-off-by: Sedat Dilek <sedat.dilek@credativ.de>
[wsa: shortened commit message to reflect the current situation]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Wolfram Sang [Tue, 21 Aug 2018 22:02:16 +0000 (00:02 +0200)]
i2c: use SPDX identifier for Renesas drivers
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Peter Korsgaard [Wed, 22 Aug 2018 16:16:07 +0000 (18:16 +0200)]
i2c: ocores: update my email address
The old @sunsite.dk address is no longer active, so update the references.
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Wolfram Sang [Tue, 21 Aug 2018 15:02:40 +0000 (17:02 +0200)]
i2c: remove deprecated attach_adapter callback
There aren't any users left. Remove this callback from the 2.4 times.
Phew, finally, that took years to reach...
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Wolfram Sang [Tue, 21 Aug 2018 15:02:39 +0000 (17:02 +0200)]
macintosh: therm_windtunnel: drop using attach_adapter
As we now have deferred probing, we can use a custom mechanism and
finally get rid of the legacy interface from the i2c core.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Richard Weinberger [Fri, 24 Aug 2018 06:55:15 +0000 (08:55 +0200)]
ubifs: Remove empty file.h
This empty file sneaked into the tree by mistake.
Remove it.
Fixes:
6eb61d587f45 ("ubifs: Pass struct ubifs_info to ubifs_assert()")
Signed-off-by: Richard Weinberger <richard@nod.at>
Dave Airlie [Fri, 24 Aug 2018 03:40:58 +0000 (13:40 +1000)]
Merge tag 'drm-misc-next-fixes-2018-08-23-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
- Add quirk to Lenovo B50-80 to use 6 bpc instead of 8 (Feng)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20180823205434.GA137644@art_vandelay
Linus Torvalds [Fri, 24 Aug 2018 02:20:12 +0000 (19:20 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
- the rest of MM
- various misc fixes and tweaks
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
mm: Change return type int to vm_fault_t for fault handlers
lib/fonts: convert comments to utf-8
s390: ebcdic: convert comments to UTF-8
treewide: convert ISO_8859-1 text comments to utf-8
drivers/gpu/drm/gma500/: change return type to vm_fault_t
docs/core-api: mm-api: add section about GFP flags
docs/mm: make GFP flags descriptions usable as kernel-doc
docs/core-api: split memory management API to a separate file
docs/core-api: move *{str,mem}dup* to "String Manipulation"
docs/core-api: kill trailing whitespace in kernel-api.rst
mm/util: add kernel-doc for kvfree
mm/util: make strndup_user description a kernel-doc comment
fs/proc/vmcore.c: hide vmcoredd_mmap_dumps() for nommu builds
treewide: correct "differenciate" and "instanciate" typos
fs/afs: use new return type vm_fault_t
drivers/hwtracing/intel_th/msu.c: change return type to vm_fault_t
mm: soft-offline: close the race against page allocation
mm: fix race on soft-offlining free huge pages
namei: allow restricted O_CREAT of FIFOs and regular files
hfs: prevent crash on exit from failed search
...
Souptick Joarder [Fri, 24 Aug 2018 00:01:36 +0000 (17:01 -0700)]
mm: Change return type int to vm_fault_t for fault handlers
Use new return type vm_fault_t for fault handler. For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno. Once all instances are converted, vm_fault_t will become a
distinct type.
Ref-> commit
1c8f422059ae ("mm: change return type to vm_fault_t")
The aim is to change the return type of finish_fault() and
handle_mm_fault() to vm_fault_t type. As part of that clean up return
type of all other recursively called functions have been changed to
vm_fault_t type.
The places from where handle_mm_fault() is getting invoked will be
change to vm_fault_t type but in a separate patch.
vmf_error() is the newly introduce inline function in 4.17-rc6.
[akpm@linux-foundation.org: don't shadow outer local `ret' in __do_huge_pmd_anonymous_page()]
Link: http://lkml.kernel.org/r/20180604171727.GA20279@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 24 Aug 2018 00:01:32 +0000 (17:01 -0700)]
lib/fonts: convert comments to utf-8
The font files contain bit masks for characters in the cp437 character
set, and comments showing what character this is supposed to be.
This only makes sense when the terminal used to view the files is set to
the same codepage, but all other files in the kernel now use utf-8
encoding.
This changes those comments to utf-8 as well, for consistency.
Link: http://lkml.kernel.org/r/20180724111600.4158975-3-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 24 Aug 2018 00:01:29 +0000 (17:01 -0700)]
s390: ebcdic: convert comments to UTF-8
The ebcdic.c file contains tables for converting between ebcdic and PC
codepage 437. I could however not identify which encoding was used for
the comments. This seems to be some variation of ISO_8859-1 with
non-UTF-8 escape characters.
I have converted this to UTF-8 by manually removing the escape
characters and then running it through recode, to get the same encoding
that we use for the rest of the kernel.
Link: http://lkml.kernel.org/r/20180724111600.4158975-2-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 24 Aug 2018 00:01:26 +0000 (17:01 -0700)]
treewide: convert ISO_8859-1 text comments to utf-8
Almost all files in the kernel are either plain text or UTF-8 encoded. A
couple however are ISO_8859-1, usually just a few characters in a C
comments, for historic reasons.
This converts them all to UTF-8 for consistency.
Link: http://lkml.kernel.org/r/20180724111600.4158975-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Simon Horman <horms@verge.net.au> [IPVS portion]
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [IIO]
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Acked-by: Rob Herring <robh@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Souptick Joarder [Fri, 24 Aug 2018 00:01:22 +0000 (17:01 -0700)]
drivers/gpu/drm/gma500/: change return type to vm_fault_t
Use new return type vm_fault_t for fault handler. For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno. Once all instances are converted, vm_fault_t will become a
distinct type.
Ref->
1c8f422059ae ("mm: change return type to vm_fault_t")
Previously vm_insert_{pfn,mixed} returns err which driver mapped into
VM_FAULT_* type. The new function vmf_insert_{pfn,mixed} will replace
this inefficiency by returning VM_FAULT_* type.
vmf_error() is the newly introduce inline function in 4.17-rc6.
Link: http://lkml.kernel.org/r/20180713154541.GA3345@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:19 +0000 (17:01 -0700)]
docs/core-api: mm-api: add section about GFP flags
Link: http://lkml.kernel.org/r/1532626360-16650-8-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:15 +0000 (17:01 -0700)]
docs/mm: make GFP flags descriptions usable as kernel-doc
This patch adds DOC: headings for GFP flag descriptions and adjusts the
formatting to fit sphinx expectations of paragraphs.
Link: http://lkml.kernel.org/r/1532626360-16650-7-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:12 +0000 (17:01 -0700)]
docs/core-api: split memory management API to a separate file
This is basically copy-paste of the memory management section from
kernel-api.rst with some minor adjustments:
* The "User Space Memory Access" is moved to the beginning
* The get_user_pages_fast reference is now a part of "User Space Memory
Access"
* And, of course, headings are adjusted with section being promoted to
chapters
Link: http://lkml.kernel.org/r/1532626360-16650-6-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:09 +0000 (17:01 -0700)]
docs/core-api: move *{str,mem}dup* to "String Manipulation"
The string and memory duplication routines fit better to the "String
Manipulation" section than to "The SLAB Cache".
Link: http://lkml.kernel.org/r/1532626360-16650-5-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:05 +0000 (17:01 -0700)]
docs/core-api: kill trailing whitespace in kernel-api.rst
Link: http://lkml.kernel.org/r/1532626360-16650-4-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:02 +0000 (17:01 -0700)]
mm/util: add kernel-doc for kvfree
Link: http://lkml.kernel.org/r/1532626360-16650-3-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:00:59 +0000 (17:00 -0700)]
mm/util: make strndup_user description a kernel-doc comment
Patch series "memory management documentation updates", v3.
Here are several updates to the mm documentation.
Aside from really minor changes in the first three patches, the updates
are:
* move the documentation of kstrdup and friends to "String Manipulation"
section
* split memory management API into a separate .rst file
* adjust formating of the GFP flags description and include it in the
reference documentation.
This patch (of 7):
The description of the strndup_user function misses '*' character at the
beginning of the comment to be proper kernel-doc. Add the missing
character.
Link: http://lkml.kernel.org/r/1532626360-16650-2-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 24 Aug 2018 00:00:55 +0000 (17:00 -0700)]
fs/proc/vmcore.c: hide vmcoredd_mmap_dumps() for nommu builds
Without CONFIG_MMU, we get a build warning:
fs/proc/vmcore.c:228:12: error: 'vmcoredd_mmap_dumps' defined but not used [-Werror=unused-function]
static int vmcoredd_mmap_dumps(struct vm_area_struct *vma, unsigned long dst,
The function is only referenced from an #ifdef'ed caller, so
this uses the same #ifdef around it.
Link: http://lkml.kernel.org/r/20180525213526.2117790-1-arnd@arndb.de
Fixes:
7efe48df8a3d ("vmcore: append device dumps to vmcore as elf notes")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ganesh Goudar <ganeshgr@chelsio.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Finn Thain [Fri, 24 Aug 2018 00:00:52 +0000 (17:00 -0700)]
treewide: correct "differenciate" and "instanciate" typos
Also add these typos to spelling.txt so checkpatch.pl will look for them.
Link: http://lkml.kernel.org/r/88af06b9de34d870cb0afc46cfd24e0458be2575.1529471371.git.fthain@telegraphics.com.au
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Souptick Joarder [Fri, 24 Aug 2018 00:00:48 +0000 (17:00 -0700)]
fs/afs: use new return type vm_fault_t
Use new return type vm_fault_t for fault handler in struct
vm_operations_struct. For now, this is just documenting that the
function returns a VM_FAULT value rather than an errno. Once all
instances are converted, vm_fault_t will become a distinct type.
See
1c8f422059ae ("mm: change return type to vm_fault_t") for reference.
Link: http://lkml.kernel.org/r/20180702152017.GA3780@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Souptick Joarder [Fri, 24 Aug 2018 00:00:45 +0000 (17:00 -0700)]
drivers/hwtracing/intel_th/msu.c: change return type to vm_fault_t
Use new return type vm_fault_t for fault handler. For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno. Once all instances are converted, vm_fault_t will become a
distinct type.
See
1c8f422059ae ("mm: change return type to vm_fault_t") for reference.
Link: http://lkml.kernel.org/r/20180702155801.GA4010@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Fri, 24 Aug 2018 00:00:42 +0000 (17:00 -0700)]
mm: soft-offline: close the race against page allocation
A process can be killed with SIGBUS(BUS_MCEERR_AR) when it tries to
allocate a page that was just freed on the way of soft-offline. This is
undesirable because soft-offline (which is about corrected error) is
less aggressive than hard-offline (which is about uncorrected error),
and we can make soft-offline fail and keep using the page for good
reason like "system is busy."
Two main changes of this patch are:
- setting migrate type of the target page to MIGRATE_ISOLATE. As done
in free_unref_page_commit(), this makes kernel bypass pcplist when
freeing the page. So we can assume that the page is in freelist just
after put_page() returns,
- setting PG_hwpoison on free page under zone->lock which protects
freelists, so this allows us to avoid setting PG_hwpoison on a page
that is decided to be allocated soon.
[akpm@linux-foundation.org: tweak set_hwpoison_free_buddy_page() comment]
Link: http://lkml.kernel.org/r/1531452366-11661-3-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com>
Tested-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <zy.zhengyi@alibaba-inc.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Fri, 24 Aug 2018 00:00:38 +0000 (17:00 -0700)]
mm: fix race on soft-offlining free huge pages
Patch series "mm: soft-offline: fix race against page allocation".
Xishi recently reported the issue about race on reusing the target pages
of soft offlining. Discussion and analysis showed that we need make
sure that setting PG_hwpoison should be done in the right place under
zone->lock for soft offline. 1/2 handles free hugepage's case, and 2/2
hanldes free buddy page's case.
This patch (of 2):
There's a race condition between soft offline and hugetlb_fault which
causes unexpected process killing and/or hugetlb allocation failure.
The process killing is caused by the following flow:
CPU 0 CPU 1 CPU 2
soft offline
get_any_page
// find the hugetlb is free
mmap a hugetlb file
page fault
...
hugetlb_fault
hugetlb_no_page
alloc_huge_page
// succeed
soft_offline_free_page
// set hwpoison flag
mmap the hugetlb file
page fault
...
hugetlb_fault
hugetlb_no_page
find_lock_page
return VM_FAULT_HWPOISON
mm_fault_error
do_sigbus
// kill the process
The hugetlb allocation failure comes from the following flow:
CPU 0 CPU 1
mmap a hugetlb file
// reserve all free page but don't fault-in
soft offline
get_any_page
// find the hugetlb is free
soft_offline_free_page
// set hwpoison flag
dissolve_free_huge_page
// fail because all free hugepages are reserved
page fault
...
hugetlb_fault
hugetlb_no_page
alloc_huge_page
...
dequeue_huge_page_node_exact
// ignore hwpoisoned hugepage
// and finally fail due to no-mem
The root cause of this is that current soft-offline code is written based
on an assumption that PageHWPoison flag should be set at first to avoid
accessing the corrupted data. This makes sense for memory_failure() or
hard offline, but does not for soft offline because soft offline is about
corrected (not uncorrected) error and is safe from data lost. This patch
changes soft offline semantics where it sets PageHWPoison flag only after
containment of the error page completes successfully.
Link: http://lkml.kernel.org/r/1531452366-11661-2-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com>
Suggested-by: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com>
Tested-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <zy.zhengyi@alibaba-inc.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Salvatore Mesoraca [Fri, 24 Aug 2018 00:00:35 +0000 (17:00 -0700)]
namei: allow restricted O_CREAT of FIFOs and regular files
Disallows open of FIFOs or regular files not owned by the user in world
writable sticky directories, unless the owner is the same as that of the
directory or the file is opened without the O_CREAT flag. The purpose
is to make data spoofing attacks harder. This protection can be turned
on and off separately for FIFOs and regular files via sysctl, just like
the symlinks/hardlinks protection. This patch is based on Openwall's
"HARDEN_FIFO" feature by Solar Designer.
This is a brief list of old vulnerabilities that could have been prevented
by this feature, some of them even allow for privilege escalation:
CVE-2000-1134
CVE-2007-3852
CVE-2008-0525
CVE-2009-0416
CVE-2011-4834
CVE-2015-1838
CVE-2015-7442
CVE-2016-7489
This list is not meant to be complete. It's difficult to track down all
vulnerabilities of this kind because they were often reported without any
mention of this particular attack vector. In fact, before
hardlinks/symlinks restrictions, fifos/regular files weren't the favorite
vehicle to exploit them.
[s.mesoraca16@gmail.com: fix bug reported by Dan Carpenter]
Link: https://lkml.kernel.org/r/20180426081456.GA7060@mwanda
Link: http://lkml.kernel.org/r/1524829819-11275-1-git-send-email-s.mesoraca16@gmail.com
[keescook@chromium.org: drop pr_warn_ratelimited() in favor of audit changes in the future]
[keescook@chromium.org: adjust commit subjet]
Link: http://lkml.kernel.org/r/20180416175918.GA13494@beast
Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Suggested-by: Solar Designer <solar@openwall.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ernesto A. Fernández [Fri, 24 Aug 2018 00:00:31 +0000 (17:00 -0700)]
hfs: prevent crash on exit from failed search
hfs_find_exit() expects fd->bnode to be NULL after a search has failed.
hfs_brec_insert() may instead set it to an error-valued pointer. Fix
this to prevent a crash.
Link: http://lkml.kernel.org/r/53d9749a029c41b4016c495fc5838c9dba3afc52.1530294815.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Cc: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ernesto A. Fernandez [Fri, 24 Aug 2018 00:00:28 +0000 (17:00 -0700)]
hfsplus: prevent crash on exit from failed search
hfs_find_exit() expects fd->bnode to be NULL after a search has failed.
hfs_brec_insert() may instead set it to an error-valued pointer. Fix
this to prevent a crash.
Link: http://lkml.kernel.org/r/803590a35221fbf411b2c141419aea3233a6e990.1530294813.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernandez <ernesto.mnd.fernandez@gmail.com>
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ernesto A. Fernández [Fri, 24 Aug 2018 00:00:25 +0000 (17:00 -0700)]
hfsplus: fix NULL dereference in hfsplus_lookup()
An HFS+ filesystem can be mounted read-only without having a metadata
directory, which is needed to support hardlinks. But if the catalog
data is corrupted, a directory lookup may still find dentries claiming
to be hardlinks.
hfsplus_lookup() does check that ->hidden_dir is not NULL in such a
situation, but mistakenly does so after dereferencing it for the first
time. Reorder this check to prevent a crash.
This happens when looking up corrupted catalog data (dentry) on a
filesystem with no metadata directory (this could only ever happen on a
read-only mount). Wen Xu sent the replication steps in detail to the
fsdevel list: https://bugzilla.kernel.org/show_bug.cgi?id=200297
Link: http://lkml.kernel.org/r/20180712215344.q44dyrhymm4ajkao@eaf
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reported-by: Wen Xu <wen.xu@gatech.edu>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Thu, 23 Aug 2018 23:23:04 +0000 (00:23 +0100)]
arm64: tlb: Provide forward declaration of tlb_flush() before including tlb.h
As of commit
fd1102f0aade ("mm: mmu_notifier fix for tlb_end_vma"),
asm-generic/tlb.h now calls tlb_flush() from a static inline function,
so we need to make sure that it's declared before #including the
asm-generic header in the arch header.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 23 Aug 2018 23:03:58 +0000 (16:03 -0700)]
Merge tag 'nfs-for-4.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
"These patches include adding async support for the v4.2 COPY
operation. I think Bruce is planning to send the server patches for
the next release, but I figured we could get the client side out of
the way now since it's been in my tree for a while. This shouldn't
cause any problems, since the server will still respond with
synchronous copies even if the client requests async.
Features:
- Add support for asynchronous server-side COPY operations
Stable bufixes:
- Fix an off-by-one in bl_map_stripe() (v3.17+)
- NFSv4 client live hangs after live data migration recovery (v4.9+)
- xprtrdma: Fix disconnect regression (v4.18+)
- Fix locking in pnfs_generic_recover_commit_reqs (v4.14+)
- Fix a sleep in atomic context in nfs4_callback_sequence() (v4.9+)
Other bugfixes and cleanups:
- Optimizations and fixes involving NFS v4.1 / pNFS layout handling
- Optimize lseek(fd, SEEK_CUR, 0) on directories to avoid locking
- Immediately reschedule writeback when the server replies with an
error
- Fix excessive attribute revalidation in nfs_execute_ok()
- Add error checking to nfs_idmap_prepare_message()
- Use new vm_fault_t return type
- Return a delegation when reclaiming one that the server has
recalled
- Referrals should inherit proto setting from parents
- Make rpc_auth_create_args a const
- Improvements to rpc_iostats tracking
- Fix a potential reference leak when there is an error processing a
callback
- Fix rmdir / mkdir / rename nlink accounting
- Fix updating inode change attribute
- Fix error handling in nfsn4_sp4_select_mode()
- Use an appropriate work queue for direct-write completion
- Don't busy wait if NFSv4 session draining is interrupted"
* tag 'nfs-for-4.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (54 commits)
pNFS: Remove unwanted optimisation of layoutget
pNFS/flexfiles: ff_layout_pg_init_read should exit on error
pNFS: Treat RECALLCONFLICT like DELAY...
pNFS: When updating the stateid in layoutreturn, also update the recall range
NFSv4: Fix a sleep in atomic context in nfs4_callback_sequence()
NFSv4: Fix locking in pnfs_generic_recover_commit_reqs
NFSv4: Fix a typo in nfs4_init_channel_attrs()
NFSv4: Don't busy wait if NFSv4 session draining is interrupted
NFS recover from destination server reboot for copies
NFS add a simple sync nfs4_proc_commit after async COPY
NFS handle COPY ERR_OFFLOAD_NO_REQS
NFS send OFFLOAD_CANCEL when COPY killed
NFS export nfs4_async_handle_error
NFS handle COPY reply CB_OFFLOAD call race
NFS add support for asynchronous COPY
NFS COPY xdr handle async reply
NFS OFFLOAD_CANCEL xdr
NFS CB_OFFLOAD xdr
NFS: Use an appropriate work queue for direct-write completion
NFSv4: Fix error handling in nfs4_sp4_select_mode()
...
Linus Torvalds [Thu, 23 Aug 2018 23:00:10 +0000 (16:00 -0700)]
Merge tag 'nfsd-4.19-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Chuck Lever fixed a problem with NFSv4.0 callbacks over GSS from
multi-homed servers.
The only new feature is a minor bit of protocol (change_attr_type)
which the client doesn't even use yet.
Other than that, various bugfixes and cleanup"
* tag 'nfsd-4.19-1' of git://linux-nfs.org/~bfields/linux: (27 commits)
sunrpc: Add comment defining gssd upcall API keywords
nfsd: Remove callback_cred
nfsd: Use correct credential for NFSv4.0 callback with GSS
sunrpc: Extract target name into svc_cred
sunrpc: Enable the kernel to specify the hostname part of service principals
sunrpc: Don't use stack buffer with scatterlist
rpc: remove unneeded variable 'ret' in rdma_listen_handler
nfsd: use true and false for boolean values
nfsd: constify write_op[]
fs/nfsd: Delete invalid assignment statements in nfsd4_decode_exchange_id
NFSD: Handle full-length symlinks
NFSD: Refactor the generic write vector fill helper
svcrdma: Clean up Read chunk path
svcrdma: Avoid releasing a page in svc_xprt_release()
nfsd: Mark expected switch fall-through
sunrpc: remove redundant variables 'checksumlen','blocksize' and 'data'
nfsd: fix leaked file lock with nfs exported overlayfs
nfsd: don't advertise a SCSI layout for an unsupported request_queue
nfsd: fix corrupted reply to badly ordered compound
nfsd: clarify check_op_ordering
...
Linus Torvalds [Thu, 23 Aug 2018 22:58:04 +0000 (15:58 -0700)]
Merge tag 'upstream-4.19-rc1' of git://git.infradead.org/linux-ubifs
Pull UBI/UBIFS updates from Richard Weinberger:
- Year 2038 preparations
- New UBI feature to skip CRC checks of static volumes
- A new Kconfig option to disable xattrs in UBIFS
- Lots of fixes in UBIFS, found by our new test framework
* tag 'upstream-4.19-rc1' of git://git.infradead.org/linux-ubifs: (21 commits)
ubifs: Set default assert action to read-only
ubifs: Allow setting assert action as mount parameter
ubifs: Rework ubifs_assert()
ubifs: Pass struct ubifs_info to ubifs_assert()
ubifs: Turn two ubifs_assert() into a WARN_ON()
ubi: expose the volume CRC check skip flag
ubi: provide a way to skip CRC checks
ubifs: Use kmalloc_array()
ubifs: Check data node size before truncate
Revert "UBIFS: Fix potential integer overflow in allocation"
ubifs: Add comment on c->commit_sem
ubifs: introduce Kconfig symbol for xattr support
ubifs: use swap macro in swap_dirty_idx
ubifs: tnc: use monotonic znode timestamp
ubifs: use timespec64 for inode timestamps
ubifs: xattr: Don't operate on deleted inodes
ubifs: gc: Fix typo
ubifs: Fix memory leak in lprobs self-check
ubi: Initialize Fastmap checkmapping correctly
ubifs: Fix synced_i_size calculation for xattr inodes
...
Linus Torvalds [Thu, 23 Aug 2018 22:51:09 +0000 (15:51 -0700)]
Merge tag 'pwm/for-4.19-rc1' of git://git./linux/kernel/git/thierry.reding/linux-pwm
Pull pwm updates from Thierry Reding:
"This contains mostly minor bug fixes as well as some new chip support
for existing drivers"
* tag 'pwm/for-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: mediatek: Add MT7628 support
dt-bindings: pwm: Add MT7628 information
dt-bindings: pwm: rcar: Add bindings for R-Car E3 support
pwm: meson: Fix mux clock names
pwm: stm32-lp: Remove useless loop in stm32_pwm_lp_remove()
pwm: omap-dmtimer: Return -EPROBE_DEFER if no dmtimer platform data
pwm: mxs: Switch to SPDX identifier
dt-bindings: pwm: fsl-ftm: Add compatible string for i.MX8QM
pwm: fsl-ftm: Enable support for the new SoC i.MX8QM
pwm: fsl-ftm: Added the support of per-compatible data
pwm: fsl-ftm: Added a dedicated IP interface clock
pwm: cros-ec: Switch to SPDX identifier
pwm: imx: Switch to SPDX identifier
pwm: tiehrpwm: Fix disabling of output of PWMs
pwm: tiehrpwm: Don't use emulation mode bits to control PWM output
pwm: berlin: Don't use broken prescaler values
Linus Torvalds [Thu, 23 Aug 2018 22:44:58 +0000 (15:44 -0700)]
Merge tag 'fbdev-v4.19' of https://github.com/bzolnier/linux
Pull fbdev updates from Bartlomiej Zolnierkiewicz:
"Mostly small fixes and cleanups for fb drivers (the biggest updates
are for udlfb and pxafb drivers). This also adds deferred console
takeover support to the console code and efifb driver.
Summary:
- add support for deferred console takeover, when enabled defers
fbcon taking over the console from the dummy console until the
first text is displayed on the console - together with the "quiet"
kernel commandline option this allows fbcon to still be used
together with a smooth graphical bootup (Hans de Goede)
- improve console locking debugging code (Thomas Zimmermann)
- copy the ACPI BGRT boot graphics to the framebuffer when deferred
console takeover support is used in efifb driver (Hans de Goede)
- update udlfb driver - fix lost console when the user unplugs a USB
adapter, fix the screen corruption issue, fix locking and add some
performance optimizations (Mikulas Patocka)
- update pxafb driver - fix using uninitialized memory, switch to
devm_* API, handle initialization errors and add support for
lcd-supply regulator (Daniel Mack)
- add support for boards booted with a DeviceTree in pxa3xx_gcu
driver (Daniel Mack)
- rename omap2 module to omap2fb.ko to avoid conflicts with omap1
driver (Arnd Bergmann)
- enable ACPI-based enumeration for goldfishfb driver (Yu Ning)
- fix goldfishfb driver to make user space Android code use 60 fps
(Christoffer Dall)
- print big fat warning when nomodeset kernel parameter is used in
vgacon driver (Lyude Paul)
- remove VLA usage from fsl-diu-fb driver (Kees Cook)
- misc fixes (Julia Lawall, Geert Uytterhoeven, Fredrik Noring,
Yisheng Xie, Dan Carpenter, Daniel Vetter, Anton Vasilyev, Randy
Dunlap, Gustavo A. R. Silva, Colin Ian King, Fengguang Wu)
- misc cleanups (Roman Kiryanov, Yisheng Xie, Colin Ian King)"
* tag 'fbdev-v4.19' of https://github.com/bzolnier/linux: (54 commits)
Documentation/fb: corrections for fbcon.txt
fbcon: Do not takeover the console from atomic context
dummycon: Stop exporting dummycon_[un]register_output_notifier
fbcon: Only defer console takeover if the current console driver is the dummycon
fbcon: Only allow FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER if fbdev is builtin
fbdev: omap2: omapfb: fix ifnullfree.cocci warnings
fbdev: omap2: omapfb: fix bugon.cocci warnings
fbdev: omap2: omapfb: fix boolreturn.cocci warnings
fb: amifb: fix build warnings when not builtin
fbdev/core: Disable console-lock warnings when fb.lockless_register_fb is set
console: Replace #if 0 with atomic var 'ignore_console_lock_warning'
udlfb: use spin_lock_irq instead of spin_lock_irqsave
udlfb: avoid prefetch
udlfb: optimization - test the backing buffer
udlfb: allow reallocating the framebuffer
udlfb: set line_length in dlfb_ops_set_par
udlfb: handle allocation failure
udlfb: set optimal write delay
udlfb: make a local copy of fb_ops
udlfb: don't switch if we are switching to the same videomode
...
Linus Torvalds [Thu, 23 Aug 2018 22:37:24 +0000 (15:37 -0700)]
Merge tag 'sound-fix-4.19-rc1' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"No surprises here: a regression fix for virmidi code refactoring,
three fixes for the new AC97 bus compat and runtime PM, and a usual
HD-audio quirk"
* tag 'sound-fix-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - Fix HP Headset Mic can't record
ALSA: ac97: fix unbalanced pm_runtime_enable
ALSA: ac97: fix check of pm_runtime_get_sync failure
ALSA: ac97: fix device initialization in the compat layer
ALSA: seq: virmidi: Fix discarding the unsubscribed output
Linus Torvalds [Thu, 23 Aug 2018 22:34:48 +0000 (15:34 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull more rdma updates from Jason Gunthorpe:
"This is the SMC cleanup promised, a randconfig regression fix, and
kernel oops fix.
Summary:
- Switch SMC over to rdma_get_gid_attr and remove the compat
- Fix a crash in HFI1 with some BIOS's
- Fix a randconfig failure"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/ucm: fix UCM link error
IB/hfi1: Invalid NUMA node information can cause a divide by zero
RDMA/smc: Replace ib_query_gid with rdma_get_gid_attr
Linus Torvalds [Thu, 23 Aug 2018 21:55:01 +0000 (14:55 -0700)]
Merge branch 'tlb-fixes'
Merge fixes for missing TLB shootdowns.
This fixes a couple of cases that involved us possibly freeing page
table structures before the required TLB shootdown had been done.
There are a few cleanup patches to make the code easier to follow, and
to avoid some of the more problematic cases entirely when not necessary.
To make this easier for backports, it undoes the recent lazy TLB
patches, because the cleanups and fixes are more important, and Rik is
ok with re-doing them later when things have calmed down.
The missing TLB flush was only delayed, and the wrong ordering only
happened under memory pressure (and in theory under a couple of other
fairly theoretical situations), so this may have been all very unlikely
to have hit people in practice.
But getting the TLB shootdown wrong is _so_ hard to debug and see that I
consider this a crticial fix.
Many thanks to Jann Horn for having debugged this.
* tlb-fixes:
x86/mm: Only use tlb_remove_table() for paravirt
mm: mmu_notifier fix for tlb_end_vma
mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
mm/tlb: Remove tlb_remove_table() non-concurrent condition
mm: move tlb_table_flush to tlb_flush_mmu_free
x86/mm/tlb: Revert the recent lazy TLB patches
Linus Torvalds [Thu, 23 Aug 2018 21:52:23 +0000 (14:52 -0700)]
Merge tag 'for-linus-4.19b-rc1b-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes and cleanups from Juergen Gross:
"Some cleanups, some minor fixes and a fix for a bug introduced in this
merge window hitting 32-bit PV guests"
* tag 'for-linus-4.19b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: enable early use of set_fixmap in 32-bit Xen PV guest
xen: remove unused hypercall functions
x86/xen: remove unused function xen_auto_xlated_memory_setup()
xen/ACPI: don't upload Px/Cx data for disabled processors
x86/Xen: further refine add_preferred_console() invocations
xen/mcelog: eliminate redundant setting of interface version
x86/Xen: mark xen_setup_gdt() __init
Linus Torvalds [Thu, 23 Aug 2018 21:23:08 +0000 (14:23 -0700)]
Merge tag 'mips_4.19_2' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
- Fix microMIPS build failures by adding a .insn directive to the
barrier_before_unreachable() asm statement in order to convince the
toolchain that the asm statement is a valid branch target rather
than a bogus attempt to switch ISA.
- Clean up our declarations of TLB functions that we overwrite with
generated code in order to prevent the compiler making assumptions
about alignment that cause microMIPS kernels built with GCC 7 &
above to die early during boot.
- Fix up a regression for MIPS32 kernels which slipped into the main
MIPS pull for 4.19, causing CONFIG_32BIT=y kernels to contain
inappropriate MIPS64 instructions.
- Extend our existing workaround for MIPSr6 builds that end up using
the __multi3 intrinsic to GCC 7 & below, rather than just GCC 7.
* tag 'mips_4.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: lib: Provide MIPS64r6 __multi3() for GCC < 7
MIPS: Workaround GCC __builtin_unreachable reordering bug
compiler.h: Allow arch-specific asm/compiler.h
MIPS: Avoid move psuedo-instruction whilst using MIPS_ISA_LEVEL
MIPS: Consistently declare TLB functions
MIPS: Export tlbmiss_handler_setup_pgd near its definition
Linus Torvalds [Thu, 23 Aug 2018 21:09:37 +0000 (14:09 -0700)]
Merge tag 'for-linus' of git://github.com/openrisc/linux
Pull OpenRISC update from Stafford Horne:
"Just one change for 4.19: refactoring from Christoph Hellwig to use
generic DMA facilities"
* tag 'for-linus' of git://github.com/openrisc/linux:
openrisc: use generic dma_noncoherent_ops
openrisc: fix cache maintainance the the sync_single_for_device DMA operation
openrisc: remove the no-op unmap_page and unmap_sg DMA operations
openrisc: remove the sync_single_for_cpu DMA operation
Linus Torvalds [Thu, 23 Aug 2018 21:02:22 +0000 (14:02 -0700)]
Merge tag 'armsoc-dt' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM device-tree updates from Olof Johansson:
"Business as usual -- the bulk of our changes are to devicetree files
with new hardware support, new SoCs and platforms, and new board
types.
New SoCs/platforms:
- Raspberry Pi Compute Module (CM1) and IO board
- i.MX6SSL from NXP
- Renesas RZ/N1D SoC (R9A06G032), Dual Cortex-A7 with Ethernet, CAN
and PLC interfaces
- TI AM654 SoC, Quad Cortex-A53, safety subsystem with Cortex-R5
controllers, communication and PRU subsystem and lots of other
interfaces (PCIe, USB3, etc).
New boards and systems:
- Several Atmel at91-based boards from Laird
- Marvell Armada388-based Helios4 board from SolidRun
- Samsung Aires-based phones (s5pv210)
- Allwinner A64-based Pinebook laptop
In addition to the above, there's the usual amount of new devices
described on existing platforms, fixes and tweaks and new minor
variants of boards/platforms"
* tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (478 commits)
arm64: dts: sdm845: Add tsens nodes
arm64: dts: msm8996: thermal: Initialise via DT and add second controller
arm64: dts: sprd: Add one suspend timer
arm64: dts: sprd: Add SC27XX ADC device
arm64: dts: sprd: Add SC27XX eFuse device
arm64: dts: sprd: Add SC27XX vibrator device
arm64: dts: sprd: Add SC27XX breathing light controller device
arm64: dts: meson-axg: add spdif-dit codec
arm64: dts: meson-axg: add lineout codec
arm64: dts: meson-axg: add linein codec
arm64: dts: meson-axg: add tdm interfaces
arm64: dts: meson-axg: add tdmout formatters
arm64: dts: meson-axg: add tdmin formatters
arm64: dts: meson-axg: add spdifout
arm64: dts: rockchip: add led support for Firefly-RK3399
arm64: dts: rockchip: remove deprecated Type-C PHY properties on rk3399
arm64: dts: rockchip: add power button support for Firefly-RK3399
ARM: dts: aspeed: Add coprocessor interrupt controller
arm64: dts: meson-axg: add audio arb reset controller
arm64: dts: meson-axg: add usb power regulator
...
Linus Torvalds [Thu, 23 Aug 2018 21:00:05 +0000 (14:00 -0700)]
Merge tag 'armsoc-defconfig' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC defconfig updates from Olof Johansson:
"We keep these separate since some files are shared and conflict-prone,
but there isn't really much to write about here.
Some of the churnier pieces is for the Aspeed platforms, which did an
overdue refresh of the defconfig, and enabled USB gadget and some
drivers from there. Most of the rest are minor additions here and
there to turn on drivers that are needed or useful on the various
platforms"
* tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (39 commits)
ARM: multi_v7_defconfig: add CONFIG_UNIPHIER_THERMAL and CONFIG_SNI_AVE
ARM: config: aspeed: Enable new FSI drivers
ARM: config: multi_v5: Enable ASPEED drivers
ARM: config: multi_v5: Refresh configuration
ARM: config: aspeed: Update defconfig
ARM: multi_v7_defconfig: Enable support for RZN1D-DB
ARM: shmobile: defconfig: Disable /sbin/hotplug fork-bomb
ARM: shmobile: defconfig: Enable support for RZN1D-DB
ARM: shmobile: defconfig: Enable reset controller support
ARM: shmobile: defconfig: Drop NET_VENDOR_<FOO>=n
arm64: defconfig: Enable more peripherals for Samsung Chromebook Plus.
arm64: defconfig: Enable CONFIG_MTD_NAND_QCOM for IPQ8074
ARM: qcom_defconfig: Enable QCOM NAND related configs
ARM: imx_v6_v7_defconfig: add DMATEST support
ARM: mvebu_v7_defconfig: enable SFP support
ARM: mvebu_v7_defconfig: sync defconfig
ARM: multi_v7_defconfig: Add Marvell NAND controller support
arm: configs: Add USB gadget to Aspeed G5 defconfig
arm: configs: Add USB gadget to Aspeed G4 defconfig
arm64: defconfig: enable HiSilicon PMU driver
...
Linus Torvalds [Thu, 23 Aug 2018 20:52:46 +0000 (13:52 -0700)]
Merge tag 'armsoc-drivers' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Olof Johansson:
"Some of the larger changes this merge window:
- Removal of drivers for Exynos5440, a Samsung SoC that never saw
widespread use.
- Uniphier support for USB3 and SPI reset handling
- Syste control and SRAM drivers and bindings for Allwinner platforms
- Qualcomm AOSS (Always-on subsystem) reset controller drivers
- Raspberry Pi hwmon driver for voltage
- Mediatek pwrap (pmic) support for MT6797 SoC"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (52 commits)
drivers/firmware: psci_checker: stash and use topology_core_cpumask for hotplug tests
soc: fsl: cleanup Kconfig menu
soc: fsl: dpio: Convert DPIO documentation to .rst
staging: fsl-mc: Remove remaining files
staging: fsl-mc: Move DPIO from staging to drivers/soc/fsl
staging: fsl-dpaa2: eth: move generic FD defines to DPIO
soc: fsl: qe: gpio: Add qe_gpio_set_multiple
usb: host: exynos: Remove support for Exynos5440
clk: samsung: Remove support for Exynos5440
soc: sunxi: Add the A13, A23 and H3 system control compatibles
reset: uniphier: add reset control support for SPI
cpufreq: exynos: Remove support for Exynos5440
ata: ahci-platform: Remove support for Exynos5440
soc: imx6qp: Use GENPD_FLAG_ALWAYS_ON for PU errata
soc: mediatek: pwrap: add mt6351 driver for mt6797 SoCs
soc: mediatek: pwrap: add pwrap driver for mt6797 SoCs
soc: mediatek: pwrap: fix cipher init setting error
dt-bindings: pwrap: mediatek: add pwrap support for MT6797
reset: uniphier: add USB3 core reset control
dt-bindings: reset: uniphier: add USB3 core reset support
...
Linus Torvalds [Thu, 23 Aug 2018 20:44:43 +0000 (13:44 -0700)]
Merge tag 'armsoc-soc' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM 32-bit SoC platform updates from Olof Johansson:
"Most of the SoC updates in this cycle are cleanups and moves to more
modern infrastructure:
- Davinci was moved to common clock framework
- OMAP1-based Amstrad E3 "Superphone" saw a bunch of cleanups to the
keyboard interface (bitbanged AT keyboard via GPIO).
- Removal of some stale code for Renesas platforms
- Power management improvements for i.MX6LL"
* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (112 commits)
ARM: uniphier: select RESET_CONTROLLER
arm64: uniphier: select RESET_CONTROLLER
ARM: uniphier: remove empty Makefile
ARM: exynos: Clear global variable on init error path
ARM: exynos: Remove outdated maintainer information
ARM: shmobile: Always enable ARCH_TIMER on SoCs with A7 and/or A15
ARM: shmobile: r8a7779: hide unused r8a7779_platform_cpu_kill
soc: r9a06g032: don't build SMP files for non-SMP config
ARM: shmobile: Add the R9A06G032 SMP enabler driver
ARM: at91: pm: configure wakeup sources for ULP1 mode
ARM: at91: pm: add PMC fast startup registers defines
ARM: at91: pm: Add ULP1 mode support
ARM: at91: pm: Use ULP0 naming instead of slow clock
ARM: hisi: handle of_iomap and fix missing of_node_put
ARM: hisi: check of_iomap and fix missing of_node_put
ARM: hisi: fix error handling and missing of_node_put
ARM: mx5: Set the DBGEN bit in ARM_GPC register
ARM: imx51: Configure M4IF to avoid visual artifacts
ARM: imx: call imx6sx_cpuidle_init() conditionally for 6sll
ARM: imx: fix i.MX6SLL build
...
Linus Torvalds [Thu, 23 Aug 2018 20:37:01 +0000 (13:37 -0700)]
Merge tag 'riscv-for-linus-4.19-mw1' of git://git./linux/kernel/git/palmer/riscv-linux
Pull RISC-V fixes from Palmer Dabbelt:
"This contains a pair of fixes to the RISC-V port:
- The removal of our compat.h, which didn't do anything.
- Fixes to sys_riscv_flush_icache to ensure it actually shows up.
We're going to just call this a bug in the ABI, as it was always
supposed to be there.
I've given these a simple build+boot test, both individually and as
the actual tag"
* tag 'riscv-for-linus-4.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
riscv: Delete asm/compat.h
RISC-V: Don't use a global include guard for uapi/asm/syscalls.h
RISC-V: Define sys_riscv_flush_icache when SMP=n
Steve French [Wed, 22 Aug 2018 02:37:53 +0000 (21:37 -0500)]
cifs: update internal module version number for cifs.ko to 2.12
Signed-off-by: Steve French <stfrench@microsoft.com>
Nicholas Mc Guire [Thu, 23 Aug 2018 10:24:02 +0000 (12:24 +0200)]
cifs: check kmalloc before use
The kmalloc was not being checked - if it fails issue a warning
and return -ENOMEM to the caller.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Fixes:
b8da344b74c8 ("cifs: dynamic allocation of ntlmssp blob")
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
cc: Stable <stable@vger.kernel.org>`
Ronnie Sahlberg [Wed, 22 Aug 2018 02:19:24 +0000 (12:19 +1000)]
cifs: check if SMB2 PDU size has been padded and suppress the warning
Some SMB2/3 servers, Win2016 but possibly others too, adds padding
not only between PDUs in a compound but also to the final PDU.
This padding extends the PDU to a multiple of 8 bytes.
Check if the unexpected length looks like this might be the case
and avoid triggering the log messages for :
"SMB2 server sent bad RFC1001 len %d not %d\n"
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Ronnie Sahlberg [Tue, 21 Aug 2018 01:49:21 +0000 (11:49 +1000)]
cifs: create a define for how many iovs we need for an SMB2_open()
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Linus Torvalds [Thu, 23 Aug 2018 20:07:00 +0000 (13:07 -0700)]
Merge tag 'trace-v4.19-1' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Masami found an off by one bug in the code that keeps "notrace"
functions from being traced by kprobes. During my testing, I found
that there's places that we may want to add kprobes to notrace, thus
we may end up changing this code before 4.19 is released.
The history behind this change is that we found that adding kprobes to
various notrace functions caused the kernel to crashed. We took the
safe route and decided not to allow kprobes to trace any notrace
function.
But because notrace is added to functions that just cause weird side
effects to the function tracer, but are still safe, preventing kprobes
for all notrace functios may be too much of a big hammer.
One such place is __schedule() is marked notrace, to keep function
tracer from doing strange recursive loops when it gets traced with
NEED_RESCHED set. With this change, one can not add kprobes to the
scheduler.
Masami also added code to use gcov on ftrace"
* tag 'trace-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Fix to check notrace function with correct range
tracing: Allow gcov profiling on only ftrace subsystem
Peter Zijlstra [Wed, 22 Aug 2018 15:30:16 +0000 (17:30 +0200)]
x86/mm: Only use tlb_remove_table() for paravirt
If we don't use paravirt; don't play unnecessary and complicated games
to free page-tables.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicholas Piggin [Thu, 23 Aug 2018 08:47:09 +0000 (18:47 +1000)]
mm: mmu_notifier fix for tlb_end_vma
The generic tlb_end_vma does not call invalidate_range mmu notifier, and
it resets resets the mmu_gather range, which means the notifier won't be
called on part of the range in case of an unmap that spans multiple
vmas.
ARM64 seems to be the only arch I could see that has notifiers and uses
the generic tlb_end_vma. I have not actually tested it.
[ Catalin and Will point out that ARM64 currently only uses the
notifiers for KVM, which doesn't use the ->invalidate_range()
callback right now, so it's a bug, but one that happens to
not affect them. So not necessary for stable. - Linus ]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 22 Aug 2018 15:30:15 +0000 (17:30 +0200)]
mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
Jann reported that x86 was missing required TLB invalidates when he
hit the !*batch slow path in tlb_remove_table().
This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache)
invalidates, the PowerPC-hash where this code originated and the
Sparc-hash where this was subsequently used did not need that. ARM
which later used this put an explicit TLB invalidate in their
__p*_free_tlb() functions, and PowerPC-radix followed that example.
But when we hooked up x86 we failed to consider this. Fix this by
(optionally) hooking tlb_remove_table() into the TLB invalidate code.
NOTE: s390 was also needing something like this and might now
be able to use the generic code again.
[ Modified to be on top of Nick's cleanups, which simplified this patch
now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]
Fixes:
9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 22 Aug 2018 15:30:14 +0000 (17:30 +0200)]
mm/tlb: Remove tlb_remove_table() non-concurrent condition
Will noted that only checking mm_users is incorrect; we should also
check mm_count in order to cover CPUs that have a lazy reference to
this mm (and could do speculative TLB operations).
If removing this turns out to be a performance issue, we can
re-instate a more complete check, but in tlb_table_flush() eliding the
call_rcu_sched().
Fixes:
267239116987 ("mm, powerpc: move the RCU page-table freeing into generic code")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicholas Piggin [Thu, 23 Aug 2018 08:47:08 +0000 (18:47 +1000)]
mm: move tlb_table_flush to tlb_flush_mmu_free
There is no need to call this from tlb_flush_mmu_tlbonly, it logically
belongs with tlb_flush_mmu_free. This makes future fixes simpler.
[ This was originally done to allow code consolidation for the
mmu_notifier fix, but it also ends up helping simplify the
HAVE_RCU_TABLE_INVALIDATE fix. - Linus ]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christian Brauner [Thu, 7 Jun 2018 11:43:48 +0000 (13:43 +0200)]
getxattr: use correct xattr length
When running in a container with a user namespace, if you call getxattr
with name = "system.posix_acl_access" and size % 8 != 4, then getxattr
silently skips the user namespace fixup that it normally does resulting in
un-fixed-up data being returned.
This is caused by posix_acl_fix_xattr_to_user() being passed the total
buffer size and not the actual size of the xattr as returned by
vfs_getxattr().
This commit passes the actual length of the xattr as returned by
vfs_getxattr() down.
A reproducer for the issue is:
touch acl_posix
setfacl -m user:0:rwx acl_posix
and the compile:
#define _GNU_SOURCE
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <attr/xattr.h>
/* Run in user namespace with nsuid 0 mapped to uid != 0 on the host. */
int main(int argc, void **argv)
{
ssize_t ret1, ret2;
char buf1[128], buf2[132];
int fret = EXIT_SUCCESS;
char *file;
if (argc < 2) {
fprintf(stderr,
"Please specify a file with "
"\"system.posix_acl_access\" permissions set\n");
_exit(EXIT_FAILURE);
}
file = argv[1];
ret1 = getxattr(file, "system.posix_acl_access",
buf1, sizeof(buf1));
if (ret1 < 0) {
fprintf(stderr, "%s - Failed to retrieve "
"\"system.posix_acl_access\" "
"from \"%s\"\n", strerror(errno), file);
_exit(EXIT_FAILURE);
}
ret2 = getxattr(file, "system.posix_acl_access",
buf2, sizeof(buf2));
if (ret2 < 0) {
fprintf(stderr, "%s - Failed to retrieve "
"\"system.posix_acl_access\" "
"from \"%s\"\n", strerror(errno), file);
_exit(EXIT_FAILURE);
}
if (ret1 != ret2) {
fprintf(stderr, "The value of \"system.posix_acl_"
"access\" for file \"%s\" changed "
"between two successive calls\n", file);
_exit(EXIT_FAILURE);
}
for (ssize_t i = 0; i < ret2; i++) {
if (buf1[i] == buf2[i])
continue;
fprintf(stderr,
"Unexpected different in byte %zd: "
"%02x != %02x\n", i, buf1[i], buf2[i]);
fret = EXIT_FAILURE;
}
if (fret == EXIT_SUCCESS)
fprintf(stderr, "Test passed\n");
else
fprintf(stderr, "Test failed\n");
_exit(fret);
}
and run:
./tester acl_posix
On a non-fixed up kernel this should return something like:
root@c1:/# ./t
Unexpected different in byte 16:
ffffffa0 != 00
Unexpected different in byte 17:
ffffff86 != 00
Unexpected different in byte 18: 01 != 00
and on a fixed kernel:
root@c1:~# ./t
Test passed
Cc: stable@vger.kernel.org
Fixes:
2f6f0654ab61 ("userns: Convert vfs posix_acl support to use kuids and kgids")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199945
Reported-by: Colin Watson <cjwatson@ubuntu.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Jens Axboe [Thu, 23 Aug 2018 15:34:46 +0000 (09:34 -0600)]
blk-wbt: don't maintain inflight counts if disabled
A previous commit removed the ability to have per-rq flags. We used
those flags to maintain inflight counts. Since we don't have those
anymore, we have to always maintain inflight counts, even if wbt is
disabled. This is clearly suboptimal.
Add a queue quiesce around changing the wbt latency settings from sysfs
to work around this. With that, we can reliably put the enabled check in
our bio_to_wbt_flags(), since we know the WBT_TRACKED flag will be
consistent for the lifetime of the request.
Fixes:
c1c80384c8f ("block: remove external dependency on wbt_flags")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mahesh Salgaonkar [Thu, 23 Aug 2018 04:56:08 +0000 (10:26 +0530)]
powerpc/mce: Fix SLB rebolting during MCE recovery path.
The commit
e7e81847478 ("powerpc/64s: move machine check SLB flushing
to mm/slb.c") introduced a bug in reloading bolted SLB entries. Unused
bolted entries are stored with .esid=0 in the slb_shadow area, and
that value is now used directly as the RB input to slbmte, which means
the RB[52:63] index field is set to 0, which causes SLB entry 0 to be
cleared.
Fix this by storing the index bits in the unused bolted entries, which
directs the slbmte to the right place.
The SLB shadow area is also used by the hypervisor, but PAPR is okay
with that, from LoPAPR v1.1, 14.11.1.3 SLB Shadow Buffer:
Note: SLB is filled sequentially starting at index 0
from the shadow buffer ignoring the contents of
RB field bits 52-63
Fixes:
e7e81847478b ("powerpc/64s: move machine check SLB flushing to mm/slb.c")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Paul Mackerras [Thu, 23 Aug 2018 00:08:58 +0000 (10:08 +1000)]
KVM: PPC: Book3S: Fix guest DMA when guest partially backed by THP pages
Commit
76fa4975f3ed ("KVM: PPC: Check if IOMMU page is contained in
the pinned physical page", 2018-07-17) added some checks to ensure
that guest DMA mappings don't attempt to map more than the guest is
entitled to access. However, errors in the logic mean that legitimate
guest requests to map pages for DMA are being denied in some
situations. Specifically, if the first page of the range passed to
mm_iommu_get() is mapped with a normal page, and subsequent pages are
mapped with transparent huge pages, we end up with mem->pageshift ==
0. That means that the page size checks in mm_iommu_ua_to_hpa() and
mm_iommu_up_to_hpa_rm() will always fail for every page in that
region, and thus the guest can never map any memory in that region for
DMA, typically leading to a flood of error messages like this:
qemu-system-ppc64: VFIO_MAP_DMA: -22
qemu-system-ppc64: vfio_dma_map(0x10005f47780, 0x800000000000000, 0x10000, 0x7fff63ff0000) = -22 (Invalid argument)
The logic errors in mm_iommu_get() are:
(a) use of 'ua' not 'ua + (i << PAGE_SHIFT)' in the find_linux_pte()
call (meaning that find_linux_pte() returns the pte for the
first address in the range, not the address we are currently up
to);
(b) use of 'pageshift' as the variable to receive the hugepage shift
returned by find_linux_pte() - for a normal page this gets set
to 0, leading to us setting mem->pageshift to 0 when we conclude
that the pte returned by find_linux_pte() didn't match the page
we were looking at;
(c) comparing 'compshift', which is a page order, i.e. log base 2 of
the number of pages, with 'pageshift', which is a log base 2 of
the number of bytes.
To fix these problems, this patch introduces 'cur_ua' to hold the
current user address and uses that in the find_linux_pte() call;
introduces 'pteshift' to hold the hugepage shift found by
find_linux_pte(); and compares 'pteshift' with 'compshift +
PAGE_SHIFT' rather than 'compshift'.
The patch also moves the local_irq_restore to the point after the PTE
pointer returned by find_linux_pte() has been dereferenced because
otherwise the PTE could change underneath us, and adds a check to
avoid doing the find_linux_pte() call once mem->pageshift has been
reduced to PAGE_SHIFT, as an optimization.
Fixes:
76fa4975f3ed ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Aneesh Kumar K.V [Wed, 22 Aug 2018 17:16:05 +0000 (22:46 +0530)]
powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition
The Nest MMU workaround is only needed for RW upgrades. Avoid doing
that for other PTE updates.
We also avoid clearing the PTE while marking it invalid. This is
because other page table walkers will find this PTE none and can
result in unexpected behaviour due to that. Instead we clear
_PAGE_PRESENT and set the software PTE bit _PAGE_INVALID.
pte_present() is already updated to check for both bits. This makes
sure page table walkers will find the PTE present and things like
pte_pfn(pte) returns the right value.
Based on an original patch from Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>