Muchun Song [Sat, 26 Sep 2020 04:19:05 +0000 (21:19 -0700)]
mm: memcontrol: fix missing suffix of workingset_restore
We forget to add the suffix to the workingset_restore string, so fix it.
And also update the documentation of cgroup-v2.rst.
Fixes: 170b04b7ae49 ("mm/workingset: prepare the workingset detection infrastructure for anon LRU")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: https://lkml.kernel.org/r/20200916100030.71698-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gao Xiang [Sat, 26 Sep 2020 04:19:01 +0000 (21:19 -0700)]
mm, THP, swap: fix allocating cluster for swapfile by mistake
SWP_FS is used to make swap_{read,write}page() go through the
filesystem, and it's only used for swap files over NFS. So, !SWP_FS
means non NFS for now, it could be either file backed or device backed.
Something similar goes with legacy SWP_FILE.
So in order to achieve the goal of the original patch, SWP_BLKDEV should
be used instead.
FS corruption can be observed with SSD device + XFS + fragmented
swapfile due to CONFIG_THP_SWAP=y.
I reproduced the issue with the following details:
Environment:
QEMU + upstream kernel + buildroot + NVMe (2 GB)
Kernel config:
CONFIG_BLK_DEV_NVME=y
CONFIG_THP_SWAP=y
Some reproducible steps:
mkfs.xfs -f /dev/nvme0n1
mkdir /tmp/mnt
mount /dev/nvme0n1 /tmp/mnt
bs="32k"
sz="1024m" # doesn't matter too much, I also tried 16m
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw
mkswap /tmp/mnt/sw
swapon /tmp/mnt/sw
stress --vm 2 --vm-bytes 600M # doesn't matter too much as well
Symptoms:
- FS corruption (e.g. checksum failure)
- memory corruption at: 0xd2808010
- segfault
Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out")
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Eric Sandeen <esandeen@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200820045323.7809-1-hsiangkao@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 26 Sep 2020 00:15:19 +0000 (17:15 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull more kvm fixes from Paolo Bonzini:
"Five small fixes.
The nested migration bug will be fixed with a better API in 5.10 or
5.11, for now this is a fix that works with existing userspace but
keeps the current ugly API"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SVM: Add a dedicated INVD intercept routine
KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE
KVM: x86: fix MSR_IA32_TSC read for nested migration
selftests: kvm: Fix assert failure in single-step test
KVM: x86: VMX: Make smaller physical guest address space support user-configurable
Linus Torvalds [Fri, 25 Sep 2020 22:24:04 +0000 (15:24 -0700)]
Merge tag 'mips_fixes_5.9_3' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- fixed FP register access on Loongsoon-3
- added missing 1074 cpu handling
- fixed Loongson2ef build error
* tag 'mips_fixes_5.9_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: BCM47XX: Remove the needless check with the 1074K
MIPS: Add the missing 'CPU_1074K' into __get_cpu_type()
MIPS: Loongson2ef: Disable Loongson MMI instructions
MIPS: Loongson-3: Fix fp register access if MSA enabled
Linus Torvalds [Fri, 25 Sep 2020 22:21:54 +0000 (15:21 -0700)]
Merge tag 'spi-fix-v5.9-rc6' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A small collection of driver specific fixes, the fsl-espi and bcm-qspi
changes in particular have been causing breakage for users"
* tag 'spi-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: bcm-qspi: Fix probe regression on iProc platforms
spi: fsl-dspi: fix use-after-free in remove path
spi: fsl-espi: Only process interrupts for expected events
spi: bcm2835: Make polling_limit_us static
spi: spi-fsl-dspi: use XSPI mode instead of DMA for DPAA2 SoCs
Linus Torvalds [Fri, 25 Sep 2020 22:16:01 +0000 (15:16 -0700)]
Merge tag 'regulator-fix-v5.9-rc6' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"A single fix for incorrect specification of some of the register
fields on axp20x devices which would break voltage setting on affected
systems"
* tag 'regulator-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: axp20x: fix LDO2/4 description
Linus Torvalds [Fri, 25 Sep 2020 22:11:24 +0000 (15:11 -0700)]
Merge tag 'regmap-fix-v5.9-rc6' of git://git./linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"Two issues here - one is a fix for use after free issues in the case
where a regmap overrides its name using something dynamically
generated, the other is that we weren't handling access checks
non-incrementing I/O on registers within paged register regions
correctly resulting in spurious errors.
Both of these are quite rare but serious if they occur"
* tag 'regmap-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: fix page selection for noinc writes
regmap: fix page selection for noinc reads
regmap: debugfs: Add back in erroneously removed initialisation of ret
regmap: debugfs: Fix handling of name string for debugfs init delays
Linus Torvalds [Fri, 25 Sep 2020 17:46:11 +0000 (10:46 -0700)]
Merge tag 'nfsd-5.9-2' of git://git.linux-nfs.org/projects/cel/cel-2.6
Pull NFS server fix from Chuck Lever:
"Fix incorrect calculation on platforms that implement
flush_dcache_page()"
* tag 'nfsd-5.9-2' of git://git.linux-nfs.org/projects/cel/cel-2.6:
SUNRPC: Fix svc_flush_dcache()
Linus Torvalds [Fri, 25 Sep 2020 17:39:22 +0000 (10:39 -0700)]
Merge tag 'pm-5.9-rc7' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix more fallout of recent RCU-lockdep changes in CPU idle code
and two devfreq issues.
Specifics:
- Export rcu_idle_{enter,exit} to modules to fix build issues
introduced by recent RCU-lockdep fixes (Borislav Petkov)
- Add missing return statement to a stub function in the ACPI
processor driver to fix a build issue introduced by recent
RCU-lockdep fixes (Rafael Wysocki)
- Fix recently introduced suspicious RCU usage warnings in the PSCI
cpuidle driver and drop stale comments regarding RCU_NONIDLE()
usage from enter_s2idle_proper() (Ulf Hansson)
- Fix error code path in the tegra30 devfreq driver (Dan Carpenter)
- Add missing information to devfreq_summary debugfs (Chanwoo Choi)"
* tag 'pm-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset
PM / devfreq: tegra30: Disable clock on error in probe
PM / devfreq: Add timer type to devfreq_summary debugfs
cpuidle: Drop misleading comments about RCU usage
cpuidle: psci: Fix suspicious RCU usage
rcu/tree: Export rcu_idle_{enter,exit} to modules
Tom Lendacky [Thu, 24 Sep 2020 18:41:57 +0000 (13:41 -0500)]
KVM: SVM: Add a dedicated INVD intercept routine
The INVD instruction intercept performs emulation. Emulation can't be done
on an SEV guest because the guest memory is encrypted.
Provide a dedicated intercept routine for the INVD intercept. And since
the instruction is emulated as a NOP, just skip it instead.
Fixes: 1654efcbc431 ("KVM: SVM: Add KVM_SEV_INIT command")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <
a0b9a19ffa7fef86a3cc700c7ea01cb2731e04e5.
1600972918.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Fri, 25 Sep 2020 16:49:19 +0000 (09:49 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fix from Jason Gunthorpe:
"One fix for a bug that blktests hits when using rxe: tear down the CQ
pool before waiting for all references to go away"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/core: Fix ordering of CQ pool destruction
Linus Torvalds [Fri, 25 Sep 2020 16:41:57 +0000 (09:41 -0700)]
Merge tag 'drm-fixes-2020-09-25' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Fairly quiet, a couple of i915 fixes, one dma-buf fix, one vc4 and two
sun4i changes
dma-buf:
- Single null pointer deref fix
i915:
- Fix selftest reference to stack data out of scope
- Fix GVT null pointer dereference
vc4:
- fill asoc card owner
sun4i:
- program secondary CSC correctly"
* tag 'drm-fixes-2020-09-25' of git://anongit.freedesktop.org/drm/drm:
drm/i915/selftests: Push the fake iommu device from the stack to data
dmabuf: fix NULL pointer dereference in dma_buf_release()
drm/i915/gvt: Fix port number for BDW on EDID region setup
drm/sun4i: mixer: Extend regmap max_register
drm/sun4i: sun8i-csc: Secondary CSC register correction
drm/vc4/vc4_hdmi: fill ASoC card owner
Rafael J. Wysocki [Fri, 25 Sep 2020 16:33:46 +0000 (18:33 +0200)]
Merge branch 'pm-cpuidle'
* pm-cpuidle:
ACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset
cpuidle: Drop misleading comments about RCU usage
cpuidle: psci: Fix suspicious RCU usage
rcu/tree: Export rcu_idle_{enter,exit} to modules
Rafael J. Wysocki [Fri, 25 Sep 2020 14:33:19 +0000 (16:33 +0200)]
Merge tag 'devfreq-fixes-for-5.9-rc7' of git://git./linux/kernel/git/chanwoo/linux
Pull devfreq updates for 5.9-rc7 from Chanwoo Choi:
"1. Update devfreq core
- Add missing timer type to devfreq_summary debugfs node.
2. Fix devfreq device driver
- Fix the exception handling about clock on tegra30-devfreq.c"
* tag 'devfreq-fixes-for-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
PM / devfreq: tegra30: Disable clock on error in probe
PM / devfreq: Add timer type to devfreq_summary debugfs
Sean Christopherson [Wed, 23 Sep 2020 21:53:52 +0000 (14:53 -0700)]
KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE
Reset the MMU context during kvm_set_cr4() if SMAP or PKE is toggled.
Recent commits to (correctly) not reload PDPTRs when SMAP/PKE are
toggled inadvertantly skipped the MMU context reset due to the mask
of bits that triggers PDPTR loads also being used to trigger MMU context
resets.
Fixes: 427890aff855 ("kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode")
Fixes: cb957adb4ea4 ("kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode")
Cc: Jim Mattson <jmattson@google.com>
Cc: Peter Shier <pshier@google.com>
Cc: Oliver Upton <oupton@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <
20200923215352.17756-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Dave Airlie [Fri, 25 Sep 2020 01:28:36 +0000 (11:28 +1000)]
Merge tag 'drm-misc-fixes-2020-09-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.9:
- Single null pointer deref fix for dma-buf.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4106c21e-f52c-4c05-6cdb-daa743bb8617@linux.intel.com
Dave Airlie [Fri, 25 Sep 2020 01:07:01 +0000 (11:07 +1000)]
Merge tag 'drm-intel-fixes-2020-09-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.9-rc7:
- Fix selftest reference to stack data out of scope
- Fix GVT null pointer dereference
- Backmerge from Linus' master to fix build
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87zh5fpmha.fsf@intel.com
Dave Airlie [Fri, 25 Sep 2020 01:06:18 +0000 (11:06 +1000)]
BackMerge commit '
98477740630f270aecf648f1d6a9dbc6027d4ff1' into drm-fixes
The dax mess had some fallout, and i915 used a later base to fix their CI.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Maxim Levitsky [Mon, 21 Sep 2020 10:38:05 +0000 (13:38 +0300)]
KVM: x86: fix MSR_IA32_TSC read for nested migration
MSR reads/writes should always access the L1 state, since the (nested)
hypervisor should intercept all the msrs it wants to adjust, and these
that it doesn't should be read by the guest as if the host had read it.
However IA32_TSC is an exception. Even when not intercepted, guest still
reads the value + TSC offset.
The write however does not take any TSC offset into account.
This is documented in Intel's SDM and seems also to happen on AMD as well.
This creates a problem when userspace wants to read the IA32_TSC value and then
write it. (e.g for migration)
In this case it reads L2 value but write is interpreted as an L1 value.
To fix this make the userspace initiated reads of IA32_TSC return L1 value
as well.
Huge thanks to Dave Gilbert for helping me understand this very confusing
semantic of MSR writes.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20200921103805.9102-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Thu, 24 Sep 2020 16:09:47 +0000 (09:09 -0700)]
Merge tag 'mmc-v5.9-rc4-2' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fix from Ulf Hansson:
"Fix build warning in mmc_spi when CONFIG_HAS_DMA is unset"
* tag 'mmc-v5.9-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: mmc_spi: Fix mmc_spi_dma_alloc() return type for !HAS_DMA
Linus Torvalds [Thu, 24 Sep 2020 16:05:04 +0000 (09:05 -0700)]
Merge tag 'media/v5.9-3' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- fix a regression at the CEC adapter core
- two uAPI patches (one revert) for changes in this development cycle
* tag 'media/v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: dt-bindings: media: imx274: Convert to json-schema
media: media/v4l2: remove V4L2_FLAG_MEMORY_NON_CONSISTENT flag
media: cec-adap.c: don't use flush_scheduled_work()
Linus Torvalds [Thu, 24 Sep 2020 16:00:05 +0000 (09:00 -0700)]
Merge tag 'sound-5.9-rc7' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just a handful small device-specific fixes including a couple of
reverts"
* tag 'sound-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
Revert "ALSA: usb-audio: Disable Lenovo P620 Rear line-in volume control"
Revert "ALSA: hda - Fix silent audio output and corrupted input on MSI X570-A PRO"
ALSA: usb-audio: Add delay quirk for H570e USB headsets
ALSA: hda/realtek: Enable front panel headset LED on Lenovo ThinkStation P520
ALSA: hda/realtek - Couldn't detect Mic if booting with headset plugged
ALSA: asihpi: fix iounmap in error handler
Linus Torvalds [Thu, 24 Sep 2020 15:41:32 +0000 (08:41 -0700)]
mm: fix misplaced unlock_page in do_wp_page()
Commit
09854ba94c6a ("mm: do_wp_page() simplification") reorganized all
the code around the page re-use vs copy, but in the process also moved
the final unlock_page() around to after the wp_page_reuse() call.
That normally doesn't matter - but it means that the unlock_page() is
now done after releasing the page table lock. Again, not a big deal,
you'd think.
But it turns out that it's very wrong indeed, because once we've
released the page table lock, we've basically lost our only reference to
the page - the page tables - and it could now be free'd at any time. We
do hold the mmap_sem, so no actual unmap() can happen, but madvise can
come in and a MADV_DONTNEED will zap the page range - and free the page.
So now the page may be free'd just as we're unlocking it, which in turn
will usually trigger a "Bad page state" error in the freeing path. To
make matters more confusing, by the time the debug code prints out the
page state, the unlock has typically completed and everything looks fine
again.
This all doesn't happen in any normal situations, but it does trigger
with the dirtyc0w_child LTP test. And it seems to trigger much more
easily (but not expclusively) on s390 than elsewhere, probably because
s390 doesn't do the "batch pages up for freeing after the TLB flush"
that gives the unlock_page() more time to complete and makes the race
harder to hit.
Fixes: 09854ba94c6a ("mm: do_wp_page() simplification")
Link: https://lore.kernel.org/lkml/a46e9bbef2ed4e17778f5615e818526ef848d791.camel@redhat.com/
Link: https://lore.kernel.org/linux-mm/c41149a8-211e-390b-af1d-d5eee690fecb@linux.alibaba.com/
Reported-by: Qian Cai <cai@redhat.com>
Reported-by: Alex Shi <alex.shi@linux.alibaba.com>
Bisected-and-analyzed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Tested-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ray Jui [Thu, 10 Sep 2020 15:25:38 +0000 (08:25 -0700)]
spi: bcm-qspi: Fix probe regression on iProc platforms
iProc chips have QSPI controller that does not have the MSPI_REV
offset. Reading from that offset will cause a bus error. Fix it by
having MSPI_REV query disabled in the generic compatible string.
Fixes: 3a01f04d74ef ("spi: bcm-qspi: Handle lack of MSPI_REV offset")
Link: https://lore.kernel.org/linux-arm-kernel/20200909211857.4144718-1-f.fainelli@gmail.com/T/#u
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20200910152539.45584-3-ray.jui@broadcom.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Linus Torvalds [Wed, 23 Sep 2020 21:52:22 +0000 (14:52 -0700)]
Merge tag 'trace-v5.9-rc5-2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull bootconfig fixes from Steven Rostedt:
"A couple of fixes for bootconfig.
Masami discovered two bugs which this fixes and he added tests to
cover these issues.
- Fix a bug that breaks bootconfig tree nodes
- Fix a bug that does not truncate whitespace properly
- Add tests to cover the above two cases"
* tag 'trace-v5.9-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tools/bootconfig: Add testcase for tailing space
tools/bootconfig: Add testcases for repeated key with brace
lib/bootconfig: Fix to remove tailing spaces after value
lib/bootconfig: Fix a bug of breaking existing tree nodes
Linus Torvalds [Wed, 23 Sep 2020 21:38:21 +0000 (14:38 -0700)]
Merge tag 'for-5.9/dm-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- DM core fix for incorrect double bio splitting. Keep "fixing" this
because past attempts didn't fully appreciate the liability relative
to recursive bio splitting. This fix limits DM's bio splitting to a
single method and does _not_ use blk_queue_split() for normal IO.
- DM crypt Documentation updates for features added during 5.9 merge.
* tag 'for-5.9/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm crypt: document encrypted keyring key option
dm crypt: document new no_workqueue flags
dm: fix comment in dm_process_bio()
dm: fix bio splitting and its bio completion order for regular IO
Linus Torvalds [Wed, 23 Sep 2020 21:32:23 +0000 (14:32 -0700)]
Merge tag 'for-5.9-rc6-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"syzkaller started to hit us with reports, here's a fix for one type
(stack overflow when printing checksums on read error).
The other patch is a fix for sysfs object, we have a test for that and
it leads to a crash."
* tag 'for-5.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix put of uninitialized kobject after seed device delete
btrfs: fix overflow when copying corrupt csums for a message
Linus Torvalds [Wed, 23 Sep 2020 17:04:16 +0000 (10:04 -0700)]
mm: move the copy_one_pte() pte_present check into the caller
This completes the split of the non-present and present pte cases by
moving the check for the source pte being present into the single
caller, which also means that we clearly separate out the very different
return value case for a non-present pte.
The present pte case currently always succeeds.
This is a pure code re-organization with no semantic change: the intent
is to make it much easier to add a new return case to the present pte
case for when we do early COW at page table copy time.
This was split out from the previous commit simply to make it easy to
visually see that there were no semantic changes from this code
re-organization.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 23 Sep 2020 16:56:59 +0000 (09:56 -0700)]
mm: split out the non-present case from copy_one_pte()
This is a purely mechanical split of the copy_one_pte() function. It's
not immediately obvious when looking at the diff because of the
indentation change, but the way to see what is going on in this commit
is to use the "-w" flag to not show pure whitespace changes, and you see
how the first part of copy_one_pte() is simply lifted out into a
separate function.
And since the non-present case is marked unlikely, don't make the new
function be inlined. Not that gcc really seems to care, since it looks
like it will inline it anyway due to the whole "single callsite for
static function" logic. In fact, code generation with the function
split is almost identical to before. But not marking it inline is the
right thing to do.
This is pure prep-work and cleanup for subsequent changes.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sascha Hauer [Wed, 23 Sep 2020 13:10:26 +0000 (15:10 +0200)]
spi: fsl-dspi: fix use-after-free in remove path
spi_unregister_controller() not only unregisters the controller, but
also frees the controller. This will free the driver data with it, so
we must not access it later dspi_remove().
Solve this by allocating the driver data separately from the SPI
controller.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20200923131026.20707-1-s.hauer@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Icenowy Zheng [Wed, 23 Sep 2020 00:51:42 +0000 (08:51 +0800)]
regulator: axp20x: fix LDO2/4 description
Currently we wrongly set the mask of value of LDO2/4 both to the mask of
LDO2, and the LDO4 voltage configuration is left untouched. This leads
to conflict when LDO2/4 are both in use.
Fix this issue by setting different vsel_mask to both regulators.
Fixes: db4a555f7c4c ("regulator: axp20x: use defines for masks")
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Link: https://lore.kernel.org/r/20200923005142.147135-1-icenowy@aosc.io
Signed-off-by: Mark Brown <broonie@kernel.org>
Yang Weijiang [Wed, 26 Aug 2020 01:55:24 +0000 (09:55 +0800)]
selftests: kvm: Fix assert failure in single-step test
This is a follow-up patch to fix an issue left in commit:
98b0bf02738004829d7e26d6cb47b2e469aaba86
selftests: kvm: Use a shorter encoding to clear RAX
With the change in the commit, we also need to modify "xor" instruction
length from 3 to 2 in array ss_size accordingly to pass below check:
for (i = 0; i < (sizeof(ss_size) / sizeof(ss_size[0])); i++) {
target_rip += ss_size[i];
CLEAR_DEBUG();
debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP;
debug.arch.debugreg[7] = 0x00000400;
APPLY_DEBUG();
vcpu_run(vm, VCPU_ID);
TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG &&
run->debug.arch.exception == DB_VECTOR &&
run->debug.arch.pc == target_rip &&
run->debug.arch.dr6 == target_dr6,
"SINGLE_STEP[%d]: exit %d exception %d rip 0x%llx "
"(should be 0x%llx) dr6 0x%llx (should be 0x%llx)",
i, run->exit_reason, run->debug.arch.exception,
run->debug.arch.pc, target_rip, run->debug.arch.dr6,
target_dr6);
}
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <
20200826015524.13251-1-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mohammed Gamal [Thu, 3 Sep 2020 14:11:22 +0000 (16:11 +0200)]
KVM: x86: VMX: Make smaller physical guest address space support user-configurable
This patch exposes allow_smaller_maxphyaddr to the user as a module parameter.
Since smaller physical address spaces are only supported on VMX, the
parameter is only exposed in the kvm_intel module.
For now disable support by default, and let the user decide if they want
to enable it.
Modifications to VMX page fault and EPT violation handling will depend
on whether that parameter is enabled.
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Message-Id: <
20200903141122.72908-1-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Wei Li [Wed, 23 Sep 2020 06:53:26 +0000 (14:53 +0800)]
MIPS: BCM47XX: Remove the needless check with the 1074K
As there is no known soc powered by mips 1074K in bcm47xx series,
the check with 1074K is needless. So just remove it.
Link: https://wireless.wiki.kernel.org/en/users/Drivers/b43/soc
Fixes: 442e14a2c55e ("MIPS: Add 1074K CPU support explicitly.")
Signed-off-by: Wei Li <liwei391@huawei.com>
Acked-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Wei Li [Wed, 23 Sep 2020 06:53:12 +0000 (14:53 +0800)]
MIPS: Add the missing 'CPU_1074K' into __get_cpu_type()
Commit
442e14a2c55e ("MIPS: Add 1074K CPU support explicitly.") split
1074K from the 74K as an unique CPU type, while it missed to add the
'CPU_1074K' in __get_cpu_type(). So let's add it back.
Fixes: 442e14a2c55e ("MIPS: Add 1074K CPU support explicitly.")
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Jiaxun Yang [Wed, 23 Sep 2020 10:33:12 +0000 (18:33 +0800)]
MIPS: Loongson2ef: Disable Loongson MMI instructions
It was missed when I was forking Loongson2ef from Loongson64 but
should be applied to Loongson2ef as march=loongson2f
will also enable Loongson MMI in GCC-9+.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Fixes: 71e2f4dd5a65 ("MIPS: Fork loongson2ef from loongson64")
Reported-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Rafael J. Wysocki [Wed, 23 Sep 2020 11:50:12 +0000 (13:50 +0200)]
ACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset
Fix the lapic_timer_needs_broadcast() stub for
ARCH_APICTIMER_STOPS_ON_C3 unset to actually return
a value.
Fixes: aa6b43d57f99 ("ACPI: processor: Use CPUIDLE_FLAG_TIMER_STOP")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Chris Wilson [Wed, 16 Sep 2020 10:50:22 +0000 (11:50 +0100)]
drm/i915/selftests: Push the fake iommu device from the stack to data
Since we store a pointer to the fake iommu device that is allocated on
the stack, as soon as we leave the function it goes out of scope and any
future dereference is undefined behaviour. Just in case we may need to
look at the fake iommu device after initialiation, move the allocation
from the stack into the data.
Fixes: 01b9d4e21148 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916105022.28316-2-chris@chris-wilson.co.uk
(cherry picked from commit
9f9f4101fc98db56714e71676d5a1e2d27e01f7e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Dan Carpenter [Tue, 8 Sep 2020 07:25:57 +0000 (10:25 +0300)]
PM / devfreq: tegra30: Disable clock on error in probe
This error path needs to call clk_disable_unprepare().
Fixes: 7296443b900e ("PM / devfreq: tegra30: Handle possible round-rate error")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Tue, 8 Sep 2020 10:57:07 +0000 (19:57 +0900)]
PM / devfreq: Add timer type to devfreq_summary debugfs
The commit
4dc3bab8687f ("PM / devfreq: Add support delayed timer for
polling mode") supports the delayed timer but this commit missed
the adding the timer type to devfreq_summary debugfs node.
Add the timer type to devfreq_summary debugfs.
Fixes: 4dc3bab8687f ("PM / devfreq: Add support delayed timer for polling mode")
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Dave Airlie [Tue, 22 Sep 2020 23:30:29 +0000 (09:30 +1000)]
Merge tag 'drm-misc-fixes-2020-09-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.9-rc6:
- Fill asoc card owner in vc4.
- Program secondary CSC correctly in sun4i, and extend
register mapping to cover secondary CSC registers.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e3ab56cf-3b8e-9b21-f1b6-9a4989a52996@linux.intel.com
Linus Torvalds [Tue, 22 Sep 2020 22:08:41 +0000 (15:08 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"No common topic, just assorted fixes"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fuse: fix the ->direct_IO() treatment of iov_iter
fs: fix cast in fsparam_u32hex() macro
vboxsf: Fix the check for the old binary mount-arguments struct
Linus Torvalds [Tue, 22 Sep 2020 21:43:50 +0000 (14:43 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
- fix failure to add bond interfaces to a bridge, the offload-handling
code was too defensive there and recent refactoring unearthed that.
Users complained (Ido)
- fix unnecessarily reflecting ECN bits within TOS values / QoS marking
in TCP ACK and reset packets (Wei)
- fix a deadlock with bpf iterator. Hopefully we're in the clear on
this front now... (Yonghong)
- BPF fix for clobbering r2 in bpf_gen_ld_abs (Daniel)
- fix AQL on mt76 devices with FW rate control and add a couple of AQL
issues in mac80211 code (Felix)
- fix authentication issue with mwifiex (Maximilian)
- WiFi connectivity fix: revert IGTK support in ti/wlcore (Mauro)
- fix exception handling for multipath routes via same device (David
Ahern)
- revert back to a BH spin lock flavor for nsid_lock: there are paths
which do require the BH context protection (Taehee)
- fix interrupt / queue / NAPI handling in the lantiq driver (Hauke)
- fix ife module load deadlock (Cong)
- make an adjustment to netlink reply message type for code added in
this release (the sole change touching uAPI here) (Michal)
- a number of fixes for small NXP and Microchip switches (Vladimir)
[ Pull request acked by David: "you can expect more of this in the
future as I try to delegate more things to Jakub" ]
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (167 commits)
net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
net: Update MAINTAINERS for MediaTek switch driver
net/mlx5e: mlx5e_fec_in_caps() returns a boolean
net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock
net/mlx5e: kTLS, Fix leak on resync error flow
net/mlx5e: kTLS, Add missing dma_unmap in RX resync
net/mlx5e: kTLS, Fix napi sync and possible use-after-free
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()
net/mlx5e: Fix multicast counter not up-to-date in "ip -s"
net/mlx5e: Fix endianness when calculating pedit mask first bit
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
net/mlx5e: CT: Fix freeing ct_label mapping
net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
net/mlx5e: Use synchronize_rcu to sync with NAPI
net/mlx5e: Use RCU to protect rq->xdp_prog
...
Linus Torvalds [Tue, 22 Sep 2020 21:36:50 +0000 (14:36 -0700)]
Merge tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"A few fixes - most of them regression fixes from this cycle, but also
a few stable heading fixes, and a build fix for the included demo tool
since some systems now actually have gettid() available"
* tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block:
io_uring: fix openat/openat2 unified prep handling
io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL
tools/io_uring: fix compile breakage
io_uring: don't use retry based buffered reads for non-async bdev
io_uring: don't re-setup vecs/iter in io_resumit_prep() is already there
io_uring: don't run task work on an exiting task
io_uring: drop 'ctx' ref on task work cancelation
io_uring: grab any needed state during defer prep
Linus Torvalds [Tue, 22 Sep 2020 21:31:38 +0000 (14:31 -0700)]
Merge tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few NVMe fixes, and a dasd write zero fix"
* tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block:
nvmet: get transport reference for passthru ctrl
nvme-core: get/put ctrl and transport module in nvme_dev_open/release()
nvme-tcp: fix kconfig dependency warning when !CRYPTO
nvme-pci: disable the write zeros command for Intel 600P/P3100
s390/dasd: Fix zero write for FBA devices
Ulf Hansson [Tue, 22 Sep 2020 09:15:50 +0000 (11:15 +0200)]
cpuidle: Drop misleading comments about RCU usage
The commit
1098582a0f6c ("sched,idle,rcu: Push rcu_idle deeper into the
idle path"), moved the calls rcu_idle_enter|exit() into the cpuidle core.
However, it forgot to remove a couple of comments in enter_s2idle_proper()
about why RCU_NONIDLE earlier was needed. So, let's drop them as they have
become a bit misleading.
Fixes: 1098582a0f6c ("sched,idle,rcu: Push rcu_idle deeper into the idle path")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Milan Broz [Thu, 20 Aug 2020 19:20:26 +0000 (21:20 +0200)]
dm crypt: document encrypted keyring key option
Commit
27f5411a718c4 ("dm crypt: support using encrypted keys")
introduced support for encrypted keyring type.
Fix documentation in admin guide to mention this type.
Fixes: 27f5411a718c4 ("dm crypt: support using encrypted keys")
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Jani Nikula [Tue, 22 Sep 2020 17:25:11 +0000 (20:25 +0300)]
Merge tag 'gvt-fixes-2020-09-17' of https://github.com/intel/gvt-linux into drm-intel-fixes
gvt-fixes-2020-09-17
- Fix kernel oops for VFIO edid on BDW (Zhenyu)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917064208.GF11592@zhen-hp.sh.intel.com
Milan Broz [Thu, 20 Aug 2020 17:45:38 +0000 (19:45 +0200)]
dm crypt: document new no_workqueue flags
Commit
39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd
workqueues") introduced new dm-crypt 'no_read_workqueue' and
'no_write_workqueue' flags.
Add documentation to admin guide for them.
Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues")
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Linus Torvalds [Tue, 22 Sep 2020 16:08:33 +0000 (09:08 -0700)]
Merge tag 'trace-v5.9-rc5' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Check kprobe is enabled before unregistering from ftrace as it isn't
registered when disabled.
- Remove kprobes enabled via command-line that is on init text when
freed.
- Add missing RCU synchronization for ftrace trampoline symbols removed
from kallsyms.
- Free trampoline on error path if ftrace_startup() fails.
- Give more space for the longer PID numbers in trace output.
- Fix a possible double free in the histogram code.
- A couple of fixes that were discovered by sparse.
* tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
bootconfig: init: make xbc_namebuf static
kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot
tracing: fix double free
ftrace: Let ftrace_enable_sysctl take a kernel pointer buffer
tracing: Make the space reserved for the pid wider
ftrace: Fix missing synchronize_rcu() removing trampoline from kallsyms
ftrace: Free the trampoline when ftrace_startup() fails
kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()
Kai-Heng Feng [Tue, 15 Sep 2020 10:39:23 +0000 (18:39 +0800)]
Revert "ALSA: usb-audio: Disable Lenovo P620 Rear line-in volume control"
This reverts commit
34dedd2a83b241ba6aeb290260313c65dc58660e.
According to Realtek, volume FU works for line-in.
I can confirm volume control works after device firmware is updated.
Fixes: 34dedd2a83b2 ("ALSA: usb-audio: Disable Lenovo P620 Rear line-in volume control")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Link: https://lore.kernel.org/r/20200915103925.12777-1-kai.heng.feng@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Anand Jain [Fri, 4 Sep 2020 17:34:21 +0000 (01:34 +0800)]
btrfs: fix put of uninitialized kobject after seed device delete
The following test case leads to NULL kobject free error:
mount seed /mnt
add sprout to /mnt
umount /mnt
mount sprout to /mnt
delete seed
kobject: '(null)' (
00000000dd2b87e4): is not initialized, yet kobject_put() is being called.
WARNING: CPU: 1 PID: 15784 at lib/kobject.c:736 kobject_put+0x80/0x350
RIP: 0010:kobject_put+0x80/0x350
::
Call Trace:
btrfs_sysfs_remove_devices_dir+0x6e/0x160 [btrfs]
btrfs_rm_device.cold+0xa8/0x298 [btrfs]
btrfs_ioctl+0x206c/0x22a0 [btrfs]
ksys_ioctl+0xe2/0x140
__x64_sys_ioctl+0x1e/0x29
do_syscall_64+0x96/0x150
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4047c6288b
::
This is because, at the end of the seed device-delete, we try to remove
the seed's devid sysfs entry. But for the seed devices under the sprout
fs, we don't initialize the devid kobject yet. So add a kobject state
check, which takes care of the bug.
Fixes: 668e48af7a94 ("btrfs: sysfs, add devid/dev_state kobject and device attributes")
CC: stable@vger.kernel.org # 5.6+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Huacai Chen [Mon, 24 Aug 2020 07:44:03 +0000 (15:44 +0800)]
MIPS: Loongson-3: Fix fp register access if MSA enabled
If MSA is enabled, FPU_REG_WIDTH is 128 rather than 64, then get_fpr64()
/set_fpr64() in the original unaligned instruction emulation code access
the wrong fp registers. This is because the current code doesn't specify
the correct index field, so fix it.
Fixes: f83e4f9896eff614d0f2547a ("MIPS: Loongson-3: Add some unaligned instructions emulation")
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Pei Huang <huangpei@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Jani Nikula [Tue, 22 Sep 2020 10:15:19 +0000 (13:15 +0300)]
Merge remote-tracking branch 'origin/master' into drm-intel-fixes
Direct backmerge to get commit
88b67edd7247 ("dax: Fix compilation for
CONFIG_DAX && !CONFIG_FS_DAX") to fix CI build issues.
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jacopo Mondi [Sat, 12 Sep 2020 10:30:45 +0000 (12:30 +0200)]
media: dt-bindings: media: imx274: Convert to json-schema
Convert the imx274 bindings document to json-schema and update
the MAINTAINERS file accordingly.
Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Masami Hiramatsu [Mon, 21 Sep 2020 09:45:11 +0000 (18:45 +0900)]
tools/bootconfig: Add testcase for tailing space
Add testcases for removing/keeping tailing space
in the value.
Link: https://lkml.kernel.org/r/160068151151.1088739.3469541807296024227.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masami Hiramatsu [Mon, 21 Sep 2020 09:45:02 +0000 (18:45 +0900)]
tools/bootconfig: Add testcases for repeated key with brace
Add a testcase for repeated key with brace parsing issue.
Link: https://lkml.kernel.org/r/160068150176.1088739.409481347784771987.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masami Hiramatsu [Mon, 21 Sep 2020 09:44:51 +0000 (18:44 +0900)]
lib/bootconfig: Fix to remove tailing spaces after value
Fix to remove tailing spaces after value. If there is a space
after value, the bootconfig failed to remove it because it
applies strim() before replacing the delimiter with null.
For example,
foo = var # comment
was parsed as below.
foo="var "
but user will expect
foo="var"
This fixes it by applying strim() after removing the delimiter.
Link: https://lkml.kernel.org/r/160068149134.1088739.8868306567670058853.stgit@devnote2
Fixes: 76db5a27a827 ("bootconfig: Add Extra Boot Config support")
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masami Hiramatsu [Mon, 21 Sep 2020 09:44:42 +0000 (18:44 +0900)]
lib/bootconfig: Fix a bug of breaking existing tree nodes
Fix a bug of breaking existing tree nodes by parsing the second
and subsequent braces. Since the bootconfig parser uses the
node.next field as a flag of current parent node, but this will
break the existing tree if the same key node is specified again
in the bootconfig.
For example, the following bootconfig should be foo.buz and bar.
foo
bar
foo { buz }
However, when parsing the brace "{", it breaks foo->bar link
by marking open-brace node. So the bootconfig unlinks bar
from the bootconfig internal tree.
This introduces a stack outside of the tree and record the
last open-brace on the stack instead of using node.next field.
Link: https://lkml.kernel.org/r/160068148267.1088739.8264704338030168660.stgit@devnote2
Fixes: 76db5a27a827 ("bootconfig: Add Extra Boot Config support")
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
David S. Miller [Tue, 22 Sep 2020 00:40:53 +0000 (17:40 -0700)]
Merge branch 'Fix-broken-tc-flower-rules-for-mscc_ocelot-switches'
Vladimir Oltean says:
====================
Fix broken tc-flower rules for mscc_ocelot switches
All 3 switch drivers from the Ocelot family have the same bug in the
VCAP IS2 key offsets, which is that some keys are in the incorrect
order.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 21 Sep 2020 22:56:38 +0000 (01:56 +0300)]
net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
The IS2 IP4_TCP_UDP key offsets do not correspond to the VSC7514
datasheet. Whether they work or not is unknown to me. On VSC9959 and
VSC9953, with the same mistake and same discrepancy from the
documentation, tc-flower src_port and dst_port rules did not work, so I
am assuming the same is true here.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 21 Sep 2020 22:56:37 +0000 (01:56 +0300)]
net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Since these were copied from the Felix VCAP IS2 code, and only the
offsets were adjusted, the order of the bit fields is still wrong.
Fix it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiaoliang Yang [Mon, 21 Sep 2020 22:56:36 +0000 (01:56 +0300)]
net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Some of the IS2 IP4_TCP_UDP keys are not correct, like L4_DPORT,
L4_SPORT and other L4 keys. This prevents offloaded tc-flower rules from
matching on src_port and dst_port for TCP and UDP packets.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 21 Sep 2020 14:27:20 +0000 (07:27 -0700)]
inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute
User space could send an invalid INET_DIAG_REQ_PROTOCOL attribute
as caught by syzbot.
BUG: KMSAN: uninit-value in inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline]
BUG: KMSAN: uninit-value in __inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147
CPU: 0 PID: 8505 Comm: syz-executor174 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x21c/0x280 lib/dump_stack.c:118
kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:122
__msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:219
inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline]
__inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147
inet_diag_dump_compat+0x2a5/0x380 net/ipv4/inet_diag.c:1254
netlink_dump+0xb73/0x1cb0 net/netlink/af_netlink.c:2246
__netlink_dump_start+0xcf2/0xea0 net/netlink/af_netlink.c:2354
netlink_dump_start include/linux/netlink.h:246 [inline]
inet_diag_rcv_msg_compat+0x5da/0x6c0 net/ipv4/inet_diag.c:1288
sock_diag_rcv_msg+0x24f/0x620 net/core/sock_diag.c:256
netlink_rcv_skb+0x6d7/0x7e0 net/netlink/af_netlink.c:2470
sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:275
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x11c8/0x1490 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x173a/0x1840 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0xc82/0x1240 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x6d1/0x820 net/socket.c:2440
__do_sys_sendmsg net/socket.c:2449 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x441389
Code: e8 fc ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 1b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007fff3b02ce98 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
0000000000441389
RDX:
0000000000000000 RSI:
0000000020001500 RDI:
0000000000000003
RBP:
00000000006cb018 R08:
00000000004002c8 R09:
00000000004002c8
R10:
0000000000000004 R11:
0000000000000246 R12:
0000000000402130
R13:
00000000004021c0 R14:
0000000000000000 R15:
0000000000000000
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:143 [inline]
kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:126
kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:80
slab_alloc_node mm/slub.c:2907 [inline]
__kmalloc_node_track_caller+0x9aa/0x12f0 mm/slub.c:4511
__kmalloc_reserve net/core/skbuff.c:142 [inline]
__alloc_skb+0x35f/0xb30 net/core/skbuff.c:210
alloc_skb include/linux/skbuff.h:1094 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline]
netlink_sendmsg+0xdb9/0x1840 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0xc82/0x1240 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x6d1/0x820 net/socket.c:2440
__do_sys_sendmsg net/socket.c:2449 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 3f935c75eb52 ("inet_diag: support for wider protocol numbers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Cc: Mat Martineau <mathew.j.martineau@linux.intel.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 21 Sep 2020 22:07:09 +0000 (01:07 +0300)]
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
When calling the RCU brother of br_vlan_get_pvid(), lockdep warns:
=============================
WARNING: suspicious RCU usage
5.9.0-rc3-01631-g13c17acb8e38-dirty #814 Not tainted
-----------------------------
net/bridge/br_private.h:1054 suspicious rcu_dereference_protected() usage!
Call trace:
lockdep_rcu_suspicious+0xd4/0xf8
__br_vlan_get_pvid+0xc0/0x100
br_vlan_get_pvid_rcu+0x78/0x108
The warning is because br_vlan_get_pvid_rcu() calls nbp_vlan_group()
which calls rtnl_dereference() instead of rcu_dereference(). In turn,
rtnl_dereference() calls rcu_dereference_protected() which assumes
operation under an RCU write-side critical section, which obviously is
not the case here. So, when the incorrect primitive is used to access
the RCU-protected VLAN group pointer, READ_ONCE() is not used, which may
cause various unexpected problems.
I'm sad to say that br_vlan_get_pvid() and br_vlan_get_pvid_rcu() cannot
share the same implementation. So fix the bug by splitting the 2
functions, and making br_vlan_get_pvid_rcu() retrieve the VLAN groups
under proper locking annotations.
Fixes: 7582f5b70f9a ("bridge: add br_vlan_get_pvid_rcu()")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 22 Sep 2020 00:32:42 +0000 (17:32 -0700)]
Merge tag 'mlx5-fixes-2020-09-18' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes-2020-09-18
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
v1->v2:
Remove missing patch from -stable list.
For -stable v5.1
('net/mlx5: Fix FTE cleanup')
For -stable v5.3
('net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported')
('net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported')
For -stable v5.7
('net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready')
For -stable v5.8
('net/mlx5e: Use RCU to protect rq->xdp_prog')
('net/mlx5e: Fix endianness when calculating pedit mask first bit')
('net/mlx5e: Use synchronize_rcu to sync with NAPI')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Wang [Mon, 21 Sep 2020 23:09:23 +0000 (07:09 +0800)]
net: Update MAINTAINERS for MediaTek switch driver
Update maintainers for MediaTek switch driver with Landen Chao who is
familiar with MediaTek MT753x switch devices and will help maintenance
from the vendor side.
Cc: Steven Liu <steven.liu@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Landen Chao <Landen.Chao@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed [Fri, 11 Sep 2020 18:00:06 +0000 (11:00 -0700)]
net/mlx5e: mlx5e_fec_in_caps() returns a boolean
Returning errno is a bug, fix that.
Also fixes smatch warnings:
drivers/net/ethernet/mellanox/mlx5/core/en/port.c:453
mlx5e_fec_in_caps() warn: signedness bug returning '(-95)'
Fixes: 2132b71f78d2 ("net/mlx5e: Advertise globaly supported FEC modes")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Saeed Mahameed [Tue, 8 Sep 2020 05:54:43 +0000 (22:54 -0700)]
net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock
The spinlock only needed when accessing the channel's icosq, grab the lock
after the buf allocation in resync_post_get_progress_params() to avoid
kzalloc(GFP_KERNEL) in atomic context.
Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Reported-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Saeed Mahameed [Fri, 11 Sep 2020 04:16:04 +0000 (21:16 -0700)]
net/mlx5e: kTLS, Fix leak on resync error flow
Resync progress params buffer and dma weren't released on error,
Add missing error unwinding for resync_post_get_progress_params().
Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Saeed Mahameed [Tue, 8 Sep 2020 05:58:50 +0000 (22:58 -0700)]
net/mlx5e: kTLS, Add missing dma_unmap in RX resync
Progress params dma address is never unmapped, unmap it when completion
handling is over.
Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Tariq Toukan [Mon, 10 Aug 2020 12:59:41 +0000 (15:59 +0300)]
net/mlx5e: kTLS, Fix napi sync and possible use-after-free
Using synchronize_rcu() is sufficient to wait until running NAPI quits.
See similar upstream fix with detailed explanation:
("net/mlx5e: Use synchronize_rcu to sync with NAPI")
This change also fixes a possible use-after-free as the NAPI
might be already released at this stage.
Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Tariq Toukan [Sun, 28 Jun 2020 10:06:06 +0000 (13:06 +0300)]
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
The set of TLS TX global SW counters in mlx5e_tls_sw_stats_desc
is updated from all rings by using atomic ops.
This set of stats is used only in the FPGA TLS use case, not in
the Connect-X TLS one, where regular per-ring counters are used.
Do not expose them in the Connect-X use case, as this would cause
counter duplication. For example, tx_tls_drop_no_sync_data would
appear twice in the ethtool stats.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Alaa Hleihel [Tue, 25 Aug 2020 07:41:50 +0000 (10:41 +0300)]
net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()
The cited commit started to reuse function mlx5e_update_ndo_stats() for
the representors as well.
However, the function is hard-coded to work on mlx5e_nic_stats_grps only.
Due to this issue, the representors statistics were not updated in the
output of "ip -s".
Fix it to work with the correct group by extracting it from the caller's
profile.
Also, while at it and since this function became generic, move it to
en_stats.c and rename it accordingly.
Fixes: 8a236b15144b ("net/mlx5e: Convert rep stats to mlx5e_stats_grp-based infra")
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Ron Diskin [Sun, 10 May 2020 11:39:51 +0000 (14:39 +0300)]
net/mlx5e: Fix multicast counter not up-to-date in "ip -s"
Currently the FW does not generate events for counters other than error
counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64" (which ip -s
uses) might run in atomic context, while the FW interface is non atomic.
Thus, 'ip' is not allowed to issue FW commands, so it will only display
cached counters in the driver.
Add a SW counter (mcast_packets) in the driver to count rx multicast
packets. The counter also counts broadcast packets, as we consider it a
special case of multicast.
Use the counter value when calling "ip -s"/"ifconfig".
Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality")
Signed-off-by: Ron Diskin <rondi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Maor Dickman [Wed, 2 Sep 2020 13:49:52 +0000 (16:49 +0300)]
net/mlx5e: Fix endianness when calculating pedit mask first bit
The field mask value is provided in network byte order and has to
be converted to host byte order before calculating pedit mask
first bit.
Fixes: 88f30bbcbaaa ("net/mlx5e: Bit sized fields rewrite support")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maor Dickman [Wed, 5 Aug 2020 14:56:04 +0000 (17:56 +0300)]
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
The cited commit creates peer miss group during switchdev mode
initialization in order to handle miss packets correctly while in VF
LAG mode. This is done regardless of FW support of such groups which
could cause rules setups failure later on.
Fix by adding FW capability check before creating peer groups/rule.
Fixes: ac004b832128 ("net/mlx5e: E-Switch, Add peer miss rules")
Signed-off-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Roi Dayan [Sun, 26 Jul 2020 13:37:47 +0000 (16:37 +0300)]
net/mlx5e: CT: Fix freeing ct_label mapping
Add missing mapping remove call when removing ct rule,
as the mapping was allocated when ct rule was adding with ct_label.
Also there is a missing mapping remove call in error flow.
Fixes: 54b154ecfb8c ("net/mlx5e: CT: Map 128 bits labels to 32 bit map ID")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Jianbo Liu [Tue, 7 Jul 2020 06:16:24 +0000 (06:16 +0000)]
net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
When deleting vxlan flow rule under multipath, tun_info in parse_attr is
not freed when the rule is not ready.
Fixes: ef06c9ee8933 ("net/mlx5e: Allow one failure when offloading tc encap rules under multipath")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Maxim Mikityanskiy [Thu, 11 Jun 2020 11:25:19 +0000 (14:25 +0300)]
net/mlx5e: Use synchronize_rcu to sync with NAPI
As described in the previous commit, napi_synchronize doesn't quite fit
the purpose when we just need to wait until the currently running NAPI
quits. Its implementation waits until NAPI is not running by polling and
waiting for 1ms in between. In cases where we need to deactivate one
queue (e.g., recovery flows) or where we deactivate them one-by-one
(deactivate channel flow), we may get stuck in napi_synchronize forever
if other queues keep NAPI active, causing a soft lockup. Depending on
kernel configuration (CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC), it may result
in a kernel panic.
To fix the issue, use synchronize_rcu to wait for NAPI to quit, and wrap
the whole NAPI in rcu_read_lock.
Fixes: acc6c5953af1 ("net/mlx5e: Split open/close channels to stages")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Maxim Mikityanskiy [Thu, 11 Jun 2020 10:55:19 +0000 (13:55 +0300)]
net/mlx5e: Use RCU to protect rq->xdp_prog
Currently, the RQs are temporarily deactivated while hot-replacing the
XDP program, and napi_synchronize is used to make sure rq->xdp_prog is
not in use. However, napi_synchronize is not ideal: instead of waiting
till the end of a NAPI cycle, it polls and waits until NAPI is not
running, sleeping for 1ms between the periodic checks. Under heavy
workloads, this loop will never end, which may even lead to a kernel
panic if the kernel detects the hangup. Such workloads include XSK TX
and possibly also heavy RX (XSK or normal).
The fix is inspired by commit
326fe02d1ed6 ("net/mlx4_en: protect
ring->xdp_prog with rcu_read_lock"). As mlx5e_xdp_handle is already
protected by rcu_read_lock, and bpf_prog_put uses call_rcu to free the
program, there is no need for additional synchronization if proper RCU
functions are used to access the pointer. This patch converts all
accesses to rq->xdp_prog to use RCU functions.
Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Maor Gottlieb [Mon, 31 Aug 2020 17:50:42 +0000 (20:50 +0300)]
net/mlx5: Fix FTE cleanup
Currently, when an FTE is allocated, its refcount is decreased to 0
with the purpose it will not be a stand alone steering object and every
rule (destination) of the FTE would increase the refcount.
When mlx5_cleanup_fs is called while not all rules were deleted by the
steering users, it hit refcount underflow on the FTE once clean_tree
calls to tree_remove_node after the deleted rules already decreased
the refcount to 0.
FTE is no longer destroyed implicitly when the last rule (destination)
is deleted. mlx5_del_flow_rules avoids it by increasing the refcount on
the FTE and destroy it explicitly after all rules were deleted. So we
can avoid the refcount underflow by making FTE as stand alone object.
In addition need to set del_hw_func to FTE so the HW object will be
destroyed when the FTE is deleted from the cleanup_tree flow.
refcount_t: underflow; use-after-free.
WARNING: CPU: 2 PID: 15715 at lib/refcount.c:28 refcount_warn_saturate+0xd9/0xe0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
tree_put_node+0xf2/0x140 [mlx5_core]
clean_tree+0x4e/0xf0 [mlx5_core]
clean_tree+0x4e/0xf0 [mlx5_core]
clean_tree+0x4e/0xf0 [mlx5_core]
clean_tree+0x5f/0xf0 [mlx5_core]
clean_tree+0x4e/0xf0 [mlx5_core]
clean_tree+0x5f/0xf0 [mlx5_core]
mlx5_cleanup_fs+0x26/0x270 [mlx5_core]
mlx5_unload+0x2e/0xa0 [mlx5_core]
mlx5_unload_one+0x51/0x120 [mlx5_core]
mlx5_devlink_reload_down+0x51/0x90 [mlx5_core]
devlink_reload+0x39/0x120
? devlink_nl_cmd_reload+0x43/0x220
genl_rcv_msg+0x1e4/0x420
? genl_family_rcv_msg_attrs_parse+0x100/0x100
netlink_rcv_skb+0x47/0x110
genl_rcv+0x24/0x40
netlink_unicast+0x217/0x2f0
netlink_sendmsg+0x30f/0x430
sock_sendmsg+0x30/0x40
__sys_sendto+0x10e/0x140
? handle_mm_fault+0xc4/0x1f0
? do_page_fault+0x33f/0x630
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x48/0x130
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 718ce4d601db ("net/mlx5: Consolidate update FTE for all removal changes")
Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Mike Snitzer [Mon, 21 Sep 2020 23:08:30 +0000 (19:08 -0400)]
dm: fix comment in dm_process_bio()
Refer to the correct function (->submit_bio instead of ->queue_bio).
Also, add details about why using blk_queue_split() isn't needed for
dm_wq_work()'s call to dm_process_bio().
Fixes: c62b37d96b6eb ("block: move ->make_request_fn to struct block_device_operations")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Mike Snitzer [Mon, 14 Sep 2020 17:04:19 +0000 (13:04 -0400)]
dm: fix bio splitting and its bio completion order for regular IO
dm_queue_split() is removed because __split_and_process_bio() _must_
handle splitting bios to ensure proper bio submission and completion
ordering as a bio is split.
Otherwise, multiple recursive calls to ->submit_bio will cause multiple
split bios to be allocated from the same ->bio_split mempool at the same
time. This would result in deadlock in low memory conditions because no
progress could be made (only one bio is available in ->bio_split
mempool).
This fix has been verified to still fix the loss of performance, due
to excess splitting, that commit
120c9257f5f1 provided.
Fixes: 120c9257f5f1 ("Revert "dm: always call blk_queue_split() in dm_process_bio()"")
Cc: stable@vger.kernel.org # 5.0+, requires custom backport due to 5.9 changes
Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
David S. Miller [Mon, 21 Sep 2020 21:54:35 +0000 (14:54 -0700)]
Merge tag 'mac80211-for-net-2020-09-21' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Just a few fixes:
* fix using HE on 2.4 GHz
* AQL (airtime queue limit) estimation & VHT160 fix
* do not oversize A-MPDUs if local capability is smaller than peer's
* fix radiotap on 6 GHz to not put 2.4 GHz flag
* fix Kconfig for lib80211
* little fixlet for 6 GHz channel number / frequency conversion
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xu Wang [Mon, 21 Sep 2020 06:38:56 +0000 (06:38 +0000)]
ipv6: route: convert comma to semicolon
Replace a comma between expression statements by a semicolon.
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 18 Sep 2020 14:33:11 +0000 (17:33 +0300)]
sfc: Fix error code in probe
This failure path should return a negative error code but it currently
returns success.
Fixes: 51b35a454efd ("sfc: skeleton EF100 PF driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Packham [Fri, 4 Sep 2020 00:28:12 +0000 (12:28 +1200)]
spi: fsl-espi: Only process interrupts for expected events
The SPIE register contains counts for the TX FIFO so any time the irq
handler was invoked we would attempt to process the RX/TX fifos. Use the
SPIM value to mask the events so that we only process interrupts that
were expected.
This was a latent issue exposed by commit
3282a3da25bd ("powerpc/64:
Implement soft interrupt replay in C").
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Link: https://lore.kernel.org/r/20200904002812.7300-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Mark Brown <broonie@kernel.org>
Dmitry Baryshkov [Thu, 17 Sep 2020 15:34:05 +0000 (18:34 +0300)]
regmap: fix page selection for noinc writes
Non-incrementing writes can fail if register + length crosses page
border. However for non-incrementing writes we should not check for page
border crossing. Fix this by passing additional flag to _regmap_raw_write
and passing length to _regmap_select_page basing on the flag.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: cdf6b11daa77 ("regmap: Add regmap_noinc_write API")
Link: https://lore.kernel.org/r/20200917153405.3139200-2-dmitry.baryshkov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Dmitry Baryshkov [Thu, 17 Sep 2020 15:34:04 +0000 (18:34 +0300)]
regmap: fix page selection for noinc reads
Non-incrementing reads can fail if register + length crosses page
border. However for non-incrementing reads we should not check for page
border crossing. Fix this by passing additional flag to _regmap_raw_read
and passing length to _regmap_select_page basing on the flag.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: 74fe7b551f33 ("regmap: Add regmap_noinc_read API")
Link: https://lore.kernel.org/r/20200917153405.3139200-1-dmitry.baryshkov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Linus Torvalds [Mon, 21 Sep 2020 19:42:31 +0000 (12:42 -0700)]
Merge branch 'rcu/urgent' of git://git./linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
"This contains a single commit that fixes a bug that was introduced in
the last merge window. This bug causes a compiler warning complaining
about show_rcu_tasks_classic_gp_kthread() being an unused static
function in !SMP kernels.
The fix is straightforward, just adding an 'inline' to make this a
static inline function, thus avoiding the warning.
This bug was reported by Laurent Pinchart, who would like it fixed
sooner rather than later"
* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
rcu-tasks: Prevent complaints of unused show_rcu_tasks_classic_gp_kthread()
Linus Torvalds [Mon, 21 Sep 2020 15:53:48 +0000 (08:53 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- fix fault on page table writes during instruction fetch
s390:
- doc improvement
x86:
- The obvious patches are always the ones that turn out to be
completely broken. /me hangs his head in shame"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
Revert "KVM: Check the allocation of pv cpu mask"
KVM: arm64: Remove S1PTW check from kvm_vcpu_dabt_iswrite()
KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch
docs: kvm: add documentation for KVM_CAP_S390_DIAG318
Linus Torvalds [Mon, 21 Sep 2020 15:46:20 +0000 (08:46 -0700)]
Merge tag 'libnvdimm-fixes-5.9-rc7' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fix from Dan Williams:
"Fix compilation for the new dax_supported() exported helper"
* tag 'libnvdimm-fixes-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX
Chuck Lever [Sun, 20 Sep 2020 17:46:25 +0000 (13:46 -0400)]
SUNRPC: Fix svc_flush_dcache()
On platforms that implement flush_dcache_page(), a large NFS WRITE
triggers the WARN_ONCE in bvec_iter_advance():
Sep 20 14:01:05 klimt.1015granger.net kernel: Attempted to advance past end of bvec iter
Sep 20 14:01:05 klimt.1015granger.net kernel: WARNING: CPU: 0 PID: 1032 at include/linux/bvec.h:101 bvec_iter_advance.isra.0+0xa7/0x158 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel: Call Trace:
Sep 20 14:01:05 klimt.1015granger.net kernel: svc_tcp_recvfrom+0x60c/0x12c7 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel: ? bvec_iter_advance.isra.0+0x158/0x158 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel: ? del_timer_sync+0x4b/0x55
Sep 20 14:01:05 klimt.1015granger.net kernel: ? test_bit+0x1d/0x27 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel: svc_recv+0x1193/0x15e4 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel: ? try_to_freeze.isra.0+0x6f/0x6f [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel: ? refcount_sub_and_test.constprop.0+0x13/0x40 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel: ? svc_xprt_put+0x1e/0x29f [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel: ? svc_send+0x39f/0x3c1 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel: nfsd+0x282/0x345 [nfsd]
Sep 20 14:01:05 klimt.1015granger.net kernel: ? __kthread_parkme+0x74/0xba
Sep 20 14:01:05 klimt.1015granger.net kernel: kthread+0x2ad/0x2bc
Sep 20 14:01:05 klimt.1015granger.net kernel: ? nfsd_destroy+0x124/0x124 [nfsd]
Sep 20 14:01:05 klimt.1015granger.net kernel: ? test_bit+0x1d/0x27
Sep 20 14:01:05 klimt.1015granger.net kernel: ? kthread_mod_delayed_work+0x115/0x115
Sep 20 14:01:05 klimt.1015granger.net kernel: ret_from_fork+0x22/0x30
Reported-by: He Zhe <zhe.he@windriver.com>
Fixes: ca07eda33e01 ("SUNRPC: Refactor svc_recvfrom()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Ulf Hansson [Fri, 4 Sep 2020 06:47:05 +0000 (08:47 +0200)]
cpuidle: psci: Fix suspicious RCU usage
The commit
eb1f00237aca ("lockdep,trace: Expose tracepoints"), started to
expose us for tracepoints. This lead to the following RCU splat on an ARM64
Qcom board.
[ 5.529634] WARNING: suspicious RCU usage
[ 5.537307] sdhci-pltfm: SDHCI platform and OF driver helper
[ 5.541092] 5.9.0-rc3 #86 Not tainted
[ 5.541098] -----------------------------
[ 5.541105] ../include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!
[ 5.541110]
[ 5.541110] other info that might help us debug this:
[ 5.541110]
[ 5.541116]
[ 5.541116] rcu_scheduler_active = 2, debug_locks = 1
[ 5.541122] RCU used illegally from extended quiescent state!
[ 5.541129] no locks held by swapper/0/0.
[ 5.541134]
[ 5.541134] stack backtrace:
[ 5.541143] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc3 #86
[ 5.541149] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[ 5.541157] Call trace:
[ 5.568185] sdhci_msm
7864900.sdhci: Got CD GPIO
[ 5.574186] dump_backtrace+0x0/0x1c8
[ 5.574206] show_stack+0x14/0x20
[ 5.574229] dump_stack+0xe8/0x154
[ 5.574250] lockdep_rcu_suspicious+0xd4/0xf8
[ 5.574269] lock_acquire+0x3f0/0x460
[ 5.574292] _raw_spin_lock_irqsave+0x80/0xb0
[ 5.574314] __pm_runtime_suspend+0x4c/0x188
[ 5.574341] psci_enter_domain_idle_state+0x40/0xa0
[ 5.574362] cpuidle_enter_state+0xc0/0x610
[ 5.646487] cpuidle_enter+0x38/0x50
[ 5.650651] call_cpuidle+0x18/0x40
[ 5.654467] do_idle+0x228/0x278
[ 5.657678] cpu_startup_entry+0x24/0x70
[ 5.661153] rest_init+0x1a4/0x278
[ 5.665061] arch_call_rest_init+0xc/0x14
[ 5.668272] start_kernel+0x508/0x540
Following the path in pm_runtime_put_sync_suspend() from
psci_enter_domain_idle_state(), it seems like we end up using the RCU.
Therefore, let's simply silence the splat by informing the RCU about it
with RCU_NONIDLE.
Note that, this is a temporary solution. Instead we should strive to avoid
using RCU_NONIDLE (and similar), but rather push rcu_idle_enter|exit()
further down, closer to the arch specific code. However, as the CPU PM
notifiers are also using the RCU, additional rework is needed.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Jan Kara [Mon, 21 Sep 2020 09:33:23 +0000 (11:33 +0200)]
dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX
dax_supported() is defined whenever CONFIG_DAX is enabled. So dummy
implementation should be defined only in !CONFIG_DAX case, not in
!CONFIG_FS_DAX case.
Fixes: e2ec51282545 ("dm: Call proper helper to determine dax support")
Cc: <stable@vger.kernel.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jens Axboe [Sat, 19 Sep 2020 01:36:24 +0000 (19:36 -0600)]
io_uring: fix openat/openat2 unified prep handling
A previous commit unified how we handle prep for these two functions,
but this means that we check the allowed context (SQPOLL, specifically)
later than we should. Move the ring type checking into the two parent
functions, instead of doing it after we've done some setup work.
Fixes: ec65fea5a8d7 ("io_uring: deduplicate io_openat{,2}_prep()")
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 18 Sep 2020 22:51:19 +0000 (16:51 -0600)]
io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL
These will naturally fail when attempted through SQPOLL, but either
with -EFAULT or -EBADF. Make it explicit that these are not workable
through SQPOLL and return -EINVAL, just like other ops that need to
use ->files.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Douglas Gilbert [Mon, 14 Sep 2020 21:36:09 +0000 (17:36 -0400)]
tools/io_uring: fix compile breakage
It would seem none of the kernel continuous integration does this:
$ cd tools/io_uring
$ make
Otherwise it may have noticed:
cc -Wall -Wextra -g -D_GNU_SOURCE -c -o io_uring-bench.o
io_uring-bench.c
io_uring-bench.c:133:12: error: static declaration of ‘gettid’
follows non-static declaration
133 | static int gettid(void)
| ^~~~~~
In file included from /usr/include/unistd.h:1170,
from io_uring-bench.c:27:
/usr/include/x86_64-linux-gnu/bits/unistd_ext.h:34:16: note:
previous declaration of ‘gettid’ was here
34 | extern __pid_t gettid (void) __THROW;
| ^~~~~~
make: *** [<builtin>: io_uring-bench.o] Error 1
The problem on Ubuntu 20.04 (with lk 5.9.0-rc5) is that unistd.h
already defines gettid(). So prefix the local definition with
"lk_".
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 14 Sep 2020 15:30:38 +0000 (09:30 -0600)]
io_uring: don't use retry based buffered reads for non-async bdev
Some block devices, like dm, bubble back -EAGAIN through the completion
handler. We check for this in io_read(), but don't honor it for when
we have copied the iov. Return -EAGAIN for this case before retrying,
to force punt to io-wq.
Fixes: bcf5a06304d6 ("io_uring: support true async buffered reads, if file provides it")
Reported-by: Zorro Lang <zlang@redhat.com>
Tested-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>