platform/kernel/linux-starfive.git
3 years agoMerge remote-tracking branch 'arm64/for-next/perf' into for-next/core
Catalin Marinas [Wed, 9 Dec 2020 18:04:48 +0000 (18:04 +0000)]
Merge remote-tracking branch 'arm64/for-next/perf' into for-next/core

* arm64/for-next/perf:
  perf/imx_ddr: Add system PMU identifier for userspace
  bindings: perf: imx-ddr: add compatible string
  arm64: Fix build failure when HARDLOCKUP_DETECTOR_PERF is enabled
  arm64: Enable perf events based hard lockup detector
  perf/imx_ddr: Add stop event counters support for i.MX8MP
  perf/smmuv3: Support sysfs identifier file
  drivers/perf: hisi: Add identifier sysfs file
  perf: remove duplicate check on fwnode
  driver/perf: Add PMU driver for the ARM DMC-620 memory controller

3 years agoMerge branch 'for-next/misc' into for-next/core
Catalin Marinas [Wed, 9 Dec 2020 18:04:48 +0000 (18:04 +0000)]
Merge branch 'for-next/misc' into for-next/core

* for-next/misc:
  : Miscellaneous patches
  arm64: vmlinux.lds.S: Drop redundant *.init.rodata.*
  kasan: arm64: set TCR_EL1.TBID1 when enabled
  arm64: mte: optimize asynchronous tag check fault flag check
  arm64/mm: add fallback option to allocate virtually contiguous memory
  arm64/smp: Drop the macro S(x,s)
  arm64: consistently use reserved_pg_dir
  arm64: kprobes: Remove redundant kprobe_step_ctx

# Conflicts:
# arch/arm64/kernel/vmlinux.lds.S

3 years agoMerge branch 'for-next/uaccess' into for-next/core
Catalin Marinas [Wed, 9 Dec 2020 18:04:42 +0000 (18:04 +0000)]
Merge branch 'for-next/uaccess' into for-next/core

* for-next/uaccess:
  : uaccess routines clean-up and set_fs() removal
  arm64: mark __system_matches_cap as __maybe_unused
  arm64: uaccess: remove vestigal UAO support
  arm64: uaccess: remove redundant PAN toggling
  arm64: uaccess: remove addr_limit_user_check()
  arm64: uaccess: remove set_fs()
  arm64: uaccess cleanup macro naming
  arm64: uaccess: split user/kernel routines
  arm64: uaccess: refactor __{get,put}_user
  arm64: uaccess: simplify __copy_user_flushcache()
  arm64: uaccess: rename privileged uaccess routines
  arm64: sdei: explicitly simulate PAN/UAO entry
  arm64: sdei: move uaccess logic to arch/arm64/
  arm64: head.S: always initialize PSTATE
  arm64: head.S: cleanup SCTLR_ELx initialization
  arm64: head.S: rename el2_setup -> init_kernel_el
  arm64: add C wrappers for SET_PSTATE_*()
  arm64: ensure ERET from kthread is illegal

3 years agoMerge branches 'for-next/kvm-build-fix', 'for-next/va-refactor', 'for-next/lto',...
Catalin Marinas [Wed, 9 Dec 2020 18:04:35 +0000 (18:04 +0000)]
Merge branches 'for-next/kvm-build-fix', 'for-next/va-refactor', 'for-next/lto', 'for-next/mem-hotplug', 'for-next/cppc-ffh', 'for-next/pad-image-header', 'for-next/zone-dma-default-32-bit', 'for-next/signal-tag-bits' and 'for-next/cmdline-extended' into for-next/core

* for-next/kvm-build-fix:
  : Fix KVM build issues with 64K pages
  KVM: arm64: Fix build error in user_mem_abort()

* for-next/va-refactor:
  : VA layout changes
  arm64: mm: don't assume struct page is always 64 bytes
  Documentation/arm64: fix RST layout of memory.rst
  arm64: mm: tidy up top of kernel VA space
  arm64: mm: make vmemmap region a projection of the linear region
  arm64: mm: extend linear region for 52-bit VA configurations

* for-next/lto:
  : Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO
  arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y
  arm64: alternatives: Remove READ_ONCE() usage during patch operation
  arm64: cpufeatures: Add capability for LDAPR instruction
  arm64: alternatives: Split up alternative.h
  arm64: uaccess: move uao_* alternatives to asm-uaccess.h

* for-next/mem-hotplug:
  : Memory hotplug improvements
  arm64/mm/hotplug: Ensure early memory sections are all online
  arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  arm64: mm: account for hotplug memory when randomizing the linear region

* for-next/cppc-ffh:
  : Add CPPC FFH support using arm64 AMU counters
  arm64: abort counter_read_on_cpu() when irqs_disabled()
  arm64: implement CPPC FFH support using AMUs
  arm64: split counter validation function
  arm64: wrap and generalise counter read functions

* for-next/pad-image-header:
  : Pad Image header to 64KB and unmap it
  arm64: head: tidy up the Image header definition
  arm64/head: avoid symbol names pointing into first 64 KB of kernel image
  arm64: omit [_text, _stext) from permanent kernel mapping

* for-next/zone-dma-default-32-bit:
  : Default to 32-bit wide ZONE_DMA (previously reduced to 1GB for RPi4)
  of: unittest: Fix build on architectures without CONFIG_OF_ADDRESS
  mm: Remove examples from enum zone_type comment
  arm64: mm: Set ZONE_DMA size based on early IORT scan
  arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
  of: unittest: Add test for of_dma_get_max_cpu_address()
  of/address: Introduce of_dma_get_max_cpu_address()
  arm64: mm: Move zone_dma_bits initialization into zone_sizes_init()
  arm64: mm: Move reserve_crashkernel() into mem_init()
  arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
  arm64: Ignore any DMA offsets in the max_zone_phys() calculation

* for-next/signal-tag-bits:
  : Expose the FAR_EL1 tag bits in siginfo
  arm64: expose FAR_EL1 tag bits in siginfo
  signal: define the SA_EXPOSE_TAGBITS bit in sa_flags
  signal: define the SA_UNSUPPORTED bit in sa_flags
  arch: provide better documentation for the arch-specific SA_* flags
  signal: clear non-uapi flag bits when passing/returning sa_flags
  arch: move SA_* definitions to generic headers
  parisc: start using signal-defs.h
  parisc: Drop parisc special case for __sighandler_t

* for-next/cmdline-extended:
  : Add support for CONFIG_CMDLINE_EXTENDED
  arm64: Extend the kernel command line from the bootloader
  arm64: kaslr: Refactor early init command line parsing

3 years agoperf/imx_ddr: Add system PMU identifier for userspace
Joakim Zhang [Mon, 30 Nov 2020 11:42:02 +0000 (19:42 +0800)]
perf/imx_ddr: Add system PMU identifier for userspace

The DDR Perf for i.MX8 is a system PMU whose AXI ID would different from
SoC to SoC. Need expose system PMU identifier for userspace which refer
to /sys/bus/event_source/devices/<PMU DEVICE>/identifier.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20201130114202.26057-3-qiangqing.zhang@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
3 years agobindings: perf: imx-ddr: add compatible string
Joakim Zhang [Mon, 30 Nov 2020 11:42:01 +0000 (19:42 +0800)]
bindings: perf: imx-ddr: add compatible string

Add extra compabile string to support driver.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20201130114202.26057-2-qiangqing.zhang@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoarm64: Fix build failure when HARDLOCKUP_DETECTOR_PERF is enabled
Will Deacon [Fri, 4 Dec 2020 09:19:35 +0000 (09:19 +0000)]
arm64: Fix build failure when HARDLOCKUP_DETECTOR_PERF is enabled

If HARDLOCKUP_DETECTOR_PERF is selected but HW_PERF_EVENTS is not, then
the associated watchdog driver will fail to link:

  |    aarch64-linux-ld: Unexpected GOT/PLT entries detected!
  |    aarch64-linux-ld: Unexpected run-time procedure linkages detected!
  |    aarch64-linux-ld: kernel/watchdog_hld.o: in function `hardlockup_detector_event_create':
  | >> watchdog_hld.c:(.text+0x68): undefined reference to `hw_nmi_get_sample_period

Change the Kconfig dependencies so that HAVE_PERF_EVENTS_NMI requires
the hardware PMU driver to be enabled, ensuring that the required
symbols are present.

Cc: Sumit Garg <sumit.garg@linaro.org>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/202012031509.4O5ZoWNI-lkp@intel.com
Fixes: 367c820ef080 ("arm64: Enable perf events based hard lockup detector")
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoarm64: mark __system_matches_cap as __maybe_unused
Mark Rutland [Thu, 3 Dec 2020 15:24:03 +0000 (15:24 +0000)]
arm64: mark __system_matches_cap as __maybe_unused

Now that the PAN toggling has been removed, the only user of
__system_matches_cap() is has_generic_auth(), which is only built when
CONFIG_ARM64_PTR_AUTH is selected, and Qian reports that this results in
a build-time warning when CONFIG_ARM64_PTR_AUTH is not selected:

| arch/arm64/kernel/cpufeature.c:2649:13: warning: '__system_matches_cap' defined but not used [-Wunused-function]
|  static bool __system_matches_cap(unsigned int n)
|              ^~~~~~~~~~~~~~~~~~~~

It's tricky to restructure things to prevent this, so let's mark
__system_matches_cap() as __maybe_unused, as we used to do for the other
user of __system_matches_cap() which we just removed.

Reported-by: Qian Cai <qcai@redhat.com>
Suggested-by: Qian Cai <qcai@redhat.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20201203152403.26100-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: uaccess: remove vestigal UAO support
Mark Rutland [Wed, 2 Dec 2020 13:15:58 +0000 (13:15 +0000)]
arm64: uaccess: remove vestigal UAO support

Now that arm64 no longer uses UAO, remove the vestigal feature detection
code and Kconfig text.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-13-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: uaccess: remove redundant PAN toggling
Mark Rutland [Wed, 2 Dec 2020 13:15:57 +0000 (13:15 +0000)]
arm64: uaccess: remove redundant PAN toggling

Some code (e.g. futex) needs to make privileged accesses to userspace
memory, and uses uaccess_{enable,disable}_privileged() in order to
permit this. All other uaccess primitives use LDTR/STTR, and never need
to toggle PAN.

Remove the redundant PAN toggling.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-12-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: uaccess: remove addr_limit_user_check()
Mark Rutland [Wed, 2 Dec 2020 13:15:56 +0000 (13:15 +0000)]
arm64: uaccess: remove addr_limit_user_check()

Now that set_fs() is gone, addr_limit_user_check() is redundant. Remove
the checks and associated thread flag.

To ensure that _TIF_WORK_MASK can be used as an immediate value in an
AND instruction (as it is in `ret_to_user`), TIF_MTE_ASYNC_FAULT is
renumbered to keep the constituent bits of _TIF_WORK_MASK contiguous.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-11-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: uaccess: remove set_fs()
Mark Rutland [Wed, 2 Dec 2020 13:15:55 +0000 (13:15 +0000)]
arm64: uaccess: remove set_fs()

Now that the uaccess primitives dont take addr_limit into account, we
have no need to manipulate this via set_fs() and get_fs(). Remove
support for these, along with some infrastructure this renders
redundant.

We no longer need to flip UAO to access kernel memory under KERNEL_DS,
and head.S unconditionally clears UAO for all kernel configurations via
an ERET in init_kernel_el. Thus, we don't need to dynamically flip UAO,
nor do we need to context-switch it. However, we still need to adjust
PAN during SDEI entry.

Masking of __user pointers no longer needs to use the dynamic value of
addr_limit, and can use a constant derived from the maximum possible
userspace task size. A new TASK_SIZE_MAX constant is introduced for
this, which is also used by core code. In configurations supporting
52-bit VAs, this may include a region of unusable VA space above a
48-bit TTBR0 limit, but never includes any portion of TTBR1.

Note that TASK_SIZE_MAX is an exclusive limit, while USER_DS and
KERNEL_DS were inclusive limits, and is converted to a mask by
subtracting one.

As the SDEI entry code repurposes the otherwise unnecessary
pt_regs::orig_addr_limit field to store the TTBR1 of the interrupted
context, for now we rename that to pt_regs::sdei_ttbr1. In future we can
consider factoring that out.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-10-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: uaccess cleanup macro naming
Mark Rutland [Wed, 2 Dec 2020 13:15:54 +0000 (13:15 +0000)]
arm64: uaccess cleanup macro naming

Now the uaccess primitives use LDTR/STTR unconditionally, the
uao_{ldp,stp,user_alternative} asm macros are misnamed, and have a
redundant argument. Let's remove the redundant argument and rename these
to user_{ldp,stp,ldst} respectively to clean this up.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Robin Murohy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-9-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: uaccess: split user/kernel routines
Mark Rutland [Wed, 2 Dec 2020 13:15:53 +0000 (13:15 +0000)]
arm64: uaccess: split user/kernel routines

This patch separates arm64's user and kernel memory access primitives
into distinct routines, adding new __{get,put}_kernel_nofault() helpers
to access kernel memory, upon which core code builds larger copy
routines.

The kernel access routines (using LDR/STR) are not affected by PAN (when
legitimately accessing kernel memory), nor are they affected by UAO.
Switching to KERNEL_DS may set UAO, but this does not adversely affect
the kernel access routines.

The user access routines (using LDTR/STTR) are not affected by PAN (when
legitimately accessing user memory), but are affected by UAO. As these
are only legitimate to use under USER_DS with UAO clear, this should not
be problematic.

Routines performing atomics to user memory (futex and deprecated
instruction emulation) still need to transiently clear PAN, and these
are left as-is. These are never used on kernel memory.

Subsequent patches will refactor the uaccess helpers to remove redundant
code, and will also remove the redundant PAN/UAO manipulation.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-8-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: uaccess: refactor __{get,put}_user
Mark Rutland [Wed, 2 Dec 2020 13:15:52 +0000 (13:15 +0000)]
arm64: uaccess: refactor __{get,put}_user

As a step towards implementing __{get,put}_kernel_nofault(), this patch
splits most user-memory specific logic out of __{get,put}_user(), with
the memory access and fault handling in new __{raw_get,put}_mem()
helpers.

For now the LDR/LDTR patching is left within the *get_mem() helpers, and
will be removed in a subsequent patch.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-7-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: uaccess: simplify __copy_user_flushcache()
Mark Rutland [Wed, 2 Dec 2020 13:15:51 +0000 (13:15 +0000)]
arm64: uaccess: simplify __copy_user_flushcache()

Currently __copy_user_flushcache() open-codes raw_copy_from_user(), and
doesn't use uaccess_mask_ptr() on the user address. Let's have it call
raw_copy_from_user(), which is both a simplification and ensures that
user pointers are masked under speculation.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: uaccess: rename privileged uaccess routines
Mark Rutland [Wed, 2 Dec 2020 13:15:50 +0000 (13:15 +0000)]
arm64: uaccess: rename privileged uaccess routines

We currently have many uaccess_*{enable,disable}*() variants, which
subsequent patches will cut down as part of removing set_fs() and
friends. Once this simplification is made, most uaccess routines will
only need to ensure that the user page tables are mapped in TTBR0, as is
currently dealt with by uaccess_ttbr0_{enable,disable}().

The existing uaccess_{enable,disable}() routines ensure that user page
tables are mapped in TTBR0, and also disable PAN protections, which is
necessary to be able to use atomics on user memory, but also permit
unrelated privileged accesses to access user memory.

As preparatory step, let's rename uaccess_{enable,disable}() to
uaccess_{enable,disable}_privileged(), highlighting this caveat and
discouraging wider misuse. Subsequent patches can reuse the
uaccess_{enable,disable}() naming for the common case of ensuring the
user page tables are mapped in TTBR0.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: sdei: explicitly simulate PAN/UAO entry
Mark Rutland [Wed, 2 Dec 2020 13:15:48 +0000 (13:15 +0000)]
arm64: sdei: explicitly simulate PAN/UAO entry

In preparation for removing addr_limit and set_fs() we must decouple the
SDEI PAN/UAO manipulation from the uaccess code, and explicitly
reinitialize these as required.

SDEI enters the kernel with a non-architectural exception, and prior to
the most recent revision of the specification (ARM DEN 0054B), PSTATE
bits (e.g. PAN, UAO) are not manipulated in the same way as for
architectural exceptions. Notably, older versions of the spec can be
read ambiguously as to whether PSTATE bits are inherited unchanged from
the interrupted context or whether they are generated from scratch, with
TF-A doing the latter.

We have three cases to consider:

1) The existing TF-A implementation of SDEI will clear PAN and clear UAO
   (along with other bits in PSTATE) when delivering an SDEI exception.

2) In theory, implementations of SDEI prior to revision B could inherit
   PAN and UAO (along with other bits in PSTATE) unchanged from the
   interrupted context. However, in practice such implementations do not
   exist.

3) Going forward, new implementations of SDEI must clear UAO, and
   depending on SCTLR_ELx.SPAN must either inherit or set PAN.

As we can ignore (2) we can assume that upon SDEI entry, UAO is always
clear, though PAN may be clear, inherited, or set per SCTLR_ELx.SPAN.
Therefore, we must explicitly initialize PAN, but do not need to do
anything for UAO.

Considering what we need to do:

* When set_fs() is removed, force_uaccess_begin() will have no HW
  side-effects. As this only clears UAO, which we can assume has already
  been cleared upon entry, this is not a problem. We do not need to add
  code to manipulate UAO explicitly.

* PAN may be cleared upon entry (in case 1 above), so where a kernel is
  built to use PAN and this is supported by all CPUs, the kernel must
  set PAN upon entry to ensure expected behaviour.

* PAN may be inherited from the interrupted context (in case 3 above),
  and so where a kernel is not built to use PAN or where PAN support is
  not uniform across CPUs, the kernel must clear PAN to ensure expected
  behaviour.

This patch reworks the SDEI code accordingly, explicitly setting PAN to
the expected state in all cases. To cater for the cases where the kernel
does not use PAN or this is not uniformly supported by hardware we add a
new cpu_has_pan() helper which can be used regardless of whether the
kernel is built to use PAN.

The existing system_uses_ttbr0_pan() is redefined in terms of
system_uses_hw_pan() both for clarity and as a minor optimization when
HW PAN is not selected.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: sdei: move uaccess logic to arch/arm64/
Mark Rutland [Wed, 2 Dec 2020 13:15:47 +0000 (13:15 +0000)]
arm64: sdei: move uaccess logic to arch/arm64/

The SDEI support code is split across arch/arm64/ and drivers/firmware/,
largley this is split so that the arch-specific portions are under
arch/arm64, and the management logic is under drivers/firmware/.
However, exception entry fixups are currently under drivers/firmware.

Let's move the exception entry fixups under arch/arm64/. This
de-clutters the management logic, and puts all the arch-specific
portions in one place. Doing this also allows the fixups to be applied
earlier, so things like PAN and UAO will be in a known good state before
we run other logic. This will also make subsequent refactoring easier.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: head.S: always initialize PSTATE
Mark Rutland [Fri, 13 Nov 2020 12:49:25 +0000 (12:49 +0000)]
arm64: head.S: always initialize PSTATE

As with SCTLR_ELx and other control registers, some PSTATE bits are
UNKNOWN out-of-reset, and we may not be able to rely on hardware or
firmware to initialize them to our liking prior to entry to the kernel,
e.g. in the primary/secondary boot paths and return from idle/suspend.

It would be more robust (and easier to reason about) if we consistently
initialized PSTATE to a default value, as we do with control registers.
This will ensure that the kernel is not adversely affected by bits it is
not aware of, e.g. when support for a feature such as PAN/UAO is
disabled.

This patch ensures that PSTATE is consistently initialized at boot time
via an ERET. This is not intended to relax the existing requirements
(e.g. DAIF bits must still be set prior to entering the kernel). For
features detected dynamically (which may require system-wide support),
it is still necessary to subsequently modify PSTATE.

As ERET is not always a Context Synchronization Event, an ISB is placed
before each exception return to ensure updates to control registers have
taken effect. This handles the kernel being entered with SCTLR_ELx.EOS
clear (or any future control bits being in an UNKNOWN state).

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201113124937.20574-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: head.S: cleanup SCTLR_ELx initialization
Mark Rutland [Fri, 13 Nov 2020 12:49:24 +0000 (12:49 +0000)]
arm64: head.S: cleanup SCTLR_ELx initialization

Let's make SCTLR_ELx initialization a bit clearer by using meaningful
names for the initialization values, following the same scheme for
SCTLR_EL1 and SCTLR_EL2.

These definitions will be used more widely in subsequent patches.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201113124937.20574-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: head.S: rename el2_setup -> init_kernel_el
Mark Rutland [Fri, 13 Nov 2020 12:49:23 +0000 (12:49 +0000)]
arm64: head.S: rename el2_setup -> init_kernel_el

For a while now el2_setup has performed some basic initialization of EL1
even when the kernel is booted at EL1, so the name is a little
misleading. Further, some comments are stale as with VHE it doesn't drop
the CPU to EL1.

To clarify things, rename el2_setup to init_kernel_el, and update
comments to be clearer as to the function's purpose.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201113124937.20574-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: add C wrappers for SET_PSTATE_*()
Mark Rutland [Fri, 13 Nov 2020 12:49:22 +0000 (12:49 +0000)]
arm64: add C wrappers for SET_PSTATE_*()

To make callsites easier to read, add trivial C wrappers for the
SET_PSTATE_*() helpers, and convert trivial uses over to these. The new
wrappers will be used further in subsequent patches.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201113124937.20574-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: ensure ERET from kthread is illegal
Mark Rutland [Fri, 13 Nov 2020 12:49:21 +0000 (12:49 +0000)]
arm64: ensure ERET from kthread is illegal

For consistency, all tasks have a pt_regs reserved at the highest
portion of their task stack. Among other things, this ensures that a
task's SP is always pointing within its stack rather than pointing
immediately past the end.

While it is never legitimate to ERET from a kthread, we take pains to
initialize pt_regs for kthreads as if this were legitimate. As this is
never legitimate, the effects of an erroneous return are rarely tested.

Let's simplify things by initializing a kthread's pt_regs such that an
ERET is caught as an illegal exception return, and removing the explicit
initialization of other exception context. Note that as
spectre_v4_enable_task_mitigation() only manipulates the PSTATE within
the unused regs this is safe to remove.

As user tasks will have their exception context initialized via
start_thread() or start_compat_thread(), this should only impact cases
where something has gone very wrong and we'd like that to be clearly
indicated.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201113124937.20574-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoof: unittest: Fix build on architectures without CONFIG_OF_ADDRESS
Catalin Marinas [Tue, 1 Dec 2020 12:47:25 +0000 (12:47 +0000)]
of: unittest: Fix build on architectures without CONFIG_OF_ADDRESS

of_dma_get_max_cpu_address() is not defined if !CONFIG_OF_ADDRESS, so
return early in of_unittest_dma_get_max_cpu_address().

Fixes: 07d13a1d6120 ("of: unittest: Add test for of_dma_get_max_cpu_address()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: vmlinux.lds.S: Drop redundant *.init.rodata.*
Youling Tang [Thu, 19 Nov 2020 01:45:40 +0000 (09:45 +0800)]
arm64: vmlinux.lds.S: Drop redundant *.init.rodata.*

We currently try to emit *.init.rodata.* twice, once in INIT_DATA, and once
in the line immediately following it. As the two section definitions are
identical, the latter is redundant and can be dropped.

This patch drops the redundant *.init.rodata.* section definition.

Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/1605750340-910-1-git-send-email-tangyouling@loongson.cn
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: Extend the kernel command line from the bootloader
Tyler Hicks [Mon, 21 Sep 2020 19:15:57 +0000 (14:15 -0500)]
arm64: Extend the kernel command line from the bootloader

Provide support for additional kernel command line parameters to be
concatenated onto the end of the command line provided by the
bootloader. Additional parameters are specified in the CONFIG_CMDLINE
option when CONFIG_CMDLINE_EXTEND is selected, matching other
architectures and leveraging existing support in the FDT and EFI stub
code.

Special care must be taken for the arch-specific nokaslr parsing. Search
the bootargs FDT property and the CONFIG_CMDLINE when
CONFIG_CMDLINE_EXTEND is in use.

There are a couple of known use cases for this feature:

1) Switching between stable and development kernel versions, where one
   of the versions benefits from additional command line parameters,
   such as debugging options.
2) Specifying additional command line parameters, for additional tuning
   or debugging, when the bootloader does not offer an interactive mode.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Link: https://lore.kernel.org/r/20200921191557.350256-3-tyhicks@linux.microsoft.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: kaslr: Refactor early init command line parsing
Tyler Hicks [Mon, 21 Sep 2020 19:15:56 +0000 (14:15 -0500)]
arm64: kaslr: Refactor early init command line parsing

Don't ask for *the* command line string to search for "nokaslr" in
kaslr_early_init(). Instead, tell a helper function to search all the
appropriate command line strings for "nokaslr" and return the result.

This paves the way for searching multiple command line strings without
having to concatenate the strings in early init.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Link: https://lore.kernel.org/r/20200921191557.350256-2-tyhicks@linux.microsoft.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agokasan: arm64: set TCR_EL1.TBID1 when enabled
Peter Collingbourne [Sat, 21 Nov 2020 09:59:02 +0000 (01:59 -0800)]
kasan: arm64: set TCR_EL1.TBID1 when enabled

On hardware supporting pointer authentication, we previously ended up
enabling TBI on instruction accesses when tag-based ASAN was enabled,
but this was costing us 8 bits of PAC entropy, which was unnecessary
since tag-based ASAN does not require TBI on instruction accesses. Get
them back by setting TCR_EL1.TBID1.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Link: https://linux-review.googlesource.com/id/I3dded7824be2e70ea64df0aabab9598d5aebfcc4
Link: https://lore.kernel.org/r/20f64e26fc8a1309caa446fffcb1b4e2fe9e229f.1605952129.git.pcc@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: Enable perf events based hard lockup detector
Sumit Garg [Wed, 7 Oct 2020 08:51:43 +0000 (14:21 +0530)]
arm64: Enable perf events based hard lockup detector

With the recent feature added to enable perf events to use pseudo NMIs
as interrupts on platforms which support GICv3 or later, its now been
possible to enable hard lockup detector (or NMI watchdog) on arm64
platforms. So enable corresponding support.

One thing to note here is that normally lockup detector is initialized
just after the early initcalls but PMU on arm64 comes up much later as
device_initcall(). So we need to re-initialize lockup detection once
PMU has been initialized.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Acked-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/1602060704-10921-1-git-send-email-sumit.garg@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoperf/imx_ddr: Add stop event counters support for i.MX8MP
Joakim Zhang [Tue, 27 Oct 2020 10:44:51 +0000 (18:44 +0800)]
perf/imx_ddr: Add stop event counters support for i.MX8MP

DDR Perf driver only supports free-running event counters(counter1/2/3)
now, this patch adds support for stop event counters.

Legacy SoCs:
Cycle counter(counter0) is a special counter, only count cycles. When
cycle counter overflow, it will lock all counters and generate an
interrupt. In ddr_perf_irq_handler, disable cycle counter then all
counters would stop at the same time, update all counters' count, then
enable cycle counter that all counters count again. During this process,
only clear cycle counter, no need to clear event counters since they are
free-running counters. They would continue counting after overflow and
do/while loop from ddr_perf_event_update can handle event counters
overflow case.

i.MX8MP:
Almost all is the same as legacy SoCs, the only difference is that, event
counters are not free-running any more. Like cycle counter, when event
counters overflow, they would stop counting unless clear the counter,
and no interrupt generate for event counters. So we should clear event
counters that let them re-count when cycle counter overflow, which ensure
event counters will not lose data.

This patch adds stop event counters support which would be compatible to
free-running event counters. We use the cycle counter to stop overflow
of the event counters.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20201027104451.15434-1-qiangqing.zhang@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoperf/smmuv3: Support sysfs identifier file
John Garry [Thu, 8 Oct 2020 09:26:21 +0000 (17:26 +0800)]
perf/smmuv3: Support sysfs identifier file

SMMU_PMCG_IIDR was added in the SMMUv3.3 spec.

For the perf tool to know the specific HW implementation, expose the
PMCG_IIDR contents only when set.

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1602149181-237415-5-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
3 years agodrivers/perf: hisi: Add identifier sysfs file
John Garry [Thu, 8 Oct 2020 09:26:18 +0000 (17:26 +0800)]
drivers/perf: hisi: Add identifier sysfs file

To allow userspace to identify the specific implementation of the device,
add an "identifier" sysfs file.

Encoding is as follows (same for all uncore drivers):
hi1620: 0x0
hi1630: 0x30

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1602149181-237415-2-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoperf: remove duplicate check on fwnode
Wang Qing [Fri, 6 Nov 2020 06:41:42 +0000 (14:41 +0800)]
perf: remove duplicate check on fwnode

fwnode is checked IS_ERR_OR_NULL in following check by
is_of_node() or is_acpi_device_node(), remove duplicate check.

Signed-off-by: Wang Qing <wangqing@vivo.com>
Link: https://lore.kernel.org/r/1604644902-29655-1-git-send-email-wangqing@vivo.com
Signed-off-by: Will Deacon <will@kernel.org>
3 years agodriver/perf: Add PMU driver for the ARM DMC-620 memory controller
Tuan Phan [Wed, 4 Nov 2020 19:30:43 +0000 (11:30 -0800)]
driver/perf: Add PMU driver for the ARM DMC-620 memory controller

DMC-620 PMU supports total 10 counters which each is
independently programmable to different events and can
be started and stopped individually.

Currently, it only supports ACPI. Other platforms feel free to test and add
support for device tree.

Usage example:
  #perf stat -e arm_dmc620_10008c000/clk_cycle_count/ -C 0
  Get perf event for clk_cycle_count counter.

  #perf stat -e arm_dmc620_10008c000/clkdiv2_allocate,mask=0x1f,match=0x2f,
  incr=2,invert=1/ -C 0
  The above example shows how to specify mask, match, incr,
  invert parameters for clkdiv2_allocate event.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Link: https://lore.kernel.org/r/1604518246-6198-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoarm64: expose FAR_EL1 tag bits in siginfo
Peter Collingbourne [Fri, 20 Nov 2020 20:33:46 +0000 (12:33 -0800)]
arm64: expose FAR_EL1 tag bits in siginfo

The kernel currently clears the tag bits (i.e. bits 56-63) in the fault
address exposed via siginfo.si_addr and sigcontext.fault_address. However,
the tag bits may be needed by tools in order to accurately diagnose
memory errors, such as HWASan [1] or future tools based on the Memory
Tagging Extension (MTE).

Expose these bits via the arch_untagged_si_addr mechanism, so that
they are only exposed to signal handlers with the SA_EXPOSE_TAGBITS
flag set.

[1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html

Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://linux-review.googlesource.com/id/Ia8876bad8c798e0a32df7c2ce1256c4771c81446
Link: https://lore.kernel.org/r/0010296597784267472fa13b39f8238d87a72cf8.1605904350.git.pcc@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agosignal: define the SA_EXPOSE_TAGBITS bit in sa_flags
Peter Collingbourne [Fri, 20 Nov 2020 20:33:45 +0000 (12:33 -0800)]
signal: define the SA_EXPOSE_TAGBITS bit in sa_flags

Architectures that support address tagging, such as arm64, may want to
expose fault address tag bits to the signal handler to help diagnose
memory errors. However, these bits have not been previously set,
and their presence may confuse unaware user applications. Therefore,
introduce a SA_EXPOSE_TAGBITS flag bit in sa_flags that a signal
handler may use to explicitly request that the bits are set.

The generic signal handler APIs expect to receive tagged addresses.
Architectures may specify how to untag addresses in the case where
SA_EXPOSE_TAGBITS is clear by defining the arch_untagged_si_addr
function.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://linux-review.googlesource.com/id/I16dd0ed2081f091fce97be0190cb8caa874c26cb
Link: https://lkml.kernel.org/r/13cf24d00ebdd8e1f55caf1821c7c29d54100191.1605904350.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
3 years agosignal: define the SA_UNSUPPORTED bit in sa_flags
Peter Collingbourne [Tue, 17 Nov 2020 03:17:25 +0000 (19:17 -0800)]
signal: define the SA_UNSUPPORTED bit in sa_flags

Define a sa_flags bit, SA_UNSUPPORTED, which will never be supported
in the uapi. The purpose of this flag bit is to allow userspace to
distinguish an old kernel that does not clear unknown sa_flags bits
from a kernel that supports every flag bit.

In other words, if userspace does something like:

  act.sa_flags |= SA_UNSUPPORTED;
  sigaction(SIGSEGV, &act, 0);
  sigaction(SIGSEGV, 0, &oldact);

and finds that SA_UNSUPPORTED remains set in oldact.sa_flags, it means
that the kernel cannot be trusted to have cleared unknown flag bits
from sa_flags, so no assumptions about flag bit support can be made.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://linux-review.googlesource.com/id/Ic2501ad150a3a79c1cf27fb8c99be342e9dffbcb
Link: https://lkml.kernel.org/r/bda7ddff8895a9bc4ffc5f3cf3d4d37a32118077.1605582887.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
3 years agoarch: provide better documentation for the arch-specific SA_* flags
Peter Collingbourne [Tue, 17 Nov 2020 03:17:24 +0000 (19:17 -0800)]
arch: provide better documentation for the arch-specific SA_* flags

Instead of documenting the arch-specific flag values in a comment at
the top where they may be easily overlooked, document them in comments
inline with the definitions in numerical order so that it is clear
why specific values must be chosen for new generic flags and to reduce
the likelihood of conflicts between generic and arch-specific flags.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I40a129cf7c3a71ba1bfd6d936c544072ee3b7ce6
Link: https://lkml.kernel.org/r/198c8b68c76bf3ed73117d817c7cdf9bc0eb174f.1605582887.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
3 years agosignal: clear non-uapi flag bits when passing/returning sa_flags
Peter Collingbourne [Fri, 13 Nov 2020 02:53:34 +0000 (18:53 -0800)]
signal: clear non-uapi flag bits when passing/returning sa_flags

Previously we were not clearing non-uapi flag bits in
sigaction.sa_flags when storing the userspace-provided sa_flags or
when returning them via oldact. Start doing so.

This allows userspace to detect missing support for flag bits and
allows the kernel to use non-uapi bits internally, as we are already
doing in arch/x86 for two flag bits. Now that this change is in
place, we no longer need the code in arch/x86 that was hiding these
bits from userspace, so remove it.

This is technically a userspace-visible behavior change for sigaction, as
the unknown bits returned via oldact.sa_flags are no longer set. However,
we are free to define the behavior for unknown bits exactly because
their behavior is currently undefined, so for now we can define the
meaning of each of them to be "clear the bit in oldact.sa_flags unless
the bit becomes known in the future". Furthermore, this behavior is
consistent with OpenBSD [1], illumos [2] and XNU [3] (FreeBSD [4] and
NetBSD [5] fail the syscall if unknown bits are set). So there is some
precedent for this behavior in other kernels, and in particular in XNU,
which is probably the most popular kernel among those that I looked at,
which means that this change is less likely to be a compatibility issue.

Link: [1] https://github.com/openbsd/src/blob/f634a6a4b5bf832e9c1de77f7894ae2625e74484/sys/kern/kern_sig.c#L278
Link: [2] https://github.com/illumos/illumos-gate/blob/76f19f5fdc974fe5be5c82a556e43a4df93f1de1/usr/src/uts/common/syscall/sigaction.c#L86
Link: [3] https://github.com/apple/darwin-xnu/blob/a449c6a3b8014d9406c2ddbdc81795da24aa7443/bsd/kern/kern_sig.c#L480
Link: [4] https://github.com/freebsd/freebsd/blob/eded70c37057857c6e23fae51f86b8f8f43cd2d0/sys/kern/kern_sig.c#L699
Link: [5] https://github.com/NetBSD/src/blob/3365779becdcedfca206091a645a0e8e22b2946e/sys/kern/sys_sig.c#L473
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://linux-review.googlesource.com/id/I35aab6f5be932505d90f3b3450c083b4db1eca86
Link: https://lkml.kernel.org/r/878dbcb5f47bc9b11881c81f745c0bef5c23f97f.1605235762.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
3 years agoarch: move SA_* definitions to generic headers
Peter Collingbourne [Fri, 13 Nov 2020 02:53:33 +0000 (18:53 -0800)]
arch: move SA_* definitions to generic headers

Most architectures with the exception of alpha, mips, parisc and
sparc use the same values for these flags. Move their definitions into
asm-generic/signal-defs.h and allow the architectures with non-standard
values to override them. Also, document the non-standard flag values
in order to make it easier to add new generic flags in the future.

A consequence of this change is that on powerpc and x86, the constants'
values aside from SA_RESETHAND change signedness from unsigned
to signed. This is not expected to impact realistic use of these
constants. In particular the typical use of the constants where they
are or'ed together and assigned to sa_flags (or another int variable)
would not be affected.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://linux-review.googlesource.com/id/Ia3849f18b8009bf41faca374e701cdca36974528
Link: https://lkml.kernel.org/r/b6d0d1ec34f9ee93e1105f14f288fba5f89d1f24.1605235762.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
3 years agoparisc: start using signal-defs.h
Peter Collingbourne [Fri, 13 Nov 2020 02:53:32 +0000 (18:53 -0800)]
parisc: start using signal-defs.h

We currently include signal-defs.h on all architectures except parisc.
Make parisc fall in line. This will make maintenance easier once the
flag bits are moved here.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: Helge Deller <deller@gmx.de>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://linux-review.googlesource.com/id/If03a5135fb514fe96548fb74610e6c3586a04064
Link: https://lkml.kernel.org/r/be8f3680ef2d0a1a120994e3ae0b11d82f373279.1605235762.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
3 years agoparisc: Drop parisc special case for __sighandler_t
Helge Deller [Fri, 13 Nov 2020 02:53:31 +0000 (18:53 -0800)]
parisc: Drop parisc special case for __sighandler_t

I believe we can and *should* drop this parisc-specific typedef for
__sighandler_t when compiling a 64-bit kernel. The reasons:

1. We don't have a 64-bit userspace yet, so nothing (on userspace side)
can break.

2. Inside the Linux kernel, this is only used in kernel/signal.c, in
function kernel_sigaction() where the signal handler is compared against
SIG_IGN.  SIG_IGN is defined as (__sighandler_t)1), so only the pointers
are compared.

3. Even when a 64-bit userspace gets added at some point, I think
__sighandler_t should be defined what it is: a function pointer struct.

I compiled kernel/signal.c with and without the patch, and the produced code
is identical in both cases.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I21c43f21b264f339e3aa395626af838646f62d97
Link: https://lkml.kernel.org/r/a75b8eb7bb9eac1cf73fb119eb53e5892d6e9656.1605235762.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
3 years agomm: Remove examples from enum zone_type comment
Nicolas Saenz Julienne [Thu, 19 Nov 2020 17:53:59 +0000 (18:53 +0100)]
mm: Remove examples from enum zone_type comment

We can't really list every setup in common code. On top of that they are
unlikely to stay true for long as things change in the arch trees
independently of this comment.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20201119175400.9995-8-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: mm: Set ZONE_DMA size based on early IORT scan
Ard Biesheuvel [Thu, 19 Nov 2020 17:53:58 +0000 (18:53 +0100)]
arm64: mm: Set ZONE_DMA size based on early IORT scan

We recently introduced a 1 GB sized ZONE_DMA to cater for platforms
incorporating masters that can address less than 32 bits of DMA, in
particular the Raspberry Pi 4, which has 4 or 8 GB of DRAM, but has
peripherals that can only address up to 1 GB (and its PCIe host
bridge can only access the bottom 3 GB)

Instructing the DMA layer about these limitations is straight-forward,
even though we had to fix some issues regarding memory limits set in
the IORT for named components, and regarding the handling of ACPI _DMA
methods. However, the DMA layer also needs to be able to allocate
memory that is guaranteed to meet those DMA constraints, for bounce
buffering as well as allocating the backing for consistent mappings.

This is why the 1 GB ZONE_DMA was introduced recently. Unfortunately,
it turns out the having a 1 GB ZONE_DMA as well as a ZONE_DMA32 causes
problems with kdump, and potentially in other places where allocations
cannot cross zone boundaries. Therefore, we should avoid having two
separate DMA zones when possible.

So let's do an early scan of the IORT, and only create the ZONE_DMA
if we encounter any devices that need it. This puts the burden on
the firmware to describe such limitations in the IORT, which may be
redundant (and less precise) if _DMA methods are also being provided.
However, it should be noted that this situation is highly unusual for
arm64 ACPI machines. Also, the DMA subsystem still gives precedence to
the _DMA method if implemented, and so we will not lose the ability to
perform streaming DMA outside the ZONE_DMA if the _DMA method permits
it.

[nsaenz: unified implementation with DT's counterpart]

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20201119175400.9995-7-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
Nicolas Saenz Julienne [Thu, 19 Nov 2020 17:53:57 +0000 (18:53 +0100)]
arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges

We recently introduced a 1 GB sized ZONE_DMA to cater for platforms
incorporating masters that can address less than 32 bits of DMA, in
particular the Raspberry Pi 4, which has 4 or 8 GB of DRAM, but has
peripherals that can only address up to 1 GB (and its PCIe host
bridge can only access the bottom 3 GB)

The DMA layer also needs to be able to allocate memory that is
guaranteed to meet those DMA constraints, for bounce buffering as well
as allocating the backing for consistent mappings. This is why the 1 GB
ZONE_DMA was introduced recently. Unfortunately, it turns out the having
a 1 GB ZONE_DMA as well as a ZONE_DMA32 causes problems with kdump, and
potentially in other places where allocations cannot cross zone
boundaries. Therefore, we should avoid having two separate DMA zones
when possible.

So, with the help of of_dma_get_max_cpu_address() get the topmost
physical address accessible to all DMA masters in system and use that
information to fine-tune ZONE_DMA's size. In the absence of addressing
limited masters ZONE_DMA will span the whole 32-bit address space,
otherwise, in the case of the Raspberry Pi 4 it'll only span the 30-bit
address space, and have ZONE_DMA32 cover the rest of the 32-bit address
space.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Link: https://lore.kernel.org/r/20201119175400.9995-6-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoof: unittest: Add test for of_dma_get_max_cpu_address()
Nicolas Saenz Julienne [Thu, 19 Nov 2020 17:53:56 +0000 (18:53 +0100)]
of: unittest: Add test for of_dma_get_max_cpu_address()

Introduce a test for of_dma_get_max_cup_address(), it uses the same DT
data as the rest of dma-ranges unit tests.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20201119175400.9995-5-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoof/address: Introduce of_dma_get_max_cpu_address()
Nicolas Saenz Julienne [Thu, 19 Nov 2020 17:53:55 +0000 (18:53 +0100)]
of/address: Introduce of_dma_get_max_cpu_address()

Introduce of_dma_get_max_cpu_address(), which provides the highest CPU
physical address addressable by all DMA masters in the system. It's
specially useful for setting memory zones sizes at early boot time.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20201119175400.9995-4-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: mm: Move zone_dma_bits initialization into zone_sizes_init()
Nicolas Saenz Julienne [Thu, 19 Nov 2020 17:53:54 +0000 (18:53 +0100)]
arm64: mm: Move zone_dma_bits initialization into zone_sizes_init()

zone_dma_bits's initialization happens earlier that it's actually
needed, in arm64_memblock_init(). So move it into the more suitable
zone_sizes_init().

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Link: https://lore.kernel.org/r/20201119175400.9995-3-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: mm: Move reserve_crashkernel() into mem_init()
Nicolas Saenz Julienne [Thu, 19 Nov 2020 17:53:53 +0000 (18:53 +0100)]
arm64: mm: Move reserve_crashkernel() into mem_init()

crashkernel might reserve memory located in ZONE_DMA. We plan to delay
ZONE_DMA's initialization after unflattening the devicetree and ACPI's
boot table initialization, so move it later in the boot process.
Specifically into bootmem_init() since request_standard_resources()
depends on it.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Link: https://lore.kernel.org/r/20201119175400.9995-2-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
Catalin Marinas [Thu, 19 Nov 2020 17:55:56 +0000 (17:55 +0000)]
arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required

mem_init() currently relies on knowing the boundaries of the crashkernel
reservation to map such region with page granularity for later
unmapping via set_memory_valid(..., 0). If the crashkernel reservation
is deferred, such boundaries are not known when the linear mapping is
created. Simply parse the command line for "crashkernel" and, if found,
create the linear map with NO_BLOCK_MAPPINGS.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Acked-by: James Morse <james.morse@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Link: https://lore.kernel.org/r/20201119175556.18681-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: Ignore any DMA offsets in the max_zone_phys() calculation
Catalin Marinas [Wed, 18 Nov 2020 18:58:09 +0000 (18:58 +0000)]
arm64: Ignore any DMA offsets in the max_zone_phys() calculation

Currently, the kernel assumes that if RAM starts above 32-bit (or
zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
such constrained devices have a hardwired DMA offset. In practice, we
haven't noticed any such hardware so let's assume that we can expand
ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
zone_bits.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20201118185809.1078362-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: mte: optimize asynchronous tag check fault flag check
Peter Collingbourne [Wed, 18 Nov 2020 03:20:51 +0000 (19:20 -0800)]
arm64: mte: optimize asynchronous tag check fault flag check

We don't need to check for MTE support before checking the flag
because it can only be set if the hardware supports MTE. As a result
we can unconditionally check the flag bit which is expected to be in
a register and therefore the check can be done in a single instruction
instead of first needing to load the hwcaps.

On a DragonBoard 845c with a kernel built with CONFIG_ARM64_MTE=y with
the powersave governor this reduces the cost of a kernel entry/exit
(invalid syscall) from 465.1ns to 463.8ns.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/If4dc3501fd4e4f287322f17805509613cfe47d24
Link: https://lore.kernel.org/r/20201118032051.1405907-1-pcc@google.com
[catalin.marinas@arm.com: remove IS_ENABLED(CONFIG_ARM64_MTE)]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64/mm: add fallback option to allocate virtually contiguous memory
Sudarshan Rajagopalan [Thu, 15 Oct 2020 00:51:23 +0000 (17:51 -0700)]
arm64/mm: add fallback option to allocate virtually contiguous memory

When section mappings are enabled, we allocate vmemmap pages from
physically continuous memory of size PMD_SIZE using
vmemmap_alloc_block_buf(). Section mappings are good to reduce TLB
pressure. But when system is highly fragmented and memory blocks are
being hot-added at runtime, its possible that such physically continuous
memory allocations can fail. Rather than failing the memory hot-add
procedure, add a fallback option to allocate vmemmap pages from
discontinuous pages using vmemmap_populate_basepages().

Signed-off-by: Sudarshan Rajagopalan <sudaraja@codeaurora.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/d6c06f2ef39bbe6c715b2f6db76eb16155fdcee6.1602722808.git.sudaraja@codeaurora.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: head: tidy up the Image header definition
Ard Biesheuvel [Tue, 17 Nov 2020 12:47:29 +0000 (13:47 +0100)]
arm64: head: tidy up the Image header definition

Even though support for EFI boot remains entirely optional for arm64,
it is unlikely that we will ever be able to repurpose the image header
fields that the EFI loader relies on, i.e., the magic NOP at offset
0x0 and the PE header address at offset 0x3c.

So let's factor out the differences into a 'efi_signature_nop' macro and
a local symbol representing the PE header address, and move the
conditional definitions into efi-header.S, taking into account whether
CONFIG_EFI is enabled or not. While at it, switch to a signature NOP
that behaves more like a NOP, i.e., one that only clobbers the
flags.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201117124729.12642-4-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64/head: avoid symbol names pointing into first 64 KB of kernel image
Ard Biesheuvel [Tue, 17 Nov 2020 12:47:28 +0000 (13:47 +0100)]
arm64/head: avoid symbol names pointing into first 64 KB of kernel image

We no longer map the first 64 KB of the kernel image, as there is nothing
there that we ever need to refer back to once the kernel has booted. Even
though facilities like kallsyms are very careful to only refer to the
region that starts at _stext when mapping virtual addresses to symbol
names, let's avoid any confusion by switching to local .L prefixed symbol
names for the EFI header, as none of them have any significance to the
rest of the kernel.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201117124729.12642-3-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: omit [_text, _stext) from permanent kernel mapping
Ard Biesheuvel [Tue, 17 Nov 2020 12:47:27 +0000 (13:47 +0100)]
arm64: omit [_text, _stext) from permanent kernel mapping

In a previous patch, we increased the size of the EFI PE/COFF header
to 64 KB, which resulted in the _stext symbol to appear at a fixed
offset of 64 KB into the image.

Since 64 KB is also the largest page size we support, this completely
removes the need to map the first 64 KB of the kernel image, given that
it only contains the arm64 Image header and the EFI header, neither of
which we ever access again after booting the kernel. More importantly,
we should avoid an executable mapping of non-executable and not entirely
predictable data, to deal with the unlikely event that we inadvertently
emitted something that looks like an opcode that could be used as a
gadget for speculative execution.

So let's limit the kernel mapping of .text to the [_stext, _etext)
region, which matches the view of generic code (such as kallsyms) when
it reasons about the boundaries of the kernel's .text section.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201117124729.12642-2-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: abort counter_read_on_cpu() when irqs_disabled()
Ionela Voinescu [Fri, 13 Nov 2020 15:53:28 +0000 (15:53 +0000)]
arm64: abort counter_read_on_cpu() when irqs_disabled()

Given that smp_call_function_single() can deadlock when interrupts are
disabled, abort the SMP call if irqs_disabled(). This scenario is
currently not possible given the function's uses, but safeguard this for
potential future uses.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20201113155328.4194-1-ionela.voinescu@arm.com
[catalin.marinas@arm.com: modified following Mark's comment]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: implement CPPC FFH support using AMUs
Ionela Voinescu [Fri, 6 Nov 2020 12:53:34 +0000 (12:53 +0000)]
arm64: implement CPPC FFH support using AMUs

If Activity Monitors (AMUs) are present, two of the counters can be used
to implement support for CPPC's (Collaborative Processor Performance
Control) delivered and reference performance monitoring functionality
using FFH (Functional Fixed Hardware).

Given that counters for a certain CPU can only be read from that CPU,
while FFH operations can be called from any CPU for any of the CPUs, use
smp_call_function_single() to provide the requested values.

Therefore, depending on the register addresses, the following values
are returned:
 - 0x0 (DeliveredPerformanceCounterRegister): AMU core counter
 - 0x1 (ReferencePerformanceCounterRegister): AMU constant counter

The use of Activity Monitors is hidden behind the generic
cpu_read_{corecnt,constcnt}() functions.

Read functionality for these two registers represents the only current
FFH support for CPPC. Read operations for other register values or write
operation for all registers are unsupported. Therefore, keep CPPC's FFH
unsupported if no CPUs have valid AMU frequency counters. For this
purpose, the get_cpu_with_amu_feat() is introduced.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201106125334.21570-4-ionela.voinescu@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: split counter validation function
Ionela Voinescu [Fri, 6 Nov 2020 12:53:33 +0000 (12:53 +0000)]
arm64: split counter validation function

In order for the counter validation function to be reused, split
validate_cpu_freq_invariance_counters() into:
 - freq_counters_valid(cpu) - check cpu for valid cycle counters
 - freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate) -
   generic function that sets the normalization ratio used by
   topology_scale_freq_tick()

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201106125334.21570-3-ionela.voinescu@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: wrap and generalise counter read functions
Ionela Voinescu [Fri, 6 Nov 2020 12:53:32 +0000 (12:53 +0000)]
arm64: wrap and generalise counter read functions

In preparation for other uses of Activity Monitors (AMU) cycle counters,
place counter read functionality in generic functions that can reused:
read_corecnt() and read_constcnt().

As a result, implement update_freq_counters_refs() to replace
init_cpu_freq_invariance_counters() and both initialise and update
the per-cpu reference variables.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201106125334.21570-2-ionela.voinescu@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: mm: don't assume struct page is always 64 bytes
Ard Biesheuvel [Tue, 10 Nov 2020 18:05:11 +0000 (19:05 +0100)]
arm64: mm: don't assume struct page is always 64 bytes

Commit 8c96400d6a39be7 simplified the page-to-virt and virt-to-page
conversions, based on the assumption that struct page is always 64
bytes in size, in which case we can use a single signed shift to
perform the conversion (provided that the vmemmap array is placed
appropriately in the kernel VA space)

Unfortunately, this assumption turns out not to hold, and so we need
to revert part of this commit, and go back to an affine transformation.
Given that all the quantities involved are compile time constants,
this should not make any practical difference.

Fixes: 8c96400d6a39 ("arm64: mm: make vmemmap region a projection of the linear region")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20201110180511.29083-1-ardb@kernel.org
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64/mm/hotplug: Ensure early memory sections are all online
Anshuman Khandual [Mon, 9 Nov 2020 04:28:57 +0000 (09:58 +0530)]
arm64/mm/hotplug: Ensure early memory sections are all online

This adds a validation function that scans the entire boot memory and makes
sure that all early memory sections are online. This check is essential for
the memory notifier to work properly, as it cannot prevent any boot memory
from offlining, if all sections are not online to begin with. Although the
boot section scanning is selectively enabled with DEBUG_VM.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604896137-16644-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64/mm/hotplug: Enable MEM_OFFLINE event handling
Anshuman Khandual [Mon, 9 Nov 2020 04:28:56 +0000 (09:58 +0530)]
arm64/mm/hotplug: Enable MEM_OFFLINE event handling

This enables MEM_OFFLINE memory event handling. It will help intercept any
possible error condition such as if boot memory some how still got offlined
even after an explicit notifier failure, potentially by a future change in
generic hot plug framework. This would help detect such scenarios and help
debug further. While here, also call out the first section being attempted
for offline or got offlined.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604896137-16644-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64/mm/hotplug: Register boot memory hot remove notifier earlier
Anshuman Khandual [Mon, 9 Nov 2020 04:28:55 +0000 (09:58 +0530)]
arm64/mm/hotplug: Register boot memory hot remove notifier earlier

This moves memory notifier registration earlier in the boot process from
device_initcall() to early_initcall() which will help in guarding against
potential early boot memory offline requests. Even though there should not
be any actual offlinig requests till memory block devices are initialized
with memory_dev_init() but then generic init sequence might just change in
future. Hence an early registration for the memory event notifier would be
helpful. While here, just skip the registration if CONFIG_MEMORY_HOTREMOVE
is not enabled and also call out when memory notifier registration fails.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604896137-16644-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: mm: account for hotplug memory when randomizing the linear region
Ard Biesheuvel [Wed, 14 Oct 2020 08:18:57 +0000 (10:18 +0200)]
arm64: mm: account for hotplug memory when randomizing the linear region

As a hardening measure, we currently randomize the placement of
physical memory inside the linear region when KASLR is in effect.
Since the random offset at which to place the available physical
memory inside the linear region is chosen early at boot, it is
based on the memblock description of memory, which does not cover
hotplug memory. The consequence of this is that the randomization
offset may be chosen such that any hotplugged memory located above
memblock_end_of_DRAM() that appears later is pushed off the end of
the linear region, where it cannot be accessed.

So let's limit this randomization of the linear region to ensure
that this can no longer happen, by using the CPU's addressable PA
range instead. As it is guaranteed that no hotpluggable memory will
appear that falls outside of that range, we can safely put this PA
range sized window anywhere in the linear region.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20201014081857.3288-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64/smp: Drop the macro S(x,s)
Anshuman Khandual [Mon, 9 Nov 2020 11:38:36 +0000 (17:08 +0530)]
arm64/smp: Drop the macro S(x,s)

Mapping between IPI type index and its string is direct without requiring
an additional offset. Hence the existing macro S(x, s) is now redundant
and can just be dropped. This also makes the code clean and simple.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604921916-23368-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: consistently use reserved_pg_dir
Mark Rutland [Tue, 3 Nov 2020 10:22:29 +0000 (10:22 +0000)]
arm64: consistently use reserved_pg_dir

Depending on configuration options and specific code paths, we either
use the empty_zero_page or the configuration-dependent reserved_ttbr0
as a reserved value for TTBR{0,1}_EL1.

To simplify this code, let's always allocate and use the same
reserved_pg_dir, replacing reserved_ttbr0. Note that this is allocated
(and hence pre-zeroed), and is also marked as read-only in the kernel
Image mapping.

Keeping this separate from the empty_zero_page potentially helps with
robustness as the empty_zero_page is used in a number of cases where a
failure to map it read-only could allow it to become corrupted.

The (presently unused) swapper_pg_end symbol is also removed, and
comments are added wherever we rely on the offsets between the
pre-allocated pg_dirs to keep these cases easily identifiable.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201103102229.8542-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: kprobes: Remove redundant kprobe_step_ctx
Masami Hiramatsu [Tue, 3 Nov 2020 13:49:04 +0000 (14:49 +0100)]
arm64: kprobes: Remove redundant kprobe_step_ctx

The kprobe_step_ctx (kcb->ss_ctx) has ss_pending and match_addr, but
those are redundant because those can be replaced by KPROBE_HIT_SS and
&cur_kprobe->ainsn.api.insn[1] respectively.
To simplify the code, remove the kprobe_step_ctx.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201103134900.337243-2-jean-philippe@linaro.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoDocumentation/arm64: fix RST layout of memory.rst
Ard Biesheuvel [Tue, 10 Nov 2020 13:08:51 +0000 (14:08 +0100)]
Documentation/arm64: fix RST layout of memory.rst

Stephen reports that commit f4693c2716b3 ("arm64: mm: extend linear region
for 52-bit VA configurations") triggers the following warnings when building
the htmldocs make target of today's linux-next:

  Documentation/arm64/memory.rst:35: WARNING: Literal block ends without a blank line; unexpected unindent.
  Documentation/arm64/memory.rst:53: WARNING: Literal block ends without a blank line; unexpected unindent.

Let's tweak the memory layout table to work around this.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Fixes: f4693c2716b3 ("arm64: mm: extend linear region for 52-bit VA configurations")
Link: https://lore.kernel.org/r/20201110130851.15751-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y
Will Deacon [Tue, 30 Jun 2020 13:02:48 +0000 (14:02 +0100)]
arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y

When building with LTO, there is an increased risk of the compiler
converting an address dependency headed by a READ_ONCE() invocation
into a control dependency and consequently allowing for harmful
reordering by the CPU.

Ensure that such transformations are harmless by overriding the generic
READ_ONCE() definition with one that provides acquire semantics when
building with LTO.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoarm64: alternatives: Remove READ_ONCE() usage during patch operation
Will Deacon [Tue, 30 Jun 2020 13:06:04 +0000 (14:06 +0100)]
arm64: alternatives: Remove READ_ONCE() usage during patch operation

In preparation for patching the internals of READ_ONCE() itself, replace
its usage on the alternatives patching patch with a volatile variable
instead.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoarm64: cpufeatures: Add capability for LDAPR instruction
Will Deacon [Tue, 30 Jun 2020 13:02:22 +0000 (14:02 +0100)]
arm64: cpufeatures: Add capability for LDAPR instruction

Armv8.3 introduced the LDAPR instruction, which provides weaker memory
ordering semantics than LDARi (RCpc vs RCsc). Generally, we provide an
RCsc implementation when implementing the Linux memory model, but LDAPR
can be used as a useful alternative to dependency ordering, particularly
when the compiler is capable of breaking the dependencies.

Since LDAPR is not available on all CPUs, add a cpufeature to detect it at
runtime and allow the instruction to be used with alternative code
patching.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoarm64: alternatives: Split up alternative.h
Will Deacon [Tue, 30 Jun 2020 12:55:59 +0000 (13:55 +0100)]
arm64: alternatives: Split up alternative.h

asm/alternative.h contains both the macros needed to use alternatives,
as well the type definitions and function prototypes for applying them.

Split the header in two, so that alternatives can be used from core
header files such as linux/compiler.h without the risk of circular
includes

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoarm64: uaccess: move uao_* alternatives to asm-uaccess.h
Mark Rutland [Mon, 26 Oct 2020 13:31:47 +0000 (13:31 +0000)]
arm64: uaccess: move uao_* alternatives to asm-uaccess.h

The uao_* alternative asm macros are only used by the uaccess assembly
routines in arch/arm64/lib/, where they are included indirectly via
asm-uaccess.h. Since they're specific to the uaccess assembly (and will
lose the alternatives in subsequent patches), let's move them into
asm-uaccess.h.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
[will: update #include in mte.S to pull in uao asm macros]
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoarm64: mm: tidy up top of kernel VA space
Ard Biesheuvel [Thu, 8 Oct 2020 15:36:02 +0000 (17:36 +0200)]
arm64: mm: tidy up top of kernel VA space

Tidy up the way the top of the kernel VA space is organized, by mirroring
the 256 MB region we have below the vmalloc space, and populating it top
down with the PCI I/O space, some guard regions, and the fixmap region.
The latter region is itself populated top down, and today only covers
about 4 MB, and so 224 MB is ample, and no guard region is therefore
required.

The resulting layout is identical between 48-bit/4k and 52-bit/64k
configurations.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Link: https://lore.kernel.org/r/20201008153602.9467-5-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: mm: make vmemmap region a projection of the linear region
Ard Biesheuvel [Thu, 8 Oct 2020 15:36:01 +0000 (17:36 +0200)]
arm64: mm: make vmemmap region a projection of the linear region

Now that we have reverted the introduction of the vmemmap struct page
pointer and the separate physvirt_offset, we can simplify things further,
and place the vmemmap region in the VA space in such a way that virtual
to page translations and vice versa can be implemented using a single
arithmetic shift.

One happy coincidence resulting from this is that the 48-bit/4k and
52-bit/64k configurations (which are assumed to be the two most
prevalent) end up with the same placement of the vmemmap region. In
a subsequent patch, we will take advantage of this, and unify the
memory maps even more.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Link: https://lore.kernel.org/r/20201008153602.9467-4-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoarm64: mm: extend linear region for 52-bit VA configurations
Ard Biesheuvel [Thu, 8 Oct 2020 15:36:00 +0000 (17:36 +0200)]
arm64: mm: extend linear region for 52-bit VA configurations

For historical reasons, the arm64 kernel VA space is configured as two
equally sized halves, i.e., on a 48-bit VA build, the VA space is split
into a 47-bit vmalloc region and a 47-bit linear region.

When support for 52-bit virtual addressing was added, this equal split
was kept, resulting in a substantial waste of virtual address space in
the linear region:

                           48-bit VA                     52-bit VA
  0xffff_ffff_ffff_ffff +-------------+               +-------------+
                        |   vmalloc   |               |   vmalloc   |
  0xffff_8000_0000_0000 +-------------+ _PAGE_END(48) +-------------+
                        |   linear    |               :             :
  0xffff_0000_0000_0000 +-------------+               :             :
                        :             :               :             :
                        :             :               :             :
                        :             :               :             :
                        :             :               :  currently  :
                        :  unusable   :               :             :
                        :             :               :   unused    :
                        :     by      :               :             :
                        :             :               :             :
                        :  hardware   :               :             :
                        :             :               :             :
  0xfff8_0000_0000_0000 :             : _PAGE_END(52) +-------------+
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :  unusable   :               |             |
                        :             :               |   linear    |
                        :     by      :               |             |
                        :             :               |   region    |
                        :  hardware   :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
  0xfff0_0000_0000_0000 +-------------+  PAGE_OFFSET  +-------------+

As illustrated above, the 52-bit VA kernel uses 47 bits for the vmalloc
space (as before), to ensure that a single 64k granule kernel image can
support any 64k granule capable system, regardless of whether it supports
the 52-bit virtual addressing extension. However, due to the fact that
the VA space is still split in equal halves, the linear region is only
2^51 bytes in size, wasting almost half of the 52-bit VA space.

Let's fix this, by abandoning the equal split, and simply assigning all
VA space outside of the vmalloc region to the linear region.

The KASAN shadow region is reconfigured so that it ends at the start of
the vmalloc region, and grows downwards. That way, the arrangement of
the vmalloc space (which contains kernel mappings, modules, BPF region,
the vmemmap array etc) is identical between non-KASAN and KASAN builds,
which aids debugging.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Link: https://lore.kernel.org/r/20201008153602.9467-3-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoLinux 5.10-rc3
Linus Torvalds [Mon, 9 Nov 2020 00:10:16 +0000 (16:10 -0800)]
Linux 5.10-rc3

3 years agoMerge tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Nov 2020 19:30:25 +0000 (11:30 -0800)]
Merge tag 'driver-core-5.10-rc3' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core documentation fixes from Greg KH:
 "Some small Documentation fixes that were fallout from the larger
  documentation update we did in 5.10-rc2.

  Nothing major here at all, but all of these have been in linux-next
  and resolve build warnings when building the documentation files"

* tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  Documentation: remove mic/index from misc-devices/index.rst
  scripts: get_api.pl: Add sub-titles to ABI output
  scripts: get_abi.pl: Don't let ABI files to create subtitles
  docs: leds: index.rst: add a missing file
  docs: ABI: sysfs-class-net: fix a typo
  docs: ABI: sysfs-driver-dma-ioatdma: what starts with /sys

3 years agoMerge tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sun, 8 Nov 2020 19:28:08 +0000 (11:28 -0800)]
Merge tag 'tty-5.10-rc3' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are a small number of small tty and serial fixes for some
  reported problems for the tty core, vt code, and some serial drivers.

  They include fixes for:

   - a buggy and obsolete vt font ioctl removal

   - 8250_mtk serial baudrate runtime warnings

   - imx serial earlycon build configuration fix

   - txx9 serial driver error path cleanup issues

   - tty core fix in release_tty that can be triggered by trying to bind
     an invalid serial port name to a speakup console device

  Almost all of these have been in linux-next without any problems, the
  only one that hasn't, just deletes code :)"

* tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  vt: Disable KD_FONT_OP_COPY
  tty: fix crash in release_tty if tty->port is not set
  serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init
  tty: serial: imx: enable earlycon by default if IMX_SERIAL_CONSOLE is enabled
  serial: 8250_mtk: Fix uart_get_baud_rate warning

3 years agoMerge tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 8 Nov 2020 19:24:10 +0000 (11:24 -0800)]
Merge tag 'usb-5.10-rc3' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes and new device ids:

   - USB gadget fixes for some reported issues

   - Fixes for the ever-troublesome apple fastcharge driver, hopefully
     we finally have it right.

   - More USB core quirks for odd devices

   - USB serial driver fixes for some long-standing issues that were
     recently found

   - some new USB serial driver device ids

  All have been in linux-next with no reported issues"

* tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: apple-mfi-fastcharge: fix reference leak in apple_mfi_fc_set_property
  usb: mtu3: fix panic in mtu3_gadget_stop()
  USB: serial: option: add Telit FN980 composition 0x1055
  USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231
  USB: serial: cyberjack: fix write-URB completion race
  USB: Add NO_LPM quirk for Kingston flash drive
  USB: serial: option: add Quectel EC200T module support
  usb: raw-gadget: fix memory leak in gadget_setup
  usb: dwc2: Avoid leaving the error_debugfs label unused
  usb: dwc3: ep0: Fix delay status handling
  usb: gadget: fsl: fix null pointer checking
  usb: gadget: goku_udc: fix potential crashes in probe
  usb: dwc3: pci: add support for the Intel Alder Lake-S

3 years agofork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent
Eddy Wu [Sat, 7 Nov 2020 06:47:22 +0000 (14:47 +0800)]
fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent

current->group_leader->exit_signal may change during copy_process() if
current->real_parent exits.

Move the assignment inside tasklist_lock to avoid the race.

Signed-off-by: Eddy Wu <eddy_wu@trendmicro.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agovt: Disable KD_FONT_OP_COPY
Daniel Vetter [Sun, 8 Nov 2020 15:38:06 +0000 (16:38 +0100)]
vt: Disable KD_FONT_OP_COPY

It's buggy:

On Fri, Nov 06, 2020 at 10:30:08PM +0800, Minh Yuan wrote:
> We recently discovered a slab-out-of-bounds read in fbcon in the latest
> kernel ( v5.10-rc2 for now ).  The root cause of this vulnerability is that
> "fbcon_do_set_font" did not handle "vc->vc_font.data" and
> "vc->vc_font.height" correctly, and the patch
> <https://lkml.org/lkml/2020/9/27/223> for VT_RESIZEX can't handle this
> issue.
>
> Specifically, we use KD_FONT_OP_SET to set a small font.data for tty6, and
> use  KD_FONT_OP_SET again to set a large font.height for tty1. After that,
> we use KD_FONT_OP_COPY to assign tty6's vc_font.data to tty1's vc_font.data
> in "fbcon_do_set_font", while tty1 retains the original larger
> height. Obviously, this will cause an out-of-bounds read, because we can
> access a smaller vc_font.data with a larger vc_font.height.

Further there was only one user ever.
- Android's loadfont, busybox and console-tools only ever use OP_GET
  and OP_SET
- fbset documentation only mentions the kernel cmdline font: option,
  not anything else.
- systemd used OP_COPY before release 232 published in Nov 2016

Now unfortunately the crucial report seems to have gone down with
gmane, and the commit message doesn't say much. But the pull request
hints at OP_COPY being broken

https://github.com/systemd/systemd/pull/3651

So in other words, this never worked, and the only project which
foolishly every tried to use it, realized that rather quickly too.

Instead of trying to fix security issues here on dead code by adding
missing checks, fix the entire thing by removing the functionality.

Note that systemd code using the OP_COPY function ignored the return
value, so it doesn't matter what we're doing here really - just in
case a lone server somewhere happens to be extremely unlucky and
running an affected old version of systemd. The relevant code from
font_copy_to_all_vcs() in systemd was:

/* copy font from active VT, where the font was uploaded to */
cfo.op = KD_FONT_OP_COPY;
cfo.height = vcs.v_active-1; /* tty1 == index 0 */
(void) ioctl(vcfd, KDFONTOP, &cfo);

Note this just disables the ioctl, garbage collecting the now unused
callbacks is left for -next.

v2: Tetsuo found the old mail, which allowed me to find it on another
archive. Add the link too.

Acked-by: Peilin Ye <yepeilin.cs@gmail.com>
Reported-by: Minh Yuan <yuanmingbuaa@gmail.com>
References: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036935.html
References: https://github.com/systemd/systemd/pull/3651
Cc: Greg KH <greg@kroah.com>
Cc: Peilin Ye <yepeilin.cs@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://lore.kernel.org/r/20201108153806.3140315-1-daniel.vetter@ffwll.ch
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoMerge tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Sun, 8 Nov 2020 18:23:07 +0000 (10:23 -0800)]
Merge tag 'xfs-5.10-fixes-3' of git://git./fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:

 - Fix an uninitialized struct problem

 - Fix an iomap problem zeroing unwritten EOF blocks

 - Fix some clumsy error handling when writeback fails on filesystems
   with blocksize < pagesize

 - Fix a retry loop not resetting loop variables properly

 - Fix scrub flagging rtinherit inodes on a non-rt fs, since the kernel
   actually does permit that combination

 - Fix excessive page cache flushing when unsharing part of a file

* tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: only flush the unshared range in xfs_reflink_unshare
  xfs: fix scrub flagging rtinherit even if there is no rt device
  xfs: fix missing CoW blocks writeback conversion retry
  iomap: clean up writeback state logic on writepage error
  iomap: support partial page discard on writeback block mapping failure
  xfs: flush new eof page on truncate to avoid post-eof corruption
  xfs: set xefi_discard when creating a deferred agfl free log intent item

3 years agoMerge branch 'hch' (patches from Christoph)
Linus Torvalds [Sun, 8 Nov 2020 18:11:31 +0000 (10:11 -0800)]
Merge branch 'hch' (patches from Christoph)

Merge procfs splice read fixes from Christoph Hellwig:
 "Greg reported a problem due to the fact that Android tests use procfs
  files to test splice, which stopped working with the changes for
  set_fs() removal.

  This series adds read_iter support for seq_file, and uses those for
  various proc files using seq_file to restore splice read support"

[ Side note: Christoph initially had a scripted "move everything over"
  patch, which looks fine, but I personally would prefer us to actively
  discourage splice() on random files.  So this does just the minimal
  basic core set of proc file op conversions.

  For completeness, and in case people care, that script was

     sed -i -e 's/\.proc_read\(\s*=\s*\)seq_read/\.proc_read_iter\1seq_read_iter/g'

  but I'll wait and see if somebody has a strong argument for using
  splice on random small /proc files before I'd run it on the whole
  kernel.   - Linus ]

* emailed patches from Christoph Hellwig <hch@lst.de>:
  proc "seq files": switch to ->read_iter
  proc "single files": switch to ->read_iter
  proc/stat: switch to ->read_iter
  proc/cpuinfo: switch to ->read_iter
  proc: wire up generic_file_splice_read for iter ops
  seq_file: add seq_read_iter

3 years agoMerge tag 'x86-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Nov 2020 18:09:36 +0000 (10:09 -0800)]
Merge tag 'x86-urgent-2020-11-08' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of x86 fixes:

   - Use SYM_FUNC_START_WEAK in the mem* ASM functions instead of a
     combination of .weak and SYM_FUNC_START_LOCAL which makes LLVMs
     integrated assembler upset

   - Correct the mitigation selection logic which prevented the related
     prctl to work correctly

   - Make the UV5 hubless system work correctly by fixing up the
     malformed table entries and adding the missing ones"

* tag 'x86-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform/uv: Recognize UV5 hubless system identifier
  x86/platform/uv: Remove spaces from OEM IDs
  x86/platform/uv: Fix missing OEM_TABLE_ID
  x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP
  x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S

3 years agoMerge tag 'perf-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Nov 2020 18:05:10 +0000 (10:05 -0800)]
Merge tag 'perf-urgent-2020-11-08' of git://git./linux/kernel/git/tip/tip

Pull perf fix from Thomas Gleixner:
 "A single fix for the perf core plugging a memory leak in the address
  filter parser"

* tag 'perf-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix a memory leak in perf_event_parse_addr_filter()

3 years agoMerge tag 'locking-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 8 Nov 2020 17:56:37 +0000 (09:56 -0800)]
Merge tag 'locking-urgent-2020-11-08' of git://git./linux/kernel/git/tip/tip

Pull futex fix from Thomas Gleixner:
 "A single fix for the futex code where an intermediate state in the
  underlying RT mutex was not handled correctly and triggering a BUG()
  instead of treating it as another variant of retry condition"

* tag 'locking-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Handle transient "ownerless" rtmutex state correctly

3 years agoMerge tag 'irq-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Nov 2020 17:52:57 +0000 (09:52 -0800)]
Merge tag 'irq-urgent-2020-11-08' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A set of fixes for interrupt chip drivers:

   - Fix the fallout of the IPI as interrupt conversion in Kconfig and
     the BCM2836 interrupt chip driver

   - Fixes for interrupt affinity setting and the handling of
     hierarchical irq domains in the SiFive PLIC driver

   - Make the unmapped event handling in the TI SCI driver work
     correctly

   - A few minor fixes and cleanups in various chip drivers and Kconfig"

* tag 'irq-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  dt-bindings: irqchip: ti, sci-inta: Fix diagram indentation for unmapped events
  irqchip/ti-sci-inta: Add support for unmapped event handling
  dt-bindings: irqchip: ti, sci-inta: Update for unmapped event handling
  irqchip/renesas-intc-irqpin: Merge irlm_bit and needs_irlm
  irqchip/sifive-plic: Fix chip_data access within a hierarchy
  irqchip/sifive-plic: Fix broken irq_set_affinity() callback
  irqchip/stm32-exti: Add all LP timer exti direct events support
  irqchip/bcm2836: Fix missing __init annotation
  irqchip/mips: Drop selection of IRQ_DOMAIN_HIERARCHY
  irqchip/mst: Make mst_intc_of_init static
  irqchip/mst: MST_IRQ should depend on ARCH_MEDIATEK or ARCH_MSTARV7
  genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY

3 years agoMerge tag 'core-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Nov 2020 17:51:28 +0000 (09:51 -0800)]
Merge tag 'core-urgent-2020-11-08' of git://git./linux/kernel/git/tip/tip

Pull entry code fix from Thomas Gleixner:
 "A single fix for the generic entry code to correct the wrong
  assumption that the lockdep interrupt state needs not to be
  established before calling the RCU check"

* tag 'core-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry: Fix the incorrect ordering of lockdep and RCU check

3 years agoMerge tag 'powerpc-5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 8 Nov 2020 17:37:20 +0000 (09:37 -0800)]
Merge tag 'powerpc-5.10-3' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - fix miscompilation with GCC 4.9 by using asm_goto_volatile for put_user()

 - fix for an RCU splat at boot caused by a recent lockdep change

 - fix for a possible deadlock in our EEH debugfs code

 - several fixes for handling of _PAGE_ACCESSED on 32-bit platforms

 - build fix when CONFIG_NUMA=n

Thanks to Andreas Schwab, Christophe Leroy, Oliver O'Halloran, Qian Cai,
and Scott Cheloha.

* tag 'powerpc-5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/numa: Fix build when CONFIG_NUMA=n
  powerpc/8xx: Manage _PAGE_ACCESSED through APG bits in L1 entry
  powerpc/8xx: Always fault when _PAGE_ACCESSED is not set
  powerpc/40x: Always fault when _PAGE_ACCESSED is not set
  powerpc/603: Always fault when _PAGE_ACCESSED is not set
  powerpc: Use asm_goto_volatile for put_user()
  powerpc/smp: Call rcu_cpu_starting() earlier
  powerpc/eeh_cache: Fix a possible debugfs deadlock

3 years agoMerge tag 'block-5.10-2020-11-07' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 7 Nov 2020 21:56:07 +0000 (13:56 -0800)]
Merge tag 'block-5.10-2020-11-07' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - NVMe pull request from Christoph:
    - revert a nvme_queue size optimization (Keith Bush)
    - fabrics timeout races fixes (Chao Leng and Sagi Grimberg)"

 - null_blk zone locking fix (Damien)

* tag 'block-5.10-2020-11-07' of git://git.kernel.dk/linux-block:
  null_blk: Fix scheduling in atomic with zoned mode
  nvme-tcp: avoid repeated request completion
  nvme-rdma: avoid repeated request completion
  nvme-tcp: avoid race between time out and tear down
  nvme-rdma: avoid race between time out and tear down
  nvme: introduce nvme_sync_io_queues
  Revert "nvme-pci: remove last_sq_tail"

3 years agoMerge tag 'io_uring-5.10-2020-11-07' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 7 Nov 2020 21:49:24 +0000 (13:49 -0800)]
Merge tag 'io_uring-5.10-2020-11-07' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "A set of fixes for io_uring:

   - SQPOLL cancelation fixes

   - Two fixes for the io_identity COW

   - Cancelation overflow fix (Pavel)

   - Drain request cancelation fix (Pavel)

   - Link timeout race fix (Pavel)"

* tag 'io_uring-5.10-2020-11-07' of git://git.kernel.dk/linux-block:
  io_uring: fix link lookup racing with link timeout
  io_uring: use correct pointer for io_uring_show_cred()
  io_uring: don't forget to task-cancel drained reqs
  io_uring: fix overflowed cancel w/ linked ->files
  io_uring: drop req/tctx io_identity separately
  io_uring: ensure consistent view of original task ->mm from SQPOLL
  io_uring: properly handle SQPOLL request cancelations
  io-wq: cancel request if it's asking for files and we don't have them

3 years agofutex: Handle transient "ownerless" rtmutex state correctly
Mike Galbraith [Wed, 4 Nov 2020 15:12:44 +0000 (16:12 +0100)]
futex: Handle transient "ownerless" rtmutex state correctly

Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner().
This is one possible chain of events leading to this:

Task Prio       Operation
T1   120 lock(F)
T2   120 lock(F)   -> blocks (top waiter)
T3   50 (RT) lock(F)   -> boosts T1 and blocks (new top waiter)
XX    timeout/  -> wakes T2
signal
T1   50 unlock(F) -> wakes T3 (rtmutex->owner == NULL, waiter bit is set)
T2   120 cleanup   -> try_to_take_mutex() fails because T3 is the top waiter
           and the lower priority T2 cannot steal the lock.
        -> fixup_pi_state_owner() sees newowner == NULL -> BUG_ON()

The comment states that this is invalid and rt_mutex_real_owner() must
return a non NULL owner when the trylock failed, but in case of a queued
and woken up waiter rt_mutex_real_owner() == NULL is a valid transient
state. The higher priority waiter has simply not yet managed to take over
the rtmutex.

The BUG_ON() is therefore wrong and this is just another retry condition in
fixup_pi_state_owner().

Drop the locks, so that T3 can make progress, and then try the fixup again.

Gratian provided a great analysis, traces and a reproducer. The analysis is
to the point, but it confused the hell out of that tglx dude who had to
page in all the futex horrors again. Condensed version is above.

[ tglx: Wrote comment and changelog ]

Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Reported-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87a6w6x7bb.fsf@ni.com
Link: https://lore.kernel.org/r/87sg9pkvf7.fsf@nanos.tec.linutronix.de
3 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 7 Nov 2020 19:24:03 +0000 (11:24 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Driver bugfixes for I2C.

  Most of them are for the new mlxbf driver which got more exposure
  after rc1. The sh_mobile patch should already have reached you during
  the merge window, but I accidently dropped it. However, since it fixes
  a problem with rebooting, it is still fine for rc3"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: designware: slave should do WRITE_REQUESTED before WRITE_RECEIVED
  i2c: designware: call i2c_dw_read_clear_intrbits_slave() once
  i2c: mlxbf: I2C_MLXBF should depend on MELLANOX_PLATFORM
  i2c: mlxbf: Update author and maintainer email info
  i2c: mlxbf: Update reference clock frequency
  i2c: mlxbf: Remove unecessary wrapper functions
  i2c: mlxbf: Fix resrticted cast warning of sparse
  i2c: mlxbf: Add CONFIG_ACPI to guard ACPI function call
  i2c: sh_mobile: implement atomic transfers
  i2c: mediatek: move dma reset before i2c reset

3 years agoMerge tag 'riscv-for-linus-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 7 Nov 2020 19:16:37 +0000 (11:16 -0800)]
Merge tag 'riscv-for-linus-5.10-rc3' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - SPDX comment style fix

 - ignore memory that is unusable

 - avoid setting a kernel text offset for the !MMU kernels, where
   skipping the first page of memory is both unnecessary and costly

 - avoid passing the flag bits in satp to pfn_to_virt()

 - fix __put_kernel_nofault, where we had the arguments to
   __put_user_nocheck reversed

 - workaround for a bug in the FU540 to avoid triggering PMP issues
   during early boot

 - change to how we pull symbols out of the vDSO. The old mechanism was
   removed from binutils-2.35 (and has been backported to Debian's 2.34)

* tag 'riscv-for-linus-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Fix the VDSO symbol generaton for binutils-2.35+
  RISC-V: Use non-PGD mappings for early DTB access
  riscv: uaccess: fix __put_kernel_nofault()
  riscv: fix pfn_to_virt err in do_page_fault().
  riscv: Set text_offset correctly for M-Mode
  RISC-V: Remove any memblock representing unusable memory area
  risc-v: kernel: ftrace: Fixes improper SPDX comment style

3 years agoMerge tag 'usb-serial-5.10-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git...
Greg Kroah-Hartman [Sat, 7 Nov 2020 14:56:37 +0000 (15:56 +0100)]
Merge tag 'usb-serial-5.10-rc3' of https://git./linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for 5.10-rc3

Here's a fix for a long-standing issue with the cyberjack driver and
some new device ids.

All have been in linux-next with no reported issues.

* tag 'usb-serial-5.10-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: option: add Telit FN980 composition 0x1055
  USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231
  USB: serial: cyberjack: fix write-URB completion race
  USB: serial: option: add Quectel EC200T module support

3 years agoperf/core: Fix a memory leak in perf_event_parse_addr_filter()
kiyin(尹亮) [Wed, 4 Nov 2020 05:23:22 +0000 (08:23 +0300)]
perf/core: Fix a memory leak in perf_event_parse_addr_filter()

As shown through runtime testing, the "filename" allocation is not
always freed in perf_event_parse_addr_filter().

There are three possible ways that this could happen:

 - It could be allocated twice on subsequent iterations through the loop,
 - or leaked on the success path,
 - or on the failure path.

Clean up the code flow to make it obvious that 'filename' is always
freed in the reallocation path and in the two return paths as well.

We rely on the fact that kfree(NULL) is NOP and filename is initialized
with NULL.

This fixes the leak. No other side effects expected.

[ Dan Carpenter: cleaned up the code flow & added a changelog. ]
[ Ingo Molnar: updated the changelog some more. ]

Fixes: 375637bc5249 ("perf/core: Introduce address range filtering")
Signed-off-by: "kiyin(尹亮)" <kiyin@tencent.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "Srivatsa S. Bhat" <srivatsa@csail.mit.edu>
Cc: Anthony Liguori <aliguori@amazon.com>
--
 kernel/events/core.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

3 years agox86/platform/uv: Recognize UV5 hubless system identifier
Mike Travis [Thu, 5 Nov 2020 22:27:41 +0000 (16:27 -0600)]
x86/platform/uv: Recognize UV5 hubless system identifier

Testing shows a problem in that UV5 hubless systems were not being
recognized.  Add them to the list of OEM IDs checked.

Fixes: 6c7794423a998 ("Add UV5 direct references")
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201105222741.157029-4-mike.travis@hpe.com