review.tizen.org Git - platform/kernel/linux-rpi.git/log

Revert "drm/amd/display: Fix for otg synchronization logic"

This reverts commit a896f870f8a5f23ec961d16baffd3fda1f8be57c.

It causes odd flickering on my Radeon RX580 (PCI ID 1002:67df rev e7,
subsystem ID 1da2:e353).

Bisected right to this commit, and reverting it fixes things.

Link: https://lore.kernel.org/all/CAHk-=wg9hDde_L3bK9tAfdJ4N=TJJ+SjO3ZDONqH5=bVoy_Mzg@mail.gmail.com/
Cc: Alex Deucher <alexdeucher@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Jun Lei <Jun.Lei@amd.com>
Cc: Mustapha Ghaddar <mustapha.ghaddar@amd.com>
Cc: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Cc: meenakshikumar somasundaram <meenakshikumar.somasundaram@amd.com>
Cc: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
"Highlights are support for privacy screens found in new laptops, a
  bunch of nomodeset refactoring, and i915 enables ADL-P systems by
  default, while starting to add RPL-S support.

  vmwgfx adds GEM and support for OpenGL 4.3 features in userspace.

  Lots of internal refactorings around dma reservations, and lots of
  driver refactoring as well.

  Summary:

  core:
   - add privacy screen support
   - move nomodeset option into drm subsystem
   - clean up nomodeset handling in drivers
   - make drm_irq.c legacy
   - fix stack_depot name conflicts
   - remove DMA_BUF_SET_NAME ioctl restrictions
   - sysfs: send hotplug event
   - replace several DRM_* logging macros with drm_*
   - move hashtable to legacy code
   - add error return from gem_create_object
   - cma-helper: improve interfaces, drop CONFIG_DRM_KMS_CMA_HELPER
   - kernel.h related include cleanups
   - support XRGB2101010 source buffers

  ttm:
   - don't include drm hashtable
   - stop pruning fences after wait
   - documentation updates

  dma-buf:
   - add dma_resv selftest
   - add debugfs helpers
   - remove dma_resv_get_excl_unlocked
   - documentation
   - make fences mandatory in dma_resv_add_excl_fence

  dp:
   - add link training delay helpers

  gem:
   - link shmem/cma helpers into separate modules
   - use dma_resv iteratior
   - import dma-buf namespace into gem helper modules

  scheduler:
   - fence grab fix
   - lockdep fixes

  bridge:
   - switch to managed MIPI DSI helpers
   - register and attach during probe fixes
   - convert to YAML in several places.

  panel:
   - add bunch of new panesl

  simpledrm:
   - support FB_DAMAGE_CLIPS
   - support virtual screen sizes
   - add Apple M1 support

  amdgpu:
   - enable seamless boot for DCN 3.01
   - runtime PM fixes
   - use drm_kms_helper_connector_hotplug_event
   - get all fences at once
   - use generic drm fb helpers
   - PSR/DPCD/LTTPR/DSC/PM/RAS/OLED/SRIOV fixes
   - add smart trace buffer (STB) for supported GPUs
   - display debugfs entries
   - new SMU debug option
   - Documentation update

  amdkfd:
   - IP discovery enumeration refactor
   - interface between driver fixes
   - SVM fixes
   - kfd uapi header to define some sysfs bitfields.

  i915:
   - support VESA panel backlights
   - enable ADL-P by default
   - add eDP privacy screen support
   - add Raptor Lake S (RPL-S) support
   - DG2 page table support
   - lots of GuC/HuC fw refactoring
   - refactored i915->gt interfaces
   - CD clock squashing support
   - enable 10-bit gamma support
   - update ADL-P DMC fw to v2.14
   - enable runtime PM autosuspend by default
   - ADL-P DSI support
   - per-lane DP drive settings for ICL+
   - add support for pipe C/D DMC firmware
   - Atomic gamma LUT updates
   - remove CCS FB stride restrictions on ADL-P
   - VRR platform support for display 11
   - add support for display audio codec keepalive
   - lots of display refactoring
   - fix runtime PM handling during PXP suspend
   - improved eviction performance with async TTM moves
   - async VMA unbinding improvements
   - VMA locking refactoring
   - improved error capture robustness
   - use per device iommu checks
   - drop bits stealing from i915_sw_fence function ptr
   - remove dma_resv_prune
   - add IC cache invalidation on DG2

  nouveau:
   - crc fixes
   - validate LUTs in atomic check
   - set HDMI AVI RGB quant to full

  tegra:
   - buffer objects reworks for dma-buf compat
   - NVDEC driver uAPI support
   - power management improvements

  etnaviv:
   - IOMMU enabled system support
   - fix > 4GB command buffer mapping
   - close a DoS vector
   - fix spurious GPU resets

  ast:
   - fix i2c initialization

  rcar-du:
   - DSI output support

  exynos:
   - replace legacy gpio interface
   - implement generic GEM object mmap

  msm:
   - dpu plane state cleanup in prep for multirect
   - dpu debugfs cleanups
   - dp support for sc7280
   - a506 support
   - removal of struct_mutex
   - remove old eDP sub-driver

  anx7625:
   - support MIPI DSI input
   - support HDMI audio
   - fix reading EDID

  lvds:
   - fix bridge DT bindings

  megachips:
   - probe both bridges before registering

  dw-hdmi:
   - allow interlace on bridge

  ps8640:
   - enable runtime PM
   - support aux-bus

  tx358768:
   - enable reference clock
   - add pulse mode support

  ti-sn65dsi86:
   - use regmap bulk write
   - add PWM support

  etnaviv:
   - get all fences at once

  gma500:
   - gem object cleanups

  kmb:
   - enable fb console

  radeon:
   - use dma_resv_wait_timeout

  rockchip:
   - add DSP hold timeout
   - suspend/resume fixes
   - PLL clock fixes
   - implement mmap in GEM object functions
   - use generic fbdev emulation

  sun4i:
   - use CMA helpers without vmap support

  vc4:
   - fix HDMI-CEC hang with display is off
   - power on HDMI controller while disabling
   - support 4K@60Hz modes
   - support 10-bit YUV 4:2:0 output

  vmwgfx:
   - fix leak on probe errors
   - fail probing on broken hosts
   - new placement for MOB page tables
   - hide internal BOs from userspace
   - implement GEM support
   - implement GL 4.3 support

  virtio:
   - overflow fixes

  xen:
   - implement mmap as GEM object function

  omapdrm:
   - fix scatterlist export
   - support virtual planes

  mediatek:
   - MT8192 support
   - CMDQ refinement"

* tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm: (1241 commits)
  drm/amdgpu: no DC support for headless chips
  drm/amd/display: fix dereference before NULL check
  drm/amdgpu: always reset the asic in suspend (v2)
  drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform
  drm/amd/display: Fix the uninitialized variable in enable_stream_features()
  drm/amdgpu: fix runpm documentation
  amdgpu/pm: Make sysfs pm attributes as read-only for VFs
  drm/amdgpu: save error count in RAS poison handler
  drm/amdgpu: drop redundant semicolon
  drm/amd/display: get and restore link res map
  drm/amd/display: support dynamic HPO DP link encoder allocation
  drm/amd/display: access hpo dp link encoder only through link resource
  drm/amd/display: populate link res in both detection and validation
  drm/amd/display: define link res and make it accessible to all link interfaces
  drm/amd/display: 3.2.167
  drm/amd/display: [FW Promotion] Release 0.0.98
  drm/amd/display: Undo ODM combine
  drm/amd/display: Add reg defs for DCN303
  drm/amd/display: Changed pipe split policy to allow for multi-display pipe split
  drm/amd/display: Set optimize_pwr_state for DCN31
  ...

Merge tag 'linux-kselftest-kunit-5.17-rc1' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:
"This consists of several fixes and enhancements. A few highlights:

   - Option --kconfig_add option allows easily tweaking kunitconfigs

   - make build subcommand can reconfigure if needed

   - doesn't error on tests without test plans

   - doesn't crash if no parameters are generated

   - defaults --jobs to # of cups

   - reports test parameter results as (K)TAP subtests"

* tag 'linux-kselftest-kunit-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: Default --jobs to number of CPUs
  kunit: tool: fix newly introduced typechecker errors
  kunit: tool: make `build` subcommand also reconfigure if needed
  kunit: tool: delete kunit_parser.TestResult type
  kunit: tool: use dataclass instead of collections.namedtuple
  kunit: tool: suggest using decode_stacktrace.sh on kernel crash
  kunit: tool: reconfigure when the used kunitconfig changes
  kunit: tool: revamp message for invalid kunitconfig
  kunit: tool: add --kconfig_add to allow easily tweaking kunitconfigs
  kunit: tool: move Kconfig read_from_file/parse_from_string to package-level
  kunit: tool: print parsed test results fully incrementally
  kunit: Report test parameter results as (K)TAP subtests
  kunit: Don't crash if no parameters are generated
  kunit: tool: Report an error if any test has no subtests
  kunit: tool: Do not error on tests without test plans
  kunit: add run_checks.py script to validate kunit changes
  Documentation: kunit: remove claims that kunit is a mocking framework
  kunit: tool: fix --json output for skipped tests

Merge tag 'linux-kselftest-next-5.17-rc1' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull Kselftest update from Shuah Khan:
"Fixes to build errors, false negatives, and several code cleanups,
  including the ARRAY_SIZE cleanup that removes 25+ duplicates
  ARRAY_SIZE defines from individual tests"

* tag 'linux-kselftest-next-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/vm: remove ARRAY_SIZE define from individual tests
  selftests/timens: remove ARRAY_SIZE define from individual tests
  selftests/sparc64: remove ARRAY_SIZE define from adi-test
  selftests/seccomp: remove ARRAY_SIZE define from seccomp_benchmark
  selftests/rseq: remove ARRAY_SIZE define from individual tests
  selftests/net: remove ARRAY_SIZE define from individual tests
  selftests/landlock: remove ARRAY_SIZE define from common.h
  selftests/ir: remove ARRAY_SIZE define from ir_loopback.c
  selftests/core: remove ARRAY_SIZE define from close_range_test.c
  selftests/cgroup: remove ARRAY_SIZE define from cgroup_util.h
  selftests/arm64: remove ARRAY_SIZE define from vec-syscfg.c
  tools: fix ARRAY_SIZE defines in tools and selftests hdrs
  selftests: cgroup: build error multiple outpt files
  selftests/move_mount_set_group remove unneeded conversion to bool
  selftests/mount: remove unneeded conversion to bool
  selftests: harness: avoid false negatives if test has no ASSERTs
  selftests/ftrace: make kprobe profile testcase description unique
  selftests: clone3: clone3: add case CLONE3_ARGS_NO_TEST
  selftests: timers: Remove unneeded semicolon
  kselftests: timers:Remove unneeded semicolon

Merge tag 'slab-for-5.17' of git://git./linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:

- Separate struct slab from struct page - an offshot of the page folio
   work.

   Struct page fields used by slab allocators are moved from struct page
   to a new struct slab, that uses the same physical storage. Similar to
   struct folio, it always is a head page. This brings better type
   safety, separation of large kmalloc allocations from true slabs, and
   cleanup of related objcg code.

- A SLAB_MERGE_DEFAULT config optimization.

* tag 'slab-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (33 commits)
  mm/slob: Remove unnecessary page_mapcount_reset() function call
  bootmem: Use page->index instead of page->freelist
  zsmalloc: Stop using slab fields in struct page
  mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only when enabled
  mm/slub: Simplify struct slab slabs field definition
  mm/sl*b: Differentiate struct slab fields by sl*b implementations
  mm/kfence: Convert kfence_guarded_alloc() to struct slab
  mm/kasan: Convert to struct folio and struct slab
  mm/slob: Convert SLOB to use struct slab and struct folio
  mm/memcg: Convert slab objcgs from struct page to struct slab
  mm: Convert struct page to struct slab in functions used by other subsystems
  mm/slab: Finish struct page to struct slab conversion
  mm/slab: Convert most struct page to struct slab by spatch
  mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
  mm/slub: Finish struct page to struct slab conversion
  mm/slub: Convert most struct page to struct slab by spatch
  mm/slub: Convert pfmemalloc_match() to take a struct slab
  mm/slub: Convert __free_slab() to use struct slab
  mm/slub: Convert alloc_slab_page() to return a struct slab
  mm/slub: Convert print_page_info() to print_slab_info()
  ...

Merge branch 'random-5.17-for-linus' of git://git./linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
"These a bit more numerous than usual for the RNG, due to folks
  resubmitting patches that had been pending prior and generally renewed
  interest.

  There are a few categories of patches in here:

   1) Dominik Brodowski and I traded a series back and forth for a some
      weeks that fixed numerous issues related to seeds being provided
      at extremely early boot by the firmware, before other parts of the
      kernel or of the RNG have been initialized, both fixing some
      crashes and addressing correctness around early boot randomness.
      One of these is marked for stable.

   2) I replaced the RNG's usage of SHA-1 with BLAKE2s in the entropy
      extractor, and made the construction a bit safer and more
      standard. This was sort of a long overdue low hanging fruit, as we
      were supposed to have phased out SHA-1 usage quite some time ago
      (even if all we needed here was non-invertibility). Along the way
      it also made extraction 131% faster. This required a bit of
      Kconfig and symbol plumbing to make things work well with the
      crypto libraries, which is one of the reasons why I'm sending you
      this pull early in the cycle.

   3) I got rid of a truly superfluous call to RDRAND in the hot path,
      which resulted in a whopping 370% increase in performance.

   4) Sebastian Andrzej Siewior sent some patches regarding PREEMPT_RT,
      the full series of which wasn't ready yet, but the first two
      preparatory cleanups were good on their own. One of them touches
      files in kernel/irq/, which is the other reason why I'm sending
      you this pull early in the cycle.

   5) Other assorted correctness fixes from Eric Biggers, Jann Horn,
      Mark Brown, Dominik Brodowski, and myself"

* 'random-5.17-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: don't reset crng_init_cnt on urandom_read()
  random: avoid superfluous call to RDRAND in CRNG extraction
  random: early initialization of ChaCha constants
  random: use IS_ENABLED(CONFIG_NUMA) instead of ifdefs
  random: harmonize "crng init done" messages
  random: mix bootloader randomness into pool
  random: do not throw away excess input to crng_fast_load
  random: do not re-init if crng_reseed completes before primary init
  random: fix crash on multiple early calls to add_bootloader_randomness()
  random: do not sign extend bytes for rotation when mixing
  random: use BLAKE2s instead of SHA1 in extraction
  lib/crypto: blake2s: include as built-in
  random: fix data race on crng init time
  random: fix data race on crng_node_pool
  irq: remove unused flags argument from __handle_irq_event_percpu()
  random: remove unused irq_flags argument from add_interrupt_randomness()
  random: document add_hwgenerator_randomness() with other input functions
  MAINTAINERS: add git tree for random.c

Merge tag 'seccomp-v5.17-rc1' of git://git./linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:
"The core seccomp code hasn't changed for this cycle, but the selftests
  were improved while helping to debug the recent signal handling
  refactoring work Eric did.

  Summary:

   - Improve seccomp selftests in support of signal handler refactoring
     (Kees Cook)"

* tag 'seccomp-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  selftests/seccomp: Report event mismatches more clearly
  selftests/seccomp: Stop USER_NOTIF test if kcmp() fails

Merge tag 'pstore-v5.17-rc1' of git://git./linux/kernel/git/kees/linux

Pull pstore update from Kees Cook:

- Add boot param for early ftrace recording in pstore (Uwe
Kleine-König)

* tag 'pstore-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore/ftrace: Allow immediate recording

Merge tag 'edac_updates_for_v5.17_rc1' of git://git./linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

- Add support for version 3 of the Synopsys DDR controller to
   synopsys_edac

- Add support for DRR5 and new models 0x10-0x1f and 0x50-0x5f of AMD
   family 0x19 CPUs to amd64_edac

- The usual set of fixes and cleanups

* tag 'edac_updates_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/amd64: Add support for family 19h, models 50h-5fh
  EDAC/sb_edac: Remove redundant initialization of variable rc
  RAS/CEC: Remove a repeated 'an' in a comment
  EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh
  EDAC: Add RDDR5 and LRDDR5 memory types
  EDAC/sifive: Fix non-kernel-doc comment
  dt-bindings: memory: Add entry for version 3.80a
  EDAC/synopsys: Enable the driver on Intel's N5X platform
  EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR
  EDAC/synopsys: Use the quirk for version instead of ddr version

Merge tag 'ras_core_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull RAS updates from Borislav Petkov:
"A relatively big amount of movements in RAS-land this time around:

   - First part of a series to move the AMD address translation code
     from arch/x86/ to amd64_edac as that is its only user anyway

   - Some MCE error injection improvements to the AMD side

   - Reorganization of the #MC handler code and the facilities it calls
     to make it noinstr-safe

   - Add support for new AMD MCA bank types and non-uniform banks layout

   - The usual set of cleanups and fixes"

* tag 'ras_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/mce: Reduce number of machine checks taken during recovery
  x86/mce/inject: Avoid out-of-bounds write when setting flags
  x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
  x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
  x86/mce: Check regs before accessing it
  x86/mce: Mark mce_start() noinstr
  x86/mce: Mark mce_timed_out() noinstr
  x86/mce: Move the tainting outside of the noinstr region
  x86/mce: Mark mce_read_aux() noinstr
  x86/mce: Mark mce_end() noinstr
  x86/mce: Mark mce_panic() noinstr
  x86/mce: Prevent severity computation from being instrumented
  x86/mce: Allow instrumentation during task work queueing
  x86/mce: Remove noinstr annotation from mce_setup()
  x86/mce: Use mce_rdmsrl() in severity checking code
  x86/mce: Remove function-local cpus variables
  x86/mce: Do not use memset to clear the banks bitmaps
  x86/mce/inject: Set the valid bit in MCA_STATUS before error injection
  x86/mce/inject: Check if a bank is populated before injecting
  x86/mce: Get rid of cpu_missing
  ...

Merge tag 'core_entry_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull thread_info flag accessor helper updates from Borislav Petkov:
"Add a set of thread_info.flags accessors which snapshot it before
  accesing it in order to prevent any potential data races, and convert
  all users to those new accessors"

* tag 'core_entry_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  powerpc: Snapshot thread flags
  powerpc: Avoid discarding flags in system_call_exception()
  openrisc: Snapshot thread flags
  microblaze: Snapshot thread flags
  arm64: Snapshot thread flags
  ARM: Snapshot thread flags
  alpha: Snapshot thread flags
  sched: Snapshot thread flags
  entry: Snapshot thread flags
  x86: Snapshot thread flags
  thread_info: Add helpers to snapshot thread flags

Merge tag 'core_core_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull notifier fix from Borislav Petkov:
"Return an error when a notifier callback has been registered already"

* tag 'core_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
notifier: Return an error when a callback has already been registered

Merge tag 'x86_vdso_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 vdso updates from Borislav Petkov:
"Remove -nostdlib compiler flag now that the vDSO uses the linker
  instead of the compiler driver to link files"

* tag 'x86_vdso_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/purgatory: Remove -nostdlib compiler flag
  x86/vdso: Remove -nostdlib compiler flag

Merge tag 'x86_build_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 build fix from Borislav Petkov:
"A fix for cross-compiling the compressed stub on arm64 with clang"

* tag 'x86_build_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/compressed: Move CLANG_FLAGS to beginning of KBUILD_CFLAGS

Merge tag 'x86_cpu_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 cpuid updates from Borislav Petkov:

- Enable the short string copies for CPUs which support them, in
   copy_user_enhanced_fast_string()

- Avoid writing MSR_CSTAR on Intel due to TDX guests raising a #VE trap

* tag 'x86_cpu_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/lib: Add fast-short-rep-movs check to copy_user_enhanced_fast_string()
  x86/cpu: Don't write CSTAR MSR on Intel CPUs

Merge tag 'x86_cleanups_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 cleanups from Borislav Petkov:
"The mandatory set of random minor cleanups all over tip"

* tag 'x86_cleanups_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/events/amd/iommu: Remove redundant assignment to variable shift
  x86/boot/string: Add missing function prototypes
  x86/fpu: Remove duplicate copy_fpstate_to_sigframe() prototype
  x86/uaccess: Move variable into switch case statement

Merge tag 'x86_misc_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull misc x86 updates from Borislav Petkov:
"The pile which we cannot find the proper topic for so we stick it in
  x86/misc:

   - Add support for decoding instructions which do MMIO accesses in
     order to use it in SEV and TDX guests

   - An include fix and reorg to allow for removing set_fs in UML later"

* tag 'x86_misc_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mtrr: Remove the mtrr_bp_init() stub
  x86/sev-es: Use insn_decode_mmio() for MMIO implementation
  x86/insn-eval: Introduce insn_decode_mmio()
  x86/insn-eval: Introduce insn_get_modrm_reg_ptr()
  x86/insn-eval: Handle insn_get_opcode() failure

Merge tag 'x86_mm_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 mm updates from Borislav Petkov:

- Flush *all* mappings from the TLB after switching to the trampoline
   pagetable to prevent any stale entries' presence

- Flush global mappings from the TLB, in addition to the CR3-write,
   after switching off of the trampoline_pgd during boot to clear the
   identity mappings

- Prevent instrumentation issues resulting from the above changes

* tag 'x86_mm_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Prevent early boot triple-faults with instrumentation
  x86/mm: Include spinlock_t definition in pgtable.
  x86/mm: Flush global TLB when switching to trampoline page-table
  x86/mm/64: Flush global TLB on boot and AP bringup
  x86/realmode: Add comment for Global bit usage in trampoline_pgd
  x86/mm: Add missing <asm/cpufeatures.h> dependency to <asm/page_64.h>

Merge tag 'x86_sgx_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 SGX updates from Borislav Petkov:

- Add support for handling hw errors in SGX pages: poisoning,
   recovering from poison memory and error injection into SGX pages

- A bunch of changes to the SGX selftests to simplify and allow of SGX
   features testing without the need of a whole SGX software stack

- Add a sysfs attribute which is supposed to show the amount of SGX
   memory in a NUMA node, similar to what /proc/meminfo is to normal
   memory

- The usual bunch of fixes and cleanups too

* tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/sgx: Fix NULL pointer dereference on non-SGX systems
  selftests/sgx: Fix corrupted cpuid macro invocation
  x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node
  x86/sgx: Fix minor documentation issues
  selftests/sgx: Add test for multiple TCS entry
  selftests/sgx: Enable multiple thread support
  selftests/sgx: Add page permission and exception test
  selftests/sgx: Rename test properties in preparation for more enclave tests
  selftests/sgx: Provide per-op parameter structs for the test enclave
  selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed
  selftests/sgx: Move setup_test_encl() to each TEST_F()
  selftests/sgx: Encpsulate the test enclave creation
  selftests/sgx: Dump segments and /proc/self/maps only on failure
  selftests/sgx: Create a heap for the test enclave
  selftests/sgx: Make data measurement for an enclave segment optional
  selftests/sgx: Assign source for each segment
  selftests/sgx: Fix a benign linker warning
  x86/sgx: Add check for SGX pages to ghes_do_memory_failure()
  x86/sgx: Add hook to error injection address validation
  x86/sgx: Hook arch_memory_failure() into mainline code
  ...

Merge tag 'x86_cache_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 resource control fixlet from Borislav Petkov:
"A minor code cleanup removing a redundant assignment"

* tag 'x86_cache_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Remove redundant assignment to variable chunks

Merge tag 'x86_sev_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 SEV updates from Borislav Petkov:
"The accumulated pile of x86/sev generalizations and cleanups:

   - Share the SEV string unrolling logic with TDX as TDX guests need it
     too

   - Cleanups and generalzation of code shared by SEV and TDX"

* tag 'x86_sev_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sev: Move common memory encryption code to mem_encrypt.c
  x86/sev: Rename mem_encrypt.c to mem_encrypt_amd.c
  x86/sev: Use CC_ATTR attribute to generalize string I/O unroll
  x86/sev: Remove do_early_exception() forward declarations
  x86/head64: Carve out the guest encryption postprocessing into a helper
  x86/sev: Get rid of excessive use of defines
  x86/sev: Shorten GHCB terminate macro names

Merge tag 'x86_platform_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 platform fix from Borislav Petkov:
"A single DT compatibility fix for the Intel media processor CE4100
driver"

* tag 'x86_platform_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ce4100: Replace "ti,pcf8575" by "nxp,pcf8575"

Merge tag 'x86_paravirt_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 paravirtualization fix from Borislav Petkov:
"Define the INTERRUPT_RETURN macro only when CONFIG_XEN_PV is enabled
as it is its only user"

* tag 'x86_paravirt_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/paravirt: Fix build PARAVIRT_XXL=y without XEN_PV

Merge tag 'x86_fpu_for_v5.17_rc1' of git://git./linux/kernel/git/tip/tip

Pull x86 fpu update from Borislav Petkov:
"A single x86/fpu update for 5.17:

   - Exclude AVX opmask registers use from AVX512 state tracking as they
     don't contribute to frequency throttling"

* tag 'x86_fpu_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Correct AVX512 state tracking

Merge tag 'm68k-for-v5.17-tag1' of git://git./linux/kernel/git/geert/linux-m68k

Pull m68k updates from Geert Uytterhoeven:

- enable memtest functionality

- defconfig updates

* tag 'm68k-for-v5.17-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: defconfig: Update defconfigs for v5.16-rc1
m68k: Enable memtest functionality

Merge tag 's390-5.17-1' of git://git./linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:
"Besides all the small improvements and cleanups the most notable part
  is the fast vector/SIMD implementation of the ChaCha20 stream cipher,
  which is an adaptation of Andy Polyakov's code for the kernel.

  Summary:

   - add fast vector/SIMD implementation of the ChaCha20 stream cipher,
     which mainly adapts Andy Polyakov's code for the kernel

   - add status attribute to AP queue device so users can easily figure
     out its status

   - fix race in page table release code, and and lots of documentation

   - remove uevent suppress from cio device driver, since it turned out
     that it generated more problems than it solved problems

   - quite a lot of virtual vs physical address confusion fixes

   - various other small improvements and cleanups all over the place"

* tag 's390-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (39 commits)
  s390/dasd: use default_groups in kobj_type
  s390/sclp_sd: use default_groups in kobj_type
  s390/pci: simplify __pciwb_mio() inline asm
  s390: remove unused TASK_SIZE_OF
  s390/crash_dump: fix virtual vs physical address handling
  s390/crypto: fix compile error for ChaCha20 module
  s390/mm: check 2KB-fragment page on release
  s390/mm: better annotate 2KB pagetable fragments handling
  s390/mm: fix 2KB pgtable release race
  s390/sclp: release SCLP early buffer after kernel initialization
  s390/nmi: disable interrupts on extended save area update
  s390/zcrypt: CCA control CPRB sending
  s390/disassembler: update opcode table
  s390/uv: fix memblock virtual vs physical address confusion
  s390/smp: fix memblock_phys_free() vs memblock_free() confusion
  s390/sclp: fix memblock_phys_free() vs memblock_free() confusion
  s390/exit: remove dead reference to do_exit from copy_thread
  s390/ap: add missing virt_to_phys address conversion
  s390/pgalloc: use pointers instead of unsigned long values
  s390/pgalloc: add virt/phys address handling to base asce functions
  ...

Merge tag 'arm64-upstream' of git://git./linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

- KCSAN enabled for arm64.

- Additional kselftests to exercise the syscall ABI w.r.t. SVE/FPSIMD.

- Some more SVE clean-ups and refactoring in preparation for SME
   support (scalable matrix extensions).

- BTI clean-ups (SYM_FUNC macros etc.)

- arm64 atomics clean-up and codegen improvements.

- HWCAPs for FEAT_AFP (alternate floating point behaviour) and
   FEAT_RPRESS (increased precision of reciprocal estimate and
   reciprocal square root estimate).

- Use SHA3 instructions to speed-up XOR.

- arm64 unwind code refactoring/unification.

- Avoid DC (data cache maintenance) instructions when DCZID_EL0.DZP ==
   1 (potentially set by a hypervisor; user-space already does this).

- Perf updates for arm64: support for CI-700, HiSilicon PCIe PMU,
   Marvell CN10K LLC-TAD PMU, miscellaneous clean-ups.

- Other fixes and clean-ups; highlights: fix the handling of erratum
   1418040, correct the calculation of the nomap region boundaries,
   introduce io_stop_wc() mapped to the new DGH instruction (data
   gathering hint).

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits)
  arm64: Use correct method to calculate nomap region boundaries
  arm64: Drop outdated links in comments
  arm64: perf: Don't register user access sysctl handler multiple times
  drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check
  perf/smmuv3: Fix unused variable warning when CONFIG_OF=n
  arm64: errata: Fix exec handling in erratum 1418040 workaround
  arm64: Unhash early pointer print plus improve comment
  asm-generic: introduce io_stop_wc() and add implementation for ARM64
  arm64: Ensure that the 'bti' macro is defined where linkage.h is included
  arm64: remove __dma_*_area() aliases
  docs/arm64: delete a space from tagged-address-abi
  arm64: Enable KCSAN
  kselftest/arm64: Add pidbench for floating point syscall cases
  arm64/fp: Add comments documenting the usage of state restore functions
  kselftest/arm64: Add a test program to exercise the syscall ABI
  kselftest/arm64: Allow signal tests to trigger from a function
  kselftest/arm64: Parameterise ptrace vector length information
  arm64/sve: Minor clarification of ABI documentation
  arm64/sve: Generalise vector length configuration prctl() for SME
  arm64/sve: Make sysctl interface for SVE reusable by SME
  ...

Merge tag 'asm-generic-5.17' of git://git./linux/kernel/git/arnd/asm-generic

Pull asm-generic cleanups from Arnd Bergmann:
"A few minor cleanups for cross-architecture code: Alexandre Ghiti
  deals with removing some leftovers from drivers and features that have
  been removed, and Wasin Thonkaew has a cosmetic change"

* tag 'asm-generic-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic/error-injection.h: fix a spelling mistake, and a coding style issue
  arch: Remove leftovers from prism54 wireless driver
  arch: Remove leftovers from mandatory file locking
  Documentation, arch: Remove leftovers from CIFS_WEAK_PW_HASH
  Documentation, arch: Remove leftovers from raw device

Merge tag 'newsoc-5.17' of git://git./linux/kernel/git/soc/soc

Pull RISC-V SoC updates from Arnd Bergmann:
"Add support for StarFive JH7100 RISC-V SoC

  This adds support for the StarFive JH7100, including the necessary
  device drivers and DT files for the BeagleV Starlight prototype board,
  with additional boards to be added later. This SoC promises to be the
  first usable low-cost platform for RISC-V.

  I've taken this through the SoC tree in the anticipation of adding a
  few other Arm based SoCs as well, but those did not pass the review in
  time, so it's only this one"

* tag 'newsoc-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  reset: starfive-jh7100: Fix 32bit compilation
  RISC-V: Add BeagleV Starlight Beta device tree
  RISC-V: Add initial StarFive JH7100 device tree
  serial: 8250_dw: Add StarFive JH7100 quirk
  dt-bindings: serial: snps-dw-apb-uart: Add JH7100 uarts
  pinctrl: starfive: Add pinctrl driver for StarFive SoCs
  dt-bindings: pinctrl: Add StarFive JH7100 bindings
  dt-bindings: pinctrl: Add StarFive pinctrl definitions
  reset: starfive-jh7100: Add StarFive JH7100 reset driver
  dt-bindings: reset: Add Starfive JH7100 reset bindings
  dt-bindings: reset: Add StarFive JH7100 reset definitions
  clk: starfive: Add JH7100 clock generator driver
  dt-bindings: clock: starfive: Add JH7100 bindings
  dt-bindings: clock: starfive: Add JH7100 clock definitions
  dt-bindings: interrupt-controller: Add StarFive JH7100 plic
  dt-bindings: timer: Add StarFive JH7100 clint
  RISC-V: Add StarFive SoC Kconfig option

Merge tag 'dt-5.17' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC devicetree updates from Arnd Bergmann:
"As usual, this is the bulk of the updates for the SoC tree, adding
  more devices to existing files, addressing issues from ever improving
  automated checking, and fixing minor issues.

  The most interesting bits as usual are the new platforms. All the
  newly supported SoCs belong into existing families this time:

   - Qualcomm gets support for two newly announced platforms, both of
     which can now work in production environments: the SDX65 5G modem
     that can run a minimal Linux on its Cortex-A7 core, and the
     Snapdragon 8 Gen 1, their latest high-end phone SoC.

   - Renesas adds support for R-Car S4-8, the most recent automotive
     Server/Communication SoC.

   - TI adds support for J721s2, a new automotive SoC in the K3 family.

   - Mediatek MT7986a/b is a SoC used in Wifi routers, the latest
     generation following their popular MT76xx series. Only basic
     support is added for now.

   - NXP i.MX8 ULP8 is a new low-power variant of the widespread i.MX8
     series.

   - TI SPEAr320s is a minor variant of the old SPEAr320 SoC that we
     have supported for a long time.

  New boards with the existing SoCs include

   - Aspeed AST2500/AST2600 BMCs in TYAN, Facebook and Yadro servers

   - AT91/SAMA5 based evaluation board

   - NXP gains twenty new development and industrial boards for their
     i.MX and Layerscape SoCs

   - Intel IXP4xx now supports the final two machines in device tree
     that were previously only supported in old style board files.

   - Mediatek MT6589 is used in the Fairphone FP1 phone from 2013, while
     MT8183 is used in the Acer Chromebook 314.

   - Qualcomm gains support for the reference machines using the two new
     SoCs, plus a number of Chromebook variants and phones based on the
     Snapdragon 7c, 845 and 888 SoCs, including various Sony Xperia
     devices and the Microsoft Surface Duo 2.

   - ST STM32 now supports the Engicam i.Core STM32MP1 carrier board.

   - Tegra now boots various older Android devices based on 32-bit chips
     out of the box, including a number of ASUS Transformer tablets.

     There is also a new Jetson AGX Orin developer kit.

   - Apple support adds the missing device trees for all the remaining
     M1 Macbook and iMac variants, though not yet the M1 Pro/Max
     versions.

   - Allwinner now supports another version of the Tanix TX6 set-top box
     based on the H6 SoC.

   - Broadcom gains support for the Netgear RAXE500 Wireless router
     based on BCM4908"

* tag 'dt-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (574 commits)
  Revert "ARM: dts: BCM5301X: define RTL8365MB switch on Asus RT-AC88U"
  arm64: dts: qcom: sm6125: Avoid using missing SM6125_VDDCX
  arm64: dts: qcom: sm8450-qrd: Enable USB nodes
  arm64: dts: qcom: sm8450: Add usb nodes
  ARM: dts: aspeed: add LCLK setting into LPC KCS nodes
  dt-bindings: ipmi: bt-bmc: add 'clocks' as a required property
  ARM: dts: aspeed: add LCLK setting into LPC IBT node
  ARM: dts: aspeed: p10: Add TPM device
  ARM: dts: aspeed: p10: Enable USB host ports
  ARM: dts: aspeed: Add TYAN S8036 BMC machine
  ARM: dts: aspeed: tyan-s7106: Add uart_routing and fix vuart config
  ARM: dts: aspeed: Adding Facebook Bletchley BMC
  ARM: dts: aspeed: g220a: Enable secondary flash
  ARM: dts: Add openbmc-flash-layout-64-alt.dtsi
  ARM: dts: aspeed: Add secure boot controller node
  dt-bindings: aspeed: Add Secure Boot Controller bindings
  ARM: dts: Remove "spidev" nodes
  dt-bindings: pinctrl: samsung: Add pin drive definitions for Exynos850
  dt-bindings: arm: samsung: Document E850-96 board binding
  dt-bindings: Add vendor prefix for WinLink
  ...

Merge tag 'drivers-5.17' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC driver updates from Arnd Bergmann:
"There are cleanups and minor bugfixes across several SoC specific
  drivers, for Qualcomm, Samsung, NXP i.MX, AT91, Tegra, Keystone,
  Renesas, ZynqMP

  Noteworthy new features are:

   - The op-tee firmware driver gains support for asynchronous
     notifications from secure-world firmware.

   - Qualcomm platforms gain support for new SoC types in various
     drivers: power domain, cache controller, RPM sleep, soc-info

   - Samsung SoC drivers gain support for new SoCs in ChipID and PMU, as
     well as a new USIv2 driver that handles various types of serial
     communiction (uart, i2c, spi)

   - Renesas adds support for R-Car S4-8 (R8A779F0) in multiple drivers,
     as well as memory controller support for RZ/G2L (R9A07G044).

   - Apple M1 gains support for the PMGR power management driver"

* tag 'drivers-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (94 commits)
  soc: qcom: rpmh-rsc: Fix typo in a comment
  soc: qcom: socinfo: Add SM6350 and SM7225
  dt-bindings: arm: msm: Don't mark LLCC interrupt as required
  dt-bindings: firmware: scm: Add SM6350 compatible
  dt-bindings: arm: msm: Add LLCC for SM6350
  soc: qcom: rpmhpd: Sort power-domain definitions and lists
  soc: qcom: rpmhpd: Remove mx/cx relationship on sc7280
  soc: qcom: rpmhpd: Rename rpmhpd struct names
  soc: qcom: rpmhpd: sm8450: Add the missing .peer for sm8450_cx_ao
  soc: qcom: socinfo: add SM8450 ID
  soc: qcom: rpmhpd: Add SM8450 power domains
  dt-bindings: power: rpmpd: Add SM8450 to rpmpd binding
  soc: qcom: smem: Update max processor count
  dt-bindings: arm: qcom: Document SM8450 SoC and boards
  dt-bindings: firmware: scm: Add SM8450 compatible
  dt-bindings: arm: cpus: Add kryo780 compatible
  soc: qcom: rpmpd: Add support for sm6125
  dt-bindings: qcom-rpmpd: Add sm6125 power domains
  soc: qcom: aoss: constify static struct thermal_cooling_device_ops
  PM: AVS: qcom-cpr: Use div64_ul instead of do_div
  ...

Merge tag 'defconfig-5.17' of git://git./linux/kernel/git/soc/soc

Pull ARM defconfig updates from Arnd Bergmann:
"These are the usual changes to enable newly added driver by default,
  and to do some housekeeping around changing Kconfig symbols"

* tag 'defconfig-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: defconfig: Enable Samsung I2C driver
  ARM: configs: at91: Enable crypto software implementations
  ARM: configs: at91: sama7: Enable SPI NOR and QSPI controller
  ARM: config: multi v7: Enable NVIDIA Tegra20 APB DMA driver
  ARM: config: multi v7: Enable NVIDIA Tegra20 S/PDIF driver
  ARM: tegra_defconfig: Enable S/PDIF driver
  ARM: imx_v6_v7_defconfig: Enable for DHCOM devices required RTC_DRV_RV3029C2
  ARM: config: multi v7: Enable display drivers used by Tegra devices
  ARM: tegra_defconfig: Enable drivers wanted by Acer Chromebooks and ASUS tablets
  ARM: configs: gemini: Activate crypto driver
  arm64: defconfig: enable drivers for booting i.MX8ULP
  arm64: defconfig: Enable R-Car S4-8
  arm64: defconfig: enable drivers for TQ TQMa8MxML-MBa8Mx
  arm64: defconfig: Enable OV5640
  arm64: defconfig: Enable VIDEO_IMX_MEDIA

Merge tag 'soc-5.17' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC updates from Arnd Bergmann:
"These are all minor bug fixes and cleanups to code in arch/arm and
  arch/arm64 that is specific to one SoC, updating Kconfig symbols, the
  MAINTAINERS file, and removing some dead code"

* tag 'soc-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: exynos: Enable Exynos Multi-Core Timer driver
  ARM: ixp4xx: remove unused header file pata_ixp4xx_cf.h
  ARM: ixp4xx: remove dead configs CPU_IXP43X and CPU_IXP46X
  MAINTAINERS: Add Florian as BCM5301X and BCM53573 maintainer
  ARM: samsung: Remove HAVE_S3C2410_I2C and use direct dependencies
  ARM: imx: rename DEBUG_IMX21_IMX27_UART to DEBUG_IMX27_UART
  ARM: imx: remove dead left-over from i.MX{27,31,35} removal
  ARM: s3c: add one more "fallthrough" statement in Jive
  ARM: s3c: include header for prototype of s3c2410_modify_misccr
  ARM: shmobile: rcar-gen2: Add missing of_node_put()

Merge branches 'edac-misc' and 'edac-amd64' into edac-updates-for-v5.17

Signed-off-by: Borislav Petkov <bp@suse.de>

Linux 5.16

Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fix from Dmitry Torokhov:
"A small fixup to the Zinitix touchscreen driver to avoid enabling the
IRQ line before we successfully requested it"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: zinitix - make sure the IRQ is allocated before it gets enabled

Merge tag 'soc-fixes-5.16-5' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fix from Olof Johansson:
"One more fix for 5.16

  I had missed one patch when I sent up what I thought was the last
  batch of fixes for this release. This one fixes issues on the
  Raspberry Pi platforms due to gpio init changes this release, so
  hopefully we can get it merged before final release is cut"

* tag 'soc-fixes-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: dts: gpio-ranges property is now required

Merge tag 'perf-tools-fixes-for-v5.16-2022-01-09' of git://git./linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

- Revert "libtraceevent: Increase libtraceevent logging when verbose",
   breaks the build with libtraceevent-1.3.0, i.e. when building with
   'LIBTRACEEVENT_DYNAMIC=1'.

- Avoid early exit in 'perf trace' due to running SIGCHLD handler
   before it makes sense to. It can happen when using a BPF source code
   event that have to be first built into an object file.

* tag 'perf-tools-fixes-for-v5.16-2022-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  Revert "libtraceevent: Increase libtraceevent logging when verbose"
  perf trace: Avoid early exit due to running SIGCHLD handler before it makes sense to

Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)"

This reverts commit f7d6779df642720e22bffd449e683bb8690bd3bf.

This bisected regression has impacted suspend-resume stability
since 5.15-rc1. It regressed -stable via 5.14.10.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215315
Fixes: f7d6779df64 ("drm/amdgpu: stop scheduler when calling hw_fini (v2)")
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: <stable@vger.kernel.org> # 5.14+
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Input: zinitix - make sure the IRQ is allocated before it gets enabled

Since irq request is the last thing in the driver probe, it happens
later than the input device registration. This means that there is a
small time window where if the open method is called the driver will
attempt to enable not yet available irq.

Fix that by moving the irq request before the input device registration.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Fixes: 26822652c85e ("Input: add zinitix touchscreen driver")
Signed-off-by: Nikita Travkin <nikita@trvn.ru>
Link: https://lore.kernel.org/r/20220106072840.36851-2-nikita@trvn.ru
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

ARM: dts: gpio-ranges property is now required

Since [1], added in 5.7, the absence of a gpio-ranges property has
prevented GPIOs from being restored to inputs when released.
Add those properties for BCM283x and BCM2711 devices.

[1] commit 2ab73c6d8323 ("gpio: Support GPIO controllers without
pin-ranges")

Link: https://lore.kernel.org/r/20220104170247.956760-1-linus.walleij@linaro.org
Fixes: 2ab73c6d8323 ("gpio: Support GPIO controllers without pin-ranges")
Fixes: 266423e60ea1 ("pinctrl: bcm2835: Change init order for gpio hogs")
Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20211206092237.4105895-3-phil@raspberrypi.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'soc-fixes-5.16-4' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Olof Johansson:
"A few more fixes have come in, nothing overly severe but would be good
  to get in by final release:

   - More specific compatible fields on the qspi controller for socfpga,
     to enable quirks in the driver

   - A runtime PM fix for Renesas to fix mismatched reference counts on
     errors"

* tag 'soc-fixes-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: dts: socfpga: change qspi to "intel,socfpga-qspi"
  dt-bindings: spi: cadence-quadspi: document "intel,socfpga-qspi"
  reset: renesas: Fix Runtime PM usage

Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"Fix the regression with AMD GPU suspend by reverting the
  handling of bus regulators in the I2C core.

  Also, there is a fix for the MPC driver to prevent an
  out-of-bound-access"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  Revert "i2c: core: support bus regulator controlling in adapter"
  i2c: mpc: Avoid out of bounds memory access

Merge tag 'for-v5.16-rc' of git://git./linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel:
"Three fixes for the 5.16 cycle:

   - Avoid going beyond last capacity in the power-supply core

   - Replace 1E6L with NSEC_PER_MSEC to avoid floating point calculation
     in LLVM resulting in a build failure

   - Fix ADC measurements in bq25890 charger driver"

* tag 'for-v5.16-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: reset: ltc2952: Fix use of floating point literals
  power: bq25890: Enable continuous conversion for ADC at charging
  power: supply: core: Break capacity loop

Merge tag 'xfs-5.16-fixes-4' of git://git./fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:

- Make the old ALLOCSP ioctl behave in a consistent manner with newer
syscalls like fallocate.

* tag 'xfs-5.16-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: map unwritten blocks in XFS_IOC_{ALLOC,FREE}SP just like fallocate

s390/dasd: use default_groups in kobj_type

There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field. Move the s390 dasd sysfs code to use default_groups field
which has been the preferred way since commit aa30f47cf666 ("kobject:
Add support for default attribute groups to kobj_type") so that we can
soon get rid of the obsolete default_attrs field.

Cc: Stefan Haberland <sth@linux.ibm.com>
Cc: Jan Hoeppner <hoeppner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220106095401.3274637-1-gregkh@linuxfoundation.org
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/sclp_sd: use default_groups in kobj_type

There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field. Move the sclp_sd sysfs code to use default_groups field which
has been the preferred way since commit aa30f47cf666 ("kobject: Add
support for default attribute groups to kobj_type") so that we can
soon get rid of the obsolete default_attrs field.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220106095252.3273905-1-gregkh@linuxfoundation.org
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Merge branch 'for-5.16-fixes' of git://git./linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
"This contains the cgroup.procs permission check fixes so that they use
  the credentials at the time of open rather than write, which also
  fixes the cgroup namespace lifetime bug"

* 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  selftests: cgroup: Test open-time cgroup namespace usage for migration checks
  selftests: cgroup: Test open-time credential usage for migration checks
  selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644
  cgroup: Use open-time cgroup namespace for process migration perm checks
  cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
  cgroup: Use open-time credentials for process migraton perm checks

Merge tag 'block-5.16-2022-01-07' of git://git.kernel.dk/linux-block

Pull block fix from Jens Axboe:
"Just the md bitmap regression this time"

* tag 'block-5.16-2022-01-07' of git://git.kernel.dk/linux-block:
md/raid1: fix missing bitmap update w/o WriteMostly devices

Merge tag 'edac_urgent_for_v5.16' of git://git./linux/kernel/git/ras/ras

Pull EDAC fix from Tony Luck:
"Fix 10nm EDAC driver to release and unmap resources on systems without
HBM"

* tag 'edac_urgent_for_v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/i10nm: Release mdev/mbase when failing to detect HBM

Revert "i2c: core: support bus regulator controlling in adapter"

This largely reverts commit 5a7b95fb993ec399c8a685552aa6a8fc995c40bd. It
breaks suspend with AMD GPUs, and we couldn't incrementally fix it. So,
let's remove the code and go back to the drawing board. We keep the
header extension to not break drivers already populating the regulator.
We expect to re-add the code handling it soon.

Fixes: 5a7b95fb993e ("i2c: core: support bus regulator controlling in adapter")
Reported-by: "Tareque Md.Hanif" <tarequemd.hanif@yahoo.com>
Link: https://lore.kernel.org/r/1295184560.182511.1639075777725@mail.yahoo.com
Reported-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Link: https://lore.kernel.org/r/7143a7147978f4104171072d9f5225d2ce355ec1.camel@yandex.ru
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1850
Tested-by: "Tareque Md.Hanif" <tarequemd.hanif@yahoo.com>
Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Cc: <stable@vger.kernel.org> # 5.14+

Revert "libtraceevent: Increase libtraceevent logging when verbose"

This reverts commit 08efcb4a638d260ef7fcbae64ecf7ceceb3f1841.

This breaks the build as it will prefer using libbpf-devel header files,
even when not using LIBBPF_DYNAMIC=1, breaking the build.

This was detected on OpenSuSE Tumbleweed with libtraceevent-devel 1.3.0,
as described by Jiri Slaby:

=======================================================================
It breaks build with LIBTRACEEVENT_DYNAMIC and version 1.3.0:
> util/debug.c: In function ‘perf_debug_option’:
> util/debug.c:243:17: error: implicit declaration of function
‘tep_set_loglevel’ [-Werror=implicit-function-declaration]
>   243 |                 tep_set_loglevel(TEP_LOG_INFO);
>       |                 ^~~~~~~~~~~~~~~~
> util/debug.c:243:34: error: ‘TEP_LOG_INFO’ undeclared (first use in this
function); did you mean ‘TEP_PRINT_INFO’?
>   243 |                 tep_set_loglevel(TEP_LOG_INFO);
>       |                                  ^~~~~~~~~~~~
>       |                                  TEP_PRINT_INFO
> util/debug.c:243:34: note: each undeclared identifier is reported only once
for each function it appears in
> util/debug.c:245:34: error: ‘TEP_LOG_DEBUG’ undeclared (first use in this
function)
>   245 |                 tep_set_loglevel(TEP_LOG_DEBUG);
>       |                                  ^~~~~~~~~~~~~
> util/debug.c:247:34: error: ‘TEP_LOG_ALL’ undeclared (first use in this
function)
>   247 |                 tep_set_loglevel(TEP_LOG_ALL);
>       |                                  ^~~~~~~~~~~

It is because the gcc's command line looks like:
gcc
...
-I/home/abuild/rpmbuild/BUILD/tools/lib/
...
-DLIBTRACEEVENT_VERSION=65790
...
=======================================================================

The proper way to fix this is more involved and so not suitable for this
late in the 5.16-rc stage.

Reported-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/lkml/bc2b0786-8965-1bcd-2316-9d9bb37b9c31@kernel.org
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/lkml/YddGjjmlMZzxUZbN@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf trace: Avoid early exit due to running SIGCHLD handler before it makes sense to

When running 'perf trace' with an BPF object like:

# perf trace -e openat,tools/perf/examples/bpf/hello.c

the event parsing eventually calls llvm__get_kbuild_opts() that runs a
script and that ends up with SIGCHLD delivered to the 'perf trace'
handler, which assumes the workload process is done and quits 'perf
trace'.

Move the SIGCHLD handler setup directly to trace__run(), where the event
is parsed and the object is already compiled.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Christy Lee <christyc.y.lee@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220106222030.227499-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"Two small fixes for x86:

   - lockdep WARN due to missing lock nesting annotation

   - NULL pointer dereference when accessing debugfs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Check for rmaps allocation
  KVM: SEV: Mark nested locking of kvm->lock

Merge tag 'drm-fixes-2022-01-07' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"There is only the amdgpu runtime pm regression fix in here:

  amdgpu:

   - suspend/resume fix

   - fix runtime PM regression"

* tag 'drm-fixes-2022-01-07' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: disable runpm if we are the primary adapter
  fbdev: fbmem: add a helper to determine if an aperture is used by a fw fb
  drm/amd/pm: keep the BACO feature enabled for suspend

KVM: x86: Check for rmaps allocation

With TDP MMU being the default now, access to mmu_rmaps_stat debugfs
file causes following oops:

BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 3185 Comm: cat Not tainted 5.16.0-rc4+ #204
RIP: 0010:pte_list_count+0x6/0x40
Call Trace:
  <TASK>
  ? kvm_mmu_rmaps_stat_show+0x15e/0x320
  seq_read_iter+0x126/0x4b0
  ? aa_file_perm+0x124/0x490
  seq_read+0xf5/0x140
  full_proxy_read+0x5c/0x80
  vfs_read+0x9f/0x1a0
  ksys_read+0x67/0xe0
  __x64_sys_read+0x19/0x20
  do_syscall_64+0x3b/0xc0
  entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fca6fc13912

Return early when rmaps are not present.

Reported-by: Vasant Hegde <vasant.hegde@amd.com>
Tested-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220105040337.4234-1-nikunj@amd.com>
Cc: stable@vger.kernel.org
Fixes: 3bcd0662d66f ("KVM: X86: Introduce mmu_rmaps_stat per-vm debugfs file")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: SEV: Mark nested locking of kvm->lock

Both source and dest vms' kvm->locks are held in sev_lock_two_vms.
Mark one with a different subtype to avoid false positives from lockdep.

Fixes: c9d61dcb0bc26 (KVM: SEV: accept signals in sev_lock_two_vms)
Reported-by: Yiru Xu <xyru1999@gmail.com>
Tested-by: Jinrong Liang <cloudliang@tencent.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1641364863-26331-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86/sgx: Fix NULL pointer dereference on non-SGX systems

== Problem ==

Nathan Chancellor reported an oops when aceessing the
'sgx_total_bytes' sysfs file:

https://lore.kernel.org/all/YbzhBrimHGGpddDM@archlinux-ax161/

The sysfs output code accesses the sgx_numa_nodes[] array
unconditionally.  However, this array is allocated during SGX
initialization, which only occurs on systems where SGX is
supported.

If the sysfs file is accessed on systems without SGX support,
sgx_numa_nodes[] is NULL and an oops occurs.

== Solution ==

To fix this, hide the entire nodeX/x86/ attribute group on
systems without SGX support using the ->is_visible attribute
group callback.

Unfortunately, SGX is initialized via a device_initcall() which
occurs _after_ the ->is_visible() callback.  Instead of moving
SGX initialization earlier, call sysfs_update_group() during
SGX initialization to update the group visiblility.

This update requires moving the SGX sysfs code earlier in
sgx/main.c.  There are no code changes other than the addition of
arch_update_sysfs_visibility() and a minor whitespace fixup to
arch_node_attr_is_visible() which checkpatch caught.

CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-sgx@vger.kernel.org
Cc: x86@kernel.org
Fixes: 50468e431335 ("x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/20220104171527.5E8416A8@davehans-spike.ostc.intel.com

s390/pci: simplify __pciwb_mio() inline asm

The PCI Write Barrier instruction ignores the registers encoded in it.
There is thus no need to explicitly set the register to zero or to
associate it with a variable at all. In the resulting binary this removes
an unnecessary lghi and it makes the code simpler.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Merge branch 'for-5.17/struct-slab' into for-linus

Series "Separate struct slab from struct page" v4

This is originally an offshoot of the folio work by Matthew. One of the more
complex parts of the struct page definition are the parts used by the slab
allocators. It would be good for the MM in general if struct slab were its own
data type, and it also helps to prevent tail pages from slipping in anywhere.
As Matthew requested in his proof of concept series, I have taken over the
development of this series, so it's a mix of patches from him (often modified
by me) and my own.

One big difference is the use of coccinelle to perform the relatively trivial
parts of the conversions automatically and at once, instead of a larger number
of smaller incremental reviewable steps. Thanks to Julia Lawall and Luis
Chamberlain for all their help!

Another notable difference is (based also on review feedback) I don't represent
with a struct slab the large kmalloc allocations which are not really a slab,
but use page allocator directly. When going from an object address to a struct
slab, the code tests first folio slab flag, and only if it's set it converts to
struct slab. This makes the struct slab type stronger.

Finally, although Matthew's version didn't use any of the folio work, the
initial support has been merged meanwhile so my version builds on top of it
where appropriate. This eliminates some of the redundant compound_head()
being performed e.g. when testing the slab flag.

To sum up, after this series, struct page fields used by slab allocators are
moved from struct page to a new struct slab, that uses the same physical
storage. The availability of the fields is further distinguished by the
selected slab allocator implementation. The advantages include:

- Similar to folios, if the slab is of order > 0, struct slab always is
  guaranteed to be the head page. Additionally it's guaranteed to be an actual
  slab page, not a large kmalloc. This removes uncertainty and potential for
  bugs.
- It's not possible to accidentally use fields of the slab implementation that's
  not configured.
- Other subsystems cannot use slab's fields in struct page anymore (some
  existing non-slab usages had to be adjusted in this series), so slab
  implementations have more freedom in rearranging them in the struct slab.

Link: https://lore.kernel.org/all/20220104001046.12263-1-vbabka@suse.cz/

Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
"Last pull for 5.16, the reversion has been known for a while now but
  didn't get a proper fix in time. Looks like we will have several
  info-leak bugs to take care of going foward.

   - Revert the patch fixing the DM related crash causing a widespread
     regression for kernel ULPs. A proper fix just didn't appear this
     cycle due to the holidays

   - Missing NULL check on alloc in uverbs

   - Double free in rxe error paths

   - Fix a new kernel-infoleak report when forming ah_attr's without
     GRH's in ucma"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/core: Don't infoleak GRH fields
  RDMA/uverbs: Check for null return of kmalloc_array
  Revert "RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow"
  RDMA/rxe: Prevent double freeing rxe_map_set()

random: don't reset crng_init_cnt on urandom_read()

At the moment, urandom_read() (used for /dev/urandom) resets crng_init_cnt
to zero when it is called at crng_init<2. This is inconsistent: We do it
for /dev/urandom reads, but not for the equivalent
getrandom(GRND_INSECURE).

(And worse, as Jason pointed out, we're only doing this as long as
maxwarn>0.)

crng_init_cnt is only read in crng_fast_load(); it is relevant at
crng_init==0 for determining when to switch to crng_init==1 (and where in
the RNG state array to write).

As far as I understand:

- crng_init==0 means "we have nothing, we might just be returning the same
   exact numbers on every boot on every machine, we don't even have
   non-cryptographic randomness; we should shove every bit of entropy we
   can get into the RNG immediately"
- crng_init==1 means "well we have something, it might not be
   cryptographic, but at least we're not gonna return the same data every
   time or whatever, it's probably good enough for TCP and ASLR and stuff;
   we now have time to build up actual cryptographic entropy in the input
   pool"
- crng_init==2 means "this is supposed to be cryptographically secure now,
   but we'll keep adding more entropy just to be sure".

The current code means that if someone is pulling data from /dev/urandom
fast enough at crng_init==0, we'll keep resetting crng_init_cnt, and we'll
never make forward progress to crng_init==1. It seems to be intended to
prevent an attacker from bruteforcing the contents of small individual RNG
inputs on the way from crng_init==0 to crng_init==1, but that's misguided;
crng_init==1 isn't supposed to provide proper cryptographic security
anyway, RNG users who care about getting secure RNG output have to wait
until crng_init==2.

This code was inconsistent, and it probably made things worse - just get
rid of it.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: avoid superfluous call to RDRAND in CRNG extraction

RDRAND is not fast. RDRAND is actually quite slow. We've known this for
a while, which is why functions like get_random_u{32,64} were converted
to use batching of our ChaCha-based CRNG instead.

Yet CRNG extraction still includes a call to RDRAND, in the hot path of
every call to get_random_bytes(), /dev/urandom, and getrandom(2).

This call to RDRAND here seems quite superfluous. CRNG is already
extracting things based on a 256-bit key, based on good entropy, which
is then reseeded periodically, updated, backtrack-mutated, and so
forth. The CRNG extraction construction is something that we're already
relying on to be secure and solid. If it's not, that's a serious
problem, and it's unlikely that mixing in a measly 32 bits from RDRAND
is going to alleviate things.

And in the case where the CRNG doesn't have enough entropy yet, we're
already initializing the ChaCha key row with RDRAND in
crng_init_try_arch_early().

Removing the call to RDRAND improves performance on an i7-11850H by
370%. In other words, the vast majority of the work done by
extract_crng() prior to this commit was devoted to fetching 32 bits of
RDRAND.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: early initialization of ChaCha constants

Previously, the ChaCha constants for the primary pool were only
initialized in crng_initialize_primary(), called by rand_initialize().
However, some randomness is actually extracted from the primary pool
beforehand, e.g. by kmem_cache_create(). Therefore, statically
initialize the ChaCha constants for the primary pool.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: <linux-crypto@vger.kernel.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: use IS_ENABLED(CONFIG_NUMA) instead of ifdefs

Rather than an awkward combination of ifdefs and __maybe_unused, we can
ensure more source gets parsed, regardless of the configuration, by
using IS_ENABLED for the CONFIG_NUMA conditional code. This makes things
cleaner and easier to follow.

I've confirmed that on !CONFIG_NUMA, we don't wind up with excess code
by accident; the generated object file is the same.

Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: harmonize "crng init done" messages

We print out "crng init done" for !TRUST_CPU, so we should also print
out the same for TRUST_CPU.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: mix bootloader randomness into pool

If we're trusting bootloader randomness, crng_fast_load() is called by
add_hwgenerator_randomness(), which sets us to crng_init==1. However,
usually it is only called once for an initial 64-byte push, so bootloader
entropy will not mix any bytes into the input pool. So it's conceivable
that crng_init==1 when crng_initialize_primary() is called later, but
then the input pool is empty. When that happens, the crng state key will
be overwritten with extracted output from the empty input pool. That's
bad.

In contrast, if we're not trusting bootloader randomness, we call
crng_slow_load() *and* we call mix_pool_bytes(), so that later
crng_initialize_primary() isn't drawing on nothing.

In order to prevent crng_initialize_primary() from extracting an empty
pool, have the trusted bootloader case mirror that of the untrusted
bootloader case, mixing the input into the pool.

[linux@dominikbrodowski.net: rewrite commit message]
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: do not throw away excess input to crng_fast_load

When crng_fast_load() is called by add_hwgenerator_randomness(), we
currently will advance to crng_init==1 once we've acquired 64 bytes, and
then throw away the rest of the buffer. Usually, that is not a problem:
When add_hwgenerator_randomness() gets called via EFI or DT during
setup_arch(), there won't be any IRQ randomness. Therefore, the 64 bytes
passed by EFI exactly matches what is needed to advance to crng_init==1.
Usually, DT seems to pass 64 bytes as well -- with one notable exception
being kexec, which hands over 128 bytes of entropy to the kexec'd kernel.
In that case, we'll advance to crng_init==1 once 64 of those bytes are
consumed by crng_fast_load(), but won't continue onward feeding in bytes
to progress to crng_init==2. This commit fixes the issue by feeding
any leftover bytes into the next phase in add_hwgenerator_randomness().

[linux@dominikbrodowski.net: rewrite commit message]
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: do not re-init if crng_reseed completes before primary init

If the bootloader supplies sufficient material and crng_reseed() is called
very early on, but not too early that wqs aren't available yet, then we
might transition to crng_init==2 before rand_initialize()'s call to
crng_initialize_primary() made. Then, when crng_initialize_primary() is
called, if we're trusting the CPU's RDRAND instructions, we'll
needlessly reinitialize the RNG and emit a message about it. This is
mostly harmless, as numa_crng_init() will allocate and then free what it
just allocated, and excessive calls to invalidate_batched_entropy()
aren't so harmful. But it is funky and the extra message is confusing,
so avoid the re-initialization all together by checking for crng_init <
2 in crng_initialize_primary(), just as we already do in crng_reseed().

Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: fix crash on multiple early calls to add_bootloader_randomness()

Currently, if CONFIG_RANDOM_TRUST_BOOTLOADER is enabled, multiple calls
to add_bootloader_randomness() are broken and can cause a NULL pointer
dereference, as noted by Ivan T. Ivanov. This is not only a hypothetical
problem, as qemu on arm64 may provide bootloader entropy via EFI and via
devicetree.

On the first call to add_hwgenerator_randomness(), crng_fast_load() is
executed, and if the seed is long enough, crng_init will be set to 1.
On subsequent calls to add_bootloader_randomness() and then to
add_hwgenerator_randomness(), crng_fast_load() will be skipped. Instead,
wait_event_interruptible() and then credit_entropy_bits() will be called.
If the entropy count for that second seed is large enough, that proceeds
to crng_reseed().

However, both wait_event_interruptible() and crng_reseed() depends
(at least in numa_crng_init()) on workqueues. Therefore, test whether
system_wq is already initialized, which is a sufficient indicator that
workqueue_init_early() has progressed far enough.

If we wind up hitting the !system_wq case, we later want to do what
would have been done there when wqs are up, so set a flag, and do that
work later from the rand_initialize() call.

Reported-by: Ivan T. Ivanov <iivanov@suse.de>
Fixes: 18b915ac6b0a ("efi/random: Treat EFI_RNG_PROTOCOL output as bootloader randomness")
Cc: stable@vger.kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
[Jason: added crng_need_done state and related logic.]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: do not sign extend bytes for rotation when mixing

By using `char` instead of `unsigned char`, certain platforms will sign
extend the byte when `w = rol32(*bytes++, input_rotate)` is called,
meaning that bit 7 is overrepresented when mixing. This isn't a real
problem (unless the mixer itself is already broken) since it's still
invertible, but it's not quite correct either. Fix this by using an
explicit unsigned type.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: use BLAKE2s instead of SHA1 in extraction

This commit addresses one of the lower hanging fruits of the RNG: its
usage of SHA1.

BLAKE2s is generally faster, and certainly more secure, than SHA1, which
has [1] been [2] really [3] very [4] broken [5]. Additionally, the
current construction in the RNG doesn't use the full SHA1 function, as
specified, and allows overwriting the IV with RDRAND output in an
undocumented way, even in the case when RDRAND isn't set to "trusted",
which means potential malicious IV choices. And its short length means
that keeping only half of it secret when feeding back into the mixer
gives us only 2^80 bits of forward secrecy. In other words, not only is
the choice of hash function dated, but the use of it isn't really great
either.

This commit aims to fix both of these issues while also keeping the
general structure and semantics as close to the original as possible.
Specifically:

   a) Rather than overwriting the hash IV with RDRAND, we put it into
      BLAKE2's documented "salt" and "personal" fields, which were
      specifically created for this type of usage.
   b) Since this function feeds the full hash result back into the
      entropy collector, we only return from it half the length of the
      hash, just as it was done before. This increases the
      construction's forward secrecy from 2^80 to a much more
      comfortable 2^128.
   c) Rather than using the raw "sha1_transform" function alone, we
      instead use the full proper BLAKE2s function, with finalization.

This also has the advantage of supplying 16 bytes at a time rather than
SHA1's 10 bytes, which, in addition to having a faster compression
function to begin with, means faster extraction in general. On an Intel
i7-11850H, this commit makes initial seeding around 131% faster.

BLAKE2s itself has the nice property of internally being based on the
ChaCha permutation, which the RNG is already using for expansion, so
there shouldn't be any issue with newness, funkiness, or surprising CPU
behavior, since it's based on something already in use.

[1] https://eprint.iacr.org/2005/010.pdf
[2] https://www.iacr.org/archive/crypto2005/36210017/36210017.pdf
[3] https://eprint.iacr.org/2015/967.pdf
[4] https://shattered.io/static/shattered.pdf
[5] https://www.usenix.org/system/files/sec20-leurent.pdf

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

lib/crypto: blake2s: include as built-in

In preparation for using blake2s in the RNG, we change the way that it
is wired-in to the build system. Instead of using ifdefs to select the
right symbol, we use weak symbols. And because ARM doesn't need the
generic implementation, we make the generic one default only if an arch
library doesn't need it already, and then have arch libraries that do
need it opt-in. So that the arch libraries can remain tristate rather
than bool, we then split the shash part from the glue code.

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: linux-kbuild@vger.kernel.org
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: fix data race on crng init time

_extract_crng() does plain loads of crng->init_time and
crng_global_init_time, which causes undefined behavior if
crng_reseed() and RNDRESEEDCRNG modify these corrently.

Use READ_ONCE() and WRITE_ONCE() to make the behavior defined.

Don't fix the race on crng->init_time by protecting it with crng->lock,
since it's not a problem for duplicate reseedings to occur. I.e., the
lockless access with READ_ONCE() is fine.

Fixes: d848e5f8e1eb ("random: add new ioctl RNDRESEEDCRNG")
Fixes: e192be9d9a30 ("random: replace non-blocking pool with a Chacha20-based CRNG")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: fix data race on crng_node_pool

extract_crng() and crng_backtrack_protect() load crng_node_pool with a
plain load, which causes undefined behavior if do_numa_crng_init()
modifies it concurrently.

Fix this by using READ_ONCE(). Note: as per the previous discussion
https://lore.kernel.org/lkml/20211219025139.31085-1-ebiggers@kernel.org/T/#u,
READ_ONCE() is believed to be sufficient here, and it was requested that
it be used here instead of smp_load_acquire().

Also change do_numa_crng_init() to set crng_node_pool using
cmpxchg_release() instead of mb() + cmpxchg(), as the former is
sufficient here but is more lightweight.

Fixes: 1e7f583af67b ("random: make /dev/urandom scalable for silly userspace programs")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

irq: remove unused flags argument from __handle_irq_event_percpu()

The __IRQF_TIMER bit from the flags argument was used in
add_interrupt_randomness() to distinguish the timer interrupt from other
interrupts. This is no longer the case.

Remove the flags argument from __handle_irq_event_percpu().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: remove unused irq_flags argument from add_interrupt_randomness()

Since commit
ee3e00e9e7101 ("random: use registers from interrupted code for CPU's w/o a cycle counter")

the irq_flags argument is no longer used.

Remove unused irq_flags.

Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: linux-hyperv@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: document add_hwgenerator_randomness() with other input functions

The section at the top of random.c which documents the input functions
available does not document add_hwgenerator_randomness() which might lead
a reader to overlook it. Add a brief note about it.

Signed-off-by: Mark Brown <broonie@kernel.org>
[Jason: reorganize position of function in doc comment and also document
add_bootloader_randomness() while we're at it.]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

MAINTAINERS: add git tree for random.c

This is handy not just for humans, but also so that the 0-day bot can
automatically test posted mailing list patches against the right tree.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Merge tag 'trace-v5.16-rc8' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
"Three minor tracing fixes:

   - Fix missing prototypes in sample module for direct functions

   - Fix check of valid buffer in get_trace_buf()

   - Fix annotations of percpu pointers"

* tag 'trace-v5.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Tag trace_percpu_buffer as a percpu pointer
  tracing: Fix check for trace_percpu_buffer validity in get_trace_buf()
  ftrace/samples: Add missing prototypes direct functions

selftests: cgroup: Test open-time cgroup namespace usage for migration checks

When a task is writing to an fd opened by a different task, the perm check
should use the cgroup namespace of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

selftests: cgroup: Test open-time credential usage for migration checks

When a task is writing to an fd opened by a different task, the perm check
should use the credentials of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644

0644 is an odd perm to create a cgroup which is a directory. Use the regular
0755 instead. This is necessary for euid switching test case.

Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

cgroup: Use open-time cgroup namespace for process migration perm checks

cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's cgroup namespace which is
a potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.

This patch makes cgroup remember the cgroup namespace at the time of open
and uses it for migration permission checks instad of current's. Note that
this only applies to cgroup2 as cgroup1 doesn't have namespace support.

This also fixes a use-after-free bug on cgroupns reported in

https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com

Note that backporting this fix also requires the preceding patch.

Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reported-by: syzbot+50f5cf33a284ce738b62@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com
Fixes: 5136f6365ce3 ("cgroup: implement "nsdelegate" mount option")
Signed-off-by: Tejun Heo <tj@kernel.org>

cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv

of->priv is currently used by each interface file implementation to store
private information. This patch collects the current two private data usages
into struct cgroup_file_ctx which is allocated and freed by the common path.
This allows generic private data which applies to multiple files, which will
be used to in the following patch.

Note that cgroup_procs iterator is now embedded as procs.iter in the new
cgroup_file_ctx so that it doesn't need to be allocated and freed
separately.

v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in
    cgroup_file_ctx as suggested by Linus.

v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too.
    Converted. Didn't change to embedded allocation as cgroup1 pidlists get
    stored for caching.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Michal Koutný <mkoutny@suse.com>

cgroup: Use open-time credentials for process migraton perm checks

cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's credentials which is a
potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.

This patch makes both cgroup2 and cgroup1 process migration interfaces to
use the credentials saved at the time of open (file->f_cred) instead of
current's.

Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Fixes: 187fe84067bd ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy")
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

Merge tag 'amd-drm-fixes-5.16-2021-12-31' of ssh://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.16-2021-12-31:

amdgpu:
- Suspend/resume fix
- Restore runtime pm behavior with efifb

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211231143825.11479-1-alexander.deucher@amd.com

i2c: mpc: Avoid out of bounds memory access

When performing an I2C transfer where the last message was a write KASAN
would complain:

  BUG: KASAN: slab-out-of-bounds in mpc_i2c_do_action+0x154/0x630
  Read of size 2 at addr c814e310 by task swapper/2/0

  CPU: 2 PID: 0 Comm: swapper/2 Tainted: G    B             5.16.0-rc8 #1
  Call Trace:
  [e5ee9d50] [c08418e8] dump_stack_lvl+0x4c/0x6c (unreliable)
  [e5ee9d70] [c02f8a14] print_address_description.constprop.13+0x64/0x3b0
  [e5ee9da0] [c02f9030] kasan_report+0x1f0/0x204
  [e5ee9de0] [c0c76ee4] mpc_i2c_do_action+0x154/0x630
  [e5ee9e30] [c0c782c4] mpc_i2c_isr+0x164/0x240
  [e5ee9e60] [c00f3a04] __handle_irq_event_percpu+0xf4/0x3b0
  [e5ee9ec0] [c00f3d40] handle_irq_event_percpu+0x80/0x110
  [e5ee9f40] [c00f3e48] handle_irq_event+0x78/0xd0
  [e5ee9f60] [c00fcfec] handle_fasteoi_irq+0x19c/0x370
  [e5ee9fa0] [c00f1d84] generic_handle_irq+0x54/0x80
  [e5ee9fc0] [c0006b54] __do_irq+0x64/0x200
  [e5ee9ff0] [c0007958] __do_IRQ+0xe8/0x1c0
  [c812dd50] [e3eaab20] 0xe3eaab20
  [c812dd90] [c0007a4c] do_IRQ+0x1c/0x30
  [c812dda0] [c0000c04] ExternalInput+0x144/0x160
  --- interrupt: 500 at arch_cpu_idle+0x34/0x60
  NIP:  c000b684 LR: c000b684 CTR: c0019688
  REGS: c812ddb0 TRAP: 0500   Tainted: G    B              (5.16.0-rc8)
  MSR:  00029002 <CE,EE,ME>  CR: 22000488  XER: 20000000

  GPR00: c10ef7fc c812de90 c80ff200 c2394718 00000001 00000001 c10e3f90 00000003
  GPR08: 00000000 c0019688 c2394718 fc7d625b 22000484 00000000 21e17000 c208228c
  GPR16: e3e99284 00000000 ffffffff c2390000 c001bac0 c2082288 c812df60 c001ba60
  GPR24: c23949c0 00000018 00080000 00000004 c80ff200 00000002 c2348ee4 c2394718
  NIP [c000b684] arch_cpu_idle+0x34/0x60
  LR [c000b684] arch_cpu_idle+0x34/0x60
  --- interrupt: 500
  [c812de90] [c10e3f90] rcu_eqs_enter.isra.60+0xc0/0x110 (unreliable)
  [c812deb0] [c10ef7fc] default_idle_call+0xbc/0x230
  [c812dee0] [c00af0e8] do_idle+0x1c8/0x200
  [c812df10] [c00af3c0] cpu_startup_entry+0x20/0x30
  [c812df20] [c001e010] start_secondary+0x5d0/0xba0
  [c812dff0] [c00028a0] __secondary_start+0x90/0xdc

This happened because we would overrun the i2c->msgs array on the final
interrupt for the I2C STOP. This didn't happen if the last message was a
read because there is no interrupt in that case. Ensure that we only
access the current message if we are not processing a I2C STOP
condition.

Fixes: 1538d82f4647 ("i2c: mpc: Interrupt driven transfer")
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@kernel.org>

mm/slob: Remove unnecessary page_mapcount_reset() function call

After commit 401fb12c68c2 ("mm/sl*b: Differentiate struct slab fields by
sl*b implementations"), we can reorder fields of struct slab depending
on slab allocator.

For now, page_mapcount_reset() is called because page->_mapcount and
slab->units have same offset. But this is not necessary for struct slab.
Use unused field for units instead.

Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20211212065241.GA886691@odroid

bootmem: Use page->index instead of page->freelist

page->freelist is for the use of slab. Using page->index is the same
set of bits as page->freelist, and by using an integer instead of a
pointer, we can avoid casts.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: <x86@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>

zsmalloc: Stop using slab fields in struct page

The ->freelist and ->units members of struct page are for the use of
slab only. I'm not particularly familiar with zsmalloc, so generate the
same code by using page->index to store 'page' (page->index and
page->freelist are at the same offset in struct page). This should be
cleaned up properly at some point by somebody who is familiar with
zsmalloc.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>

mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only when enabled

The fields 'next' and 'slabs' are only used when CONFIG_SLUB_CPU_PARTIAL
is enabled. We can put their definition to #ifdef to prevent accidental
use when disabled.

Currenlty show_slab_objects() and slabs_cpu_partial_show() contain code
accessing the slabs field that's effectively dead with
CONFIG_SLUB_CPU_PARTIAL=n through the wrappers slub_percpu_partial() and
slub_percpu_partial_read_once(), but to prevent a compile error, we need
to hide all this code behind #ifdef.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>

mm/slub: Simplify struct slab slabs field definition

Before commit b47291ef02b0 ("mm, slub: change percpu partial accounting
from objects to pages") we had to fit two integer fields into a native
word size, so we used short int on 32-bit and int on 64-bit via #ifdef.
After that commit there is only one integer field, so we can simply
define it as int everywhere.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Roman Gushchin <guro@fb.com>

mm/sl*b: Differentiate struct slab fields by sl*b implementations

With a struct slab definition separate from struct page, we can go
further and define only fields that the chosen sl*b implementation uses.
This means everything between __page_flags and __page_refcount
placeholders now depends on the chosen CONFIG_SL*B. Some fields exist in
all implementations (slab_list) but can be part of a union in some, so
it's simpler to repeat them than complicate the definition with ifdefs
even more.

The patch doesn't change physical offsets of the fields, although it
could be done later - for example it's now clear that tighter packing in
SLOB could be possible.

This should also prevent accidental use of fields that don't exist in
given implementation. Before this patch virt_to_cache() and
cache_from_obj() were visible for SLOB (albeit not used), although they
rely on the slab_cache field that isn't set by SLOB. With this patch
it's now a compile error, so these functions are now hidden behind
an #ifndef CONFIG_SLOB.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Roman Gushchin <guro@fb.com>
Tested-by: Marco Elver <elver@google.com> # kfence
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <kasan-dev@googlegroups.com>

mm/kfence: Convert kfence_guarded_alloc() to struct slab

The function sets some fields that are being moved from struct page to
struct slab so it needs to be converted.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <kasan-dev@googlegroups.com>

mm/kasan: Convert to struct folio and struct slab

KASAN accesses some slab related struct page fields so we need to
convert it to struct slab. Some places are a bit simplified thanks to
kasan_addr_to_slab() encapsulating the PageSlab flag check through
virt_to_slab(). When resolving object address to either a real slab or
a large kmalloc, use struct folio as the intermediate type for testing
the slab flag to avoid unnecessary implicit compound_head().

[ vbabka@suse.cz: use struct folio, adjust to differences in previous
patches ]

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Tested-by: Hyeongogn Yoo <42.hyeyoo@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <kasan-dev@googlegroups.com>

mm/slob: Convert SLOB to use struct slab and struct folio

Use struct slab throughout the slob allocator. Where non-slab page can
appear use struct folio instead of struct page.

[ vbabka@suse.cz: don't introduce wrappers for PageSlobFree in mm/slab.h
just for the single callers being wrappers in mm/slob.c ]

[ Hyeonggon Yoo <42.hyeyoo@gmail.com>: fix NULL pointer deference ]

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>

mm/memcg: Convert slab objcgs from struct page to struct slab

page->memcg_data is used with MEMCG_DATA_OBJCGS flag only for slab pages
so convert all the related infrastructure to struct slab. Also use
struct folio instead of struct page when resolving object pointers.

This is not just mechanistic changing of types and names. Now in
mem_cgroup_from_obj() we use folio_test_slab() to decide if we interpret
the folio as a real slab instead of a large kmalloc, instead of relying
on MEMCG_DATA_OBJCGS bit that used to be checked in page_objcgs_check().
Similarly in memcg_slab_free_hook() where we can encounter
kmalloc_large() pages (here the folio slab flag check is implied by
virt_to_slab()). As a result, page_objcgs_check() can be dropped instead
of converted.

To avoid include cycles, move the inline definition of slab_objcgs()
from memcontrol.h to mm/slab.h.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <cgroups@vger.kernel.org>

mm: Convert struct page to struct slab in functions used by other subsystems

KASAN, KFENCE and memcg interact with SLAB or SLUB internals through
functions nearest_obj(), obj_to_index() and objs_per_slab() that use
struct page as parameter. This patch converts it to struct slab
including all callers, through a coccinelle semantic patch.

// Options: --include-headers --no-includes --smpl-spacing include/linux/slab_def.h include/linux/slub_def.h mm/slab.h mm/kasan/*.c mm/kfence/kfence_test.c mm/memcontrol.c mm/slab.c mm/slub.c
// Note: needs coccinelle 1.1.1 to avoid breaking whitespace

@@
@@

-objs_per_slab_page(
+objs_per_slab(
...
)
{ ... }

@@
@@

-objs_per_slab_page(
+objs_per_slab(
...
)

@@
identifier fn =~ "obj_to_index|objs_per_slab";
@@

fn(...,
-   const struct page *page
+   const struct slab *slab
    ,...)
{
<...
(
- page_address(page)
+ slab_address(slab)
|
- page
+ slab
)
...>
}

@@
identifier fn =~ "nearest_obj";
@@

fn(...,
-   struct page *page
+   const struct slab *slab
    ,...)
{
<...
(
- page_address(page)
+ slab_address(slab)
|
- page
+ slab
)
...>
}

@@
identifier fn =~ "nearest_obj|obj_to_index|objs_per_slab";
expression E;
@@

fn(...,
(
- slab_page(E)
+ E
|
- virt_to_page(E)
+ virt_to_slab(E)
|
- virt_to_head_page(E)
+ virt_to_slab(E)
|
- page
+ page_slab(page)
)
  ,...)

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <kasan-dev@googlegroups.com>
Cc: <cgroups@vger.kernel.org>

mm/slab: Finish struct page to struct slab conversion

Change cache_free_alien() to use slab_nid(virt_to_slab()). Otherwise
just update of comments and some remaining variable names.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Roman Gushchin <guro@fb.com>