Linus Torvalds [Tue, 13 Dec 2022 20:32:07 +0000 (12:32 -0800)]
Merge tag 'mtd/for-6.2' of git://git./linux/kernel/git/mtd/linux
Pull mtd updates from Miquel Raynal:
"MTD core changes:
- Fix refcount error in del_mtd_device()
- Fix possible resource leak in init_mtd()
- Set ROOT_DEV for partitions marked as rootfs in DT
- Describe marking rootfs partitions in the bindings
- Fix device name leak when register device fails in add_mtd_device()
- Try to find OF node for every MTD partition
- simplify (a bit) code find partition-matching dynamic OF node
MTD driver changes:
- pxa2xx-flash maps: fix memory leak in probe
- BCM parser: refer to ARCH_BCMBCA instead of ARCH_BCM4908
- lpddr2_nvm: Fix possible null-ptr-deref
- inftlcore: fix repeated words in comments
- lart: remove driver
- tplink:
- Add TP-Link SafeLoader partitions table parser and bindings
- Describe TP-Link SafeLoader parser
- Describe TP-Link SafeLoader dynamic subpartitions
- mtdoops:
- Panic caused mtdoops to call mtdoops_erase function immediately
- Add mtdoops_erase function and move mtdoops_inc_counter after it
- Change printk() to counterpart pr_ functions
MTD binding cleanup:
- Fixed-partitions: Fix 'sercomm,scpart-id' schema
- Standardize the style in the examples
- Drop object types when referencing other files
- Argue in favor of keeping additionalProperties set to true
- NVMEM-cells:
- Inherit from MTD partitions
- Drop range property from example
- Partitions:
- Change qcom,smem-part partition type
- Constrain the list of parsers
- Physmap: Reuse the generic definitions
- SPI-NOR: Drop common properties
- Sunxi-nand: Add an example to validate the bindings
- Onenand: Mention the expected node name
- Ingenic: Mark partitions in the controller node as deprecated
- NAND:
- Standardize the child node name
- Drop common properties already defined in generic files
- nand-chip.yaml should reference mtd.yaml
- Remove useless file about partitions
- Clarify all partition subnodes
SPI NOR core changes:
- Add support for flash reset using the dt reset-gpios property.
- Update hwcaps.mask to include 8D-8D-8D read and page program ops
when xSPI profile 1.0 table is defined.
- Bypass zero erase size in spi_nor_find_best_erase_type().
- Fix select_uniform_erase to skip 0 erase size
- Add generic flash driver. If a flash is not found in the flash_info
array, fall back to the generic flash driver which is described
solely by the flash's SFDP tables.
- Fix the number of bytes for the dummy cycles in
spi_nor_spimem_check_readop().
- Introduce SPI_NOR_QUAD_PP flag, as PP_1_1_4 is not SFDP
discoverable.
SPI NOR manufacturer drivers changes:
- Spansion:
- use PARSE_SFDP for s28hs512t,
- add support for s28hl512t, s28hl01gt, and s28hs01gt.
- Gigadevice: Replace default_init() with post_bfpt() for gd25q256.
- Micron - ST: Enable locking for mt25qu256a.
- Winbond: Add support for W25Q512NW-IQ.
- ISSI: Use PARSE_SFDP and SPI_NOR_QUAD_PP.
Raw NAND core changes:
- Drop obsolete dependencies on COMPILE_TEST
- MAINTAINERS: rectify entry for MESON NAND controller bindings
- Drop EXPORT_SYMBOL_GPL for nanddev_erase()
Raw NAND driver changes:
- marvell: Enable NFC/DEVBUS arbiter
- gpmi: Use pm_runtime_resume_and_get instead of pm_runtime_get_sync
- mpc5121: Replace NO_IRQ by 0
- lpc32xx_{slc,mlc}:
- Switch to using pm_ptr()
- Switch to using gpiod API
- lpc32xx_mlc: Switch to using pm_ptr()
- cadence: Support 64-bit slave dma interface
- rockchip: Describe rk3128-nfc in the bindings
- brcmnand: Update interrupts description in the bindings
SPI-NAND driver changes:
- winbond:
- Add Winbond W25N02KV flash support
- Fix flash identification"
* tag 'mtd/for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (76 commits)
mtd: rawnand: Drop obsolete dependencies on COMPILE_TEST
mtd: maps: pxa2xx-flash: fix memory leak in probe
mtd: core: Fix refcount error in del_mtd_device()
mtd: spi-nor: add SFDP fixups for Quad Page Program
mtd: spi-nor: issi: is25wp256: Init flash based on SFDP
mtd: spi-nor: winbond: add support for W25Q512NW-IQ
mtd: spi-nor: micron-st: Enable locking for mt25qu256a
mtd: spi-nor: Fix the number of bytes for the dummy cycles
mtd: spi-nor: gigadevice: gd25q256: replace gd25q256_default_init with gd25q256_post_bfpt
mtd: spi-nor: Fix formatting in spi_nor_read_raw() kerneldoc comment
mtd: spi-nor: sysfs: print JEDEC ID for generic flash driver
mtd: spi-nor: add generic flash driver
mtd: spi-nor: fix select_uniform_erase to skip 0 erase size
mtd: spi-nor: move function declaration out of sfdp.h
mtd: spi-nor: remember full JEDEC flash ID
mtd: spi-nor: sysfs: hide manufacturer if it is not set
mtd: spi-nor: hide jedec_id sysfs attribute if not present
mtd: spi-nor: Check for zero erase size in spi_nor_find_best_erase_type()
mtd: rawnand: marvell: Enable NFC/DEVBUS arbiter
mtd: parsers: refer to ARCH_BCMBCA instead of ARCH_BCM4908
...
Linus Torvalds [Tue, 13 Dec 2022 19:59:58 +0000 (11:59 -0800)]
Merge tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"The biggest highlight is that the accel subsystem framework is merged.
Hopefully for 6.3 we will be able to line up a driver to use it.
In drivers land, i915 enables DG2 support by default now, and nouveau
has a big stability refactoring and initial ampere support, AMD
includes new hw IP support and should build on ARM again. There is
also an ofdrm driver to take over offb on platforms it's used.
Stuff outside my tree, the dma-buf patches hit a few places, the vc4
firmware changes also do, and i915 has some interactions with MEI for
discrete GPUs. I think all of those should have been acked/reviewed by
relevant parties.
New driver:
- ofdrm - replacement for offb
fbdev:
- add support for nomodeset
fourcc:
- add Vivante tiled modifier
core:
- atomic-helpers: CRTC primary plane test fixes, fb access hooks
- connector: TV API consistency, cmdline parser improvements
- send connector hotplug on cleanup
- sort makefile objects
tests:
- sort kunit tests
- improve DP-MST tests
- add kunit helpers to create a device
sched:
- module param for scheduling policy
- refcounting fix
buddy:
- add back random seed log
ttm:
- convert ttm_resource to size_t
- optimize pool allocations
edid:
- HFVSDB parsing support fixes
- logging/debug improvements
- DSC quirks
dma-buf:
- Add unlocked vmap and attachment mapping
- move drivers to common locking convention
- locking improvements
firmware:
- new API for rPI firmware and vc4
xilinx:
- zynqmp: displayport bridge support
- dpsub fix
bridge:
- adv7533: Remove dynamic lane switching
- it6505: Runtime PM support, sync improvements
- ps8640: Handle AUX defer messages
- tc358775: Drop soft-reset over I2C
panel:
- panel-edp: Add INX N116BGE-EA2 C2 and C4 support.
- Jadard JD9365DA-H3
- NewVision NV3051D
amdgpu:
- DCN support on ARM
- DCN 2.1 secure display
- Sienna Cichlid mode2 reset fixes
- new GC 11.x firmware versions
- drop AMD specific DSC workarounds in favour of drm code
- clang warning fixes
- scheduler rework
- SR-IOV fixes
- GPUVM locking fixes
- fix memory leak in CS IOCTL error path
- flexible array updates
- enable new GC/PSP/SMU/NBIO IP
- GFX preemption support for gfx9
amdkfd:
- cache size fixes
- userptr fixes
- enable cooperative launch on gfx 10.3
- enable GC 11.0.4 KFD support
radeon:
- replace kmap with kmap_local_page
- ACPI ref count fix
- HDA audio notifier support
i915:
- DG2 enabled by default
- MTL enablement work
- hotplug refactoring
- VBT improvements
- Display and watermark refactoring
- ADL-P workaround
- temp disable runtime_pm for discrete-
- fix for A380 as a secondary GPU
- Wa_18017747507 for DG2
- CS timestamp support fixes for gen5 and earlier
- never purge busy TTM objects
- use i915_sg_dma_sizes for all backends
- demote GuC kernel contexts to normal priority
- gvt: refactor for new MDEV interface
- enable DC power states on eDP ports
- fix gen 2/3 workarounds
nouveau:
- fix page fault handling
- Ampere acceleration support
- driver stability improvements
- nva3 backlight support
msm:
- MSM_INFO_GET_FLAGS support
- DPU: XR30 and P010 image formats
- Qualcomm SM6115 support
- DSI PHY support for QCM2290
- HDMI: refactored dev init path
- remove exclusive-fence hack
- fix speed-bin detection
- enable clamp to idle on 7c3
- improved hangcheck detection
vmwgfx:
- fb and cursor refactoring
- convert to generic hashtable
- cursor improvements
etnaviv:
- hw workarounds
- softpin MMU fixes
ast:
- atomic gamma LUT support
- convert to SHMEM
lcdif:
- support YUV planes
- Increase DMA burst size
- FIFO threshold tuning
meson:
- fix return type of cvbs mode_valid
mgag200:
- fix PLL setup on some revisions
sun4i:
- A100 and D1 support
udl:
- modesetting improvements
- hot unplug support
vc4:
- support PAL-M
- fix regression preventing 4K @ 60Hz
- fix NULL ptr deref
v3d:
- switch to drm managed resources
renesas:
- RZ/G2L DSI support
- DU Kconfig cleanup
mediatek:
- fixup dpi and hdmi
- MT8188 dpi support
- MT8195 AFBC support
tegra:
- NVDEC hardware on Tegra234 SoC
hdlcd:
- switch to drm managed resources
ingenic:
- fix registration error path
hisilicon:
- convert to drm_mode_init
maildp:
- use managed resources
mtk:
- use drm_mode_init
rockchip:
- use drm_mode_copy"
* tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm: (1397 commits)
drm/amdgpu: fix mmhub register base coding error
drm/amdgpu: add tmz support for GC IP v11.0.4
drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4
drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4
drm/amdgpu: enable GFX IP v11.0.4 CG support
drm/amdgpu: Make amdgpu_ring_mux functions as static
drm/amdgpu: generally allow over-commit during BO allocation
drm/amd/display: fix array index out of bound error in DCN32 DML
drm/amd/display: 3.2.215
drm/amd/display: set optimized required for comp buf changes
drm/amd/display: Add debug option to skip PSR CRTC disable
drm/amd/display: correct DML calc error of UrgentLatency
drm/amd/display: correct static_screen_event_mask
drm/amd/display: Ensure commit_streams returns the DC return code
drm/amd/display: read invalid ddc pin status cause engine busy
drm/amd/display: Bypass DET swath fill check for max clocks
drm/amd/display: Disable uclk pstate for subvp pipes
drm/amd/display: Fix DCN2.1 default DSC clocks
drm/amd/display: Enable dp_hdmi21_pcon support
drm/amd/display: prevent seamless boot on displays that don't have the preferred dig
...
Linus Torvalds [Tue, 13 Dec 2022 19:36:58 +0000 (11:36 -0800)]
Merge tag 'media/v6.2-1' of git://git./linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- DVB core changes to avoid refcount troubles and UAF
- DVB API/core has gained support for DVB-C2 and DVB-S2X
- New sensor drivers: ov08x40, ov4689.c, st-vgxy61 and tc358746.c
- Removal of an unused sensor driver: s5k4ecgx
- Move microchip_csi2dc to a new directory, named after the
manufacturer
- Add media controller support to Microship drivers
- Old Atmel/Microship drivers that don't use media controler got moved
to staging
- New drivers added for Renesas RZ/G2L CRU and MIPI CSI-2 support
- Allwinner A31 camera sensor driver code was now split into a bridge
and a separate processor driver
- Added a virtual stateless decoder driver in order to test core
support for stateless drivers and test userspace apps using it
- removed platform-based support for ov9650, as this is not used
anymore
- atomisp now uses videobuf2 and supports normal mmap mode
- the imx7-media-csi driver got promoted from staging
- rcar-vin driver has gained support for gen3 UDS (Up Down Scaler)
- most i2c drivers now use I2C .probe_new() kAPI
- lots of drivers fixes, cleanups and improvements
* tag 'media/v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (544 commits)
media: s5c73m3: Switch to GPIO descriptors
media: i2c: s5k5baf: switch to using gpiod API
media: i2c: s5k6a3: switch to using gpiod API
media: imx: remove code for non-existing config IMX_GPT_ICAP
media: si470x: Fix use-after-free in si470x_int_in_callback()
media: staging: stkwebcam: Restore MEDIA_{USB,CAMERA}_SUPPORT dependencies
media: coda: Add check for kmalloc
media: coda: Add check for dcoda_iram_alloc
dt-bindings: media: s5c73m3: Fix reset-gpio descriptor
media: dt-bindings: allwinner: h6-vpu-g2: Add IOMMU reference property
media: s5k4ecgx: Delete driver
media: s5k4ecgx: Switch to GPIO descriptors
media: Switch to use dev_err_probe() helper
headers: Remove some left-over license text in include/uapi/linux/v4l2-*
headers: Remove some left-over license text in include/uapi/linux/dvb/
media: usb: pwc-uncompress: Use flex array destination for memcpy()
media: s5p-mfc: Fix to handle reference queue during finishing
media: s5p-mfc: Clear workbit to handle error condition
media: s5p-mfc: Fix in register read and write for H264
media: imx: Use get_mbus_config instead of parsing upstream DT endpoints
...
Linus Torvalds [Tue, 13 Dec 2022 19:27:26 +0000 (11:27 -0800)]
Merge tag 'sound-6.2-rc1' of git://git./linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"This looks like a relatively calm development cycle; there have been
only few changes in ALSA and ASoC core sides while we get lots of
device-specific fixes and updates as usual. Most of commits are about
ASoC, including Intel SOF/AVS and many device tree updates.
Below are some highlights:
Core:
- Improvement in memalloc helper for fallback allocations
- More cleanups of ASoC DAPM code
ASoC:
- Factoring out of mapping hw_params onto SoundWire configuration
- The ever ongoing overhauls of the Intel DSP code continue,
including support for loading libraries and probes with IPC4 on
SOF.
- Support for more sample formats on JZ4740
- Lots of device tree conversions and fixups
- Support for Allwinner D1, a range of AMD and Intel systems,
Mediatek systems with multiple DMICs, Nuvoton NAU8318, NXP
fsl_rpmsg and i.MX93, Qualcomm AudioReach Enable, MFC and SAL,
RealTek RT1318 and Rockchip RK3588
ALSA:
- Addition of PCM kselftest; still minimalistic but can be extended
in future
- Fixes for corner-case XRUNs with USB-audio implicit feedback mode
- Usual device-specific quirk updates for USB- and HD-audio
- FireWire DICE updates
This also contains a few cross-tree updates:
- Some OMAP board file updates for removal of relevant OMAP platforms
- A new I2C API update for I2C probe API adaption
- A DRM update for the further hdmi-codec updates"
* tag 'sound-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (417 commits)
ALSA: mts64: fix possible null-ptr-defer in snd_mts64_interrupt
ALSA: patch_realtek: Fix Dell Inspiron Plus 16
ALSA: hda/cirrus: Add extra 10 ms delay to allow PLL settle and lock.
ASoC: dt-bindings: Correct Alexandre Belloni email
ASoC: dt-bindings: maxim,max98504: Convert to DT schema
ASoC: dt-bindings: maxim,max98357a: Convert to DT schema
ASoC: dt-bindings: Reference common DAI properties
ASoC: dt-bindings: Extend name-prefix.yaml into common DAI properties
ASoC: rt715: Make read-only arrays capture_reg_H and capture_reg_L static const
ASoC: uniphier: aio-core: Make some read-only arrays static const
ASoC: wcd938x: Make read-only array minCode_param static const
ASoC: qcom: lpass-sc7280: Add maybe_unused tag for system PM ops
ASoC : SOF: amd: Add support for IPC and DSP dumps
ASoC: SOF: amd: Use poll function instead to read ACP_SHA_DSP_FW_QUALIFIER
ALSA: usb-audio: Workaround for XRUN at prepare
ALSA: pcm: Handle XRUN at trigger START
ALSA: pcm: Set missing stop_operating flag at undoing trigger start
drm: tda99x: Don't advertise non-existent capture support
ASoC: hdmi-codec: Allow playback and capture to be disabled
kselftest/alsa: Add more coverage of sample rates and channel counts
...
Linus Torvalds [Tue, 13 Dec 2022 18:58:09 +0000 (10:58 -0800)]
Merge tag 'for-6.2/dm-changes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Fix use-after-free races due to missing resource cleanup during DM
target destruction in DM targets: thin-pool, cache, integrity and
clone.
- Fix ABBA deadlocks in DM thin-pool and cache targets due to their use
of a bufio client (that has a shrinker whose locking can cause the
incorrect locking order).
- Fix DM cache target to set its needs_check flag after first aborting
the metadata (whereby using reset persistent-data objects to update
the superblock with, otherwise the superblock update could be dropped
due to aborting metadata). This was found with code-inspection when
comparing with the equivalent in DM thinp code.
- Fix DM thin-pool's presume to continue resuming the device even if
the pool in is fail mode -- otherwise bios may never be failed up the
IO stack (which will prevent resetting the thin-pool target via table
reload)
- Fix DM thin-pool's metadata to use proper btree root (from previous
transaction) if metadata commit failed.
- Add 'waitfor' module param to DM module (dm_mod) to allow dm-init to
wait for the specified device before continuing with its DM target
initialization.
* tag 'for-6.2/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm thin: Use last transaction's pmd->root when commit failed
dm init: add dm-mod.waitfor to wait for asynchronously probed block devices
dm ioctl: fix a couple ioctl codes
dm ioctl: a small code cleanup in list_version_get_info
dm thin: resume even if in FAIL mode
dm cache: set needs_check flag after aborting metadata
dm cache: Fix ABBA deadlock between shrink_slab and dm_cache_metadata_abort
dm thin: Fix ABBA deadlock between shrink_slab and dm_pool_abort_metadata
dm integrity: Fix UAF in dm_integrity_dtr()
dm cache: Fix UAF in destroy()
dm clone: Fix UAF in clone_dtr()
dm thin: Fix UAF in run_timer_softirq()
Linus Torvalds [Tue, 13 Dec 2022 18:54:19 +0000 (10:54 -0800)]
Merge tag 'ata-6.2-rc1' of git://git./linux/kernel/git/dlemoal/libata
Pull ata updates from Damien Le Moal:
"The ususal set of driver fixes and improvements as well as several
patches improving libata core in preparation of the introduction of
the support for the command duration limits feature. In more details:
- Define the missing COMPLETED sense key in scsi header (me)
- Several patches to improve libata handling of the status of
completed commands and the retry and sense data reported to the
scsi layer for failed commands. In particular, this widen the
support for NCQ autosense to all drives that support this feature
instead of restricting this feature use to ZAC drives only (Niklas)
- Cleanup of the pata_mpc52xx and sata_dwc_460ex drivers to remove
the use of the deprecated NO_IRQ macro (Christophe)
- Fix build dedependency on OF vs use of the of_match_ptr() macro to
avoid build errors with the sata_gemini and pata_ftide010 drivers
(me)
- Some libata cleanups using the new helper function
ata_port_is_frozen() (Niklas)
- Improve internal command handling by not retrying commands that
failed with a timeout (Niklas)
- Remove code for several unused libata helper functions (from
Niklas)
- Remove the palmchip pata_bk3710 driver. A couple of other driver
removal should come in through the arm tree pull request (from
Arnd)
- Remove unused variable and function in the sata_dwc_460ex driver
and libata-sff code (Colin and Sergey)
- Minor cleanup of the pata_ep93xx driver platform code (from
Minghao)
- Remove the unnecessary linux/msi.h include from the ahci driver
(Thomas)
- Changes to libata enum constants definitions to avoid warnings with
gcc-13 (Arnd)"
* tag 'ata-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (24 commits)
ata: ahci: fix enum constants for gcc-13
ata: libata: fix commands incorrectly not getting retried during NCQ error
ata: ahci: Remove linux/msi.h include
ata: sata_dwc_460ex: Check !irq instead of irq == NO_IRQ
ata: pata_ep93xx: use devm_platform_get_and_ioremap_resource()
ata: libata-sff: kill unused ata_sff_busy_sleep()
ata: sata_dwc_460ex: remove variable num_processed
ata: remove palmchip pata_bk3710 driver
ata: remove unused helper ata_id_flush_ext_enabled()
ata: remove unused helper ata_id_flush_enabled()
ata: remove unused helper ata_id_lba48_enabled()
ata: libata-core: do not retry reading the log on timeout
scsi: libsas: make use of ata_port_is_frozen() helper
ata: make use of ata_port_is_frozen() helper
ata: add ata_port_is_frozen() helper
ata: pata_ftide010: Remove build dependency on OF
ata: sata_gemini: Remove dependency on OF for compile tests
ata: pata_mpc52xx: Replace NO_IRQ with 0
ata: libahci: read correct status and error field for NCQ commands
ata: libata: fetch sense data for ATA devices supporting sense reporting
...
Linus Torvalds [Tue, 13 Dec 2022 18:43:59 +0000 (10:43 -0800)]
Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe pull requests via Christoph:
- Support some passthrough commands without CAP_SYS_ADMIN (Kanchan
Joshi)
- Refactor PCIe probing and reset (Christoph Hellwig)
- Various fabrics authentication fixes and improvements (Sagi
Grimberg)
- Avoid fallback to sequential scan due to transient issues (Uday
Shankar)
- Implement support for the DEAC bit in Write Zeroes (Christoph
Hellwig)
- Allow overriding the IEEE OUI and firmware revision in configfs
for nvmet (Aleksandr Miloserdov)
- Force reconnect when number of queue changes in nvmet (Daniel
Wagner)
- Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi
Grimberg, Christoph Hellwig, Christophe JAILLET)
- Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni)
- Use the common tagset helpers in nvme-pci driver (Christoph
Hellwig)
- Cleanup the nvme-pci removal path (Christoph Hellwig)
- Use kstrtobool() instead of strtobool (Christophe JAILLET)
- Allow unprivileged passthrough of Identify Controller (Joel
Granados)
- Support io stats on the mpath device (Sagi Grimberg)
- Minor nvmet cleanup (Sagi Grimberg)
- MD pull requests via Song:
- Code cleanups (Christoph)
- Various fixes
- Floppy pull request from Denis:
- Fix a memory leak in the init error path (Yuan)
- Series fixing some batch wakeup issues with sbitmap (Gabriel)
- Removal of the pktcdvd driver that was deprecated more than 5 years
ago, and subsequent removal of the devnode callback in struct
block_device_operations as no users are now left (Greg)
- Fix for partition read on an exclusively opened bdev (Jan)
- Series of elevator API cleanups (Jinlong, Christoph)
- Series of fixes and cleanups for blk-iocost (Kemeng)
- Series of fixes and cleanups for blk-throttle (Kemeng)
- Series adding concurrent support for sync queues in BFQ (Yu)
- Series bringing drbd a bit closer to the out-of-tree maintained
version (Christian, Joel, Lars, Philipp)
- Misc drbd fixes (Wang)
- blk-wbt fixes and tweaks for enable/disable (Yu)
- Fixes for mq-deadline for zoned devices (Damien)
- Add support for read-only and offline zones for null_blk
(Shin'ichiro)
- Series fixing the delayed holder tracking, as used by DM (Yu,
Christoph)
- Series enabling bio alloc caching for IRQ based IO (Pavel)
- Series enabling userspace peer-to-peer DMA (Logan)
- BFQ waker fixes (Khazhismel)
- Series fixing elevator refcount issues (Christoph, Jinlong)
- Series cleaning up references around queue destruction (Christoph)
- Series doing quiesce by tagset, enabling cleanups in drivers
(Christoph, Chao)
- Series untangling the queue kobject and queue references (Christoph)
- Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye,
Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph)
* tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits)
blktrace: Fix output non-blktrace event when blk_classic option enabled
block: sed-opal: Don't include <linux/kernel.h>
sed-opal: allow using IOC_OPAL_SAVE for locking too
blk-cgroup: Fix typo in comment
block: remove bio_set_op_attrs
nvmet: don't open-code NVME_NS_ATTR_RO enumeration
nvme-pci: use the tagset alloc/free helpers
nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set
nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers
nvme: consolidate setting the tagset flags
nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set
block: bio_copy_data_iter
nvme-pci: split out a nvme_pci_ctrl_is_dead helper
nvme-pci: return early on ctrl state mismatch in nvme_reset_work
nvme-pci: rename nvme_disable_io_queues
nvme-pci: cleanup nvme_suspend_queue
nvme-pci: remove nvme_pci_disable
nvme-pci: remove nvme_disable_admin_queue
nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl
nvme: use nvme_wait_ready in nvme_shutdown_ctrl
...
Linus Torvalds [Tue, 13 Dec 2022 18:40:31 +0000 (10:40 -0800)]
Merge tag 'for-6.2/io_uring-next-2022-12-08' of git://git.kernel.dk/linux
Pull io_uring updates part two from Jens Axboe:
- Misc fixes (me, Lin)
- Series from Pavel extending the single task exclusive ring mode,
yielding nice improvements for the common case of having a single
ring per thread (Pavel)
- Cleanup for MSG_RING, removing our IOPOLL hack (Pavel)
- Further poll cleanups and fixes (Pavel)
- Misc cleanups and fixes (Pavel)
* tag 'for-6.2/io_uring-next-2022-12-08' of git://git.kernel.dk/linux: (22 commits)
io_uring/msg_ring: flag target ring as having task_work, if needed
io_uring: skip spinlocking for ->task_complete
io_uring: do msg_ring in target task via tw
io_uring: extract a io_msg_install_complete helper
io_uring: get rid of double locking
io_uring: never run tw and fallback in parallel
io_uring: use tw for putting rsrc
io_uring: force multishot CQEs into task context
io_uring: complete all requests in task context
io_uring: don't check overflow flush failures
io_uring: skip overflow CQE posting for dying ring
io_uring: improve io_double_lock_ctx fail handling
io_uring: dont remove file from msg_ring reqs
io_uring: reshuffle issue_flags
io_uring: don't reinstall quiesce node for each tw
io_uring: improve rsrc quiesce refs checks
io_uring: don't raw spin unlock to match cq_lock
io_uring: combine poll tw handlers
io_uring: improve poll warning handling
io_uring: remove ctx variable in io_poll_check_events
...
Linus Torvalds [Tue, 13 Dec 2022 18:33:08 +0000 (10:33 -0800)]
Merge tag 'for-6.2/io_uring-2022-12-08' of git://git.kernel.dk/linux
Pull io_uring updates from Jens Axboe:
- Always ensure proper ordering in case of CQ ring overflow, which then
means we can remove some work-arounds for that (Dylan)
- Support completion batching for multishot, greatly increasing the
efficiency for those (Dylan)
- Flag epoll/eventfd wakeups done from io_uring, so that we can easily
tell if we're recursing into io_uring again.
Previously, this would have resulted in repeated multishot
notifications if we had a dependency there. That could happen if an
eventfd was registered as the ring eventfd, and we multishot polled
for events on it. Or if an io_uring fd was added to epoll, and
io_uring had a multishot request for the epoll fd.
Test cases here:
https://git.kernel.dk/cgit/liburing/commit/?id=
919755a7d0096fda08fb6d65ac54ad8d0fe027cd
Previously these got terminated when the CQ ring eventually
overflowed, now it's handled gracefully (me).
- Tightening of the IOPOLL based completions (Pavel)
- Optimizations of the networking zero-copy paths (Pavel)
- Various tweaks and fixes (Dylan, Pavel)
* tag 'for-6.2/io_uring-2022-12-08' of git://git.kernel.dk/linux: (41 commits)
io_uring: keep unlock_post inlined in hot path
io_uring: don't use complete_post in kbuf
io_uring: spelling fix
io_uring: remove io_req_complete_post_tw
io_uring: allow multishot polled reqs to defer completion
io_uring: remove overflow param from io_post_aux_cqe
io_uring: add lockdep assertion in io_fill_cqe_aux
io_uring: make io_fill_cqe_aux static
io_uring: add io_aux_cqe which allows deferred completion
io_uring: allow defer completion for aux posted cqes
io_uring: defer all io_req_complete_failed
io_uring: always lock in io_apoll_task_func
io_uring: remove iopoll spinlock
io_uring: iopoll protect complete_post
io_uring: inline __io_req_complete_put()
io_uring: remove io_req_tw_post_queue
io_uring: use io_req_task_complete() in timeout
io_uring: hold locks for io_req_complete_failed
io_uring: add completion locking for iopoll
io_uring: kill io_cqring_ev_posted() and __io_cq_unlock_post()
...
Linus Torvalds [Tue, 13 Dec 2022 18:29:22 +0000 (10:29 -0800)]
Merge tag 'iomap-6.2-merge-1' of git://git./fs/xfs/xfs-linux
Pull iomap update from Darrick Wong:
- Minor code cleanup to eliminate unnecessary bit shifting
* tag 'iomap-6.2-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: directly use logical block size
Linus Torvalds [Tue, 13 Dec 2022 18:26:38 +0000 (10:26 -0800)]
Merge tag 'vfs-6.2-merge-1' of git://git./fs/xfs/xfs-linux
Pull vfs remap_range update from Darrick Wong:
- Make some minor adjustments to the remap range preparation function
to skip file updates when the request length is adjusted downwards to
zero.
* tag 'vfs-6.2-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
fs/remap_range: avoid spurious writeback on zero length request
Linus Torvalds [Tue, 13 Dec 2022 18:08:36 +0000 (10:08 -0800)]
Merge tag 'fs.xattr.simple.rework.rbtree.rwlock.v6.2' of git://git./linux/kernel/git/vfs/idmapping
Pull simple-xattr updates from Christian Brauner:
"This ports the simple xattr infrastucture to rely on a simple rbtree
protected by a read-write lock instead of a linked list protected by a
spinlock.
A while ago we received reports about scaling issues for filesystems
using the simple xattr infrastructure that also support setting a
larger number of xattrs. Specifically, cgroups and tmpfs.
Both cgroupfs and tmpfs can be mounted by unprivileged users in
unprivileged containers and root in an unprivileged container can set
an unrestricted number of security.* xattrs and privileged users can
also set unlimited trusted.* xattrs. A few more words on further that
below. Other xattrs such as user.* are restricted for kernfs-based
instances to a fairly limited number.
As there are apparently users that have a fairly large number of
xattrs we should scale a bit better. Using a simple linked list
protected by a spinlock used for set, get, and list operations doesn't
scale well if users use a lot of xattrs even if it's not a crazy
number.
Let's switch to a simple rbtree protected by a rwlock. It scales way
better and gets rid of the perf issues some people reported. We
originally had fancier solutions even using an rcu+seqlock protected
rbtree but we had concerns about being to clever and also that
deletion from an rbtree with rcu+seqlock isn't entirely safe.
The rbtree plus rwlock is perfectly fine. By far the most common
operation is getting an xattr. While setting an xattr is not and
should be comparatively rare. And listxattr() often only happens when
copying xattrs between files or together with the contents to a new
file.
Holding a lock across listxattr() is unproblematic because it doesn't
list the values of xattrs. It can only be used to list the names of
all xattrs set on a file. And the number of xattr names that can be
listed with listxattr() is limited to XATTR_LIST_MAX aka 65536 bytes.
If a larger buffer is passed then vfs_listxattr() caps it to
XATTR_LIST_MAX and if more xattr names are found it will return
-E2BIG. In short, the maximum amount of memory that can be retrieved
via listxattr() is limited and thus listxattr() bounded.
Of course, the API is broken as documented on xattr(7) already. While
I have no idea how the xattr api ended up in this state we should
probably try to come up with something here at some point. An iterator
pattern similar to readdir() as an alternative to listxattr() or
something else.
Right now it is extremly strange that users can set millions of xattrs
but then can't use listxattr() to know which xattrs are actually set.
And it's really trivial to do:
for i in {1..1000000}; do setfattr -n security.$i -v $i ./file1; done
And around 5000 xattrs it's impossible to use listxattr() to figure
out which xattrs are actually set. So I have suggested that we try to
limit the number of xattrs for simple xattrs at least. But that's a
future patch and I don't consider it very urgent.
A bonus of this port to rbtree+rwlock is that we shrink the memory
consumption for users of the simple xattr infrastructure.
This also adds kernel documentation to all the functions"
* tag 'fs.xattr.simple.rework.rbtree.rwlock.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
xattr: use rbtree for simple_xattrs
Linus Torvalds [Tue, 13 Dec 2022 17:47:48 +0000 (09:47 -0800)]
Merge tag 'lsm-pr-
20221212' of git://git./linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Improve the error handling in the device cgroup such that memory
allocation failures when updating the access policy do not
potentially alter the policy.
- Some minor fixes to reiserfs to ensure that it properly releases
LSM-related xattr values.
- Update the security_socket_getpeersec_stream() LSM hook to take
sockptr_t values.
Previously the net/BPF folks updated the getsockopt code in the
network stack to leverage the sockptr_t type to make it easier to
pass both kernel and __user pointers, but unfortunately when they did
so they didn't convert the LSM hook.
While there was/is no immediate risk by not converting the LSM hook,
it seems like this is a mistake waiting to happen so this patch
proactively does the LSM hook conversion.
- Convert vfs_getxattr_alloc() to return an int instead of a ssize_t
and cleanup the callers. Internally the function was never going to
return anything larger than an int and the callers were doing some
very odd things casting the return value; this patch fixes all that
and helps bring a bit of sanity to vfs_getxattr_alloc() and its
callers.
- More verbose, and helpful, LSM debug output when the system is booted
with "lsm.debug" on the command line. There are examples in the
commit description, but the quick summary is that this patch provides
better information about which LSMs are enabled and the ordering in
which they are processed.
- General comment and kernel-doc fixes and cleanups.
* tag 'lsm-pr-
20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lsm: Fix description of fs_context_parse_param
lsm: Add/fix return values in lsm_hooks.h and fix formatting
lsm: Clarify documentation of vm_enough_memory hook
reiserfs: Add missing calls to reiserfs_security_free()
lsm,fs: fix vfs_getxattr_alloc() return type and caller error paths
device_cgroup: Roll back to original exceptions after copy failure
LSM: Better reporting of actual LSMs at boot
lsm: make security_socket_getpeersec_stream() sockptr_t safe
audit: Fix some kernel-doc warnings
lsm: remove obsoleted comments for security hooks
fs: edit a comment made in bad taste
Linus Torvalds [Tue, 13 Dec 2022 17:32:05 +0000 (09:32 -0800)]
Merge tag 'selinux-pr-
20221212' of git://git./linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
"Two SELinux patches: one increases the sleep time on deprecated
functionality, and one removes the indirect calls in the sidtab
context conversion code"
* tag 'selinux-pr-
20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: remove the sidtab context conversion indirect calls
selinux: increase the deprecation sleep for checkreqprot and runtime disable
Linus Torvalds [Tue, 13 Dec 2022 17:20:06 +0000 (09:20 -0800)]
Merge tag 'audit-pr-
20221212' of git://git./linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
"Two performance oriented patches for the audit subsystem: one
consolidates similar code to gain some caching advantages, while the
other stores a value in a stack variable to avoid repeated lookups in
a loop.
The commit descriptions have more information, including some
before/after performance measurements"
* tag 'audit-pr-
20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: unify audit_filter_{uring(), inode_name(), syscall()}
audit: cache ctx->major in audit_filter_syscall()
Linus Torvalds [Tue, 13 Dec 2022 17:14:50 +0000 (09:14 -0800)]
Merge tag 'landlock-6.2-rc1' of git://git./linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This adds file truncation support to Landlock, contributed by Günther
Noack. As described by Günther [1], the goal of these patches is to
work towards a more complete coverage of file system operations that
are restrictable with Landlock.
The known set of currently unsupported file system operations in
Landlock is described at [2]. Out of the operations listed there,
truncate is the only one that modifies file contents, so these patches
should make it possible to prevent the direct modification of file
contents with Landlock.
The new LANDLOCK_ACCESS_FS_TRUNCATE access right covers both the
truncate(2) and ftruncate(2) families of syscalls, as well as open(2)
with the O_TRUNC flag. This includes usages of creat() in the case
where existing regular files are overwritten.
Additionally, this introduces a new Landlock security blob associated
with opened files, to track the available Landlock access rights at
the time of opening the file. This is in line with Unix's general
approach of checking the read and write permissions during open(), and
associating this previously checked authorization with the opened
file. An ongoing patch documents this use case [3].
In order to treat truncate(2) and ftruncate(2) calls differently in an
LSM hook, we split apart the existing security_path_truncate hook into
security_path_truncate (for truncation by path) and
security_file_truncate (for truncation of previously opened files)"
Link: https://lore.kernel.org/r/20221018182216.301684-1-gnoack3000@gmail.com
Link: https://www.kernel.org/doc/html/v6.1/userspace-api/landlock.html#filesystem-flags
Link: https://lore.kernel.org/r/20221209193813.972012-1-mic@digikod.net
* tag 'landlock-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
samples/landlock: Document best-effort approach for LANDLOCK_ACCESS_FS_REFER
landlock: Document Landlock's file truncation support
samples/landlock: Extend sample tool to support LANDLOCK_ACCESS_FS_TRUNCATE
selftests/landlock: Test ftruncate on FDs created by memfd_create(2)
selftests/landlock: Test FD passing from restricted to unrestricted processes
selftests/landlock: Locally define __maybe_unused
selftests/landlock: Test open() and ftruncate() in multiple scenarios
selftests/landlock: Test file truncation support
landlock: Support file truncation
landlock: Document init_layer_masks() helper
landlock: Refactor check_access_path_dual() into is_access_to_paths_allowed()
security: Create file_truncate hook from path_truncate hook
Linus Torvalds [Tue, 13 Dec 2022 17:05:19 +0000 (09:05 -0800)]
Merge tag 'dma-mapping-6.2-2022-12-13' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- reduce the swiotlb buffer size on allocation failure (Alexey
Kardashevskiy)
- clean up passing of bogus GFP flags to the dma-coherent allocator
(Christoph Hellwig)
* tag 'dma-mapping-6.2-2022-12-13' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: reject __GFP_COMP in dma_alloc_attrs
ALSA: memalloc: don't pass bogus GFP_ flags to dma_alloc_*
s390/ism: don't pass bogus GFP_ flags to dma_alloc_coherent
cnic: don't pass bogus GFP_ flags to dma_alloc_coherent
RDMA/qib: don't pass bogus GFP_ flags to dma_alloc_coherent
RDMA/hfi1: don't pass bogus GFP_ flags to dma_alloc_coherent
media: videobuf-dma-contig: use dma_mmap_coherent
swiotlb: reduce the swiotlb buffer size on allocation failure
Linus Torvalds [Tue, 13 Dec 2022 16:51:13 +0000 (08:51 -0800)]
Merge tag 'configfs-6.2-2022-12-13' of git://git.infradead.org/users/hch/configfs
Pull configfs updates from Christoph Hellwig:
- fix a memory leak in configfs_create_dir (Chen Zhongjin)
- remove mentions of committable items that were implemented (Bartosz
Golaszewski)
* tag 'configfs-6.2-2022-12-13' of git://git.infradead.org/users/hch/configfs:
configfs: remove mentions of committable items
configfs: fix possible memory leak in configfs_create_dir()
Linus Torvalds [Tue, 13 Dec 2022 16:44:41 +0000 (08:44 -0800)]
Merge tag 'nfs-for-6.2-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust
"Bugfixes:
- Fix NULL pointer dereference in the mount parser
- Fix memory stomp in decode_attr_security_label
- Fix credential leak in _nfs4_discover_trunking()
- Fix buffer leak in rpcrdma_req_create()
- Fix leaked socket in rpc_sockname()
- Fix deadlock between nfs4_open_recover_helper() and delegreturn
- Fix an Oops in nfs_d_automount()
- Fix potential race in nfs_call_unlink()
- Multiple fixes for the open context mode
- NFSv4.2 READ_PLUS fixes
- Fix a regression in which small rsize/wsize values are being
forbidden
- Fail client initialisation if the NFSv4.x state manager thread
can't run
- Avoid spurious warning of lost lock that is being unlocked.
- Ensure the initialisation of struct nfs4_label
Features and cleanups:
- Trigger the "ls -l" readdir heuristic sooner
- Clear the file access cache upon login to ensure supplementary
group info is in sync between the client and server
- pnfs: Fix up the logging of layout stateids
- NFSv4.2: Change the default KConfig value for READ_PLUS
- Use sysfs_emit() instead of scnprintf() where appropriate"
* tag 'nfs-for-6.2-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits)
NFSv4.2: Change the default KConfig value for READ_PLUS
NFSv4.x: Fail client initialisation if state manager thread can't run
fs: nfs: sysfs: use sysfs_emit() to instead of scnprintf()
NFS: use sysfs_emit() to instead of scnprintf()
NFS: Allow very small rsize & wsize again
NFSv4.2: Fix up READ_PLUS alignment
NFSv4.2: Set the correct size scratch buffer for decoding READ_PLUS
SUNRPC: Fix missing release socket in rpc_sockname()
xprtrdma: Fix regbuf data not freed in rpcrdma_req_create()
NFS: avoid spurious warning of lost lock that is being unlocked.
nfs: fix possible null-ptr-deref when parsing param
NFSv4: check FMODE_EXEC from open context mode in nfs4_opendata_access()
NFS: make sure open context mode have FMODE_EXEC when file open for exec
NFS4.x/pnfs: Fix up logging of layout stateids
NFS: Fix a race in nfs_call_unlink()
NFS: Fix an Oops in nfs_d_automount()
NFSv4: Fix a deadlock between nfs4_open_recover_helper() and delegreturn
NFSv4: Fix a credential leak in _nfs4_discover_trunking()
NFS: Trigger the "ls -l" readdir heuristic sooner
NFSv4.2: Fix initialisation of struct nfs4_label
...
Linus Torvalds [Tue, 13 Dec 2022 04:54:39 +0000 (20:54 -0800)]
Merge tag 'nfsd-6.2' of git://git./linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
"This release introduces support for the CB_RECALL_ANY operation. NFSD
can send this operation to request that clients return any delegations
they choose. The server uses this operation to handle low memory
scenarios or indicate to a client when that client has reached the
maximum number of delegations the server supports.
The NFSv4.2 READ_PLUS operation has been simplified temporarily whilst
support for sparse files in local filesystems and the VFS is improved.
Two major data structure fixes appear in this release:
- The nfs4_file hash table is replaced with a resizable hash table to
reduce the latency of NFSv4 OPEN operations.
- Reference counting in the NFSD filecache has been hardened against
races.
In furtherance of removing support for NFSv2 in a subsequent kernel
release, a new Kconfig option enables server-side support for NFSv2 to
be left out of a kernel build.
MAINTAINERS has been updated to indicate that changes to fs/exportfs
should go through the NFSD tree"
* tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (49 commits)
NFSD: Avoid clashing function prototypes
SUNRPC: Fix crasher in unwrap_integ_data()
SUNRPC: Make the svc_authenticate tracepoint conditional
NFSD: Use only RQ_DROPME to signal the need to drop a reply
SUNRPC: Clean up xdr_write_pages()
SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails
NFSD: add CB_RECALL_ANY tracepoints
NFSD: add delegation reaper to react to low memory condition
NFSD: add support for sending CB_RECALL_ANY
NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker
trace: Relocate event helper files
NFSD: pass range end to vfs_fsync_range() instead of count
lockd: fix file selection in nlmsvc_cancel_blocked
lockd: ensure we use the correct file descriptor when unlocking
lockd: set missing fl_flags field when retrieving args
NFSD: Use struct_size() helper in alloc_session()
nfsd: return error if nfs4_setacl fails
lockd: set other missing fields when unlocking files
NFSD: Add an nfsd_file_fsync tracepoint
sunrpc: svc: Remove an unused static function svc_ungetu32()
...
Linus Torvalds [Tue, 13 Dec 2022 04:47:51 +0000 (20:47 -0800)]
Merge tag 'for-6.2-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"This round there are a lot of cleanups and moved code so the diffstat
looks huge, otherwise there are some nice performance improvements and
an update to raid56 reliability.
User visible features:
- raid56 reliability vs performance trade off:
- fix destructive RMW for raid5 data (raid6 still needs work): do
full checksum verification for all data during RMW cycle, this
should prevent rewriting potentially corrupted data without
notice
- stripes are cached in memory which should reduce the performance
impact but still can hurt some workloads
- checksums are verified after repair again
- this is the last option without introducing additional features
(write intent bitmap, journal, another tree), the extra checksum
read/verification was supposed to be avoided by the original
implementation exactly for performance reasons but that caused
all the reliability problems
- discard=async by default for devices that support it
- implement emergency flush reserve to avoid almost all unnecessary
transaction aborts due to ENOSPC in cases where there are too many
delayed refs or delayed allocation
- skip block group synchronization if there's no change in used
bytes, can reduce transaction commit count for some workloads
Performance improvements:
- fiemap and lseek:
- overall speedup due to skipping unnecessary or duplicate
searches (-40% run time)
- cache some data structures and sharedness of extents (-30% run
time)
- send:
- faster backref resolution when finding clones
- cached leaf to root mapping for faster backref walking
- improved clone/sharing detection
- overall run time improvements (-70%)
Core:
- module initialization converted to a table of function pointers run
in a sequence
- preparation for fscrypt, extend passing file names across calls,
dir item can store encryption status
- raid56 updates:
- more accurate error tracking of sectors within stripe
- simplify recovery path and remove dedicated endio worker kthread
- simplify scrub call paths
- refactoring to support the extra data checksum verification
during RMW cycle
- tree block parentness checks consolidated and done at metadata read
time
- improved error handling
- cleanups:
- move a lot of code for better synchronization between kernel and
user space sources, split big files
- enum cleanups
- GFP flag cleanups
- header file cleanups, prototypes, dependencies
- redundant parameter cleanups
- inline extent handling simplifications
- inode parameter conversion
- data structure cleanups, reductions, renames, merges"
* tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (249 commits)
btrfs: print transaction aborted messages with an error level
btrfs: sync some cleanups from progs into uapi/btrfs.h
btrfs: do not BUG_ON() on ENOMEM when dropping extent items for a range
btrfs: fix extent map use-after-free when handling missing device in read_one_chunk
btrfs: remove outdated logic from overwrite_item() and add assertion
btrfs: unify overwrite_item() and do_overwrite_item()
btrfs: replace strncpy() with strscpy()
btrfs: fix uninitialized variable in find_first_clear_extent_bit
btrfs: fix uninitialized parent in insert_state
btrfs: add might_sleep() annotations
btrfs: add stack helpers for a few btrfs items
btrfs: add nr_global_roots to the super block definition
btrfs: remove BTRFS_LEAF_DATA_OFFSET
btrfs: add helpers for manipulating leaf items and data
btrfs: add eb to btrfs_node_key_ptr_offset
btrfs: pass the extent buffer for the btrfs_item_nr helpers
btrfs: move the csum helpers into ctree.h
btrfs: move eb offset helpers into extent_io.h
btrfs: move file_extent_item helpers into file-item.h
btrfs: move leaf_data_end into ctree.c
...
Linus Torvalds [Tue, 13 Dec 2022 04:41:50 +0000 (20:41 -0800)]
Merge tag 'dlm-6.2' of git://git./linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"These patches include the usual cleanups and minor fixes, the removal
of code that is no longer needed due to recent improvements, and
improvements to processing large volumes of messages during heavy
locking activity.
Summary:
- Misc code cleanup
- Fix a couple of socket handling bugs: a double release on an error
path and a data-ready race in an accept loop
- Remove code for resending dir-remove messages. This code is no
longer needed since the midcomms layer now ensures the messages are
resent if needed
- Add tracepoints for dlm messages
- Improve callback queueing by replacing the fixed array with a list
- Simplify the handling of a remove message followed by a lookup
message by sending both without releasing a spinlock in between
- Improve the concurrency of sending and receiving messages by
holding locks for a shorter time, and changing how workqueues are
used
- Remove old code for shutting down sockets, which is no longer
needed with the reliable connection handling that was recently
added"
* tag 'dlm-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: (37 commits)
fs: dlm: fix building without lockdep
fs: dlm: parallelize lowcomms socket handling
fs: dlm: don't init error value
fs: dlm: use saved sk_error_report()
fs: dlm: use sock2con without checking null
fs: dlm: remove dlm_node_addrs lookup list
fs: dlm: don't put dlm_local_addrs on heap
fs: dlm: cleanup listen sock handling
fs: dlm: remove socket shutdown handling
fs: dlm: use listen sock as dlm running indicator
fs: dlm: use list_first_entry_or_null
fs: dlm: remove twice INIT_WORK
fs: dlm: add midcomms init/start functions
fs: dlm: add dst nodeid for msg tracing
fs: dlm: rename seq to h_seq for msg tracing
fs: dlm: rename DLM_IFL_NEED_SCHED to DLM_IFL_CB_PENDING
fs: dlm: ast do WARN_ON_ONCE() on hotpath
fs: dlm: drop lkb ref in bug case
fs: dlm: avoid false-positive checker warning
fs: dlm: use WARN_ON_ONCE() instead of WARN_ON()
...
Linus Torvalds [Tue, 13 Dec 2022 04:38:28 +0000 (20:38 -0800)]
Merge tag 'jfs-6.2' of https://github.com/kleikamp/linux-shaggy
Pull jfs updates from David Kleikamp:
"Assorted JFS fixes for 6.2"
* tag 'jfs-6.2' of https://github.com/kleikamp/linux-shaggy:
jfs: makes diUnmount/diMount in jfs_mount_rw atomic
jfs: Fix a typo in function jfs_umount
fs: jfs: fix shift-out-of-bounds in dbDiscardAG
jfs: Fix fortify moan in symlink
jfs: remove redundant assignments to ipaimap and ipaimap2
jfs: remove unused declarations for jfs
fs/jfs/jfs_xattr.h: Fix spelling typo in comment
MAINTAINERS: git://github -> https://github.com for kleikamp
fs/jfs: replace ternary operator with min_t()
fs: jfs: fix shift-out-of-bounds in dbAllocAG
Linus Torvalds [Tue, 13 Dec 2022 04:32:50 +0000 (20:32 -0800)]
Merge tag 'fixes_for_v6.2-rc1' of git://git./linux/kernel/git/jack/linux-fs
Pull udf and ext2 fixes from Jan Kara:
- a couple of smaller cleanups and fixes for ext2
- fixes of a data corruption issues in udf when handling holes and
preallocation extents
- fixes and cleanups of several smaller issues in udf
- add maintainer entry for isofs
* tag 'fixes_for_v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fix extending file within last block
udf: Discard preallocation before extending file with a hole
udf: Do not bother looking for prealloc extents if i_lenExtents matches i_size
udf: Fix preallocation discarding at indirect extent boundary
udf: Increase UDF_MAX_READ_VERSION to 0x0260
fs/ext2: Fix code indentation
ext2: unbugger ext2_empty_dir()
udf: remove ->writepage
ext2: remove ->writepage
ext2: Don't flush page immediately for DIRSYNC directories
ext2: Fix some kernel-doc warnings
maintainers: Add ISOFS entry
udf: Avoid double brelse() in udf_rename()
fs: udf: Optimize udf_free_in_core_inode and udf_find_fileset function
Linus Torvalds [Tue, 13 Dec 2022 04:29:45 +0000 (20:29 -0800)]
Merge tag 'fs.xattr.simple.noaudit.v6.2' of git://git./linux/kernel/git/vfs/idmapping
Pull xattr audit fix from Seth Forshee:
"This is a single patch to remove auditing of the capability check in
simple_xattr_list().
This check is done to check whether trusted xattrs should be included
by listxattr(2). SELinux will normally log a denial when capable() is
called and the task's SELinux context doesn't have the corresponding
capability permission allowed, which can end up spamming the log.
Since a failed check here cannot be used to infer malicious intent,
auditing is of no real value, and it makes sense to stop auditing the
capability check"
* tag 'fs.xattr.simple.noaudit.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
fs: don't audit the capability check in simple_xattr_list()
Linus Torvalds [Tue, 13 Dec 2022 04:24:51 +0000 (20:24 -0800)]
Merge tag 'fs.idmapped.squashfs.v6.2' of git://git./linux/kernel/git/vfs/idmapping
Pull squashfs update from Seth Forshee:
"This is a simple patch to enable idmapped mounts for squashfs.
All functionality squashfs needs to support idmapped mounts is already
implemented in generic VFS code, so all that is needed is to set
FS_ALLOW_IDMAP in fs_flags"
* tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
squashfs: enable idmapped mounts
Linus Torvalds [Tue, 13 Dec 2022 04:22:09 +0000 (20:22 -0800)]
Merge tag 'fuse-update-6.2' of git://git./linux/kernel/git/mszeredi/fuse
Pull fuse update from Miklos Szeredi:
- Allow some write requests to proceed in parallel
- Fix a performance problem with allow_sys_admin_access
- Add a special kind of invalidation that doesn't immediately purge
submounts
- On revalidation treat the target of rename(RENAME_NOREPLACE) the same
as open(O_EXCL)
- Use type safe helpers for some mnt_userns transformations
* tag 'fuse-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: Rearrange fuse_allow_current_process checks
fuse: allow non-extending parallel direct writes on the same file
fuse: remove the unneeded result variable
fuse: port to vfs{g,u}id_t and associated helpers
fuse: Remove user_ns check for FUSE_DEV_IOC_CLONE
fuse: always revalidate rename target dentry
fuse: add "expire only" mode to FUSE_NOTIFY_INVAL_ENTRY
fs/fuse: Replace kmap() with kmap_local_page()
Linus Torvalds [Tue, 13 Dec 2022 04:18:26 +0000 (20:18 -0800)]
Merge tag 'ovl-update-6.2' of git://git./linux/kernel/git/mszeredi/vfs
Pull overlayfs update from Miklos Szeredi:
- Fix a couple of bugs found by syzbot
- Don't ingore some open flags set by fcntl(F_SETFL)
- Fix failure to create a hard link in certain cases
- Use type safe helpers for some mnt_userns transformations
- Improve performance of mount
- Misc cleanups
* tag 'ovl-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: Kconfig: Fix spelling mistake "undelying" -> "underlying"
ovl: use inode instead of dentry where possible
ovl: Add comment on upperredirect reassignment
ovl: use plain list filler in indexdir and workdir cleanup
ovl: do not reconnect upper index records in ovl_indexdir_cleanup()
ovl: fix comment typos
ovl: port to vfs{g,u}id_t and associated helpers
ovl: Use ovl mounter's fsuid and fsgid in ovl_link()
ovl: Use "buf" flexible array for memcpy() destination
ovl: update ->f_iocb_flags when ovl_change_flags() modifies ->f_flags
ovl: fix use inode directly in rcu-walk mode
Linus Torvalds [Tue, 13 Dec 2022 04:14:04 +0000 (20:14 -0800)]
Merge tag 'erofs-for-6.2-rc1' of git://git./linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"In this cycle, large folios are now enabled in the iomap/fscache mode
for uncompressed files first. In order to do that, we've also cleaned
up better interfaces between erofs and fscache, which are acked by
fscache/netfs folks and included in this pull request.
Other than that, there are random fixes around erofs over fscache and
crafted images by syzbot, minor cleanups and documentation updates.
Summary:
- Enable large folios for iomap/fscache mode
- Avoid sysfs warning due to mounting twice with the same fsid and
domain_id in fscache mode
- Refine fscache interface among erofs, fscache, and cachefiles
- Use kmap_local_page() only for metabuf
- Fixes around crafted images found by syzbot
- Minor cleanups and documentation updates"
* tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: validate the extent length for uncompressed pclusters
erofs: fix missing unmap if z_erofs_get_extent_compressedlen() fails
erofs: Fix pcluster memleak when its block address is zero
erofs: use kmap_local_page() only for erofs_bread()
erofs: enable large folios for fscache mode
erofs: support large folios for fscache mode
erofs: switch to prepare_ondemand_read() in fscache mode
fscache,cachefiles: add prepare_ondemand_read() callback
erofs: clean up cached I/O strategies
erofs: update documentation
erofs: check the uniqueness of fsid in shared domain in advance
erofs: enable large folios for iomap mode
Linus Torvalds [Tue, 13 Dec 2022 04:06:35 +0000 (20:06 -0800)]
Merge tag 'fsverity-for-linus' of git://git./fs/fscrypt/fscrypt
Pull fsverity updates from Eric Biggers:
"The main change this cycle is to stop using the PG_error flag to track
verity failures, and instead just track failures at the bio level.
This follows a similar fscrypt change that went into 6.1, and it is a
step towards freeing up PG_error for other uses.
There's also one other small cleanup"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fsverity: simplify fsverity_get_digest()
fsverity: stop using PG_error to track error status
Linus Torvalds [Tue, 13 Dec 2022 04:03:50 +0000 (20:03 -0800)]
Merge tag 'fscrypt-for-linus' of git://git./fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
"This release adds SM4 encryption support, contributed by Tianjia
Zhang. SM4 is a Chinese block cipher that is an alternative to AES.
I recommend against using SM4, but (according to Tianjia) some people
are being required to use it. Since SM4 has been turning up in many
other places (crypto API, wireless, TLS, OpenSSL, ARMv8 CPUs, etc.),
it hasn't been very controversial, and some people have to use it, I
don't think it would be fair for me to reject this optional feature.
Besides the above, there are a couple cleanups"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: add additional documentation for SM4 support
fscrypt: remove unused Speck definitions
fscrypt: Add SM4 XTS/CTS symmetric algorithm support
blk-crypto: Add support for SM4-XTS blk crypto mode
fscrypt: add comment for fscrypt_valid_enc_modes_v1()
fscrypt: pass super_block to fscrypt_put_master_key_activeref()
Linus Torvalds [Tue, 13 Dec 2022 03:56:37 +0000 (19:56 -0800)]
Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"A large number of cleanups and bug fixes, with many of the bug fixes
found by Syzbot and fuzzing. (Many of the bug fixes involve less-used
ext4 features such as fast_commit, inline_data and bigalloc)
In addition, remove the writepage function for ext4, since the
medium-term plan is to remove ->writepage() entirely. (The VM doesn't
need or want writepage() for writeback, since it is fine with
->writepages() so long as ->migrate_folio() is implemented)"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits)
ext4: fix reserved cluster accounting in __es_remove_extent()
ext4: fix inode leak in ext4_xattr_inode_create() on an error path
ext4: allocate extended attribute value in vmalloc area
ext4: avoid unaccounted block allocation when expanding inode
ext4: initialize quota before expanding inode in setproject ioctl
ext4: stop providing .writepage hook
mm: export buffer_migrate_folio_norefs()
ext4: switch to using write_cache_pages() for data=journal writeout
jbd2: switch jbd2_submit_inode_data() to use fs-provided hook for data writeout
ext4: switch to using ext4_do_writepages() for ordered data writeout
ext4: move percpu_rwsem protection into ext4_writepages()
ext4: provide ext4_do_writepages()
ext4: add support for writepages calls that cannot map blocks
ext4: drop pointless IO submission from ext4_bio_write_page()
ext4: remove nr_submitted from ext4_bio_write_page()
ext4: move keep_towrite handling to ext4_bio_write_page()
ext4: handle redirtying in ext4_bio_write_page()
ext4: fix kernel BUG in 'ext4_write_inline_data_end()'
ext4: make ext4_mb_initialize_context return void
ext4: fix deadlock due to mbcache entry corruption
...
Linus Torvalds [Tue, 13 Dec 2022 03:30:18 +0000 (19:30 -0800)]
Merge tag 'fs.idmapped.mnt_idmap.v6.2' of git://git./linux/kernel/git/vfs/idmapping
Pull idmapping updates from Christian Brauner:
"Last cycle we've already made the interaction with idmapped mounts
more robust and type safe by introducing the vfs{g,u}id_t type. This
cycle we concluded the conversion and removed the legacy helpers.
Currently we still pass around the plain namespace that was attached
to a mount. This is in general pretty convenient but it makes it easy
to conflate namespaces that are relevant on the filesystem - with
namespaces that are relevent on the mount level. Especially for
filesystem developers without detailed knowledge in this area this can
be a potential source for bugs.
Instead of passing the plain namespace we introduce a dedicated type
struct mnt_idmap and replace the pointer with a pointer to a struct
mnt_idmap. There are no semantic or size changes for the mount struct
caused by this.
We then start converting all places aware of idmapped mounts to rely
on struct mnt_idmap. Once the conversion is done all helpers down to
the really low-level make_vfs{g,u}id() and from_vfs{g,u}id() will take
a struct mnt_idmap argument instead of two namespace arguments. This
way it becomes impossible to conflate the two removing and thus
eliminating the possibility of any bugs. Fwiw, I fixed some issues in
that area a while ago in ntfs3 and ksmbd in the past. Afterwards only
low-level code can ultimately use the associated namespace for any
permission checks. Even most of the vfs can be completely obivious
about this ultimately and filesystems will never interact with it in
any form in the future.
A struct mnt_idmap currently encompasses a simple refcount and pointer
to the relevant namespace the mount is idmapped to. If a mount isn't
idmapped then it will point to a static nop_mnt_idmap and if it
doesn't that it is idmapped. As usual there are no allocations or
anything happening for non-idmapped mounts. Everthing is carefully
written to be a nop for non-idmapped mounts as has always been the
case.
If an idmapped mount is created a struct mnt_idmap is allocated and a
reference taken on the relevant namespace. Each mount that gets
idmapped or inherits the idmap simply bumps the reference count on
struct mnt_idmap. Just a reminder that we only allow a mount to change
it's idmapping a single time and only if it hasn't already been
attached to the filesystems and has no active writers.
The actual changes are fairly straightforward but this will have huge
benefits for maintenance and security in the long run even if it
causes some churn.
Note that this also makes it possible to extend struct mount_idmap in
the future. For example, it would be possible to place the namespace
pointer in an anonymous union together with an idmapping struct. This
would allow us to expose an api to userspace that would let it specify
idmappings directly instead of having to go through the detour of
setting up namespaces at all"
* tag 'fs.idmapped.mnt_idmap.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
acl: conver higher-level helpers to rely on mnt_idmap
fs: introduce dedicated idmap type for mounts
Linus Torvalds [Tue, 13 Dec 2022 03:20:05 +0000 (19:20 -0800)]
Merge tag 'fs.vfsuid.conversion.v6.2' of git://git./linux/kernel/git/vfs/idmapping
Pull vfsuid updates from Christian Brauner:
"Last cycle we introduced the vfs{g,u}id_t types and associated helpers
to gain type safety when dealing with idmapped mounts. That initial
work already converted a lot of places over but there were still some
left,
This converts all remaining places that still make use of non-type
safe idmapping helpers to rely on the new type safe vfs{g,u}id based
helpers.
Afterwards it removes all the old non-type safe helpers"
* tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
fs: remove unused idmapping helpers
ovl: port to vfs{g,u}id_t and associated helpers
fuse: port to vfs{g,u}id_t and associated helpers
ima: use type safe idmapping helpers
apparmor: use type safe idmapping helpers
caps: use type safe idmapping helpers
fs: use type safe idmapping helpers
mnt_idmapping: add missing helpers
Linus Torvalds [Tue, 13 Dec 2022 03:03:10 +0000 (19:03 -0800)]
Merge tag 'fs.ovl.setgid.v6.2' of git://git./linux/kernel/git/vfs/idmapping
Pull setgid inheritance updates from Christian Brauner:
"This contains the work to make setgid inheritance consistent between
modifying a file and when changing ownership or mode as this has been
a repeated source of very subtle bugs. The gist is that we perform the
same permission checks in the write path as we do in the ownership and
mode changing paths after this series where we're currently doing
different things.
We've already made setgid inheritance a lot more consistent and
reliable in the last releases by moving setgid stripping from the
individual filesystems up into the vfs. This aims to make the logic
even more consistent and easier to understand and also to fix
long-standing overlayfs setgid inheritance bugs. Miklos was nice
enough to just let me carry the trivial overlayfs patches from Amir
too.
Below is a more detailed explanation how the current difference in
setgid handling lead to very subtle bugs exemplified via overlayfs
which is a victim of the current rules. I hope this explains why I
think taking the regression risk here is worth it.
A long while ago I found a few setgid inheritance bugs in overlayfs in
the write path in certain conditions. Amir recently picked this back
up in [1] and I jumped on board to fix this more generally.
On the surface all that overlayfs would need to fix setgid inheritance
would be to call file_remove_privs() or file_modified() but actually
that isn't enough because the setgid inheritance api is wildly
inconsistent in that area.
Before this pr setgid stripping in file_remove_privs()'s old
should_remove_suid() helper was inconsistent with other parts of the
vfs. Specifically, it only raises ATTR_KILL_SGID if the inode is
S_ISGID and S_IXGRP but not if the inode isn't in the caller's groups
and the caller isn't privileged over the inode although we require
this already in setattr_prepare() and setattr_copy() and so all
filesystem implement this requirement implicitly because they have to
use setattr_{prepare,copy}() anyway.
But the inconsistency shows up in setgid stripping bugs for overlayfs
in xfstests (e.g., generic/673, generic/683, generic/685, generic/686,
generic/687). For example, we test whether suid and setgid stripping
works correctly when performing various write-like operations as an
unprivileged user (fallocate, reflink, write, etc.):
echo "Test 1 - qa_user, non-exec file $verb"
setup_testfile
chmod a+rws $junk_file
commit_and_check "$qa_user" "$verb" 64k 64k
The test basically creates a file with 6666 permissions. While the
file has the S_ISUID and S_ISGID bits set it does not have the S_IXGRP
set.
On a regular filesystem like xfs what will happen is:
sys_fallocate()
-> vfs_fallocate()
-> xfs_file_fallocate()
-> file_modified()
-> __file_remove_privs()
-> dentry_needs_remove_privs()
-> should_remove_suid()
-> __remove_privs()
newattrs.ia_valid = ATTR_FORCE | kill;
-> notify_change()
-> setattr_copy()
In should_remove_suid() we can see that ATTR_KILL_SUID is raised
unconditionally because the file in the test has S_ISUID set.
But we also see that ATTR_KILL_SGID won't be set because while the
file is S_ISGID it is not S_IXGRP (see above) which is a condition for
ATTR_KILL_SGID being raised.
So by the time we call notify_change() we have attr->ia_valid set to
ATTR_KILL_SUID | ATTR_FORCE.
Now notify_change() sees that ATTR_KILL_SUID is set and does:
ia_valid = attr->ia_valid |= ATTR_MODE
attr->ia_mode = (inode->i_mode & ~S_ISUID);
which means that when we call setattr_copy() later we will definitely
update inode->i_mode. Note that attr->ia_mode still contains S_ISGID.
Now we call into the filesystem's ->setattr() inode operation which
will end up calling setattr_copy(). Since ATTR_MODE is set we will
hit:
if (ia_valid & ATTR_MODE) {
umode_t mode = attr->ia_mode;
vfsgid_t vfsgid = i_gid_into_vfsgid(mnt_userns, inode);
if (!vfsgid_in_group_p(vfsgid) &&
!capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
mode &= ~S_ISGID;
inode->i_mode = mode;
}
and since the caller in the test is neither capable nor in the group
of the inode the S_ISGID bit is stripped.
But assume the file isn't suid then ATTR_KILL_SUID won't be raised
which has the consequence that neither the setgid nor the suid bits
are stripped even though it should be stripped because the inode isn't
in the caller's groups and the caller isn't privileged over the inode.
If overlayfs is in the mix things become a bit more complicated and
the bug shows up more clearly.
When e.g., ovl_setattr() is hit from ovl_fallocate()'s call to
file_remove_privs() then ATTR_KILL_SUID and ATTR_KILL_SGID might be
raised but because the check in notify_change() is questioning the
ATTR_KILL_SGID flag again by requiring S_IXGRP for it to be stripped
the S_ISGID bit isn't removed even though it should be stripped:
sys_fallocate()
-> vfs_fallocate()
-> ovl_fallocate()
-> file_remove_privs()
-> dentry_needs_remove_privs()
-> should_remove_suid()
-> __remove_privs()
newattrs.ia_valid = ATTR_FORCE | kill;
-> notify_change()
-> ovl_setattr()
/* TAKE ON MOUNTER'S CREDS */
-> ovl_do_notify_change()
-> notify_change()
/* GIVE UP MOUNTER'S CREDS */
/* TAKE ON MOUNTER'S CREDS */
-> vfs_fallocate()
-> xfs_file_fallocate()
-> file_modified()
-> __file_remove_privs()
-> dentry_needs_remove_privs()
-> should_remove_suid()
-> __remove_privs()
newattrs.ia_valid = attr_force | kill;
-> notify_change()
The fix for all of this is to make file_remove_privs()'s
should_remove_suid() helper perform the same checks as we already
require in setattr_prepare() and setattr_copy() and have
notify_change() not pointlessly requiring S_IXGRP again. It doesn't
make any sense in the first place because the caller must calculate
the flags via should_remove_suid() anyway which would raise
ATTR_KILL_SGID
Note that some xfstests will now fail as these patches will cause the
setgid bit to be lost in certain conditions for unprivileged users
modifying a setgid file when they would've been kept otherwise. I
think this risk is worth taking and I explained and mentioned this
multiple times on the list [2].
Enforcing the rules consistently across write operations and
chmod/chown will lead to losing the setgid bit in cases were it
might've been retained before.
While I've mentioned this a few times but it's worth repeating just to
make sure that this is understood. For the sake of maintainability,
consistency, and security this is a risk worth taking.
If we really see regressions for workloads the fix is to have special
setgid handling in the write path again with different semantics from
chmod/chown and possibly additional duct tape for overlayfs. I'll
update the relevant xfstests with if you should decide to merge this
second setgid cleanup.
Before that people should be aware that there might be failures for
fstests where unprivileged users modify a setgid file"
Link: https://lore.kernel.org/linux-fsdevel/20221003123040.900827-1-amir73il@gmail.com
Link: https://lore.kernel.org/linux-fsdevel/20221122142010.zchf2jz2oymx55qi@wittgenstein
* tag 'fs.ovl.setgid.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
fs: use consistent setgid checks in is_sxid()
ovl: remove privs in ovl_fallocate()
ovl: remove privs in ovl_copyfile()
attr: use consistent sgid stripping checks
attr: add setattr_should_drop_sgid()
fs: move should_remove_suid()
attr: add in_group_or_capable()
Linus Torvalds [Tue, 13 Dec 2022 02:46:39 +0000 (18:46 -0800)]
Merge tag 'fs.acl.rework.v6.2' of git://git./linux/kernel/git/vfs/idmapping
Pull VFS acl updates from Christian Brauner:
"This contains the work that builds a dedicated vfs posix acl api.
The origins of this work trace back to v5.19 but it took quite a while
to understand the various filesystem specific implementations in
sufficient detail and also come up with an acceptable solution.
As we discussed and seen multiple times the current state of how posix
acls are handled isn't nice and comes with a lot of problems: The
current way of handling posix acls via the generic xattr api is error
prone, hard to maintain, and type unsafe for the vfs until we call
into the filesystem's dedicated get and set inode operations.
It is already the case that posix acls are special-cased to death all
the way through the vfs. There are an uncounted number of hacks that
operate on the uapi posix acl struct instead of the dedicated vfs
struct posix_acl. And the vfs must be involved in order to interpret
and fixup posix acls before storing them to the backing store, caching
them, reporting them to userspace, or for permission checking.
Currently a range of hacks and duct tape exist to make this work. As
with most things this is really no ones fault it's just something that
happened over time. But the code is hard to understand and difficult
to maintain and one is constantly at risk of introducing bugs and
regressions when having to touch it.
Instead of continuing to hack posix acls through the xattr handlers
this series builds a dedicated posix acl api solely around the get and
set inode operations.
Going forward, the vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl()
helpers must be used in order to interact with posix acls. They
operate directly on the vfs internal struct posix_acl instead of
abusing the uapi posix acl struct as we currently do. In the end this
removes all of the hackiness, makes the codepaths easier to maintain,
and gets us type safety.
This series passes the LTP and xfstests suites without any
regressions. For xfstests the following combinations were tested:
- xfs
- ext4
- btrfs
- overlayfs
- overlayfs on top of idmapped mounts
- orangefs
- (limited) cifs
There's more simplifications for posix acls that we can make in the
future if the basic api has made it.
A few implementation details:
- The series makes sure to retain exactly the same security and
integrity module permission checks. Especially for the integrity
modules this api is a win because right now they convert the uapi
posix acl struct passed to them via a void pointer into the vfs
struct posix_acl format to perform permission checking on the mode.
There's a new dedicated security hook for setting posix acls which
passes the vfs struct posix_acl not a void pointer. Basing checking
on the posix acl stored in the uapi format is really unreliable.
The vfs currently hacks around directly in the uapi struct storing
values that frankly the security and integrity modules can't
correctly interpret as evidenced by bugs we reported and fixed in
this area. It's not necessarily even their fault it's just that the
format we provide to them is sub optimal.
- Some filesystems like 9p and cifs need access to the dentry in
order to get and set posix acls which is why they either only
partially or not even at all implement get and set inode
operations. For example, cifs allows setxattr() and getxattr()
operations but doesn't allow permission checking based on posix
acls because it can't implement a get acl inode operation.
Thus, this patch series updates the set acl inode operation to take
a dentry instead of an inode argument. However, for the get acl
inode operation we can't do this as the old get acl method is
called in e.g., generic_permission() and inode_permission(). These
helpers in turn are called in various filesystem's permission inode
operation. So passing a dentry argument to the old get acl inode
operation would amount to passing a dentry to the permission inode
operation which we shouldn't and probably can't do.
So instead of extending the existing inode operation Christoph
suggested to add a new one. He also requested to ensure that the
get and set acl inode operation taking a dentry are consistently
named. So for this version the old get acl operation is renamed to
->get_inode_acl() and a new ->get_acl() inode operation taking a
dentry is added. With this we can give both 9p and cifs get and set
acl inode operations and in turn remove their complex custom posix
xattr handlers.
In the future I hope to get rid of the inode method duplication but
it isn't like we have never had this situation. Readdir is just one
example. And frankly, the overall gain in type safety and the more
pleasant api wise are simply too big of a benefit to not accept
this duplication for a while.
- We've done a full audit of every codepaths using variant of the
current generic xattr api to get and set posix acls and
surprisingly it isn't that many places. There's of course always a
chance that we might have missed some and if so I'm sure we'll find
them soon enough.
The crucial codepaths to be converted are obviously stacking
filesystems such as ecryptfs and overlayfs.
For a list of all callers currently using generic xattr api helpers
see [2] including comments whether they support posix acls or not.
- The old vfs generic posix acl infrastructure doesn't obey the
create and replace semantics promised on the setxattr(2) manpage.
This patch series doesn't address this. It really is something we
should revisit later though.
The patches are roughly organized as follows:
(1) Change existing set acl inode operation to take a dentry
argument (Intended to be a non-functional change)
(2) Rename existing get acl method (Intended to be a non-functional
change)
(3) Implement get and set acl inode operations for filesystems that
couldn't implement one before because of the missing dentry.
That's mostly 9p and cifs (Intended to be a non-functional
change)
(4) Build posix acl api, i.e., add vfs_get_acl(), vfs_remove_acl(),
and vfs_set_acl() including security and integrity hooks
(Intended to be a non-functional change)
(5) Implement get and set acl inode operations for stacking
filesystems (Intended to be a non-functional change)
(6) Switch posix acl handling in stacking filesystems to new posix
acl api now that all filesystems it can stack upon support it.
(7) Switch vfs to new posix acl api (semantical change)
(8) Remove all now unused helpers
(9) Additional regression fixes reported after we merged this into
linux-next
Thanks to Seth for a lot of good discussion around this and
encouragement and input from Christoph"
* tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (36 commits)
posix_acl: Fix the type of sentinel in get_acl
orangefs: fix mode handling
ovl: call posix_acl_release() after error checking
evm: remove dead code in evm_inode_set_acl()
cifs: check whether acl is valid early
acl: make vfs_posix_acl_to_xattr() static
acl: remove a slew of now unused helpers
9p: use stub posix acl handlers
cifs: use stub posix acl handlers
ovl: use stub posix acl handlers
ecryptfs: use stub posix acl handlers
evm: remove evm_xattr_acl_change()
xattr: use posix acl api
ovl: use posix acl api
ovl: implement set acl method
ovl: implement get acl method
ecryptfs: implement set acl method
ecryptfs: implement get acl method
ksmbd: use vfs_remove_acl()
acl: add vfs_remove_acl()
...
Linus Torvalds [Tue, 13 Dec 2022 02:38:47 +0000 (18:38 -0800)]
Merge tag 'pull-misc' of git://git./linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"misc pile"
* tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: sysv: Fix sysv_nblocks() returns wrong value
get rid of INT_LIMIT, use type_max() instead
btrfs: replace INT_LIMIT(loff_t) with OFFSET_MAX
fs: simplify vfs_get_super
fs: drop useless condition from inode_needs_update_time
Linus Torvalds [Tue, 13 Dec 2022 02:36:12 +0000 (18:36 -0800)]
Merge tag 'pull-namespace' of git://git./linux/kernel/git/viro/vfs
Pull namespace fix from Al Viro:
"Fix weird corner case in copy_mnt_ns()"
* tag 'pull-namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
copy_mnt_ns(): handle a corner case (overmounted mntns bindings) saner
Linus Torvalds [Tue, 13 Dec 2022 02:29:54 +0000 (18:29 -0800)]
Merge tag 'pull-iov_iter' of git://git./linux/kernel/git/viro/vfs
Pull iov_iter updates from Al Viro:
"iov_iter work; most of that is about getting rid of direction
misannotations and (hopefully) preventing more of the same for the
future"
* tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
use less confusing names for iov_iter direction initializers
iov_iter: saner checks for attempt to copy to/from iterator
[xen] fix "direction" argument of iov_iter_kvec()
[vhost] fix 'direction' argument of iov_iter_{init,bvec}()
[target] fix iov_iter_bvec() "direction" argument
[s390] memcpy_real(): WRITE is "data source", not destination...
[s390] zcore: WRITE is "data source", not destination...
[infiniband] READ is "data destination", not source...
[fsi] WRITE is "data source", not destination...
[s390] copy_oldmem_kernel() - WRITE is "data source", not destination
csum_and_copy_to_iter(): handle ITER_DISCARD
get rid of unlikely() on page_copy_sane() calls
Linus Torvalds [Tue, 13 Dec 2022 02:27:06 +0000 (18:27 -0800)]
Merge tag 'pull-alpha' of git://git./linux/kernel/git/viro/vfs
Pull alpha updates from Al Viro:
"Alpha architecture cleanups and fixes.
One thing *not* included is lazy FPU switching stuff - this pile is
just the straightforward stuff"
* tag 'pull-alpha' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
alpha: ret_from_fork can go straight to ret_to_user
alpha: syscall exit cleanup
alpha: fix handling of a3 on straced syscalls
alpha: fix syscall entry in !AUDUT_SYSCALL case
alpha: _TIF_ALLWORK_MASK is unused
alpha: fix TIF_NOTIFY_SIGNAL handling
Linus Torvalds [Tue, 13 Dec 2022 02:18:34 +0000 (18:18 -0800)]
Merge tag 'pull-elfcore' of git://git./linux/kernel/git/viro/vfs
Pull elf coredumping updates from Al Viro:
"Unification of regset and non-regset sides of ELF coredump handling.
Collecting per-thread register values is the only thing that needs to
be ifdefed there..."
* tag 'pull-elfcore' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
[elf] get rid of get_note_info_size()
[elf] unify regset and non-regset cases
[elf][non-regset] use elf_core_copy_task_regs() for dumper as well
[elf][non-regset] uninline elf_core_copy_task_fpregs() (and lose pt_regs argument)
elf_core_copy_task_regs(): task_pt_regs is defined everywhere
[elf][regset] simplify thread list handling in fill_note_info()
[elf][regset] clean fill_note_info() a bit
kill extern of vsyscall32_sysctl
kill coredump_params->regs
kill signal_pt_regs()
Linus Torvalds [Tue, 13 Dec 2022 01:28:58 +0000 (17:28 -0800)]
Merge tag 'mm-nonmm-stable-2022-12-12' of git://git./linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- A ptrace API cleanup series from Sergey Shtylyov
- Fixes and cleanups for kexec from ye xingchen
- nilfs2 updates from Ryusuke Konishi
- squashfs feature work from Xiaoming Ni: permit configuration of the
filesystem's compression concurrency from the mount command line
- A series from Akinobu Mita which addresses bound checking errors when
writing to debugfs files
- A series from Yang Yingliang to address rapidio memory leaks
- A series from Zheng Yejian to address possible overflow errors in
encode_comp_t()
- And a whole shower of singleton patches all over the place
* tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits)
ipc: fix memory leak in init_mqueue_fs()
hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount
rapidio: devices: fix missing put_device in mport_cdev_open
kcov: fix spelling typos in comments
hfs: Fix OOB Write in hfs_asc2mac
hfs: fix OOB Read in __hfs_brec_find
relay: fix type mismatch when allocating memory in relay_create_buf()
ocfs2: always read both high and low parts of dinode link count
io-mapping: move some code within the include guarded section
kernel: kcsan: kcsan_test: build without structleak plugin
mailmap: update email for Iskren Chernev
eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD
rapidio: fix possible UAF when kfifo_alloc() fails
relay: use strscpy() is more robust and safer
cpumask: limit visibility of FORCE_NR_CPUS
acct: fix potential integer overflow in encode_comp_t()
acct: fix accuracy loss for input value of encode_comp_t()
linux/init.h: include <linux/build_bug.h> and <linux/stringify.h>
rapidio: rio: fix possible name leak in rio_register_mport()
rapidio: fix possible name leaks when rio_add_device() fails
...
Linus Torvalds [Tue, 13 Dec 2022 01:18:50 +0000 (17:18 -0800)]
Merge tag 'docs-6.2' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"This was a not-too-busy cycle for documentation; highlights include:
- The beginnings of a set of translations into Spanish, headed up by
Carlos Bilbao
- More Chinese translations
- A change to the Sphinx "alabaster" theme by default for HTML
generation.
Unlike the previous default (Read the Docs), alabaster is shipped
with Sphinx by default, reducing the number of other dependencies
that need to be installed. It also (IMO) produces a cleaner and
more readable result.
- The ability to render the documentation into the texinfo format
(something Sphinx could always do, we just never wired it up until
now)
Plus the usual collection of typo fixes, build-warning fixes, and
minor updates"
* tag 'docs-6.2' of git://git.lwn.net/linux: (67 commits)
Documentation/features: Use loongarch instead of loong
Documentation/features-refresh.sh: Only sed the beginning "arch" of ARCH_DIR
docs/zh_CN: Fix '.. only::' directive's expression
docs/sp_SP: Add memory-barriers.txt Spanish translation
docs/zh_CN/LoongArch: Update links of LoongArch ISA Vol1 and ELF psABI
docs/LoongArch: Update links of LoongArch ISA Vol1 and ELF psABI
Documentation/features: Update feature lists for 6.1
Documentation: Fixed a typo in bootconfig.rst
docs/sp_SP: Add process coding-style translation
docs/sp_SP: Add kernel-docs.rst Spanish translation
docs: Create translations/sp_SP/process/, move submitting-patches.rst
docs: Add book to process/kernel-docs.rst
docs: Retire old resources from kernel-docs.rst
docs: Update maintainer of kernel-docs.rst
Documentation: riscv: Document the sv57 VM layout
Documentation: USB: correct possessive "its" usage
math64: fix kernel-doc return value warnings
math64: add kernel-doc for DIV64_U64_ROUND_UP
math64: favor kernel-doc from header files
doc: add texinfodocs and infodocs targets
...
Linus Torvalds [Tue, 13 Dec 2022 00:59:00 +0000 (16:59 -0800)]
Merge tag 'rust-6.2' of https://github.com/Rust-for-Linux/linux
Pull rust updates from Miguel Ojeda:
"The first set of changes after the merge, the major ones being:
- String and formatting: new types 'CString', 'CStr', 'BStr' and
'Formatter'; new macros 'c_str!', 'b_str!' and 'fmt!'.
- Errors: the rest of the error codes from 'errno-base.h', as well as
some 'From' trait implementations for the 'Error' type.
- Printing: the rest of the 'pr_*!' levels and the continuation one
'pr_cont!', as well as a new sample.
- 'alloc' crate: new constructors 'try_with_capacity()' and
'try_with_capacity_in()' for 'RawVec' and 'Vec'.
- Procedural macros: new macros '#[vtable]' and 'concat_idents!', as
well as better ergonomics for 'module!' users.
- Asserting: new macros 'static_assert!', 'build_error!' and
'build_assert!', as well as a new crate 'build_error' to support
them.
- Vocabulary types: new types 'Opaque' and 'Either'.
- Debugging: new macro 'dbg!'"
* tag 'rust-6.2' of https://github.com/Rust-for-Linux/linux: (28 commits)
rust: types: add `Opaque` type
rust: types: add `Either` type
rust: build_assert: add `build_{error,assert}!` macros
rust: add `build_error` crate
rust: static_assert: add `static_assert!` macro
rust: std_vendor: add `dbg!` macro based on `std`'s one
rust: str: add `fmt!` macro
rust: str: add `CString` type
rust: str: add `Formatter` type
rust: str: add `c_str!` macro
rust: str: add `CStr` unit tests
rust: str: implement several traits for `CStr`
rust: str: add `CStr` type
rust: str: add `b_str!` macro
rust: str: add `BStr` type
rust: alloc: add `Vec::try_with_capacity{,_in}()` constructors
rust: alloc: add `RawVec::try_with_capacity_in()` constructor
rust: prelude: add `error::code::*` constant items
rust: error: add `From` implementations for `Error`
rust: error: add codes from `errno-base.h`
...
Linus Torvalds [Tue, 13 Dec 2022 00:48:48 +0000 (16:48 -0800)]
Merge tag 'trace-tools-6.2' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing tools updates from Steven Rostedt:
- New tool "rv" for starting and stopping runtime verification.
Example:
./rv mon wip -r printk -v
Enables the wake-in-preempt monitor and the printk reactor in verbose
mode
- Fix exit status of rtla usage() calls
* tag 'trace-tools-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
Documentation/rv: Add verification/rv man pages
tools/rv: Add in-kernel monitor interface
rv: Add rv tool
rtla: Fix exit status when returning from calls to usage()
Linus Torvalds [Tue, 13 Dec 2022 00:46:08 +0000 (16:46 -0800)]
Merge tag 'ktest-v6.2' of git://git./linux/kernel/git/rostedt/linux-ktest
Pull ktest updates from Steven Rostedt:
- Fix minconfig test to unset the config and not relying on
olddefconfig to do it, as some configs are set to default y
- Fix reading grub2 menus for handling submenus
- Add new ${shell <cmd>} to execute shell commands that will be useful
for setting variables like: HOSTNAME := ${shell hostname}
* tag 'ktest-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest.pl: Add shell commands to variables
kest.pl: Fix grub2 menu handling for rebooting
ktest.pl minconfig: Unset configs instead of just removing them
Linus Torvalds [Tue, 13 Dec 2022 00:42:57 +0000 (16:42 -0800)]
Merge tag 'linux-kselftest-kunit-next-6.2-rc1' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull KUnit updates from Shuah Khan:
"Several enhancements, fixes, clean-ups, documentation updates,
improvements to logging and KTAP compliance of KUnit test output:
- log numbers in decimal and hex
- parse KTAP compliant test output
- allow conditionally exposing static symbols to tests when KUNIT is
enabled
- make static symbols visible during kunit testing
- clean-ups to remove unused structure definition"
* tag 'linux-kselftest-kunit-next-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (29 commits)
Documentation: dev-tools: Clarify requirements for result description
apparmor: test: make static symbols visible during kunit testing
kunit: add macro to allow conditionally exposing static symbols to tests
kunit: tool: make parser preserve whitespace when printing test log
Documentation: kunit: Fix "How Do I Use This" / "Next Steps" sections
kunit: tool: don't include KTAP headers and the like in the test log
kunit: improve KTAP compliance of KUnit test output
kunit: tool: parse KTAP compliant test output
mm: slub: test: Use the kunit_get_current_test() function
kunit: Use the static key when retrieving the current test
kunit: Provide a static key to check if KUnit is actively running tests
kunit: tool: make --json do nothing if --raw_ouput is set
kunit: tool: tweak error message when no KTAP found
kunit: remove KUNIT_INIT_MEM_ASSERTION macro
Documentation: kunit: Remove redundant 'tips.rst' page
Documentation: KUnit: reword description of assertions
Documentation: KUnit: make usage.rst a superset of tips.rst, remove duplication
kunit: eliminate KUNIT_INIT_*_ASSERT_STRUCT macros
kunit: tool: remove redundant file.close() call in unit test
kunit: tool: unit tests all check parser errors, standardize formatting a bit
...
Linus Torvalds [Tue, 13 Dec 2022 00:39:38 +0000 (16:39 -0800)]
Merge tag 'linux-kselftest-next-6.2-rc1' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull Kselftest updates from Shuah Khan:
"Several fixes and enhancements to existing tests and a few new tests:
- add new amd-pstate tests and fix and enhance existing ones
- add new watchdog tests and enhance existing ones to improve
coverage
- fixes to ftrace, splice_read, rtc, and efivars tests
- fixes to handle egrep obsolescence in the latest grep release
- miscellaneous spelling and SPDX fixes"
* tag 'linux-kselftest-next-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (24 commits)
selftests/ftrace: Use long for synthetic event probe test
selftests/tpm2: Split async tests call to separate shell script runner
selftests: splice_read: Fix sysfs read cases
selftests: ftrace: Use "grep -E" instead of "egrep"
selftests: gpio: Use "grep -E" instead of "egrep"
selftests: kselftest_deps: Use "grep -E" instead of "egrep"
selftests/efivarfs: Add checking of the test return value
cpufreq: amd-pstate: fix spdxcheck warnings for amd-pstate-ut.c
selftests: rtc: skip when RTC is not present
selftests/ftrace: event_triggers: wait longer for test_event_enable
selftests/vDSO: Add riscv getcpu & gettimeofday test
Documentation: amd-pstate: Add tbench and gitsource test introduction
selftests: amd-pstate: Trigger gitsource benchmark and test cpus
selftests: amd-pstate: Trigger tbench benchmark and test cpus
selftests: amd-pstate: Split basic.sh into run.sh and basic.sh.
selftests: amd-pstate: Rename amd-pstate-ut.sh to basic.sh.
selftests/ftrace: Convert tracer tests to use 'requires' to specify program dependency
selftests/ftrace: Add check for ping command for trigger tests
selftests/watchdog: Fix spelling mistake "Temeprature" -> "Temperature"
selftests/watchdog: add test for WDIOC_GETTEMP
...
Linus Torvalds [Tue, 13 Dec 2022 00:22:22 +0000 (16:22 -0800)]
Merge tag 'random-6.2-rc1-for-linus' of git://git./linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
- Replace prandom_u32_max() and various open-coded variants of it,
there is now a new family of functions that uses fast rejection
sampling to choose properly uniformly random numbers within an
interval:
get_random_u32_below(ceil) - [0, ceil)
get_random_u32_above(floor) - (floor, U32_MAX]
get_random_u32_inclusive(floor, ceil) - [floor, ceil]
Coccinelle was used to convert all current users of
prandom_u32_max(), as well as many open-coded patterns, resulting in
improvements throughout the tree.
I'll have a "late" 6.1-rc1 pull for you that removes the now unused
prandom_u32_max() function, just in case any other trees add a new
use case of it that needs to converted. According to linux-next,
there may be two trivial cases of prandom_u32_max() reintroductions
that are fixable with a 's/.../.../'. So I'll have for you a final
conversion patch doing that alongside the removal patch during the
second week.
This is a treewide change that touches many files throughout.
- More consistent use of get_random_canary().
- Updates to comments, documentation, tests, headers, and
simplification in configuration.
- The arch_get_random*_early() abstraction was only used by arm64 and
wasn't entirely useful, so this has been replaced by code that works
in all relevant contexts.
- The kernel will use and manage random seeds in non-volatile EFI
variables, refreshing a variable with a fresh seed when the RNG is
initialized. The RNG GUID namespace is then hidden from efivarfs to
prevent accidental leakage.
These changes are split into random.c infrastructure code used in the
EFI subsystem, in this pull request, and related support inside of
EFISTUB, in Ard's EFI tree. These are co-dependent for full
functionality, but the order of merging doesn't matter.
- Part of the infrastructure added for the EFI support is also used for
an improvement to the way vsprintf initializes its siphash key,
replacing an sleep loop wart.
- The hardware RNG framework now always calls its correct random.c
input function, add_hwgenerator_randomness(), rather than sometimes
going through helpers better suited for other cases.
- The add_latent_entropy() function has long been called from the fork
handler, but is a no-op when the latent entropy gcc plugin isn't
used, which is fine for the purposes of latent entropy.
But it was missing out on the cycle counter that was also being mixed
in beside the latent entropy variable. So now, if the latent entropy
gcc plugin isn't enabled, add_latent_entropy() will expand to a call
to add_device_randomness(NULL, 0), which adds a cycle counter,
without the absent latent entropy variable.
- The RNG is now reseeded from a delayed worker, rather than on demand
when used. Always running from a worker allows it to make use of the
CPU RNG on platforms like S390x, whose instructions are too slow to
do so from interrupts. It also has the effect of adding in new inputs
more frequently with more regularity, amounting to a long term
transcript of random values. Plus, it helps a bit with the upcoming
vDSO implementation (which isn't yet ready for 6.2).
- The jitter entropy algorithm now tries to execute on many different
CPUs, round-robining, in hopes of hitting even more memory latencies
and other unpredictable effects. It also will mix in a cycle counter
when the entropy timer fires, in addition to being mixed in from the
main loop, to account more explicitly for fluctuations in that timer
firing. And the state it touches is now kept within the same cache
line, so that it's assured that the different execution contexts will
cause latencies.
* tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits)
random: include <linux/once.h> in the right header
random: align entropy_timer_state to cache line
random: mix in cycle counter when jitter timer fires
random: spread out jitter callback to different CPUs
random: remove extraneous period and add a missing one in comments
efi: random: refresh non-volatile random seed when RNG is initialized
vsprintf: initialize siphash key using notifier
random: add back async readiness notifier
random: reseed in delayed work rather than on-demand
random: always mix cycle counter in add_latent_entropy()
hw_random: use add_hwgenerator_randomness() for early entropy
random: modernize documentation comment on get_random_bytes()
random: adjust comment to account for removed function
random: remove early archrandom abstraction
random: use random.trust_{bootloader,cpu} command line option only
stackprotector: actually use get_random_canary()
stackprotector: move get_random_canary() into stackprotector.h
treewide: use get_random_u32_inclusive() when possible
treewide: use get_random_u32_{above,below}() instead of manual loop
treewide: use get_random_u32_below() instead of deprecated function
...
Linus Torvalds [Tue, 13 Dec 2022 00:07:04 +0000 (16:07 -0800)]
Merge branch 'for-6.2' of git://git./linux/kernel/git/dennis/percpu
Pull percpu updates from Dennis Zhou:
"Baoquan was nice enough to run some clean ups for percpu"
* 'for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
mm/percpu: remove unused PERCPU_DYNAMIC_EARLY_SLOTS
mm/percpu.c: remove the lcm code since block size is fixed at page size
mm/percpu: replace the goto with break
mm/percpu: add comment to state the empty populated pages accounting
mm/percpu: Update the code comment when creating new chunk
mm/percpu: use list_first_entry_or_null in pcpu_reclaim_populated()
mm/percpu: remove unused pcpu_map_extend_chunks
Linus Torvalds [Tue, 13 Dec 2022 00:01:02 +0000 (16:01 -0800)]
Merge tag 'livepatching-for-6.2' of git://git./linux/kernel/git/livepatching/livepatching
Pull livepatching update from Petr Mladek:
- code cleanup
* tag 'livepatching-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
livepatch: Move the result-invariant calculation out of the loop
Linus Torvalds [Mon, 12 Dec 2022 23:48:36 +0000 (15:48 -0800)]
Merge tag 'cgroup-for-6.2' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
"Nothing too interesting:
- Add CONFIG_DEBUG_GROUP_REF which makes cgroup refcnt operations
kprobable
- A couple cpuset optimizations
- Other misc changes including doc and test updates"
* tag 'cgroup-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: remove rcu_read_lock()/rcu_read_unlock() in critical section of spin_lock_irq()
cgroup/cpuset: Improve cpuset_css_alloc() description
kselftest/cgroup: Add cleanup() to test_cpuset_prs.sh
cgroup/cpuset: Optimize cpuset_attach() on v2
cgroup/cpuset: Skip spread flags update on v2
kselftest/cgroup: Fix gathering number of CPUs
cgroup: cgroup refcnt functions should be exported when CONFIG_DEBUG_CGROUP_REF
cgroup: Implement DEBUG_CGROUP_REF
Linus Torvalds [Mon, 12 Dec 2022 23:33:42 +0000 (15:33 -0800)]
Merge tag 'sched-core-2022-12-12' of git://git./linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Implement persistent user-requested affinity: introduce
affinity_context::user_mask and unconditionally preserve the
user-requested CPU affinity masks, for long-lived tasks to better
interact with cpusets & CPU hotplug events over longer timespans,
without destroying the original affinity intent if the underlying
topology changes.
- Uclamp updates: fix relationship between uclamp and fits_capacity()
- PSI fixes
- Misc fixes & updates
* tag 'sched-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Clear ttwu_pending after enqueue_task()
sched/psi: Use task->psi_flags to clear in CPU migration
sched/psi: Stop relying on timer_pending() for poll_work rescheduling
sched/psi: Fix avgs_work re-arm in psi_avgs_work()
sched/psi: Fix possible missing or delayed pending event
sched: Always clear user_cpus_ptr in do_set_cpus_allowed()
sched: Enforce user requested affinity
sched: Always preserve the user requested cpumask
sched: Introduce affinity_context
sched: Add __releases annotations to affine_move_task()
sched/fair: Check if prev_cpu has highest spare cap in feec()
sched/fair: Consider capacity inversion in util_fits_cpu()
sched/fair: Detect capacity inversion
sched/uclamp: Cater for uclamp in find_energy_efficient_cpu()'s early exit condition
sched/uclamp: Make cpu_overutilized() use util_fits_cpu()
sched/uclamp: Make asym_fits_capacity() use util_fits_cpu()
sched/uclamp: Make select_idle_capacity() use util_fits_cpu()
sched/uclamp: Fix fits_capacity() check in feec()
sched/uclamp: Make task_fits_capacity() use util_fits_cpu()
sched/uclamp: Fix relationship between uclamp and migration margin
Linus Torvalds [Mon, 12 Dec 2022 23:19:38 +0000 (15:19 -0800)]
Merge tag 'perf-core-2022-12-12' of git://git./linux/kernel/git/tip/tip
Pull perf events updates from Ingo Molnar:
- Thoroughly rewrite the data structures that implement perf task
context handling, with the goal of fixing various quirks and
unfeatures both in already merged, and in upcoming proposed code.
The old data structure is the per task and per cpu
perf_event_contexts:
task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context
^ | ^ | ^
`---------------------------------' | `--> pmu ---'
v ^
perf_event ------'
In this new design this is replaced with a single task context and a
single CPU context, plus intermediate data-structures:
task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context
^ | ^ ^
`---------------------------' | |
| | perf_cpu_pmu_context <--.
| `----. ^ |
| | | |
| v v |
| ,--> perf_event_pmu_context |
| | |
| | |
v v |
perf_event ---> pmu ----------------'
[ See commit
bd2756811766 for more details. ]
This rewrite was developed by Peter Zijlstra and Ravi Bangoria.
- Optimize perf_tp_event()
- Update the Intel uncore PMU driver, extending it with UPI topology
discovery on various hardware models.
- Misc fixes & cleanups
* tag 'perf-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
perf/x86/intel/uncore: Fix reference count leak in __uncore_imc_init_box()
perf/x86/intel/uncore: Fix reference count leak in snr_uncore_mmio_map()
perf/x86/intel/uncore: Fix reference count leak in hswep_has_limit_sbox()
perf/x86/intel/uncore: Fix reference count leak in sad_cfg_iio_topology()
perf/x86/intel/uncore: Make set_mapping() procedure void
perf/x86/intel/uncore: Update sysfs-devices-mapping file
perf/x86/intel/uncore: Enable UPI topology discovery for Sapphire Rapids
perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server
perf/x86/intel/uncore: Get UPI NodeID and GroupID
perf/x86/intel/uncore: Enable UPI topology discovery for Skylake Server
perf/x86/intel/uncore: Generalize get_topology() for SKX PMUs
perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D
perf/x86/intel/uncore: Clear attr_update properly
perf/x86/intel/uncore: Introduce UPI topology type
perf/x86/intel/uncore: Generalize IIO topology support
perf/core: Don't allow grouping events from different hw pmus
perf/amd/ibs: Make IBS a core pmu
perf: Fix function pointer case
perf/x86/amd: Remove the repeated declaration
perf: Fix possible memleak in pmu_dev_alloc()
...
Linus Torvalds [Mon, 12 Dec 2022 23:14:53 +0000 (15:14 -0800)]
Merge tag 'locking-core-2022-12-12' of git://git./linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"Two changes in this cycle:
- a micro-optimization in static_key_slow_inc_cpuslocked()
- fix futex death-notification wakeup bug"
* tag 'locking-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Resend potentially swallowed owner death notification
jump_label: Use atomic_try_cmpxchg() in static_key_slow_inc_cpuslocked()
Linus Torvalds [Mon, 12 Dec 2022 22:54:24 +0000 (14:54 -0800)]
Merge tag 'x86_alternatives_for_v6.2' of git://git./linux/kernel/git/tip/tip
Pull x86 alternative update from Borislav Petkov:
"A single alternatives patching fix for modules:
- Have alternatives patch the same sections in modules as in vmlinux"
* tag 'x86_alternatives_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternative: Consistently patch SMP locks in vmlinux and modules
Linus Torvalds [Mon, 12 Dec 2022 22:51:56 +0000 (14:51 -0800)]
Merge tag 'ras_core_for_v6.2' of git://git./linux/kernel/git/tip/tip
Pull x86 RAS updates from Borislav Petkov:
- Fix confusing output from /sys/kernel/debug/ras/daemon_active
- Add another MCE severity error case to the Intel error severity table
to promote UC and AR errors to panic severity and remove the
corresponding code condition doing that.
- Make sure the thresholding and deferred error interrupts on AMD SMCA
systems clear the all registers reporting an error so that there are
no multiple errors logged for the same event
* tag 'ras_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
RAS: Fix return value from show_trace()
x86/mce: Use severity table to handle uncorrected errors in kernel
x86/MCE/AMD: Clear DFR errors found in THR handler
Linus Torvalds [Mon, 12 Dec 2022 22:47:31 +0000 (14:47 -0800)]
Merge tag 'edac_updates_for_6.2' of git://git./linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- Make ghes_edac a simple module like the rest of the EDAC drivers and
drop the forced built-in only configuration by disentangling it from
GHES (Jia He)
- The usual small cleanups and improvements all over EDAC land
* tag 'edac_updates_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/i10nm: fix refcount leak in pci_get_dev_wrapper()
EDAC/i5400: Fix typo in comment: vaious -> various
EDAC/mc_sysfs: Increase legacy channel support to 12
MAINTAINERS: Make Mauro EDAC reviewer
MAINTAINERS: Make Manivannan Sadhasivam the maintainer of qcom_edac
EDAC/igen6: Return the correct error type when not the MC owner
apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg()
EDAC: Check for GHES preference in the chipset-specific EDAC drivers
EDAC/ghes: Make ghes_edac a proper module
EDAC/ghes: Prepare to make ghes_edac a proper module
EDAC/ghes: Add a notifier for reporting memory errors
efi/cper: Export several helpers for ghes_edac to use
EDAC/i5000: Mark as BROKEN
Linus Torvalds [Mon, 12 Dec 2022 22:41:57 +0000 (14:41 -0800)]
Merge tag 'x86_fpu_for_6.2' of git://git./linux/kernel/git/tip/tip
Pull x86 fpu updates from Dave Hansen:
"There are two little fixes in here, one to give better XSAVE warnings
and another to address some undefined behavior in offsetof().
There is also a collection of patches to fix some issues with ptrace
and the protection keys register (PKRU). PKRU is a real oddity because
it is exposed in the XSAVE-related ABIs, but it is generally managed
without using XSAVE in the kernel. This fix thankfully came with a
selftest to ward off future regressions.
Summary:
- Clarify XSAVE consistency warnings
- Fix up ptrace interface to protection keys register (PKRU)
- Avoid undefined compiler behavior with TYPE_ALIGN"
* tag 'x86_fpu_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Use _Alignof to avoid undefined behavior in TYPE_ALIGN
selftests/vm/pkeys: Add a regression test for setting PKRU through ptrace
x86/fpu: Emulate XRSTOR's behavior if the xfeatures PKRU bit is not set
x86/fpu: Allow PKRU to be (once again) written by ptrace.
x86/fpu: Add a pkru argument to copy_uabi_to_xstate()
x86/fpu: Add a pkru argument to copy_uabi_from_kernel_to_xstate().
x86/fpu: Take task_struct* in copy_sigframe_from_user_to_xstate()
x86/fpu/xstate: Fix XSTATE_WARN_ON() to emit relevant diagnostics
Linus Torvalds [Mon, 12 Dec 2022 22:39:51 +0000 (14:39 -0800)]
Merge tag 'x86_splitlock_for_6.2' of git://git./linux/kernel/git/tip/tip
Pull x86 splitlock updates from Dave Hansen:
"Add a sysctl to control the split lock misery mode.
This enables users to reduce the penalty inflicted on split lock
users. There are some proprietary, binary-only games which became
entirely unplayable with the old penalty.
Anyone opting into the new mode is, of course, more exposed to the DoS
nasitness inherent with split locks, but they can play their games
again"
* tag 'x86_splitlock_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/split_lock: Add sysctl to control the misery mode
Linus Torvalds [Mon, 12 Dec 2022 22:30:54 +0000 (14:30 -0800)]
Merge tag 'x86_cache_for_6.2' of git://git./linux/kernel/git/tip/tip
Pull x86 cache resource control updates from Dave Hansen:
"These declare the resource control (rectrl) MSRs a bit more normally
and clean up an unnecessary structure member:
- Remove unnecessary arch_has_empty_bitmaps structure memory
- Move rescrtl MSR defines into msr-index.h, like normal MSRs"
* tag 'x86_cache_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Move MSR defines into msr-index.h
x86/resctrl: Remove arch_has_empty_bitmaps
Linus Torvalds [Mon, 12 Dec 2022 22:27:49 +0000 (14:27 -0800)]
Merge tag 'x86_tdx_for_6.2' of git://git./linux/kernel/git/tip/tip
Pull x86 tdx updates from Dave Hansen:
"This includes a single chunk of new functionality for TDX guests which
allows them to talk to the trusted TDX module software and obtain an
attestation report.
This report can then be used to prove the trustworthiness of the guest
to a third party and get access to things like storage encryption
keys"
* tag 'x86_tdx_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/tdx: Test TDX attestation GetReport support
virt: Add TDX guest driver
x86/tdx: Add a wrapper to get TDREPORT0 from the TDX Module
Linus Torvalds [Mon, 12 Dec 2022 22:18:44 +0000 (14:18 -0800)]
Merge tag 'x86_sgx_for_6.2' of git://git./linux/kernel/git/tip/tip
Pull x86 sgx updates from Dave Hansen:
"The biggest deal in this series is support for a new hardware feature
that allows enclaves to detect and mitigate single-stepping attacks.
There's also a minor performance tweak and a little piece of the
kmap_atomic() -> kmap_local() transition.
Summary:
- Introduce a new SGX feature (Asynchrounous Exit Notification) for
bare-metal enclaves and KVM guests to mitigate single-step attacks
- Increase batching to speed up enclave release
- Replace kmap/kunmap_atomic() calls"
* tag 'x86_sgx_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sgx: Replace kmap/kunmap_atomic() calls
KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest
x86/sgx: Allow enclaves to use Asynchrounous Exit Notification
x86/sgx: Reduce delay and interference of enclave release
Linus Torvalds [Mon, 12 Dec 2022 21:55:31 +0000 (13:55 -0800)]
Merge tag 'cxl-for-6.2' of git://git./linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"Compute Express Link (CXL) updates for 6.2.
While it may seem backwards, the CXL update this time around includes
some focus on CXL 1.x enabling where the work to date had been with
CXL 2.0 (VH topologies) in mind.
First generation CXL can mostly be supported via BIOS, similar to DDR,
however it became clear there are use cases for OS native CXL error
handling and some CXL 3.0 endpoint features can be deployed on CXL 1.x
hosts (Restricted CXL Host (RCH) topologies). So, this update brings
RCH topologies into the Linux CXL device model.
In support of the ongoing CXL 2.0+ enabling two new core kernel
facilities are added.
One is the ability for the kernel to flag collisions between userspace
access to PCI configuration registers and kernel accesses. This is
brought on by the PCIe Data-Object-Exchange (DOE) facility, a hardware
mailbox over config-cycles.
The other is a cpu_cache_invalidate_memregion() API that maps to
wbinvd_on_all_cpus() on x86. To prevent abuse it is disabled in guest
VMs and architectures that do not support it yet. The CXL paths that
need it, dynamic memory region creation and security commands (erase /
unlock), are disabled when it is not present.
As for the CXL 2.0+ this cycle the subsystem gains support Persistent
Memory Security commands, error handling in response to PCIe AER
notifications, and support for the "XOR" host bridge interleave
algorithm.
Summary:
- Add the cpu_cache_invalidate_memregion() API for cache flushing in
response to physical memory reconfiguration, or memory-side data
invalidation from operations like secure erase or memory-device
unlock.
- Add a facility for the kernel to warn about collisions between
kernel and userspace access to PCI configuration registers
- Add support for Restricted CXL Host (RCH) topologies (formerly CXL
1.1)
- Add handling and reporting of CXL errors reported via the PCIe AER
mechanism
- Add support for CXL Persistent Memory Security commands
- Add support for the "XOR" algorithm for CXL host bridge interleave
- Rework / simplify CXL to NVDIMM interactions
- Miscellaneous cleanups and fixes"
* tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (71 commits)
cxl/region: Fix memdev reuse check
cxl/pci: Remove endian confusion
cxl/pci: Add some type-safety to the AER trace points
cxl/security: Drop security command ioctl uapi
cxl/mbox: Add variable output size validation for internal commands
cxl/mbox: Enable cxl_mbox_send_cmd() users to validate output size
cxl/security: Fix Get Security State output payload endian handling
cxl: update names for interleave ways conversion macros
cxl: update names for interleave granularity conversion macros
cxl/acpi: Warn about an invalid CHBCR in an existing CHBS entry
tools/testing/cxl: Require cache invalidation bypass
cxl/acpi: Fail decoder add if CXIMS for HBIG is missing
cxl/region: Fix spelling mistake "memergion" -> "memregion"
cxl/regs: Fix sparse warning
cxl/acpi: Set ACPI's CXL _OSC to indicate RCD mode support
tools/testing/cxl: Add an RCH topology
cxl/port: Add RCD endpoint port enumeration
cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem
tools/testing/cxl: Add XOR Math support to cxl_test
cxl/acpi: Support CXL XOR Interleave Math (CXIMS)
...
Linus Torvalds [Mon, 12 Dec 2022 21:45:21 +0000 (13:45 -0800)]
Merge tag 'thermal-6.2-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These include thermal core fixes to protect thermal device operations
against thermal device removal, other thermal core fixes and updates
of Intel thermal control drivers.
Specifics:
- Fix race conditions related to thermal device operations that are
not protected against thermal device removal (Guenter Roeck)
- Fix error code in __thermal_cooling_device_register() (Dan
Carpenter)
- Validate new cooling device state (coming from user space) in
cur_state_store() and reuse the max_state value from cooling device
structure in the sysfs interface (Viresh Kumar)
- Fix some possible name leaks in error paths in the thermal control
core code (Yang Yingliang)
- Detect TCC lock bit set in the intel_tcc_cooling driver and make it
refuse to update the TCC offset in that case (Zhang Rui)
- Add TCC cooling support for RaptorLake-S (Zhang Rui)
- Prevent accidental clearing of HFI status by one of the other
drivers using the same status register (Srinivas Pandruvada)
- Protect clearing of thermal status bits in Intel thermal control
drivers (Srinivas Pandruvada)
- Allow the HFI thermal control driver to ACK an HFI event for the
previously observed timestamp (Srinivas Pandruvada)
- Remove a pointless die_id check from the HFI thermal driver and
adjust the definition a data structure used by it (Ricardo Neri)"
* tag 'thermal-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel: hfi: Remove a pointless die_id check
thermal: core: fix some possible name leaks in error paths
thermal: intel: hfi: ACK HFI for the same timestamp
thermal: intel: Protect clearing of thermal status bits
thermal: intel: Prevent accidental clearing of HFI status
thermal/core: Protect thermal device operations against thermal device removal
thermal/core: Remove thermal_zone_set_trips()
thermal/core: Protect sysfs accesses to thermal operations with thermal zone mutex
thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex
thermal/core: Introduce locked version of thermal_zone_device_update
thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp
thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp
thermal/core: Delete device under thermal device zone lock
thermal/core: Destroy thermal zone device mutex in release function
thermal: intel: intel_tcc_cooling: Add TCC cooling support for RaptorLake-S
thermal: intel: intel_tcc_cooling: Detect TCC lock bit
thermal: intel: hfi: Improve the type of hfi_features::nr_table_pages
thermal/core: fix error code in __thermal_cooling_device_register()
thermal: sysfs: Reuse cdev->max_state
thermal: Validate new state in cur_state_store()
Linus Torvalds [Mon, 12 Dec 2022 21:38:17 +0000 (13:38 -0800)]
Merge tag 'acpi-6.2-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and PNP updates from Rafael Wysocki:
"These include new code (for instance, support for the FFH address
space type and support for new firmware data structures in ACPICA),
some new quirks (mostly related to backlight handling and I2C
enumeration), a number of fixes and a fair amount of cleanups all
over.
Specifics:
- Update the ACPICA code in the kernel to the
20221020 upstream
version and fix a couple of issues in it:
- Make acpi_ex_load_op() match upstream implementation (Rafael
Wysocki)
- Add support for loong_arch-specific APICs in MADT (Huacai Chen)
- Add support for fixed PCIe wake event (Huacai Chen)
- Add EBDA pointer sanity checks (Vit Kabele)
- Avoid accessing VGA memory when EBDA < 1KiB (Vit Kabele)
- Add CCEL table support to both compiler/disassembler (Kuppuswamy
Sathyanarayanan)
- Add a couple of new UUIDs to the known UUID list (Bob Moore)
- Add support for FFH Opregion special context data (Sudeep
Holla)
- Improve warning message for "invalid ACPI name" (Bob Moore)
- Add support for CXL 3.0 structures (CXIMS & RDPAS) in the CEDT
table (Alison Schofield)
- Prepare IORT support for revision E.e (Robin Murphy)
- Finish support for the CDAT table (Bob Moore)
- Fix error code path in acpi_ds_call_control_method() (Rafael
Wysocki)
- Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage() (Li
Zetao)
- Update the version of the ACPICA code in the kernel (Bob Moore)
- Use ZERO_PAGE(0) instead of empty_zero_page in the ACPI device
enumeration code (Giulio Benetti)
- Change the return type of the ACPI driver remove callback to void
and update its users accordingly (Dawei Li)
- Add general support for FFH address space type and implement the
low- level part of it for ARM64 (Sudeep Holla)
- Fix stale comments in the ACPI tables parsing code and make it
print more messages related to MADT (Hanjun Guo, Huacai Chen)
- Replace invocations of generic library functions with more kernel-
specific counterparts in the ACPI sysfs interface (Christophe
JAILLET, Xu Panda)
- Print full name paths of ACPI power resource objects during
enumeration (Kane Chen)
- Eliminate a compiler warning regarding a missing function prototype
in the ACPI power management code (Sudeep Holla)
- Fix and clean up the ACPI processor driver (Rafael Wysocki, Li
Zhong, Colin Ian King, Sudeep Holla)
- Add quirk for the HP Pavilion Gaming 15-cx0041ur to the ACPI EC
driver (Mia Kanashi)
- Add some mew ACPI backlight handling quirks and update some
existing ones (Hans de Goede)
- Make the ACPI backlight driver prefer the native backlight control
over vendor backlight control when possible (Hans de Goede)
- Drop unsetting ACPI APEI driver data on remove (Uwe Kleine-König)
- Use xchg_release() instead of cmpxchg() for updating new GHES cache
slots (Ard Biesheuvel)
- Clean up the ACPI APEI code (Sudeep Holla, Christophe JAILLET, Jay
Lu)
- Add new I2C device enumeration quirks for Medion Lifetab S10346 and
Lenovo Yoga Tab 3 Pro (YT3-X90F) (Hans de Goede)
- Make the ACPI battery driver notify user space about adding new
battery hooks and removing the existing ones (Armin Wolf)
- Modify the pfr_update and pfr_telemetry drivers to use ACPI_FREE()
for freeing acpi_object structures to help diagnostics (Wang
ShaoBo)
- Make the ACPI fan driver use sysfs_emit_at() in its sysfs interface
code (ye xingchen)
- Fix the _FIF package extraction failure handling in the ACPI fan
driver (Hanjun Guo)
- Fix the PCC mailbox handling error code path (Huisong Li)
- Avoid using PCC Opregions if there is no platform interrupt
allocated for this purpose (Huisong Li)
- Use sysfs_emit() instead of scnprintf() in the ACPI PAD driver and
CPPC library (ye xingchen)
- Fix some kernel-doc issues in the ACPI GSI processing code
(Xiongfeng Wang)
- Fix name memory leak in pnp_alloc_dev() (Yang Yingliang)
- Do not disable PNP devices on suspend when they cannot be
re-enabled on resume (Hans de Goede)
- Clean up the ACPI thermal driver a bit (Rafael Wysocki)"
* tag 'acpi-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (67 commits)
ACPI: x86: Add skip i2c clients quirk for Medion Lifetab S10346
ACPI: APEI: EINJ: Refactor available_error_type_show()
ACPI: APEI: EINJ: Fix formatting errors
ACPI: processor: perflib: Adjust acpi_processor_notify_smm() return value
ACPI: processor: perflib: Rearrange acpi_processor_notify_smm()
ACPI: processor: perflib: Rearrange unregistration routine
ACPI: processor: perflib: Drop redundant parentheses
ACPI: processor: perflib: Adjust white space
ACPI: processor: idle: Drop unnecessary statements and parens
ACPI: thermal: Adjust critical.flags.valid check
ACPI: fan: Convert to use sysfs_emit_at() API
ACPICA: Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage()
ACPI: battery: Call power_supply_changed() when adding hooks
ACPI: use sysfs_emit() instead of scnprintf()
ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Tab 3 Pro (YT3-X90F)
ACPI: APEI: Remove a useless include
PNP: Do not disable devices on suspend when they cannot be re-enabled on resume
ACPI: processor: Silence missing prototype warnings
ACPI: processor_idle: Silence missing prototype warnings
ACPI: PM: Silence missing prototype warning
...
Linus Torvalds [Mon, 12 Dec 2022 21:19:07 +0000 (13:19 -0800)]
Merge tag 'pm-6.2-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These include two new drivers (cpufreq driver for Apple SoC CPU
P-states and the SCMI Powercap based power capping driver), other new
hardware support and driver extensions (Qualcomm cpufreq driver and
its DT bindings, TI cpufreq driver, intel_pstate, intel-uncore-freq),
a bunch of fixes and cleanups all over and a cpupower utility update
including new features related to RAPL support.
Specifics:
- Fix nasty and hard to debug race condition introduced by mistake in
the runtime PM core code and clean up that code somewhat on top of
the fix (Rafael Wysocki)
- Generalize of_perf_domain_get_sharing_cpumask phandle format
(Hector Martin)
- Add new cpufreq driver for Apple SoC CPU P-states (Hector Martin)
- Update Qualcomm cpufreq driver (Manivannan Sadhasivam, Chen Hui):
- CPU clock provider support
- Generic cleanups or reorganization
- Potential memleak fix
- Fix of the return value of cpufreq_driver->get()
- Update Qualcomm cpufreq driver's DT bindings (Manivannan
Sadhasivam, Rob Herring, Melody Olvera):
- Support for CPU clock provider
- Missing cache-related properties fixes
- Support for QDU1000/QRU1000
- Add support for ti,am625 SoC and enable build of ti-cpufreq for
ARCH_K3 (Dave Gerlach, and Vibhore Vardhan)
- Use flexible array to simplify memory allocation in the tegra186
cpufreq driver (Christophe JAILLET)
- Convert cpufreq statistics code to use sysfs_emit_at() (ye
xingchen)
- Allow intel_pstate to use no-HWP mode on Sapphire Rapids (Giovanni
Gherdovich)
- Add missing pci_dev_put() to the amd_freq_sensitivity cpufreq
driver (Xiongfeng Wang)
- Initialize the kobj_unregister completion before calling
kobject_init_and_add() in the cpufreq core code (Yongqiang Liu)
- Defer setting boost MSRs in the ACPI cpufreq driver (Stuart Hayes,
Nathan Chancellor)
- Make intel_pstate accept initial EPP value of 0x80 (Srinivas
Pandruvada)
- Make read-only array sys_clk_src in the SPEAr cpufreq driver static
(Colin Ian King)
- Make array speeds in the longhaul cpufreq driver static (Colin Ian
King)
- Use str_enabled_disabled() helper in the ACPI cpufreq driver (Andy
Shevchenko)
- Drop a reference to CVS from cpufreq documentation (Conghui Wang)
- Improve kernel messages printed by the PSCI cpuidle driver (Ulf
Hansson)
- Make the DT cpuidle driver return the correct number of parsed idle
states, clean it up and clarify a comment in it (Ulf Hansson)
- Modify the tasks freezing code to avoid using pr_cont() and refine
an error message printed by it (Rafael Wysocki)
- Make the hibernation core code complain about memory map mismatches
during resume to help diagnostics (Xueqin Luo)
- Fix mistake in a kerneldoc comment in the hibernation code
(xiongxin)
- Reverse the order of performance and enabling operations in the
generic power domains code (Abel Vesa)
- Power off[on] domains in hibernate .freeze[thaw]_noirq hook of in
the generic power domains code (Abel Vesa)
- Consolidate genpd_restore_noirq() and genpd_resume_noirq() (Shawn
Guo)
- Pass generic PM noirq hooks to genpd_finish_suspend() (Shawn Guo)
- Drop generic power domain status manipulation during hibernate
restore (Shawn Guo)
- Fix compiler warnings with make W=1 in the idle_inject power
capping driver (Srinivas Pandruvada)
- Use kstrtobool() instead of strtobool() in the power capping sysfs
interface (Christophe JAILLET)
- Add SCMI Powercap based power capping driver (Cristian Marussi)
- Add Emerald Rapids support to the intel-uncore-freq driver (Artem
Bityutskiy)
- Repair slips in kernel-doc comments in the generic notifier code
(Lukas Bulwahn)
- Fix several DT issues in the OPP library reorganize code around
opp-microvolt-<named> DT property (Viresh Kumar)
- Allow any of opp-microvolt, opp-microamp, or opp-microwatt
properties to be present without the others present (James
Calligeros)
- Fix clock-latency-ns property in DT example (Serge Semin)
- Add a private governor_data for devfreq governors (Kant Fan)
- Reorganize devfreq code to use device_match_of_node() and
devm_platform_get_and_ioremap_resource() instead of open coding
them (ye xingchen, Minghao Chi)
- Make cpupower choose base_cpu to display default cpupower details
instead of picking CPU 0 (Saket Kumar Bhaskar)
- Add Georgian translation to cpupower documentation (Zurab
Kargareteli)
- Introduce powercap intel-rapl library, powercap-info command, and
RAPL monitor into cpupower (Thomas Renninger)"
* tag 'pm-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits)
PM: runtime: Adjust white space in the core code
cpufreq: Remove CVS version control contents from documentation
cpufreq: stats: Convert to use sysfs_emit_at() API
cpufreq: ACPI: Only set boost MSRs on supported CPUs
PM: sleep: Refine error message in try_to_freeze_tasks()
PM: sleep: Avoid using pr_cont() in the tasks freezing code
PM: runtime: Relocate rpm_callback() right after __rpm_callback()
PM: runtime: Do not call __rpm_callback() from rpm_idle()
PM / devfreq: event: use devm_platform_get_and_ioremap_resource()
PM / devfreq: event: Use device_match_of_node()
PM / devfreq: Use device_match_of_node()
powercap: idle_inject: Fix warnings with make W=1
PM: hibernate: Complain about memory map mismatches during resume
dt-bindings: cpufreq: cpufreq-qcom-hw: Add QDU1000/QRU1000 cpufreq
cpufreq: tegra186: Use flexible array to simplify memory allocation
cpupower: rapl monitor - shows the used power consumption in uj for each rapl domain
cpupower: Introduce powercap intel-rapl library and powercap-info command
cpupower: Add Georgian translation
cpufreq: intel_pstate: Add Sapphire Rapids support in no-HWP mode
cpufreq: amd_freq_sensitivity: Add missing pci_dev_put()
...
Mark Brown [Wed, 7 Dec 2022 13:03:02 +0000 (13:03 +0000)]
Documentation: dev-tools: Clarify requirements for result description
Currently the KTAP specification says that a test result line is
<result> <number> [<description>][ # [<directive>] [<diagnostic data>]]
and the description of a test can be "any sequence of words
(can't include #)" which specifies that there may be more than
one word but does not specify anything other than those words
which might be used to separate the words which probably isn't
what we want. Given that practically we have tests using a range
of separators for words including combinations of spaces and
combinations of other symbols like underscores or punctuation
let's just clarify that the description can contain any character
other than # (marking the start of the directive/diagnostic) or
newline (marking the end of this test result).
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Rae Moar [Wed, 7 Dec 2022 01:40:24 +0000 (01:40 +0000)]
apparmor: test: make static symbols visible during kunit testing
Use macros, VISIBLE_IF_KUNIT and EXPORT_SYMBOL_IF_KUNIT, to allow
static symbols to be conditionally set to be visible during
apparmor_policy_unpack_test, which removes the need to include the testing
file in the implementation file.
Change the namespace of the symbols that are now conditionally visible (by
adding the prefix aa_) to avoid confusion with symbols of the same name.
Allow the test to be built as a module and namespace the module name from
policy_unpack_test to apparmor_policy_unpack_test to improve clarity of
the module name.
Provide an example of how static symbols can be dealt with in testing.
Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Rae Moar [Wed, 7 Dec 2022 01:40:23 +0000 (01:40 +0000)]
kunit: add macro to allow conditionally exposing static symbols to tests
Create two macros:
VISIBLE_IF_KUNIT - A macro that sets symbols to be static if CONFIG_KUNIT
is not enabled. Otherwise if CONFIG_KUNIT is enabled there is no change to
the symbol definition.
EXPORT_SYMBOL_IF_KUNIT(symbol) - Exports symbol into
EXPORTED_FOR_KUNIT_TESTING namespace only if CONFIG_KUNIT is enabled. Must
use MODULE_IMPORT_NS(EXPORTED_FOR_KUNIT_TESTING) in test file in order to
use symbols.
Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Wed, 30 Nov 2022 18:54:19 +0000 (10:54 -0800)]
kunit: tool: make parser preserve whitespace when printing test log
Currently, kunit_parser.py is stripping all leading whitespace to make
parsing easier. But this means we can't accurately show kernel output
for failing tests or when the kernel crashes.
Embarassingly, this affects even KUnit's own output, e.g.
[13:40:46] Expected 2 + 1 == 2, but
[13:40:46] 2 + 1 == 3 (0x3)
[13:40:46] not ok 1 example_simple_test
[13:40:46] [FAILED] example_simple_test
After this change, here's what the output in context would look like
[13:40:46] =================== example (4 subtests) ===================
[13:40:46] # example_simple_test: initializing
[13:40:46] # example_simple_test: EXPECTATION FAILED at lib/kunit/kunit-example-test.c:29
[13:40:46] Expected 2 + 1 == 2, but
[13:40:46] 2 + 1 == 3 (0x3)
[13:40:46] [FAILED] example_simple_test
[13:40:46] [SKIPPED] example_skip_test
[13:40:46] [SKIPPED] example_mark_skipped_test
[13:40:46] [PASSED] example_all_expect_macros_test
[13:40:46] # example: initializing suite
[13:40:46] # example: pass:1 fail:1 skip:2 total:4
[13:40:46] # Totals: pass:1 fail:1 skip:2 total:4
[13:40:46] ===================== [FAILED] example =====================
This example shows one minor cosmetic defect this approach has.
The test counts lines prevent us from dedenting the suite-level output.
But at the same time, any form of non-KUnit output would do the same
unless it happened to be indented as well.
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
David Gow [Wed, 7 Dec 2022 04:33:19 +0000 (12:33 +0800)]
Documentation: kunit: Fix "How Do I Use This" / "Next Steps" sections
The "How Do I Use This" section of index.rst and "Next Steps" section of
start.rst were just copies of the table of contents, and therefore
weren't really useful either when looking a sphinx generated output
(which already had the TOC visible) or when reading the source (where
it's just a list of files that ls could give you).
Instead, provide a small number of concrete next steps, and a bit more
description about what the pages contain.
This also removes the broken reference to 'tips.rst', which was
previously removed.
Fixed git am whitespace complaints during commit:
Shuah Khan <skhan@linuxfoundation.org>
Fixes:
4399c737a97d ("Documentation: kunit: Remove redundant 'tips.rst' page")
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Sadiya Kazi <sadiyakazi@google.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Tue, 29 Nov 2022 00:12:34 +0000 (16:12 -0800)]
kunit: tool: don't include KTAP headers and the like in the test log
We print the "test log" on failure.
This is meant to be all the kernel output that happened during the test.
But we also include the special KTAP lines in it, which are often
redundant.
E.g. we include the "not ok" line in the log, right before we print
that the test case failed...
[13:51:48] Expected 2 + 1 == 2, but
[13:51:48] 2 + 1 == 3 (0x3)
[13:51:48] not ok 1 example_simple_test
[13:51:48] [FAILED] example_simple_test
More full example after this patch:
[13:51:48] =================== example (4 subtests) ===================
[13:51:48] # example_simple_test: initializing
[13:51:48] # example_simple_test: EXPECTATION FAILED at lib/kunit/kunit-example-test.c:29
[13:51:48] Expected 2 + 1 == 2, but
[13:51:48] 2 + 1 == 3 (0x3)
[13:51:48] [FAILED] example_simple_test
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Rae Moar [Wed, 23 Nov 2022 18:25:58 +0000 (18:25 +0000)]
kunit: improve KTAP compliance of KUnit test output
Change KUnit test output to better comply with KTAP v1 specifications
found here: https://kernel.org/doc/html/latest/dev-tools/ktap.html.
1) Use "KTAP version 1" instead of "TAP version 14" as test output header
2) Remove '-' between test number and test name on test result lines
2) Add KTAP version lines to each subtest header as well
Note that the new KUnit output still includes the “# Subtest” line now
located after the KTAP version line. This does not completely match the
KTAP v1 spec but since it is classified as a diagnostic line, it is not
expected to be disruptive or break any existing parsers. This
“# Subtest” line comes from the TAP 14 spec
(https://testanything.org/tap-version-14-specification.html) and it is
used to define the test name before the results.
Original output:
TAP version 14
1..1
# Subtest: kunit-test-suite
1..3
ok 1 - kunit_test_1
ok 2 - kunit_test_2
ok 3 - kunit_test_3
# kunit-test-suite: pass:3 fail:0 skip:0 total:3
# Totals: pass:3 fail:0 skip:0 total:3
ok 1 - kunit-test-suite
New output:
KTAP version 1
1..1
KTAP version 1
# Subtest: kunit-test-suite
1..3
ok 1 kunit_test_1
ok 2 kunit_test_2
ok 3 kunit_test_3
# kunit-test-suite: pass:3 fail:0 skip:0 total:3
# Totals: pass:3 fail:0 skip:0 total:3
ok 1 kunit-test-suite
Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Rae Moar [Wed, 23 Nov 2022 18:25:57 +0000 (18:25 +0000)]
kunit: tool: parse KTAP compliant test output
Change the KUnit parser to be able to parse test output that complies with
the KTAP version 1 specification format found here:
https://kernel.org/doc/html/latest/dev-tools/ktap.html. Ensure the parser
is able to parse tests with the original KUnit test output format as
well.
KUnit parser now accepts any of the following test output formats:
Original KUnit test output format:
TAP version 14
1..1
# Subtest: kunit-test-suite
1..3
ok 1 - kunit_test_1
ok 2 - kunit_test_2
ok 3 - kunit_test_3
# kunit-test-suite: pass:3 fail:0 skip:0 total:3
# Totals: pass:3 fail:0 skip:0 total:3
ok 1 - kunit-test-suite
KTAP version 1 test output format:
KTAP version 1
1..1
KTAP version 1
1..3
ok 1 kunit_test_1
ok 2 kunit_test_2
ok 3 kunit_test_3
ok 1 kunit-test-suite
New KUnit test output format (changes made in the next patch of
this series):
KTAP version 1
1..1
KTAP version 1
# Subtest: kunit-test-suite
1..3
ok 1 kunit_test_1
ok 2 kunit_test_2
ok 3 kunit_test_3
# kunit-test-suite: pass:3 fail:0 skip:0 total:3
# Totals: pass:3 fail:0 skip:0 total:3
ok 1 kunit-test-suite
Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
David Gow [Fri, 25 Nov 2022 08:43:06 +0000 (16:43 +0800)]
mm: slub: test: Use the kunit_get_current_test() function
Use the newly-added function kunit_get_current_test() instead of
accessing current->kunit_test directly. This function uses a static key
to return more quickly when KUnit is enabled, but no tests are actively
running. There should therefore be a negligible performance impact to
enabling the slub KUnit tests.
Other than the performance improvement, this should be a no-op.
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Gow <davidgow@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
David Gow [Fri, 25 Nov 2022 08:43:05 +0000 (16:43 +0800)]
kunit: Use the static key when retrieving the current test
In order to detect if a KUnit test is running, and to access its
context, the 'kunit_test' member of the current task_struct is used.
Usually, this is accessed directly or via the kunit_fail_current_task()
function.
In order to speed up the case where no test is running, add a wrapper,
kunit_get_current_test(), which uses the static key to fail early.
Equally, Speed up kunit_fail_current_test() by using the static key.
This should make it convenient for code to call this
unconditionally in fakes or error paths, without worrying that this will
slow the code down significantly.
If CONFIG_KUNIT=n (or m), this compiles away to nothing. If
CONFIG_KUNIT=y, it will compile down to a NOP (on most architectures) if
no KUnit test is currently running.
Note that kunit_get_current_test() does not work if KUnit is built as a
module. This mirrors the existing restriction on kunit_fail_current_test().
Note that the definition of kunit_fail_current_test() still wraps an
empty, inline function if KUnit is not built-in. This is to ensure that
the printf format string __attribute__ will still work.
Also update the documentation to suggest users use the new
kunit_get_current_test() function, update the example, and to describe
the behaviour when KUnit is disabled better.
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sadiya Kazi <sadiyakazi@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
David Gow [Fri, 25 Nov 2022 08:43:04 +0000 (16:43 +0800)]
kunit: Provide a static key to check if KUnit is actively running tests
KUnit does a few expensive things when enabled. This hasn't been a
problem because KUnit was only enabled on test kernels, but with a few
people enabling (but not _using_) KUnit on production systems, we need a
runtime way of handling this.
Provide a 'kunit_running' static key (defaulting to false), which allows
us to hide any KUnit code behind a static branch. This should reduce the
performance impact (on other code) of having KUnit enabled to a single
NOP when no tests are running.
Note that, while it looks unintuitive, tests always run entirely within
__kunit_test_suites_init(), so it's safe to decrement the static key at
the end of this function, rather than in __kunit_test_suites_exit(),
which is only there to clean up results in debugfs.
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Mon, 21 Nov 2022 19:55:26 +0000 (11:55 -0800)]
kunit: tool: make --json do nothing if --raw_ouput is set
When --raw_output is set (to any value), we don't actually parse the
test results. So asking to print the test results as json doesn't make
sense.
We internally create a fake test with one passing subtest, so --json
would actually print out something misleading.
This patch:
* Rewords the flag descriptions so hopefully this is more obvious.
* Also updates --raw_output's description to note the default behavior
is to print out only "KUnit" results (actually any KTAP results)
* also renames and refactors some related logic for clarity (e.g.
test_result => test, it's a kunit_parser.Test object).
Notably, this patch does not make it an error to specify --json and
--raw_output together. This is an edge case, but I know of at least one
wrapper around kunit.py that always sets --json. You'd never be able to
use --raw_output with that wrapper.
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Fri, 11 Nov 2022 03:18:55 +0000 (19:18 -0800)]
kunit: tool: tweak error message when no KTAP found
We currently tell people we "couldn't find any KTAP output" with no
indication as to what this might mean.
After this patch, we get:
$ ./tools/testing/kunit/kunit.py parse /dev/null
============================================================
[ERROR] Test: <missing>: Could not find any KTAP output. Did any KUnit tests run?
============================================================
Testing complete. Ran 0 tests: errors: 1
Note: we could try and generate a more verbose message like
> Please check .kunit/test.log to see the raw kernel output.
or the like, but we'd need to know what the build dir was to know where
test.log actually lives.
This patch tries to make a more minimal improvement.
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Wed, 9 Nov 2022 21:20:32 +0000 (13:20 -0800)]
kunit: remove KUNIT_INIT_MEM_ASSERTION macro
Commit
870f63b7cd78 ("kunit: eliminate KUNIT_INIT_*_ASSERT_STRUCT
macros") removed all the other macros of this type.
But it raced with commit
b8a926bea8b1 ("kunit: Introduce
KUNIT_EXPECT_MEMEQ and KUNIT_EXPECT_MEMNEQ macros"), which added another
instance.
Remove KUNIT_INIT_MEM_ASSERTION and just use the generic
KUNIT_INIT_ASSERT macro instead.
Rename the `size` arg to avoid conflicts by appending a "_" (like we did
in the previous commit).
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
David Gow [Fri, 11 Nov 2022 18:29:06 +0000 (10:29 -0800)]
Documentation: kunit: Remove redundant 'tips.rst' page
The contents of 'tips.rst' was mostly included in 'usage.rst' way back in
commit
953574390634 ("Documentation: KUnit: Rework writing page to focus on writing tests"),
but the tips page remained behind as well.
The parent patches in this series fill in the gaps, so now 'tips.rst' is
redundant.
Therefore, delete 'tips.rst'.
While I regret breaking any links to 'tips' which might exist
externally, it's confusing to have two subtly different versions of the
same content around.
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Sadiya Kazi <sadiyakazi@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Fri, 11 Nov 2022 18:29:05 +0000 (10:29 -0800)]
Documentation: KUnit: reword description of assertions
The existing wording implies that kunit_kmalloc_array() is "the method
under test". We're actually testing the sort() function in that example.
This is because the example was changed in commit
953574390634
("Documentation: KUnit: Rework writing page to focus on writing tests"),
but the wording was not.
Also add a `note` telling people they can use the KUNIT_ASSERT_EQ()
macros from any function. Some users might be coming from a framework
like gUnit where that'll compile but silently do the wrong thing.
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Sadiya Kazi <sadiyakazi@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Fri, 11 Nov 2022 18:29:04 +0000 (10:29 -0800)]
Documentation: KUnit: make usage.rst a superset of tips.rst, remove duplication
usage.rst had most of the content of the tips.rst page copied over.
But it's missing https://www.kernel.org/doc/html/v6.0/dev-tools/kunit/tips.html#customizing-error-messages
Copy it over so we can retire tips.rst w/o losing content.
And in that process, it also gained a duplicate section about how
KUNIT_ASSERT_*() exit the test case early. Remove that.
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Sadiya Kazi <sadiyakazi@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Sat, 1 Oct 2022 00:26:37 +0000 (17:26 -0700)]
kunit: eliminate KUNIT_INIT_*_ASSERT_STRUCT macros
These macros exist because passing an initializer list to other macros
is hard.
The goal of these macros is to generate a line like
struct $ASSERT_TYPE __assertion = $APPROPRIATE_INITIALIZER;
e.g.
struct kunit_unary_assertion __assertion = {
.condition = "foo()",
.expected_true = true
};
But the challenge is you can't pass `{.condition=..., .expect_true=...}`
as a macro argument, since the comma means you're actually passing two
arguments, `{.condition=...` and `.expect_true=....}`.
So we'd made custom macros for each different initializer-list shape.
But we can work around this with the following generic macro
#define KUNIT_INIT_ASSERT(initializers...) { initializers }
Note: this has the downside that we have to rename some macros arguments
to not conflict with the struct field names (e.g. `expected_true`).
It's a bit gross, but probably worth reducing the # of macros.
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Thu, 3 Nov 2022 17:47:40 +0000 (10:47 -0700)]
kunit: tool: remove redundant file.close() call in unit test
We're using a `with` block above, so the file object is already closed.
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Thu, 3 Nov 2022 17:47:39 +0000 (10:47 -0700)]
kunit: tool: unit tests all check parser errors, standardize formatting a bit
Let's verify that the parser isn't reporting any errors for valid
inputs.
This change also
* does result.status checking on one line
* makes sure we consistently do it outside of the `with` block
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Daniel Latypov [Thu, 3 Nov 2022 17:47:38 +0000 (10:47 -0700)]
kunit: tool: make TestCounts a dataclass
Since we're using Python 3.7+, we can use dataclasses to tersen the
code.
It also lets us create pre-populated TestCounts() objects and compare
them in our unit test. (Before, you could only create empty ones).
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Linus Torvalds [Mon, 12 Dec 2022 21:01:14 +0000 (13:01 -0800)]
Merge tag 'x86-misc-2022-12-10' of git://git./linux/kernel/git/tip/tip
Pull misc x86 updates from Thomas Gleixner:
"Updates for miscellaneous x86 areas:
- Reserve a new boot loader type for barebox which is usally used on
ARM and MIPS, but can also be utilized as EFI payload on x86 to
provide watchdog-supervised boot up.
- Consolidate the native and compat 32bit signal handling code and
split the 64bit version out into a separate source file
- Switch the ESPFIX random usage to get_random_long()"
* tag 'x86-misc-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/espfix: Use get_random_long() rather than archrandom
x86/signal/64: Move 64-bit signal code to its own file
x86/signal/32: Merge native and compat 32-bit signal code
x86/signal: Add ABI prefixes to frame setup functions
x86/signal: Merge get_sigframe()
x86: Remove __USER32_DS
signal/compat: Remove compat_sigset_t override
x86/signal: Remove sigset_t parameter from frame setup functions
x86/signal: Remove sig parameter from frame setup functions
Documentation/x86/boot: Reserve type_of_loader=13 for barebox
Linus Torvalds [Mon, 12 Dec 2022 20:52:02 +0000 (12:52 -0800)]
Merge tag 'timers-core-2022-12-10' of git://git./linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Updates for timers, timekeeping and drivers:
Core:
- The timer_shutdown[_sync]() infrastructure:
Tearing down timers can be tedious when there are circular
dependencies to other things which need to be torn down. A prime
example is timer and workqueue where the timer schedules work and
the work arms the timer.
What needs to prevented is that pending work which is drained via
destroy_workqueue() does not rearm the previously shutdown timer.
Nothing in that shutdown sequence relies on the timer being
functional.
The conclusion was that the semantics of timer_shutdown_sync()
should be:
- timer is not enqueued
- timer callback is not running
- timer cannot be rearmed
Preventing the rearming of shutdown timers is done by discarding
rearm attempts silently.
A warning for the case that a rearm attempt of a shutdown timer is
detected would not be really helpful because it's entirely unclear
how it should be acted upon. The only way to address such a case is
to add 'if (in_shutdown)' conditionals all over the place. This is
error prone and in most cases of teardown not required all.
- The real fix for the bluetooth HCI teardown based on
timer_shutdown_sync().
A larger scale conversion to timer_shutdown_sync() is work in
progress.
- Consolidation of VDSO time namespace helper functions
- Small fixes for timer and timerqueue
Drivers:
- Prevent integer overflow on the XGene-1 TVAL register which causes
an never ending interrupt storm.
- The usual set of new device tree bindings
- Small fixes and improvements all over the place"
* tag 'timers-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
dt-bindings: timer: renesas,cmt: Add r8a779g0 CMT support
dt-bindings: timer: renesas,tmu: Add r8a779g0 support
clocksource/drivers/arm_arch_timer: Use kstrtobool() instead of strtobool()
clocksource/drivers/timer-ti-dm: Fix missing clk_disable_unprepare in dmtimer_systimer_init_clock()
clocksource/drivers/timer-ti-dm: Clear settings on probe and free
clocksource/drivers/timer-ti-dm: Make timer_get_irq static
clocksource/drivers/timer-ti-dm: Fix warning for omap_timer_match
clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error
clocksource/drivers/timer-npcm7xx: Enable timer 1 clock before use
dt-bindings: timer: nuvoton,npcm7xx-timer: Allow specifying all clocks
dt-bindings: timer: rockchip: Add rockchip,rk3128-timer
clockevents: Repair kernel-doc for clockevent_delta2ns()
clocksource/drivers/ingenic-ost: Define pm functions properly in platform_driver struct
clocksource/drivers/sh_cmt: Access registers according to spec
vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy
Bluetooth: hci_qca: Fix the teardown problem for real
timers: Update the documentation to reflect on the new timer_shutdown() API
timers: Provide timer_shutdown[_sync]()
timers: Add shutdown mechanism to the internal functions
timers: Split [try_to_]del_timer[_sync]() to prepare for shutdown mode
...
Linus Torvalds [Mon, 12 Dec 2022 20:44:03 +0000 (12:44 -0800)]
Merge tag 'x86-cleanups-2022-12-10' of git://git./linux/kernel/git/tip/tip
Pull x86 cleanups from Thomas Gleixner:
"A set of x86 cleanups:
- Rework the handling of x86_regset for 32 and 64 bit.
The original implementation tried to minimize the allocation size
with quite some hard to understand and fragile tricks. Make it
robust and straight forward by separating the register enumerations
for 32 and 64 bit completely.
- Add a few missing static annotations
- Remove the stale unused setup_once() assembly function
- Address a few minor static analysis and kernel-doc warnings"
* tag 'x86-cleanups-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm/32: Remove setup_once()
x86/kaslr: Fix process_mem_region()'s return value
x86: Fix misc small issues
x86/boot: Repair kernel-doc for boot_kstrtoul()
x86: Improve formatting of user_regset arrays
x86: Separate out x86_regset for 32 and 64 bit
x86/i8259: Make default_legacy_pic static
x86/tsc: Make art_related_clocksource static
Linus Torvalds [Mon, 12 Dec 2022 20:30:31 +0000 (12:30 -0800)]
Merge tag 'x86-apic-2022-12-10' of git://git./linux/kernel/git/tip/tip
Pull x86 apic update from Thomas Gleixner:
"A set of changes for the x86 APIC code:
- Handle the case where x2APIC is enabled and locked by the BIOS on a
kernel with CONFIG_X86_X2APIC=n gracefully.
Instead of a panic which does not make it to the graphical console
during very early boot, simply disable the local APIC completely
and boot with the PIC and very limited functionality, which allows
to diagnose the issue
- Convert x86 APIC device tree bindings to YAML
- Extend x86 APIC device tree bindings to configure interrupt
delivery mode and handle this in during init. This allows to boot
with device tree on platforms which lack a legacy PIC"
* tag 'x86-apic-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/of: Add support for boot time interrupt delivery mode configuration
x86/of: Replace printk(KERN_LVL) with pr_lvl()
dt-bindings: x86: apic: Introduce new optional bool property for lapic
dt-bindings: x86: apic: Convert Intel's APIC bindings to YAML schema
x86/of: Remove unused early_init_dt_add_memory_arch()
x86/apic: Handle no CONFIG_X86_X2APIC on systems with x2APIC enabled by BIOS
Linus Torvalds [Mon, 12 Dec 2022 19:56:59 +0000 (11:56 -0800)]
Merge tag 'smp-core-2022-12-10' of git://git./linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner:
"A small set of updates for CPU hotplug:
- Prevent stale CPU hotplug state in the cpu_down() path which was
detected by stress testing the sysfs interface
- Ensure that the target CPU hotplug state for the boot CPU is
CPUHP_ONLINE instead of the compile time init value CPUHP_OFFLINE.
- Switch back to the original behaviour of warning when a CPU hotplug
callback in the DYING/STARTING section returns an error code.
Otherwise a buggy callback can leave the CPUs in an non recoverable
state"
* tag 'smp-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Do not bail-out in DYING/STARTING sections
cpu/hotplug: Set cpuhp target for boot cpu
cpu/hotplug: Make target_store() a nop when target == state
Linus Torvalds [Mon, 12 Dec 2022 19:21:29 +0000 (11:21 -0800)]
Merge tag 'irq-core-2022-12-10' of git://git./linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt core and driver subsystem:
The bulk is the rework of the MSI subsystem to support per device MSI
interrupt domains. This solves conceptual problems of the current
PCI/MSI design which are in the way of providing support for
PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device.
IMS (Interrupt Message Store] is a new specification which allows
device manufactures to provide implementation defined storage for MSI
messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified
message store which is uniform accross all devices). The PCI/MSI[-X]
uniformity allowed us to get away with "global" PCI/MSI domains.
IMS not only allows to overcome the size limitations of the MSI-X
table, but also gives the device manufacturer the freedom to store the
message in arbitrary places, even in host memory which is shared with
the device.
There have been several attempts to glue this into the current MSI
code, but after lengthy discussions it turned out that there is a
fundamental design problem in the current PCI/MSI-X implementation.
This needs some historical background.
When PCI/MSI[-X] support was added around 2003, interrupt management
was completely different from what we have today in the actively
developed architectures. Interrupt management was completely
architecture specific and while there were attempts to create common
infrastructure the commonalities were rudimentary and just providing
shared data structures and interfaces so that drivers could be written
in an architecture agnostic way.
The initial PCI/MSI[-X] support obviously plugged into this model
which resulted in some basic shared infrastructure in the PCI core
code for setting up MSI descriptors, which are a pure software
construct for holding data relevant for a particular MSI interrupt,
but the actual association to Linux interrupts was completely
architecture specific. This model is still supported today to keep
museum architectures and notorious stragglers alive.
In 2013 Intel tried to add support for hot-pluggable IO/APICs to the
kernel, which was creating yet another architecture specific mechanism
and resulted in an unholy mess on top of the existing horrors of x86
interrupt handling. The x86 interrupt management code was already an
incomprehensible maze of indirections between the CPU vector
management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X]
implementation.
At roughly the same time ARM struggled with the ever growing SoC
specific extensions which were glued on top of the architected GIC
interrupt controller.
This resulted in a fundamental redesign of interrupt management and
provided the today prevailing concept of hierarchical interrupt
domains. This allowed to disentangle the interactions between x86
vector domain and interrupt remapping and also allowed ARM to handle
the zoo of SoC specific interrupt components in a sane way.
The concept of hierarchical interrupt domains aims to encapsulate the
functionality of particular IP blocks which are involved in interrupt
delivery so that they become extensible and pluggable. The X86
encapsulation looks like this:
|--- device 1
[Vector]---[Remapping]---[PCI/MSI]--|...
|--- device N
where the remapping domain is an optional component and in case that
it is not available the PCI/MSI[-X] domains have the vector domain as
their parent. This reduced the required interaction between the
domains pretty much to the initialization phase where it is obviously
required to establish the proper parent relation ship in the
components of the hierarchy.
While in most cases the model is strictly representing the chain of IP
blocks and abstracting them so they can be plugged together to form a
hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the
hardware it's clear that the actual PCI/MSI[-X] interrupt controller
is not a global entity, but strict a per PCI device entity.
Here we took a short cut on the hierarchical model and went for the
easy solution of providing "global" PCI/MSI domains which was possible
because the PCI/MSI[-X] handling is uniform across the devices. This
also allowed to keep the existing PCI/MSI[-X] infrastructure mostly
unchanged which in turn made it simple to keep the existing
architecture specific management alive.
A similar problem was created in the ARM world with support for IP
block specific message storage. Instead of going all the way to stack
a IP block specific domain on top of the generic MSI domain this ended
in a construct which provides a "global" platform MSI domain which
allows overriding the irq_write_msi_msg() callback per allocation.
In course of the lengthy discussions we identified other abuse of the
MSI infrastructure in wireless drivers, NTB etc. where support for
implementation specific message storage was just mindlessly glued into
the existing infrastructure. Some of this just works by chance on
particular platforms but will fail in hard to diagnose ways when the
driver is used on platforms where the underlying MSI interrupt
management code does not expect the creative abuse.
Another shortcoming of today's PCI/MSI-X support is the inability to
allocate or free individual vectors after the initial enablement of
MSI-X. This results in an works by chance implementation of VFIO (PCI
pass-through) where interrupts on the host side are not set up upfront
to avoid resource exhaustion. They are expanded at run-time when the
guest actually tries to use them. The way how this is implemented is
that the host disables MSI-X and then re-enables it with a larger
number of vectors again. That works by chance because most device
drivers set up all interrupts before the device actually will utilize
them. But that's not universally true because some drivers allocate a
large enough number of vectors but do not utilize them until it's
actually required, e.g. for acceleration support. But at that point
other interrupts of the device might be in active use and the MSI-X
disable/enable dance can just result in losing interrupts and
therefore hard to diagnose subtle problems.
Last but not least the "global" PCI/MSI-X domain approach prevents to
utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact
that IMS is not longer providing a uniform storage and configuration
model.
The solution to this is to implement the missing step and switch from
global PCI/MSI domains to per device PCI/MSI domains. The resulting
hierarchy then looks like this:
|--- [PCI/MSI] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
which in turn allows to provide support for multiple domains per
device:
|--- [PCI/MSI] device 1
|--- [PCI/IMS] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
|--- [PCI/IMS] device N
This work converts the MSI and PCI/MSI core and the x86 interrupt
domains to the new model, provides new interfaces for post-enable
allocation/free of MSI-X interrupts and the base framework for
PCI/IMS. PCI/IMS has been verified with the work in progress IDXD
driver.
There is work in progress to convert ARM over which will replace the
platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
"solutions" are in the works as well.
Drivers:
- Updates for the LoongArch interrupt chip drivers
- Support for MTK CIRQv2
- The usual small fixes and updates all over the place"
* tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits)
irqchip/ti-sci-inta: Fix kernel doc
irqchip/gic-v2m: Mark a few functions __init
irqchip/gic-v2m: Include arm-gic-common.h
irqchip/irq-mvebu-icu: Fix works by chance pointer assignment
iommu/amd: Enable PCI/IMS
iommu/vt-d: Enable PCI/IMS
x86/apic/msi: Enable PCI/IMS
PCI/MSI: Provide pci_ims_alloc/free_irq()
PCI/MSI: Provide IMS (Interrupt Message Store) support
genirq/msi: Provide constants for PCI/IMS support
x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN
PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X
PCI/MSI: Provide prepare_desc() MSI domain op
PCI/MSI: Split MSI-X descriptor setup
genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN
genirq/msi: Provide msi_domain_alloc_irq_at()
genirq/msi: Provide msi_domain_ops:: Prepare_desc()
genirq/msi: Provide msi_desc:: Msi_data
genirq/msi: Provide struct msi_map
x86/apic/msi: Remove arch_create_remap_msi_irq_domain()
...
Linus Torvalds [Mon, 12 Dec 2022 19:11:57 +0000 (11:11 -0800)]
Merge tag 'core-debugobjects-2022-12-10' of git://git./linux/kernel/git/tip/tip
Pull debugobjects update from Thomas Gleixner:
"A single update for debugobjects:
Add the object pointer to the debug output for better correlation with
other debug facilities"
* tag 'core-debugobjects-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
debugobjects: Print object pointer in debug_print_object()
Linus Torvalds [Mon, 12 Dec 2022 19:10:02 +0000 (11:10 -0800)]
Merge tag 'x86-urgent-2022-12-12' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Three small x86 fixes which did not make it into 6.1:
- Remove a superfluous noinline which prevents GCC-7.3 to optimize a
stub function away
- Allow uprobes on REP NOP and do not treat them like word-sized
branch instructions
- Make the VDSO symbol export of __vdso_sgx_enter_enclave() depend on
CONFIG_X86_SGX to prevent build failures with newer LLVM versions
which rightfully detect that there is no function behind the
symbol"
* tag 'x86-urgent-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Conditionally export __vdso_sgx_enter_enclave()
uprobes/x86: Allow to probe a NOP instruction with 0x66 prefix
x86/alternative: Remove noinline from __ibt_endbr_seal[_end]() stubs
Linus Torvalds [Mon, 12 Dec 2022 19:04:08 +0000 (11:04 -0800)]
Merge tag 's390-6.2-1' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:
- Factor out handle_write() function and simplify 3215 console write
operation
- When 3170 terminal emulator is connected to the 3215 console driver
the boot time could be very long due to limited buffer space or
missing operator input. Add con3215_drop command line parameter and
con3215_drop sysfs attribute file to instruct the kernel drop console
data when such conditions are met
- Fix white space errors in 3215 console driver
- Move enum paiext_mode definition to a header file and rename it to
paievt_mode to indicate this is now used for several events. Rename
PAI_MODE_COUNTER to PAI_MODE_COUNTING to make consistent with
PAI_MODE_SAMPLING
- Simplify the logic of PMU pai_crypto mapped buffer reference counter
and make it consistent with PMU pai_ext
- Rename PMU pai_crypto mapped buffer structure member users to
active_events to make it consistent with PMU pai_ext
- Enable HUGETLB_PAGE_OPTIMIZE_VMEMMAP configuration option. This
results in saving of 12K per 1M hugetlb page (~1.2%) and 32764K per
2G hugetlb page (~1.6%)
- Use generic serial.h, bugs.h, shmparam.h and vga.h header files and
scrap s390-specific versions
- The generic percpu setup code does not expect the s390-like
implementation and emits a warning. To get rid of that warning and
provide sane CPU-to-node and CPU-to-CPU distance mappings implementat
a minimal version of setup_per_cpu_areas()
- Use kstrtobool() instead of strtobool() for re-IPL sysfs device
attributes
- Avoid unnecessary lookup of a pointer to MSI descriptor when setting
IRQ affinity for a PCI device
- Get rid of "an incompatible function type cast" warning by changing
debug_sprintf_format_fn() function prototype so it matches the
debug_format_proc_t function type
- Remove unused info_blk_hdr__pcpus() and get_page_state() functions
- Get rid of clang "unused unused insn cache ops function" warning by
moving s390_insn definition to a private header
- Get rid of clang "unused function" warning by making function
raw3270_state_final() only available if CONFIG_TN3270_CONSOLE is
enabled
- Use kstrobool() to parse sclp_con_drop parameter to make it identical
to the con3215_drop parameter and allow passing values like "yes" and
"true"
- Use sysfs_emit() for all SCLP sysfs show functions, which is the
current standard way to generate output strings
- Make SCLP con_drop sysfs attribute also writable and allow to change
its value during runtime. This makes SCLP console drop handling
consistent with the 3215 device driver
- Virtual and physical addresses are indentical on s390. However, there
is still a confusion when pointers are directly casted to physical
addresses or vice versa. Use correct address converters
virt_to_phys() and phys_to_virt() for s390 channel IO drivers
- Support for power managemant has been removed from s390 since quite
some time. Remove unused power managemant code from the appldata
device driver
- Allow memory tools like KASAN see memory accesses from the checksum
code. Switch to GENERIC_CSUM if KASAN is enabled, just like x86 does
- Add support of ECKD DASDs disks so it could be used as boot and dump
devices
- Follow checkpatch recommendations and use octal values instead of
S_IRUGO and S_IWUSR for dump device attributes in sysfs
- Changes to vx-insn.h do not cause a recompile of C files that use
asm(".include \"asm/vx-insn.h\"\n") magic to access vector
instruction macros from inline assemblies. Add wrapper include header
file to avoid this problem
- Use vector instruction macros instead of byte patterns to increase
register validation routine readability
- The current machine check register validation handling does not take
into account various scenarios and might lead to killing a wrong user
process or potentially ignore corrupted FPU registers. Simplify logic
of the machine check handler and stop the whole machine if the
previous context was kerenel mode. If the previous context was user
mode, kill the current task
- Introduce sclp_emergency_printk() function which can be used to emit
a message in emergency cases. It is supposed to be used in cases
where regular console device drivers may not work anymore, e.g.
unrecoverable machine checks
Keep the early Service-Call Control Block so it can also be used
after initdata has been freed to allow sclp_emergency_printk()
implementation
- In case a system will be stopped because of an unrecoverable machine
check error print the machine check interruption code to give a hint
of what went wrong
- Move storage error checking from the assembly entry code to C in
order to simplify machine check handling. Enter the handler with DAT
turned on, which simplifies the entry code even more
- The machine check extended save areas are allocated using a private
"nmi_save_areas" slab cache which guarantees a required power-of-two
alignment. Get rid of that cache in favour of kmalloc()
* tag 's390-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (38 commits)
s390/nmi: get rid of private slab cache
s390/nmi: move storage error checking back to C, enter with DAT on
s390/nmi: print machine check interruption code before stopping system
s390/sclp: introduce sclp_emergency_printk()
s390/sclp: keep sclp_early_sccb
s390/nmi: rework register validation handling
s390/nmi: use vector instruction macros instead of byte patterns
s390/vx: add vx-insn.h wrapper include file
s390/ipl: use octal values instead of S_* macros
s390/ipl: add eckd dump support
s390/ipl: add eckd support
vfio/ccw: identify CCW data addresses as physical
vfio/ccw: sort out physical vs virtual pointers usage
s390/checksum: support GENERIC_CSUM, enable it for KASAN
s390/appldata: remove power management callbacks
s390/cio: sort out physical vs virtual pointers usage
s390/sclp: allow to change sclp_console_drop during runtime
s390/sclp: convert to use sysfs_emit()
s390/sclp: use kstrobool() to parse sclp_con_drop parameter
s390/3270: make raw3270_state_final() depend on CONFIG_TN3270_CONSOLE
...
Linus Torvalds [Mon, 12 Dec 2022 19:02:19 +0000 (11:02 -0800)]
Merge tag 'm68k-for-v6.2-tag1' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:
- remove an unused function involving a non-explictly signed char type
- reword a (correct) comment to stop the inflood of (incorrect) patches
trying to fix it
- defconfig updates
* tag 'm68k-for-v6.2-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: defconfig: Update defconfigs for v6.1-rc1
m68k: mac: Reword comment using double "in"
m68k: mac: Remove unused rbv_set_video_bpp()
Linus Torvalds [Mon, 12 Dec 2022 18:59:13 +0000 (10:59 -0800)]
Merge tag 'mips_6.2' of git://git./linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
- DT cleanups
- fix for early use of kzalloc on mt7621 platform
- cleanups and fixes
* tag 'mips_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (21 commits)
MIPS: OCTEON: warn only once if deprecated link status is being used
MIPS: BCM63xx: Add check for NULL for clk in clk_enable
platform/mips: Adjust Kconfig to keep consistency
MIPS: OCTEON: cvmx-bootmem: use strscpy() to instead of strncpy()
MIPS: mscc: jaguar2: Fix pca9545 i2c-mux node names
mips/pci: use devm_platform_ioremap_resource()
mips: ralink: mt7621: do not use kzalloc too early
mips: ralink: mt7621: soc queries and tests as functions
mips: ralink: mt7621: define MT7621_SYSC_BASE with __iomem
MIPS: Restore symbol versions for copy_page_cpu and clear_page_cpu
mips: dts: remove label = "cpu" from DSA dt-binding
mips: ralink: mt7621: change DSA port labels to generic naming
mips: ralink: mt7621: fix phy-mode of external phy on GB-PC2
MIPS: vpe-cmp: fix possible memory leak while module exiting
MIPS: vpe-mt: fix possible memory leak while module exiting
dt-bindings: mips: brcm: add Broadcom SoCs bindings
dt-bindings: mips: add CPU bindings for MIPS architecture
mips: dts: brcm: bcm7435: add "interrupt-names" for NAND controller
mips: dts: bcm63268: add TWD block timer
MIPS: Use "grep -E" instead of "egrep"
...
Linus Torvalds [Mon, 12 Dec 2022 18:55:53 +0000 (10:55 -0800)]
Merge tag 'for-linus-6.2-rc1-tag' of git://git./linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- fix memory leaks in error paths
- add support for virtio PCI-devices in Xen guests on ARM
- two minor fixes
* tag 'for-linus-6.2-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/privcmd: Fix a possible warning in privcmd_ioctl_mmap_resource()
x86/xen: Fix memory leak in xen_init_lock_cpu()
x86/xen: Fix memory leak in xen_smp_intr_init{_pv}()
xen: fix xen.h build for CONFIG_XEN_PVH=y
xen/virtio: Handle PCI devices which Host controller is described in DT
xen/virtio: Optimize the setup of "xen-grant-dma" devices