From: Dylan Baker Date: Mon, 28 Sep 2020 23:27:44 +0000 (-0700) Subject: docs: add release notes for 20.2.0 X-Git-Tag: upstream/21.0.0~4349 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=ddad8d9c983e042671159ae5adb9eaa5d947ed17;p=platform%2Fupstream%2Fmesa.git docs: add release notes for 20.2.0 Part-of: --- diff --git a/docs/relnotes/20.2.0.rst b/docs/relnotes/20.2.0.rst new file mode 100644 index 0000000..bb21ad1 --- /dev/null +++ b/docs/relnotes/20.2.0.rst @@ -0,0 +1,4748 @@ +Mesa 20.2.0 Release Notes / 2020-09-28 +====================================== + +Mesa 20.2.0 is a new development release. People who are concerned +with stability and reliability should stick with a previous release or +wait for Mesa 20.2.1. + +Mesa 20.2.0 implements the OpenGL 4.6 API, but the version reported by +glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / +glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. +Some drivers don't support all the features required in OpenGL 4.6. OpenGL +4.6 is **only** available if requested at context creation. +Compatibility contexts may report a lower version depending on each driver. + +Mesa 20.2.0 implements the Vulkan 1.2 API, but the version reported by +the apiVersion property of the VkPhysicalDeviceProperties struct +depends on the particular driver being used. + +SHA256 checksum +--------------- + +:: + + TBD. + + +New features +------------ + +- GL_ARB_compute_variable_group_size on Iris. + +- GL_ARB_gpu_shader5 on llvmpipe + +- GL_ARB_post_depth_coverage on llvmpipe + +- GLES 3.2 on llvmpipe + +- GL_EXT_shader_group_vote on GLES3. + +- GL_EXT_texture_shadow_lod on llvmpipe + +- VK_AMD_texture_gather_bias_lod on RADV. + +- VK_AMD_gpu_shader_half_float on RADV/ACO. + +- VK_AMD_gpu_shader_int16 on RADV/ACO. + +- VK_EXT_extended_dynamic_state on ANV and RADV. + +- VK_EXT_image_robustness on RADV. + +- VK_EXT_private_data on ANV and RADV. + +- VK_EXT_custom_border_color on ANV and RADV. + +- VK_EXT_pipeline_creation_cache_control on ANV and RADV. + +- VK_EXT_shader_demote_to_helper_invocation on RADV/LLVM. + +- VK_EXT_subgroup_size_control on RADV/ACO. + +- VK_GOOGLE_user_type on ANV and RADV. + +- VK_KHR_shader_subgroup_extended_types on RADV/ACO. + +- GL_ARB_gl_spirv on nvc0/nir. + +- GL_ARB_spirv_extensions on nvc0/nir. + +- RADV now uses ACO per default as backend + +- RADV_DEBUG=llvm option to enable LLVM backend for RADV + +- VK_EXT_image_robustness for ANV + +- VK_EXT_shader_atomic_float on ANV + +- VK_EXT_4444_formats on ANV and RADV. + +- VK_KHR_memory_model on RADV. + +- GL 4.5 on llvmpipe + +- EGL_KHR_swap_buffers_with_damage on X11 (DRI3) + + +Bug fixes +--------- + +- [Regression][Bisected][20.2][radeonsi] American Truck Simulator continually allocates memory until OOM +- anv: dEQP-VK.robustness.robustness2.* failures on gen12 +- [RADV] Problems reading primitive ID in fragment shader after tessellation +- Massive memory leak (at least AMD, others unknown) +- Substance Painter 6.1.3 black glitches on Radeon RX570 +- vkCmdCopyImage broadcasts subsample 0 of MSAA src into all subsamples of dst on RADV +- Crash in ruvd_end_frame when calling vaBeginPicture/vaEndPicture without rendering anything +- X-Plane 11 Installer crashes on startup since `glsl: declare gl_Layer/gl_ViewportIndex/gl_ViewportMask as vs builtins` +- Horizon Zero Dawn graphics corruption with with radv +- Amber test opt_peel_loop_initial_if: Assertion failed +- Dirt Rally: Flickering glitches on certain foliage since Mesa 20.1.0 caused by MSAA +- [BRW] WRC 5 asserts with gallium nine and iris. +- radv: Corruption in "The Surge 2" +- [RADV] Detroit: Become Human Demo game lock-ups with RADV +- Road Redemption certain graphic effects rendered white color +- vulkan/wsi/x11: deadlock with Xwayland when compositor holds multiple buffers +- [RADV/ACO] Death Stranding cause a GPU hung (*ERROR* Waiting for fences timed out!) +- lp_bld_init.c:172:7: error: implicit declaration of function ‘LLVMAddConstantPropagationPass’; did you mean ‘LLVMAddCorrelatedValuePropagationPass’? [-Werror=implicit-function-declaration] +- Intel Vulkan driver crash with alpha-to-coverage +- EGL_KHR_swap_buffers_with_damage support on X11 +- radv: blitting 3D images with linear filter +- [ACO] Compiling pipelines from RPCS3's shader interpreter spins forever in ACO code +- Intel Vulkan driver assertion with small xfb buffer +- [spirv-fuzz] SPIR-V parsing failed "src->type->type == dest->type->type" +- radeonsi: radeonsi crashes in Chrome on chromeos +- [RADV] commit d19bc94e4eb94 broke gamescope with Navi +- 4e3a7dcf6ee4946c46ae8b35e7883a49859ef6fb breaks Gamescope showing windows properly. +- anv: crashes in CTS test dEQP-VK.subgroups.*.framebuffer.*_tess_eval +- Intel Vuikan (anv) crash in copy_non_dynamic_state() when using validation layer +- Mafia 3: Trees get rendered incorrectly +- radv: dEQP-VK.synchronization.op.multi_queue.timeline_semaphore.write_clear_attachments_*_concurrent fail when forcing DCC. +- Crash on GTA 5 through proton 5.0.9 and GE versions +- Mesa 20.2.0-rc1 fails to build for AMD +- Assertion failure compiling shader from Zigguart +- Panfrost locks for waiting fence when running Source engine games +- ci: `-Dtools=panfrost` should be build-tested +- panfrost: Register allocation fails for Firefox WebRender shaders +- VRAM leak with vuilkan external memory + opengl memory objects +- [vulkan/build] Recent build system changes made VK_EXT_acquire_xlib_display unnecessarily depend on GBM +- ci: Capture devcoredumps on chezas +- Possible array out of bounds in brw_vec4_nir.cpp +- freedreno/a6xx: incorrect rendering in asphalt 9 +- [tgl][bisected][regression][iris] failure on dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_default +- Multiply defined symbols compiling with gcc@10.1.0 +- shrinking descriptor pool on intel+vulkan +- dEQP-VK.renderpass2.dedicated_allocation.attachment.1.12 fails on NAVI14 +- turnip: binning and indirect dependency +- Amber test leads to NIR validation failed after nir_opt_if (on spirv-fuzz shader) +- Unable to compile mesa-git from b559d26c +- Ambient light too bright with ACO in AC: Odyssey +- Multiple issues with Detroit Become Human +- ci: Capture artifacts in baremetal mode +- turnip/ir3: fine derivatives +- panfrost: regression: Major stuttering and low compositor FPS with glmark2 +- khr_debug-push-pop-group_gl: ../src/util/simple_mtx.h:86: simple_mtx_lock: Assertion `c != _SIMPLE_MTX_INVALID_VALUE' failed. +- freedreno/a6xx: skai/skqp fails +- SPIR-V parsing fails in src/compiler/spirv/spirv_to_nir.c +- SPIR-V parsing fails in src/compiler/spirv/vtn_cfg.c +- Weird GLSL bug +- iris driver is broken in Freedesktop 19.08 +- LLVM not properly shutdown in `si_pipe.c`? +- Panfrost: add current status to docs/features.txt +- Opengl incorrect rendering on yuzu Amd +- RADV: VK_ACCESS_MEMORY_READ/WRITE_BIT is not implemented +- [bisected][regression][all platforms] multiple deqp-gles31/glescts/piglit failures +- 7406ea37, "ac/surface: require that gfx8 doesn't have DCC in order to be displayable", breaks Gamescope being able to launch games on RX580, and possibly other gfx8 cards +- vkGetSemaphoreCounterValue doesn't update without vkWaitSemaphores calls on Intel UHD 620 +- [RADV] System crash when playing XCOM Chimera Squad because of commit #7a5e6fd2 +- [RADV] Non-precise occlusion queries return non-zero when all fragments are discarded +- [DXVK] Project Cars rendering problems +- ADDRLIB ODR Violation +- Build fails with current mesa from git "undefinierter Verweis auf »nir_lower_clip_disable«" +- KDE Compositor stuttering after Check for window destruction in dri3_wait_for_event_locked +- Add fallthrough to prevent errors caused by missing break +- i965/20.1: gray rendering with torcs racing +- glBindBufferRange call seems to be ignored by one of two shader-programs on radeon cards +- [bisected][g33] piglit.spec.ext_framebuffer_object.fbo-cubemap failure +- Increase GL_MAX_COMPUTE_SHADER_STORAGE_BLOCKS to greater value. +- nir: st_nir_lower_builtin fails for gl_LightSource[i] +- Sometimes VLC player process gets stuck in memory after closure if video output used is Auto or OpenGL +- Double unlock in rbug_context.c +- Double copy for TexSubImage +- [v3d] corruption when GS omits some vertices +- Iris crashes when reading from multisampled front buffer on platforms without front buffer +- freedreno: subway surfers crash when repeatedly toggling fullscreen +- [RADV/GFX8] Performance drop in DOOM Eternal when "Present from compute" is enabled +- freedreno: multiple applications crash on a5xx +- Use-after-free crash innv50_ir::GCRA::RIG_Node::init() +- intel: Sample mask writes need to be honored in Vulkan +- [RADV] - Path of Exile (238960) - Map outline, landscape and markers are missing with the Vulkan renderer. +- ASTC texture decompression fails when using software fallback +- [i965][iris][regression][bisected] multiple piglit and glcts failures on all platforms +- please publish GPG keyring used to sign new releases +- [BISECTED] compiling shader causes crash +- Missing render Information on Stellaris +- freedreno/ir3: allow copy-propagate from array +- Zink + GALLIUM_HUD SIGSEGV +- piglit spec@egl_ext_device_base@conformance fails LLVM 11 Git assertion since "llvmpipe/fs: add caching support" +- llvmpipe: 1x1 framebuffer with a 2x2 viewport +- [regression] nir build failure +- ci: need to end baremetal tests after kernel panic/instaboot +- If-statement body is executed for false condition +- freedreno/a6xx: broken rendering in playcanvas "after the flood" +- [regression] performance drop on Dota 2, CS:GO, and gfxbench GL benchmarks on ICL/Iris +- [amd] C++ ODR violatation for union GB_ADDR_CONFIG +- Zink reports incorrect amount of video memory +- [RADV/LLVM]: void llvm::ICmpInst::AssertOK(): Assertion `getOperand(0)->getType() == getOperand(1)->getType() && "Both operands to ICmp instruction are not of the same type!"' failed. +- glsl-1.50-gs-max-output hangs on Navi10 + NGG +- anv: Runs out of binding tables with PPSSPP during long runs +- Segfault in Panfrost with waypipe +- ci: Use rsync instead of rm -rf ; cp for baremetal rootfs +- i965: Rendering problems replaying a trace of "Refunct" after mesa-20.1.0-rc1 release [bisected] +- Panfrost (rk3399 NanoPi M4) hang/crash on playing video on Kodi/X11 +- gallium/winsys/radeon/drm fails assertion on 32bit +- NIR validation failed after glsl to nir, before function inline, wrong {src,dst}->type ? +- nir/spirv asin() function not precise enough +- Mesa 20.0.7 / 20.1.0-rc4 regression, extremally long shader compilation time in NIR +- Android build error after 689acc73 +- freedreno/a6xx: gpu hangs in google earth +- Mesa-git build fails on Fedora Rawhide +- Doom Eternal 1.1 performs very poorly on RADV +- iris/i965: possible regression in 20.0.5 due to changes in buffer manager sharing across screens (firefox/mozilla#1634213) +- iris/i965: possible regression in 20.0.5 due to changes in buffer manager sharing across screens (firefox/mozilla#1634213) +- Incorrect _NetBSD__ macro inside execmem.c +- Possible invalid sizeof in device.c +- YUV FP16 lowering validation failing +- GLSL compiler assertion is_float() failed in glsl/ir_validate.cpp, visit_leave on specific WebGL shader +- [RADV] - Doom Eternal (782330) & Metro Exodus (412020) - Title requires 'RADV_DEBUG=zerovram' to eliminate colorful graphical aberrations. +- [RADV] - Doom Eternal (782330) & Metro Exodus (412020) - Title requires 'RADV_DEBUG=zerovram' to eliminate colorful graphical aberrations. +- mesa trunk master vulkan overlay-layer meson.build warning empty configuration_data() object +- [meson] increase minimum required version +- Kicad fails to render 3D PCB models. +- freedreno: minetest: alpha channel issue on a6xx +- Reproduceable i915 gpu hang Intel Iris Plus Graphics (Ice Lake 8x8 GT2) +- 7 Days to Die - "Reflection Quality" setting broken, results in environment rendered black +- glsl: regression affecting shader compilation time +- freedreno: glamor issue with x11 desktops +- finish converting from fnv1a to xxhash +- Hang in iris_dri in kitty +- Setting twice value to output_stream in radv_nir_to_llvm.c +- Overwriting value of `jit_tex->sample_stride` in lp_setup.c +- [AMDGPU][OpenGL] apitrace of kernel/firmware crash that requires a reboot +- Flickering in Superposition benchmark +- Double lock in fbobject.c +- Possible typo in aco_insert_waitcnt.cpp +- [bisected] Steam crashes when newest Iris built with LTO +- Freeing null pointer inside radv_amdgpu_cs.c +- Duplicated sub expression in radv_nir_to_llvm.c +- i965/vec4: opt_cse_local cause the out of bound array access +- NIR: Regression on shader using 8/16-bit integers +- ACO: Compiler segfault on 8/16-bit integers. +- lp_bld_intr.c:70:16: error: use of undeclared identifier 'LLVMFixedVectorTypeKind'; did you mean 'LLVMVectorTypeKind'? +- recent seqno changes causing surfaceflinger crash +- [radeonsi] [glthread] Crash with glthread enabled +- Deadlock in anv_timelines_wait() +- [gles3] supertuxkart: some textures are incorrect +- post_version.py does not work with release candidates +- post_version.py does not work with release candidates +- radv regression on android +- ogl: Set mesa_glthread=true as default on the RPCS3 emulator +- [iris] android deqp dEQP-EGL.functional.robustness.negative_context#invalid_notification_strategy_enum fails +- zink: conditional rendering +- [RadeonSI] Glitches on VEGA8 + RX 560X after MR 4863 +- RadeonSI OpenGL broken for GFX8 after unify code for overriding offset +- freedreno/turnip: Don't request fragcoord components we don't use +- Make check fails in ANV +- src\util\meson.build:294:4: ERROR: Program or command 'winepath' not found or not executable +- Please add Zink to features.txt +- llvmpipe: assert triggers in LLVM +- debug builds are massively broken on Windows +- ci: Report flakes on IRC from baremetal tests +- heavy glitches on amd ryzen 5 since version 20.x +- zink asserts with 32-bit boolean +- OpenGL: Surviving Mars black screen late-game (possible shader problem) +- Kerbal Space Program (KSP) hangs entire Navi system +- Dirt: Showdown bad performance and broken rendering with enabled advanced lightning +- gravit & Firefox WebGL broken since 3dc2ccc14c0e035368fea6ae3cce8c481f3c4ad2 "ac/surface: replace RADEON_SURF_OPTIMIZE_FOR_SPACE with !FORCE_SWIZZLE_MODE" +- mesa 20.0.5 causing kitty to crash +- radeonsi: "Torchlight II" trace showing regression on mesa-20.0.6 [bisected] +- [RADV/LLVM/ACO/Regression] After mesa commit a3dc7fffbb7be0f1b2ac478b16d3acc5662dff66 all games stucks at start +- Android building error after commit 2ab45f41 +- freedreno/a6xx: pubg rendering glitches +- iris: Crash when trying to capture window in OBS Studio +- lp_test_format failure with llvm-11 + + +Changes +------- + +Abhishek Kumar (1): + +- egl: Limit the EGL ver for android + +Adam Jackson (1): + +- glx: Fix build and warnings with -Dglx=dri -Dglx-direct=false + +Alejandro Piñeiro (9): + +- v3d/tex: only look up the 2nd texture gather offset for 1d non-arrays +- v3d/tex: set up default values for Configuration Parameter 1 if possible +- v3d/tex: use TMUSLOD register if possible +- v3d: moving v3d simulator to src/broadcom +- v3d/tex: handle correctly coordinates for cube/cubearrays images +- vulkan/util: add struct vk_pipeline_cache_header +- nir/lower_tex: handle query lod with nir_lower_tex_packing_16 at lower_tex_packing +- v3d/packet: fix typo on Set InstanceID/PrimitiveID packet +- v3d: set instance id to 0 at start of tile + +Alyssa Rosenzweig (475): + +- pan/mdg: Track more types +- pan/mdg: Be a bit more pedantic in invert passes +- panfrost: Enumify bifrost blend types +- pan/bi: Add texture indices to IR +- pan/bi: Pipe multiple textures through +- pan/bi: Pack round opcodes (FMA, either 16 or 32) +- pan/bit: Add framework forinterpreting double vs float +- pan/bit: Interpret ROUND +- pan/bit: Add round tests +- panfrost: Fix texture field size +- panfrost: Fix size of bifrost sampler descriptor +- panfrost: Fix sampler wrap/filter field orders +- panfrost: Fix norm coords on bifrost sampler +- panfrost: Fix tiled texture "stride"s on Bifrost +- pan/decode: Don't crash on missing payload +- pan/bi: Enable lower_mediump_outputs NIR pass +- panfrost: Update Bifrost fields in mali_shader_meta +- pan/bi: Lower for now sincos +- pan/mdg: Ingest actual isub ops +- pan/mdg: Rename .one to .sat_signed +- pan/mdg: Move constant switch opts to algebraic pass +- pan/mdg: Drop forever todo +- pan/mdg: Drop `opt` in name of midgard_opt_cull_dead_branch +- pan/mdg: Enable nir_opt_algebraic_distribute_src_mods +- panfrost: Update dEQP expectation list +- panfrost: Setup gl_FragCoord as sysval on Bifrost +- pan/bi: Add clause type for gl_FragCoord.zw load +- pan/bi: Abort on unknown op packing +- pan/bi: Abort on unhandled intrinsics +- pan/bi: Futureproof COMBINE lowering against non-u32 +- pan/bi: Print bad instruction on src packing fail +- pan/bi: Passthrough direct ld_var addresses +- pan/bi: Lower gl_FragCoord +- pan/bi: Set clause type for gl_FragCoord.z +- pan/bi: Fix double-abs flipping +- pan/bi: Fix missing swizzle +- pan/bi: Fix incorrectly flipped swizzle +- pan/bi: Disable CSEL4 emit for now +- pan/bi: Fix DISCARD ops in disasm +- pan/bi: Structify DISCARD +- pan/bi: Remove BI_GENERIC +- pan/bi: Unwrap BRANCH into CONDITIONAL class +- pan/bi: Handle discard_if in NIR->BIR naively +- pan/bi: Emit discard (not if) +- pan/bi: Add float-only mode to condition fusing +- pan/bi: Fuse conditions into discard_if +- pan/bi: Handle discard/branch in get_component_count +- pan/bi: Pack ADD.DISCARD +- pan/bi: Structify ADD ICMP 16 +- pan/bi: Pack ADD ICMP 32 +- pan/bi: Pack ADD ICMP 16 +- pan/bi: Don't pack ICMP on FMA +- pan/bit: Add swizzles to round tests +- pan/bit: Add more 16-bit fmod tests +- pan/bit: Add ICMP tests +- pan/bi: Rename BI_ISUB to BI_IMATH +- pan/bi: Use IMATH for nir_op_iadd +- pan/bi: Pack FMA IADD/ISUB 32 +- pan/bi: Pack ADD IADD/ISUB for 8/16/32 +- pan/bi: Add SUB.v2i16/SUB.v4i8 opcodes to disasm +- pan/bi: Don't schedule <32-bit IMATH to FMA +- pan/bit: Interpret IMATH +- pan/bit: Interpret v4i8 ops +- pan/bit: Remove test names +- pan/bit: Use swizzle helper for round +- pan/bit: Factor out identity swizzle helper +- pan/bit: Add IMATH packing tests +- pan/decode: Fix flags_hi printing +- pan/mdg: Explain helper invocations dataflow theory +- pan/mdg: Analyze helper invocation termination +- pan/mdg: Analyze helper execution requirements +- pan/mdg: Use the helper invo analyze passes +- pan/mdg: Use analysis to set .cont/.last flags +- pan/mdg: Remove texture_op_count +- pan/mdg: Set types for derivatives +- pan/mdg: Fix derivative swizzle +- panfrost: Run dEQP-GLES3.functional.shaders.derivate.* on CI +- pan/decode: Use a page table for tracking mmaps +- pan/decode: Fix min/max_tile_coord mixup +- pan/mfbd: Add format codes for PIPE_FORMAT_B5G5R5A1_UNORM +- panfrost: Switch formats to table +- panfrost: Fix Z24 vs Z32 mixup +- panfrost: Enable AFBC for Z24X8 +- nir: Add fsat_signed opcode +- nir: Add fclamp_pos opcode +- panfrost: Add modifier detection helpers +- pan/mdg: Remove .pos propagation pass +- pan/mdg: Drop nir_lower_to_source_mods +- pan/mdg: Prepare for modifier helpers +- pan/mdg: Ingest fsat_signed/fclamp_pos +- pan/mdg: Apply abs/neg modifiers +- pan/mdg: Treat inot as a modifier +- pan/mdg: Remove invert optimizations +- pan/mdg: Use helpers for branch/discard inversion +- pan/mdg: Apply outmods +- pan/mdg: Emit fcsel when beneficial +- pan/mdg: Optimize pipelining logic +- pan/mdg: Precompute mir_special_index +- pan/mdg: Optimize liveness computation in DCE +- pan/mdg: Handle comparisons in fp16 path +- pan/mdg: Fix constant combining crash +- pan/mdg: Remove mir_*size routines +- pan/mdg: Remove mir_get_alu_src +- pan/mdg: Include more types +- pan/mdg: Handle dest up/lower correctly with swizzles +- pan/mdg: Respect !32-bit sizes in RA +- pan/mdg: Explain ld/st sign/zero extension +- pan/mdg: Add abs/neg/shift modifiers to IR +- pan/mdg: Use src_types to determine size in scheduling +- pan/mdg: Use type to determine triviality of a move +- pan/mdg: Identify scalar integer mods +- pan/mdg: Promote imov to fmov on a NIR level +- pan/mdg: Remove promote_float pass +- pan/mdg: Defer modifier packing until emit time +- pan/mdg: Remove redundant redundancy +- pan/mdg: Streamline dest_override handling +- pan/mdg: Implement b2f16 +- pan/mdg: Don't generate conversions for fp16 LUTs +- pan/mdg: Ignore dest.type when offseting load swizzle +- pan/lcra: Remove unused alignment parameters +- pan/lcra: Allow per-variable bounds to be set +- pan/mdg: Use type size to determine alignment +- pan/mdg: Eliminate load_64 +- pan/mdg: Set RA bounds for fp16 +- pan/mdg: Print mask when dest=0 +- pan/mdg: Round up bytemasks when spilling +- pan/mdg: Print constant vectors less wrong +- pan/mdg: Factor out mir_adjust_constant +- pan/mdg: Only combine 16-bit constants to lower half +- pan/mdg: Separately pack constants to the upper half +- pan/mdg: Fix type checking issues with compute +- pan/mdg: Pack barriers correctly +- pan/mdg: Use shifts instead of division for RA sizes +- pan/mdg: Implement vector constant printing for 8-bit +- pan/mdg: Implement condense_writemask for 8-bit +- pan/mdg: Pack 8-bit swizzles in 16-bit ops +- panfrost: Guard experimental fp16 behind debug flag +- panfrost: Keep cached BOs mmap'd +- panfrost: Remove deadcode +- panfrost: Fill in SCALED formats to format table +- panfrost: Don't set PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY +- panfrost: Don't zero staging buffer for tiling +- panfrost: Allow bpp24 tiling +- panfrost: Allow tiling on RECT textures +- panfrost: Limit blend shader work count +- panfrost: Remove dated comment about leaks +- panfrost: Disable tib read/write when colourmask = 0x0 +- panfrost: Avoid redundant shader executions with mask=0x0 +- panfrost: Don't set CAN_DISCARD for MFBD +- panfrost: Fix transform feedback types +- pan/mdg: Cleanup comments that look like division +- pan/mdg: Eliminate expand_writemask division +- pan/mdg: Eliminate 64-bit swizzle packing division +- pan/mdg: Avoid division in printing helpers +- pan/mdg: Eliminate remaining divisions from compiler +- panfrost: Fix dated comment +- panfrost: Use _mesa_roundevenf when packing clear colours +- panfrost: Handle !independent_blend for blend shaders +- pan/mdg: Add pack_colour_32 opcode +- pan/mdg: Lower shifts to 32-bit +- pan/mdg: Ensure we don't DCE into impossible masks +- pan/mdg: Allow DCE on ld_color_buffer masks +- panfrost: Add debug print before query flushes +- panfrost: Only run batch debug when specifically asked +- nir: Add un/pack_32_4x8 opcodes +- util: Add SATURATE macro +- util/format: Use SATURATE +- mesa: Use SATURATE +- mesa/swrast: Use SATURATE +- gallium/draw: Use SATURATE +- glsl: Use SATURATE +- panfrost: Use SATURATE +- softpipe: Use SATURATE +- intel: Use SATURATE +- i965: Use SATURATE +- iris: Use SATURATE +- etnaviv: Use SATURATE +- nouveau: Use SATURATE +- pan/decode: Fix unused variable warning +- pan/decode: Fix tiler warning +- pan/decode: Dump missing field on Bifrost +- pan/decode: Dump unknown2 +- panfrost: Fix Bifrost blending with depth-only FBO +- panfrost: Adjust null_rt for Bifrost +- panfrost: Tweak zsbuf magic numbers for Bifrost +- panfrost: Tweak Bifrost colour buffer magic +- panfrost: Force Z/S tiling on Bifrost +- panfrost: Share MRT blend flag calculation with Bifrost +- panfrost: Set unk2 to accomodate blending +- panfrost: Identify Bifrost texture format swizzle +- panfrost: Ensure nonlinear strides are 16-aligned +- panfrost: Document Midgard Inf/NaN suppress bit +- panfrost: Add defines for bifrost unk1 flags +- panfrost: Identify MALI_BIFROST_EARLY_Z flag +- panfrost: Set MALI_BIFROST_EARLY_Z as necessary +- pan/decode: Decode Bifrost shader flags +- pan/bi: Add TEX.vtx opcode for vertex texturing +- pan/bi: Also add compact vertex texturing +- pan/bi: Document compute_lod bit for compact tex +- pan/bi: Allow vertex txl with lod=0 as compact +- pan/bi: Add f16 TEXC.vtx op +- pan/bi: Pack compact vertex texturing +- pan/bi: Add CSEL.16 packing tests +- pan/bi: Suppress inf/nan for now +- panfrost: Don't generate gl_FragCoord varying on Bifrost +- panfrost: Set reads_frag_coord as a sysval +- panfrost: Preload gl_FragCoord on Bifrost +- pan/bi: Remove FMA? parameter from get_src +- pan/bi: Remove comment about old scheduler design +- pan/bi: Move bi_registers to common IR structures +- pan/bi: Move bi_registers to bi_bundle +- pan/bi: Drop `struct` from bi_registers +- pan/bi: Add FILE* argument to bi_print_registers +- pan/bi: Move bi_flip_ports out of port assignment +- pan/bi: Document constant count invariant +- pan/bi: Disassemble pos=0xe +- pan/bi: Add MUL.i32 to disasm +- pan/bi: Remove more artefacts of 2-pass scheduling +- pan/bi: Add bi_layout.c for clause layout helpers +- pan/bi: Add helper to measure clause size +- pan/bi: Remove schedule_barrier +- pan/bi: Allow printing branches without targets +- pan/bi: Fix emit_if successor assignment +- pan/bi: Only rewrite COMBINE dest if not SSA +- pan/bi: Fix CONVERT component counting +- pan/bi: Fix branch condition typesize +- pan/bi: Passthrough ZERO in branch packing +- pan/bi: Add branch constant field to IR +- pan/bi: Pack branch offset constants +- pan/bi: Set branch_constant if there is a branch +- pan/bi: Assign constant port for branch offsets +- pan/bi: Preliminary branch packing +- pan/bi: Link clauses back to their blocks +- pan/bi: Add bi_foreach_clause_in_block_from{_rev} helpers +- pan/bi: Measure distance between blocks +- pan/bi: Pack proper clause offsets +- pan/bi: Set branch_conditional if b2b is set +- pan/bi: Set back-to-back bit more accurately +- pan/bi: Set branch conditional bit +- pan/bi: Pack unconditional branch +- pan/bi: Defer block naming until after emit +- pan/bi: Add bi_foreach_block_from_rev helper +- pan/bi: Measure backwards branches as well +- pan/bi: Allow two successors in header packing +- pan/bi: Passthrough deps of the branch target +- panfrost: Disable QUAD_STRIP/POLYGON on Bifrost +- panfrost: Add GPU IDs for G31/G52 +- panfrost: Probe G31/G52 if PAN_MESA_DEBUG=bifrost +- pan/mdg: Handle un/pack opcodes as moves +- pan/mdg: Add pack_unorm_4x8 via 8-bit +- pan/mdg: Treat packs "specially" +- pan/mdg: Handle bitsize for packs +- pan/mdg: Print 8-bit constants +- pan/mdg: Drop the u8 from the colorbuf op names +- pan/mdg: Implement raw colourbuf loads on T720 +- panfrost: Add theory for new framebuffer lowering +- panfrost: Determine unpacked type for formats +- panfrost: Add quirks for blend shader types +- panfrost: Determine load classes for formats +- panfrost: Determine classes for stores +- panfrost: Stub out lowering boilerplate +- panfrost: Un/pack pure 32-bit +- panfrost: Un/pack pure 16-bit +- panfrost: Un/pack pure 8-bit +- panfrost: Un/pack 8-bit UNORM +- panfrost: Flesh out dispatch +- panfrost: Un/pack UNORM 4 +- panfrost: Un/pack RGB565 and RGB5A1 +- panfrost: Un/pack RGB10_A2_UNORM +- panfrost: Un/pack RGB10_A2_UINT +- panfrost: Un/pack R11G11B10 +- panfrost: Un/pack sRGB via NIR +- panfrost: Switch to pan_lower_framebuffer +- panfrost: Conditionally allow fp16 blending +- panfrost: Account for differing types in blend lower +- panfrost: Let Gallium pack colours +- panfrost: Check for large tilebuffer requirements +- panfrost: Add separate_stencil BO to batch +- panfrost: Use internal_format throughout +- panfrost: Update fails list +- pan/mdg: Handle 16-bit ld_vary +- pan/mdg: Fuse f2f16 into load_interpolated_input +- panfrost: Fix PRESENT flag mix-up +- panfrost: Permit AFBC of RGB8 +- panfrost: Use VTX tag for vertex texturing +- panfrost: Don't flush explicitly when mipmapping +- panfrost: Remove unused nir_lower_framebuffer pass +- pan/mdg: Disassemble out-of-order bits +- pan/mdg: Add quirk for missing out-of-order support +- pan/mdg: Enable out-of-order execution after texture ops +- nir: Fold f2f16(b2f32(x)) to b2f16(x) +- pan/mdg: Don't double-replicate blend on T720 +- pan/mdg: Distinguish blend shaders in internal shader-db +- pan/mdg: Add roundmode enum +- pan/mdg: Add opcode roundmode property +- pan/mdg: Lower roundmodes +- pan/mdg: Implement *_rtz conversions with roundmode +- pan/mdg: Fold roundmode into applicable instructions +- pan/mdg: Handle f2u8 +- pan/mdg: Allow f2u8 and friends thru +- pan/mdg: Handle regular nir_intrinsic_load_output +- panfrost: Passthrough NATIVE loads/stores +- pan/bi: Handle SEL with vec3 16-bit +- pan/bi: Fix SEL.16 swizzle +- pan/bi: Pack second argument of F32_TO_F16 +- pan/bi: Passthrough second argument of F32_TO_F16 +- pan/bi: Handle vectorized load_const +- panfrost: Update MALI_EARLY_Z description +- panfrost: Document MALI_WRITES_GLOBAL bit +- panfrost: Handle writes_memory correctly +- panfrost: Readd MIDGARD_SHADERLESS quirk to t760 +- panfrost: Explicitly convert to 32-bit for logic-ops +- pan/bi: Disassemble gl_PointCoord reads. +- panfrost: Prefer sysval for gl_PointCoord on Bifrost +- panfrost: Fix gl_PointSize out of GL_POINTS +- panfrost: Mark point sprites as todo on Bifrost +- pan/mdg: Legalize inverts with constants +- pan/mdg: Ensure ld_vary_16 is aligned +- panfrost: Ensure we have ro before using it +- nir: Remove nir_intrinsic_output_u8_as_fp16_pan +- pan/mdg: Avoid fusing ld_vary_16 with non-zero component +- panfrost: Calculate varying size by format +- panfrost: Add panfrost_streamout_offset helper +- panfrost: Introduce bitfields for tracking varyings +- panfrost: Determine varying buffer presence +- panfrost: Emit unlinked varyings +- panfrost: Emit special varyings +- panfrost: Emit xfb records +- panfrost: Add helper to determine if we are capturing +- panfrost: Add high-level varying emit +- panfrost: Use new varying linking +- panfrost: Remove unused routines +- panfrost: Allow R/RG/RGB varyings +- panfrost: Only store varying formats +- panfrost: Use shader_info harder +- panfrost: Override varying format to minimal precision +- panfrost: Demote mediump varyings to fp16 +- pan/mdg: Explicitly type 64-bit uniform moves +- pan/mdg: Analyze types for 64-bitness in RA +- pan/mdg: Prefer type over regmode for schedule constraints +- pan/mdg: Precolour blend inputs +- panfrost: Merge bifrost_bo/midgard_bo +- panfrost: Update sampler view in Bifrost path +- panfrost: Fix level_2 +- panfrost: Correctly calculate tiled stride +- panfrost: Enable AFBC for RGB565 +- panfrost: Simplify AFBC format check +- pan/mdg: Factor out unit check +- pan/mdg: Allow scheduling "x + x" to multipliers +- pan/mdg: Canonicalize (x * 2.0) to (x + x) +- pan/mdg: Reassociate adds for multiply-by-two +- nir: Propagate *2*16 conversions into vectors +- panfrost: Specify stack_shift on SFBD +- pan/mdg: Defer nir_fuse_io_16 until after opts +- pan/mdg: Don't assign destination in writeout block to r1 +- pan/mdg: Remove bundle interference code +- pan/mdg: Schedule writeout to VLUT +- pan/mdg: Defer smul, vlut until after writeout moves +- pan/mdg: Allow Z/S writes to use any 2nd stage unit +- pan/mdg: Prioritize non-moves on VADD/VLUT +- pan/mdg: Skip r1.w write where possible +- pan/mdg: Schedule based on liveness +- pan/mdg: Respect type/mask in mir_lower_special_reads +- pan/mdg: Fix indirect UBO swizzles +- pan/decode: Fix MSAA texture decoding +- pan/decode: Identify layered MSAA flag +- pan/mdg: Allow ignoring move mode +- pan/mdg: Handle GLSL_SAMPLER_DIM_MS +- pan/mdg: Handle nir_tex_src_ms_index +- pan/mdg: Handle nir_texop_txf_ms +- pan/mdg: Use _VTX tag for texelFetch in frag shaders +- panfrost: Set depth to sample_count for MSAA 2D +- panfrost: Identify layer_stride +- panfrost: Allocate space for multisampling +- panfrost: Index texture by sample +- panfrost: Include pointer for each sample +- panfrost: Set layer_stride for multisampled rendering +- panfrost: Don't advertise MSAA 2x +- panfrost: Identify coverage_mask +- panfrost: Pass sample_mask to the hardware +- panfrost: Implement alpha-to-coverage +- panfrost: Identify depth/stencil layer strides +- panfrost: Set depth/stencil_layer_stride accordingly +- panfrost: Enable MSAA if we render to such a surface +- panfrost: Save sample_mask before blitting +- panfrost: Expose MSAA 4x +- glsl: Handle 16-bit types in loop analysis +- docs/features: Track Panfrost +- panfrost: Introduce pan_pool struct +- panfrost: Allocate pool BOs against the pool +- panfrost: Track the device through the pool +- panfrost: Expose pool-based allocation API +- panfrost: Move debug flags into the device +- panfrost: Drop Gallium-local pan_bo_create wrapper +- panfrost: Move pool routines to common code +- panfrost: Factor out scoreboarding state +- panfrost: Pass polygon_list to tiler init function +- panfrost: Drop batch from scoreboard routines +- panfrost: Move scoreboarding routines to common +- panfrost: Handle PIPE_FORMAT_X24S8_UINT +- panfrost: Handle PIPE_FORMAT_S8_UINT +- panfrost: Move panfrost_translate_texture_type +- panfrost: Report blend shader work count +- panfrost: Clamp pure int pixels +- panfrost: Generate shader variants on framebuffer bind +- panfrost: Always use SOFTWARE for pure formats +- panfrost: Extend fetched framebuffer results +- panfrost: Fix fence leak +- panfrost: Fix write to free'd memory +- panfrost: Add a sparse array to map GEM handles to BOs +- panfrost: Index BOs from the BO map sparse array +- panfrost: Merge PAN_BO_IMPORTED/PAN_BO_EXPORTED +- panfrost: Remove PAN_BO_COHERENT_LOCAL +- panfrost: Remove PAN_BO_DONT_REUSE +- panfrost: Remove panfrost_bo_access type +- panfrost: Compact unused BO flag bits +- panfrost: Add format codes for new compressed textures +- panfrost: Pipe in compressed texture feature mask +- panfrost: Filter compressed texture formats +- panfrost: Map PIPE_{DXT, RGTC, BPTC} to MALI_BCn +- docs/features: Update ASTC entries for Panfrost +- pan/mdg: Bump compiler RT maximum +- pan/mdg: Identify per-sample interpolation mode +- pan/mdg: Implement gl_SampleID +- panfrost: Force Z/S writeback +- panfrost: Expose panfrost_get_blend_shader +- panfrost: Add MALI_PER_SAMPLE bit +- panfrost: Include sample count in payload estimates +- panfrost: Identify zs_samples field +- panfrost: Add rectangle subtraction algorithm +- panfrost: Handle per-sample shading +- panfrost: Set zs_samples as necessary +- panfrost: Track surfaces drawn per-batch +- panfrost: Extract panfrost_batch_reserve_framebuffer +- panfrost: Use Midgard-specific reloads +- panfrost: Call util_blitter_save_fragment_constant_buffer_slot +- panfrost: Overhaul tilebuffer allocations +- panfrost: Set PIPE_CAP_MIXED_COLORBUFFER_FORMATS +- panfrost: Fix sRGB clear colour packing +- panfrost: Implement Z32F_S8 blits +- panfrost: Abort on unsupported blit +- panfrost: Avoid integer underflow in rt_count_1 +- panfrost: Honour cso->compare_mode +- panfrost: Fix faults with RASTERIZER_DISCARD +- panfrost: Report CAPs more honestly +- panfrost: Enable Chromium +- panfrost: Revert "Disable frame throttling" +- docs/features: Mark trivial missed feature +- panfrost: Enable FP16 by default +- panfrost: Avoid wait=true flushing all batches +- panfrost: Remove wait parameter to flush_all_batches +- panfrost: Skip specifying in_syncs +- panfrost: Allocate syncobjs in panfrost_flush +- panfrost: Remove unused batch_fence->signaled +- panfrost: Remove unused batch_fence->ctx +- pan/bit: Update f32->f16 convert test +- pan/bit: Remove BI_SHIFT stub +- pan/mdg: Mask spills from texture write +- pan/mdg: Test for SSA before chasing addresses +- docs/features: Add GL_EXT_multisampled_render_to_texture +- panfrost: Add MSAA mode selection field +- panfrost: Implement EXT_multisampled_render_to_texture +- panfrost: Set STRIDE_4BYTE_ALIGNED_ONLY +- panfrost: Fix WRITES_GLOBAL bit +- pan/mdg: Ensure barrier op is set on texture +- panfrost: Fix blend leak for render targets 5-8 +- panfrost: Free cloned NIR shader +- panfrost: Free NIR of blit shaders +- panfrost: Free hash_to_temp map +- pan/mdg: Free previous liveness +- panfrost: Use memctx for sysvals +- panfrost: Free batch->dependencies +- pan/mdg: Fix discard encoding +- pan/mdg: Fix perspective combination +- pan/bit: Set d3d=true for CMP tests + +Andreas Baierl (1): + +- nir/ lower_int_to_float: Handle umax and umin + +Andres Gomez (10): + +- .mailmap: add an alias for Iago Toral Quiroga +- .mailmap: add an alias for Andres Gomez +- gitlab-ci: update tracie README after changes in main script +- scripts: remove unittest.mock dependency when not used +- gitlab-ci: create always the "results" directory with tracie +- gitlab-ci: correct tracie behavior with replay errors +- gitlab-ci: build gfxreconstruct from the "dev" branch +- gitlab-ci: get the last frame from a gfxr trace using gfxrecon-info +- gitlab-ci/traces: updated paths and checksums for POLARIS10 traces +- gitlab-ci: Test AMD's Raven with traces + +Andrey Vostrikov (1): + +- egl/x11: Free memory allocated for reply structures on error + +Andrii Simiklit (3): + +- glsl_type: don't serialize padding bytes from glsl_struct_field +- i965/vec4: Ignore swizzle of VGRF for use by var_range_end() +- glsl: fix crash on glsl macro redefinition + +Ani (1): + +- drirc: Enable glthread for rpcs3 + +Anuj Phogat (6): + +- intel/devinfo: Add is_dg1 to device info +- intel/l3: Add DG1 L3 configuration +- intel/ehl: Use GEN11_URB_MIN_MAX_ENTRIES in device info +- intel/ehl: Use macro GEN11_LP_FEATURES in device info +- intel/ehl: Rename gen_device_info struct +- intel/ehl: Add new PCI-IDs + +Arcady Goldmints-Orlov (4): + +- anv: increase minUniformBufferOffsetAlignment to 64 +- intel/compiler: fix alignment assert in nir_emit_intrinsic +- nir/spirv/glsl450: increase asin(x) precision +- intel/compiler: Always apply sample mask on Vulkan. + +Axel Davy (19): + +- st/nine: Set correctly blend max_rt +- gallium/util: Fix leak in the live shader cache +- ttn: Add new allow_disk_cache parameter +- ttn: Implement disk cache +- st/nine: Enable ttn cache +- radeonsi: Enable tgsi to nir disk cache +- st/nine: Add checks for pure device +- st/nine: Return error when setting invalid depth buffer +- st/nine: Do not return invalidcall on getrenderstate +- st/nine: Pass more adapter formats for CheckDepthStencilMatch +- st/nine: Improve return error code in CheckDeviceFormat +- st/nine: Fix uninitialized variable in BEM() +- st/nine: Fix a crash if the state is not initialized +- st/nine: Add missing NULL checks +- st/nine: Increase available GPU memory +- st/nine: Retry allocations after freeing some space +- st/nine: Improve pDestRect handling +- st/nine: Ignore pDirtyRegion +- st/nine: Handle full pSourceRect better + +Bas Nieuwenhuizen (80): + +- radv: Fix implicit sync with recent allocation changes. +- radv: Extend tiling flags to 64-bit. +- radv: Provide a better error for permission issues with priorities. +- radv: Support VK_PIPELINE_COMPILE_REQUIRED_EXT. +- radv: Support VK_PIPELINE_CREATE_EARLY_RETURN_ON_FAILURE_BIT_EXT. +- radv: Support VK_PIPELINE_CACHE_CREATE_EXTERNALLY_SYNCHRONIZED_BIT_EXT. +- radv: Expose VK_EXT_pipeline_creation_cache_control. +- radv/winsys: Finish mapping for sparse residency. +- radv/winsys: Remove extra sizeof multiply. +- radv: Handle failing to create .cache dir. +- radv: Remove dead code. +- radv: Do not close fd -1 when NULL-winsys creation fails. +- radv: Implement vkGetSwapchainGrallocUsage2ANDROID. +- frontend/dri: Implement mapping individual planes. +- util/format: Add VK_FORMAT_D16_UNORM_S8_UINT. +- util/format: Use correct pipe format for VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM. +- util/format: Add more multi-planar formats. +- gallium/dri: Remove lowered_yuv tracking for plane mapping. +- radeonsi: Explicitly map Z16_UNORM_S8_UINT to None for GFX10. +- amd/common,radeonsi: Move gfx10_format_table to common. +- radeonsi: Define gfx10_format in the common header. +- radv: Include gfx10_format_table.h only from a single source file. +- radv: Use common gfx10_format_table.h +- radv: Use ac_surface to determine fmask enable. +- radv: Pass no_metadata_planes info in to ac_surface. +- radv: Enforce the contiguous memory for DCC layers in ac_surface. +- radv: Rely on ac_surface for avoiding cmask for linear images. +- radv: Use offsets in surface struct. +- radv: Disable DCC in ac_surface. +- radv: Disable HTILE in ac_surface. +- radv: Allocate values/predicates at the end of the image. +- amd/common: Add total alignment calculation. +- radv: Use ac_surface to allocate aux surfaces. +- vulkan/wsi/x11: Ensure we create at least minImageCount images. +- radv/winsys: Deal with realloc failures in BO lists. +- radv: Handle mmap failures. +- radv/winsys: Distinguish device/host memory errors. +- radv: Make radv_alloc_shader_memory static. +- turnip: semaphore support. +- meson: Do not require shader cache for radv. +- amd/addrlib: fix another C++ one definition rule violation +- radv: Set handle types in Android semaphore/fence import. +- radv: Always enable PERFECT_ZPASS_COUNTS. +- Revert "radv: add support for MRTs compaction to avoid holes" +- radv: Use correct semaphore handle type for Android import. +- amd/llvm: Mark pointer function arguments as 32-byte aligned. +- amd/common: Cache intra-tile addresses for retile map. +- amd/addrlib: Clean up unused colorFlags argument +- amd/registers: add RLC_PERFMON_CLK_CNTL for pre-GFX10 +- radeonsi: Inhibit clock-gating for perf counters. +- meson: Add mising git_sha1.h dependency. +- amd: Add detection of timeline semaphore support. +- radv/winsys: Add binary syncobj ABI changes for timeline semaphores. +- radv: Add thread for timeline syncobj submission. +- radv: Add winsys support for submitting timeline syncobj. +- radv: Add winsys functions for timeline syncobj. +- radv: Add timeline syncobj for timeline semaphores. +- radv: Fix uninitialized variable in renderpass. +- vulkan/wsi/x11: report device-group present rectangles with prime. +- vulkan/wsi: Convert usage of -1 to UINT32_MAX. +- radv: Fix host->host signalling with legacy timeline semaphores. +- mesa/st: Actually free the driver part of memory objects on destruction. +- radv: Don't use both DCC and CMASK for single sample images. +- radv: Fix assert that is too strict. +- radv: Do not consider layouts fast-clearable on compute queue. +- radv: When importing an image, redo the layout based on the metadata. +- radv: Use getter instead of setter to extract value. +- driconf: Support selection by Vulkan applicationName. +- radv: Override the uniform buffer offset alignment for World War Z. +- radv: Fix handling of attribs 16-31. +- radv: Remove conformance warnings with ACO. +- radv: Update CTS version. +- radv: Fix 3d blits. +- radv: Fix threading issue with submission refcounts. +- radv: Avoid deadlock on bo_list. +- spirv: Deal with glslang not setting NonUniform on constructors. +- radeonsi: Work around Wasteland 2 bug. +- spirv: Deal with glslang bug not setting the decoration for stores. +- ac/surface: Fix depth import on GFX6-GFX8. +- st/mesa: Deal with empty textures/buffers in semaphore wait/signal. + +Ben Skeggs (38): + +- nir: use bitfield_insert instead of bfi in nir_lower_double_ops +- nvir: bump max encoding size of instructions +- nvir: introduce OP_LOP3_LUT +- nvir: introduce OP_WARPSYNC +- nvir: introduce OP_BREV with lowering to EXTBF_REV for current GPUs +- nvir: introduce OP_SHF +- nvir: introduce OP_BMSK +- nvir: introduce OP_SGXT +- nvir: introduce OP_FINAL +- nvir: add constant folding for OP_PERMT +- nvir: run replaceZero() before replaceCvt() +- nvir/nir: fix fragment program output when using MRT +- nvir/nir: move nir options to codegen +- nvir/nir: flesh out options +- nvir/nir: turn on lower_rotate +- nvir/nir: implement nir_op_extract_u8 +- nvir/nir: implement nir_op_extract_i8 +- nvir/nir: implement nir_op_extract_u16 +- nvir/nir: implement nir_op_extract_i16 +- nvir/nir: implement nir_op_urol +- nvir/nir: implement nir_op_uror +- nvir/nir: nir expects the shift amount to wrap, rather than clamp +- nvir/nir: use nir_lower_idiv +- nvir/gm107: implement OP_PERMT +- nvir/gm107: replace SHR+AND+AND with PRMT+PRMT in PFETCH lowering +- nvir/gm107: separate out header for sched data calculator +- nvir/nir/gm107: split nir shader compiler options from gf100 +- nvir/nir/gm107: turn on nir_lower_extract64 +- nvir/nir/gm107: switch off lower_extract_byte +- nvir/nir/gm107: switch off lower_extract_word +- nvir/gv100: initial support +- nvir/gv100: enable support for tu1xx +- nvc0: use NVIDIA headers for GK104->GM2xx compute QMD +- nvc0: use NVIDIA headers for GP100- compute QMD +- nvc0: move setting of entrypoint for a shader stage to a function +- nvc0: remove hardcoded blitter vertprog +- nvc0: initial support for gv100 +- nvc0: initial support for tu1xx + +Benjamin Cheng (1): + +- drirc: Add picom to adaptive_sync exclusion list + +Benjamin Tissoires (3): + +- CI: reduce bandwidth for git pull +- gitlab-ci: update ci-fairy minio to latest upstream +- gitlab-ci: do not run full CI on scheduled pipelines + +Blaž Tomažič (1): + +- radeonsi: Fix omitted flush when moving suballocated texture + +Boris Brezillon (14): + +- spirv: Split the vtn_emit_scoped_memory_barrier() logic +- nir: Replace the scoped_memory barrier by a scoped_barrier +- intel/compiler: Extract control barriers from scoped barriers +- spirv: Use scoped barriers for SpvOpControlBarrier +- nir: Add new rules to optimize NOOP pack/unpack pairs +- nir: Use a switch in build_deref_offset()/deref_instr_get_const_offset() +- nir: Allow casts in nir_deref_instr_get[_const]_offset() +- freedreno: Initialize lower_int64_options to a proper value +- nir: Stop passing an options arg to nir_lower_int64() +- nir: Extend nir_lower_int64() to support i2f/f2i lowering +- intel: Set int64_options to ~0 when lowering 64b ops +- nir: Get rid of __[u]int64_to_fp32() and __fp32_to_[u]int64() +- nir: Fix i64tof32 lowering +- spirv: Add a vtn_get_mem_operands() helper + +Boyuan Zhang (2): + +- radeon/vcn/enc: Re-write PPS encoding for HEVC +- radeon/vcn: bump vcn3.0 encode major version to 1 + +Brian Ho (14): + +- turnip: Execute ir3_nir_lower_gs pass again +- turnip: Fill out VkPhysicalDeviceSubgroupProperties +- nir: Support sysval tess levels in SPIR-V to NIR +- nir: Add an option for lowering TessLevelInner/Outer to vecs +- turnip: Lower shaders for tessellation +- turnip: Offset by component when lowering gl_TessLevel* +- turnip: Parse tess state and support PATCH primtype +- turnip: Allocate tess BOs as a function of draw size +- turnip: Update VFD_CONTROL with tess system values +- turnip: Emit HS/DS user consts as draw states +- turnip: Support tess for draws +- turnip: Force sysmem for tessellation +- ir3: Unconditionally enable MERGEDREGS on a6xx +- turnip: Enable tessellationShader physical device feature + +Caio Marcelo de Oliveira Filho (32): + +- intel/dev: Bail when INTEL_DEVID_OVERRIDE is not valid +- intel/fs: Clean up variable group size handling in backend +- intel/fs: Add an option to lower variable group size in backend +- intel/fs: Add and use a new load_simd_width_intel intrinsic +- intel: Let drivers call brw_nir_lower_cs_intrinsics() +- iris: Implement ARB_compute_variable_group_size +- util/list: Add list_foreach_entry_from_safe +- nir: Use deref intrinsics to set writes_memory when gathering info +- intel/fs: Use writes_memory from shader_info +- nir: Consider atomic counter intrinsics when setting writes_memory +- intel/fs: Remove unused emission of load_simd_with_intel +- intel/fs: Remove unused state from brw_nir_lower_cs_intrinsics +- intel/fs: Early return when can't satisfy explicit group size +- intel/fs: Remove redundant assert() +- intel/fs: Remove min_dispatch_width spilling decision from RA +- intel/fs: Support INTEL_DEBUG=no8,no32 in compute shaders +- intel/fs: Add helper to get prog_offset and simd_size +- i965: Use new helper functions to pick SIMD variant for CS +- iris: Set CS KernelStatePointer at dispatch +- iris: Use new helper functions to pick SIMD variant for CS +- anv: Use new helper functions to pick SIMD variant for CS +- intel/fs: Generate multiple CS SIMD variants for variable group size +- iris, i965: Drop max_variable_local_size +- iris, i965: Update limits for ARB_compute_variable_group_size +- intel: Add helper to calculate GPGPU_WALKER::RightExecutionMask +- nir: Fix printing execution scope of a scoped barrier +- spirv: Memory semantics is optional for OpControlBarrier +- intel/fs: Add Fall-through comment +- nir: Fix logic that ends combine barrier sequence +- spirv: Handle most execution modes earlier +- nir: Filter modes of scoped memory barrier in nir_opt_load_store_vectorize +- spirv: Propagate explicit layout only in types that need it + +Charmaine Lee (1): + +- llvmpipe: do not enable tessellation shader without llvm coroutines support + +Chris Forbes (12): + +- bifrost: Set RTZ rounding mode for f2i conversion +- bifrost: Lower x->bool conversions to != 0 +- bifrost: Emit "d3d" variant of comparison instructions +- bifrost: Document d3d/gl comparison control bit +- bifrost: Add lowering for b2i32 +- bifrost: Add support for nir_op_inot +- bifrost: Add support for nir_op_ishl +- bifrost: Add support for nir_op_uge +- bifrost: Add support for nir_op_imul +- bifrost: Add support for nir_op_iabs +- bifrost: Honor src swizzle in special math ops +- bifrost: Fix packing of ADD_FEXP2_FAST + +Chris Wilson (6): + +- iris: Place a seqno at the end of every batch +- iris: Convert fences to using lightweight seqno +- iris: Store a seqno for each batch in the fence +- iris: Initialise stub iris_seqno to 0 +- iris: Rename iris_seqno to iris_fine_fence +- iris: Fixup copy'n'paste mistake in Makefile.sources + +Christian Gmeiner (31): + +- etnaviv: fix SAMP_ANISOTROPY register value +- etnaviv: do not use int filter when anisotropic filtering is used +- ci: bare-metal: make it possible to use a script for serial +- ci: extend expect-output.sh +- ci: add U-Boot specific fetch strings +- etnaviv: drop translate_blend(..) +- ci: add arm_test-base docker image +- ci: use separate docker images for baremetal builds +- ci: fix possible spuriously run of jobs +- etnaviv: delete not used struct +- etnaviv: convert enums +- etnaviv: move etna_lower_io(..) to etnaviv_nir.c +- etnaviv: get rid of etna_compile dependency +- etnaviv: move etna_lower_alu(..) to etnaviv_nir.c +- etnaviv: drop OPT_V define +- etnaviv: make more use of compile_error(..) +- etnaviv: move liveness related stuff into own file +- etnaviv: merge struct etna_compile and etna_state +- etnaviv: drop emit macro +- etnaviv: move functions that generate asm to own file +- etnaviv: move nir compiler related stuff into .c file +- etnaviv: move ra into own file +- etnaviv: replace prims-emitted query +- ci: bare-metal: use nginx to get results from DUT +- etnaviv: explicitly set nir_variable_mode +- etnaviv: introduce struct etna_compiler +- etnaviv: move shader_count to etna_compiler +- etnaviv: do register setup only once +- etnaviv: fix nir validation problem +- etnaviv: call nir_lower_bool_to_bitsize +- etnaviv: completely turn off MSAA + +Christopher Egert (2): + +- radv: use util_float_to_half_rtz +- r600: Use TRUNC_COORD on samplers + +Clément Guérin (1): + +- radv: Always expose non-visible local memory type on dedicated GPUs + +Con Kolivas (1): + +- Linux: Change minimum priority threads from SCHED_IDLE to nice 19 SCHED_BATCH. + +Connor Abbott (88): + +- tu: Support pipelines without a fragment shader +- tu: Add a "scratch bo" allocation mechanism +- tu: Add noubwc debug flag to disable UBWC +- tu: Implement fallback linear staging blit for CopyImage +- freedreno/a6xx: Document dual-src blending enable bits +- ir3: Fixup dual-source blending slot +- tu: Move RENDER_COMPONENTS setting to pipeline state +- tu: Implement dual-src blending +- tu: Advertise COLOR_ATTACHMENT_BLEND_BIT for blendable formats +- tu: Always initialize image_view fields for blit sources +- tu: Fall back to 3d blit path for BC1_RGB_* formats +- tu: Fix buffer compressed pitch calculation with unaligned sizes +- tu: Support VK_FORMAT_FEATURE_BLIT_SRC_BIT for texture-only formats +- tu: Fix IBO descriptor for cubes +- tu: Respect VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT +- tu: Add missing storage image/texel buffer bits +- tu: Remove useless post-binning flushes +- tu: Don't actually track seqno's for events +- tu: Remove useless event_write helpers +- tu: Rewrite flushing to use barriers +- tu: Fix context faults loading unused descriptor sets +- ir3: Pass reserved_user_consts to ir3_shader_from_nir() +- tu: Remove num_samp hack +- tu: Use the ir3 shader API +- tu: Remove tu_shader_compile_options +- tu: Set num_components to 0 when building bindless intrinsics +- ir3: Don't calculate num_samp ourselves +- tu: Actually remove dead variables after io lowering +- ir3: Split out variant-specific lowering and optimizations +- ir3, freedreno: Round up constlen earlier +- ir3: Include ir3_compiler from ir3_shader +- ir3: Support variants with different constlen's +- ir3: Add ir3_trim_constlen() +- tu: Share constlen between different stages properly +- freedreno: Refactor ir3_cache shader compilation +- freedreno: Share constlen between different stages properly +- freedreno: On a5xx+ INDX_SIZE is MAX_INDICES +- freedreno/registers: Label firstIndex field in CP_DRAW_INDX_OFFSET +- tu: Pass firstIndex directly to CP_DRAW_INDX_OFFSET +- freedreno/a6xx: use firstIndex field +- nir: Refactor load/store intrinsic helper +- nir: add vec2_index_32bit_offset address format +- tu: Rewrite variable lowering +- tu: Enable KHR_variable_pointers +- ir3: Add layer_zero variant bit +- tu: Force gl_Layer to 0 when necessary +- freedreno/a6xx: Force gl_Layer to 0 when necessary +- freedreno: Include adreno_pm4.xml.h before adreno_a6xx.xml.h +- freedreno: Sync registers with envytools +- freedreno/a6xx: Rename and document HLSQ_UPDATE_CNTL +- freedreno/a6xx: Add some documentation for shared consts +- tu: Don't invalidate irrelevant state when changing pipeline +- freedreno/a6xx: Add stencilref register info +- ir3: Handle gl_FragStencilRefARB +- tu: Enable VK_EXT_shader_stencil_export +- freedreno: Add a helper for computing guardband sizes +- tu: Use common guardband helper +- freedreno: Use common guardband helper +- freedreno/ir3: Fix SSBO size for bindless SSBO's +- tu: Enable VK_EXT_depth_clip_enable +- freedreno: Clean up CP_DRAW_MULTI_INDIRECT definition +- freedreno: Add INDIRECT_COUNT CP_DRAW_INDIRECT_MULTI variants +- tu: Integrate WFI/WAIT_FOR_ME/WAIT_MEM_WRITES with cache tracking +- tu: Add missing wfi to tu6_emit_hw() +- tu: Implement VK_KHR_draw_indirect_count +- tu: Fix empty blit scissor case +- tu: Fix hangs for DS with no output +- tu: Detect invalid-for-binning renderpass dependencies +- tu: Enable vertex & fragment stores & atomics +- tu: Fix descriptor update templates with input attachments +- ir3: Validate bindless samp_tex correctly +- ir3: Remove redundant samp_tex validation +- ir3: Fix incorrect src flags for samp_tex +- tu: Enable resource dynamic indexing +- freedreno/rnn: Return success when parsing addvariant +- tu: Dump CP_DRAW_INDIRECT_MULTI draw BO's +- freedreno/rnn: Support stripes in rnndec_decodereg +- freedreno/cffdec: Handle CP_DRAW_INDIRECT_MULTI like other draws +- freedreno: Add trace for CP_DRAW_INDIRECT_MULTI +- freedreno/a6xx: Fix CP_BIN_SIZE_ADDRESS name +- freedreno/rnn: Make rnn_decode_enum() respect variants +- freedreno/cffdec: Stop open-coding enum parsing +- freedreno/afuc: Add missing rnn_prepdb() +- freedreno/afuc: Fix PM4 enum parsing +- tu: Fix DST_INCOHERENT_FLUSH copy/paste error +- freedreno: Document draw predication packets +- tu: Reset has_tess after renderpass +- tu: Implement VK_EXT_conditional_rendering + +D Scott Phillips (4): + +- intel/fs: Update location of Render Target Array Index for gen12 +- anv,iris: Fix input vertex max for tcs on gen12 +- intel/dump_gpu: Fix name of LD_PRELOAD in env append logic +- anv/gen11+: Disable object level preemption + +Daniel Schürmann (54): + +- aco: either copy-propagate or inline create_vector operands +- aco: coalesce parallelcopies during register allocation +- nir: add nir_intrinsic_elect to divergence analysis +- nir: refactor divergence analysis state +- nir: rework phi handling in divergence analysis +- nir: simplify phi handling in divergence analysis +- nir: reset ssa-defs as non-divergent during divergence analysis instead of upfront +- aco: fix WQM coalescing +- aco: restrict copying of create_vector operands to GFX9+ +- aco: don't move create_vector subdword operands to unsupported register offsets +- aco: fix corner case in register allocation +- aco: don't allow unaligned subdword accesses on GFX6/7 +- aco: fix register assignment for p_create_vector on GFX6/7 +- aco: simplify statistics collection for copies +- aco: use full-register instructions to implement subdword packing on GFX6/7 +- aco: Workarounds subdword lowering on GFX6/7 +- aco: adjust GFX6 subdword lowering workarounds for 8bit +- aco: add and use scratch SGPR to lower subdword p_create_vector on GFX6/7 +- aco: coalesce copies more aggressively when lowering to hw +- aco: skip partial copies on first iteration when lowering to hw +- aco: optimize packing of 16bit subdword registers on GFX6/7 +- aco: remove unnecessary split- and create_vector instructions for subdword loads +- aco: fix shared subdword loads +- aco: reorder calls to aco_validate() and cleanup aco_compile_shader() +- aco: don't allow SGPRs on logical phis +- aco: fix WQM handling in nested loops +- radv/aco: implement logic64 instead of lowering +- aco: align swap operations to 4 bytes on GFX6/7 +- aco: don't allow partial copies on GFX6/7 +- radv: introduce RADV_DEBUG=llvm option +- radv: change use_aco -> use_llvm +- radv: enable ACO by default +- aco: fix partial copies on GFX6/7 +- aco: remove superflous (bool & exec) if the result comes from VOPC +- nir: also move vecN in case of nir_move_copies +- nir: refactor nir_can_move_instr +- nir/algebraic: optimize bcsel(a, 0, 1) to b2i +- nir: also move b2i in case of nir_move_copies +- nir/algebraic: optimize iand/ior of (n)eq zero +- nir/algebraic: add optimizations for fsign/isign +- nir/algebraic: add some more unop + bcsel optimizations +- nir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x) +- nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a) +- nir/algebraic: add distributive rules for ior/iand +- nir/algebraic: propagate b2i out of ior/iand +- nir/algebraic: fold some nested bcsel +- aco: fix scratch loads which cross element_size boundaries +- aco: ensure to not extract more components than have been fetched +- aco: don't split store data if it was already split into more elements +- aco: prevent infinite recursion in RA for subdword variables +- aco: ensure readfirstlane subdword operands are always dword aligned +- radv: call radv_nir_lower_ycbcr_textures after first optimizations +- aco: add GFX6/7 subdword lowering tests +- aco: execute branch instructions in WQM if necessary + +Daniel Stone (13): + +- CI: Disable Panfrost T7x0 jobs +- CI: Re-enable Panfrost T7x0 jobs +- llvmpipe: Expect increased exp precision on Windows +- CI: Windows: Build LLVM and llvmpipe +- CI: Disable Panfrost T720/T760 +- Revert "CI: Disable Panfrost T720/T760" +- CI: Enable assertions on Windows +- CI: Try shared libraries on Windows +- CI: Correct build-directory path on Windows, and keep it +- CI: Re-enable the Windows VS2019 build job +- CI: Temporarily disable Panfrost T860 jobs +- CI: Re-enable Panfrost T860 jobs +- CI: Disable Windows build due to unstable infrastructure + +Danylo Piliaiev (25): + +- glsl: rename has_implicit_uint_to_int_conversion to *_int_to_uint_* +- i965: Fix out-of-bounds access to brw_stage_state::surf_offset +- anv: Translate relative timeout to absolute when calling anv_timelines_wait +- anv: Fix deadlock in anv_timelines_wait +- meson: Disable GCC's dead store elimination for memory zeroing custom new +- mesa: Fix double-lock of Shared->FrameBuffers and usage of wrong mutex +- st/mesa: Clear texture's views when texture is removed from Shared->TexObjects +- intel/fs: Work around dual-source blending hangs in combination with SIMD16 +- glsl: Don't replace lrp pattern with lrp if arguments are not floats +- glsl: inline functions with unsupported return type before converting to nir +- i965: Work around incorrect usage of glDrawRangeElements in UE4 +- st/mesa: account for "loose", per-mipmap level textures in CopyImageSubData +- iris: Honor scanout requirement from DRI +- iris: Fix fast-clearing of depth via glClearTex(Sub)Image +- nir/opt_if: Fix opt_if_simplification when else branch has jump +- nir/tests: Add tests for opt_if_simplification +- st/mesa: Treat vertex outputs absent in outputMapping as zero in mesa_to_tgsi +- anv/nir: Unify inputs_read/outputs_written between geometry stages +- spirv: Only require bare types to match when copying variables +- glsl: Eliminate out-of-bounds triop_vector_insert +- intel/compiler: Fix pointer arithmetic when reading shader assembly +- glsl: Eliminate assigments to out-of-bounds elements of vector +- nir/lower_io: Eliminate oob writes and return zero for oob reads +- nir/large_constants: Eliminate out-of-bounds writes to large constants +- nir/lower_samplers: Clamp out-of-bounds access to array of samplers + +Daryl W. Grunau (1): + +- prevent multiply defined symbols + +Dave Airlie (199): + +- i965: add support for gen 5 pipelined pointers to dump +- i965: disable shadow batches when batch debugging. +- draw/tess: free tessellation control shader i/o memory. +- llvmpipo/nir: free compute shader NIR +- llvmpipe: simple texture barrier implementation. +- gallivm/sample: add multisample support for texel fetch +- gallivm/sample: add multisample image operation support +- gallivm/nir/tgsi: add multisample texture sampling. +- gallivm/nir: add multisample support to image size +- gallivm/nir: add multisample image operations +- draw: introduce sampler num samples + stride members +- draw: add support for num_samples + sample_stride to the image paths +- llvmpipe: add num_samples/sample_stride support to jit textures +- llvmpipe: add samples support to image jit +- util: add a resource wrapper to get resource samples +- llvmpipe: add multisample support to texture allocator. +- llvmpipe: add a max samples define set to 4. +- gallium/util: split out zstencil clearing code. +- llvmpipe: fix race between draw and setting fragment shader. +- llvmpipe: add get_sample_position support (v2) +- llvmpipe/jit: pass fragment sample mask via jit context. +- llvmpipe: pass incoming sample_mask into fragment shader context. +- llvmpipe: add internal multisample texture mapping path. +- llvmpipe: add multisample resource copy region support. +- llvmpipe: add clear texture support for multisample textures. +- llvmpipe: handle multisample render target clears +- draw: disable point/line smoothing for multisample (v2) +- llvmpipe: pass color and depth sample strides into fragment shader. +- llvmpipe: record sample info for color/depth buffers in scene +- llvmpipe/rast: fix tile clearing for multisample color and depth tiles +- llvmpipe: plumb multisample state bit into setup code. +- llvmpipe: add multisample bit to fragment shader key. +- llvmpipe: change mask input to fragment shader to 64-bit. +- llvmpipe: add cbuf/zsbuf + coverage samples to the fragment shader key. +- gallivm: add sample id/pos intrinsic support +- gallivm: add mask api to force mask +- nir/tgsi: translate the interp location +- llvmpipe: pass interp location into interpolation code. +- llvmpipe: add centroid interpolation support. +- llvmpipe: add per-sample interpolation. +- llvmpipe: move getting mask value out of depth code. (v2) +- llvmpipe: add per-sample depth/stencil test +- llvmpipe: move some fs code around +- llvmpipe: multisample sample mask + early/late depth pass +- llvmpipe: handle multisample early depth test/late depth write +- llvmpipe: interpolate Z at sample points for early depth test. +- llvmpipe: handle multisample color stores. +- llvmpipe: hook up sample position system value +- llvmpipe: add multisample alpha to coverage support. +- llvmpipe: add multisample alpha to one support +- llvmpipe: handle gl_SampleMask writing. +- llvmpipe: don't allow branch to end for early Z with multisample +- llvmpipe: pass mask store into interp for centroid interpolation +- llvmpipe: move color storing earlier in frag shader +- llvmpipe: fix multisample occlusion queries. +- llvmpipe: disable opaque variant for multisample +- llvmpipe: add new rast api to pass full 64-bit mask. +- llvmpipe: add fixed point sample positions to scene. +- llvmpipe: build 64-bit coverage mask in rasterizer +- llvmpipe: fixup multisample coverage masks for covered tiles +- llvmpipe: generate multisample triangle rasterizer functions (v2) +- llvmpipe: choose multisample rasterizer functions per triangle (v2) +- llvmpipe: choose correct position for multisample +- llvmpipe: don't choose pixel centers for multisample +- drisw: add multisample support to sw dri layer. +- llvmpipe: enable 4x sample MSAA + texture multisample +- gallivm/sample: add num samples query for txqs (v2) +- gallivm/nir: hooks up texture samples queries +- llvmpipe: enable GL_ARB_shader_texture_image_samples +- llvmpipe: add min samples support to the fragment shader. +- llvmpipe: enable ARB_sample_shading +- llvmpipe: make sample position a global array. +- zink: enable conditional rendering if available +- r600: enable TEXCOORD semantic for TGSI. +- r600/sfn: plumb the chip class into the instruction emission +- r600/sfn: fix cayman float instruction emission. +- r600/sfn: cayman fix int trans op2 +- r600/sfn: add callstack non-evergreen support +- r600/sfn: add emit if start cayman support +- llvmpipe: don't use sample mask with 0 samples +- llvmpipe: use per-sample position not sample id for interp +- llvmpipe/interp: fix interpolating frag pos for sample shading +- llvmpipe: remove non-simple interpolation paths. +- gallivm/nir: add an interpolation interface. +- llvmpipe/interp: refactor out use of pixel center offset +- llvmpipe/interp: refactor out centroid calculations +- llvmpipe: add interp instruction support +- llvmpipe/fs: hook up the interpolation APIs. +- gallivm/nir: add sample_mask_in support +- llvmpipe: add gl_SampleMaskIn support. +- r600/sfn: fix nop channel assignment. +- llvmpipe: compute shaders work better with all the threads. +- llvmpipe: move coroutines out of noopt case +- ci: bump virglrenderer to latest version +- util/disk_cache: add fallback for disk_cache_get_function_identifier +- llvmpipe/cs: overhaul cs variant key state. +- llvmpipe/draw: drop variant number from function names. +- gallivm: rework coroutine malloc/free callouts. +- gallivm: rework debug printf hook to use global mapping. +- gallivm: add support for a cache object +- gallivm: skip operations if we have a cached object. +- gallivm: add cache interface to mcjit +- llvmpipe: add infrastructure for disk cache support +- gallivm: don't cache shaders that use fetch functions. +- llvmpipe/fs: add caching support +- llvmpipe/cs: add shader caching +- draw: add disk cache callbacks for draw shaders +- llvmpipe: hook draw disk cache up +- draw: add disk caching for draw shaders +- draw/gs: fix emitting inactive primitives crash +- draw/gs: add more info to debugging. +- gallivm/nir: add group barrier support +- llvmpipe: fix subpixel bits reporting. +- gallivm/format: convert unsigned values to float properly. +- gallivm/conv: enable conversion min code. (v2) +- gallivm/sample: fix texel type for stencil 8-bit +- llvmpipe/setup: add planes for draw regions if no scissor. +- gallivm/cache: don't require a null terminator for cache data. +- mesa/gles3: add support for GL_EXT_shader_group_vote +- virgl: change vendor id to reflect reality more. +- llvmpipe: change vendor to be more generic. +- softpipe: change vendor name to something more generic. +- gallivm/nir: fix const loading on big endian systems +- glsl: fix constant packing for 64-bit big endian. +- gallivm/nir: fix big-endian 64-bit splitting/merging. +- llvmpipe: fix occlusion queries on big-endian. +- mesa/get: fix enum16 big-endian getting. +- draw/llvm: fix big-endian mask adjusting +- draw: pass nr_samplers into llvm sample state creation. +- llvmpipe: pass number of samplers into llvm sampler code. +- gallivm/sample: change texture function generator api +- gallivm: add indirect texture switch statement builder. +- draw: add support for indirect texture access +- llvmpipe: add support for indirect texture access. +- gallivm/nir: add texture unit indexing +- gallivm/nir: handle non-uniform texture offsets +- gallivm/sample: pass indirect offset into texture/image units +- llvmpipe/draw: wire up indirect offset +- gallivm/sample: handle size unit offset +- llvmpipe: enable ARB_gpu_shader5 +- draw: pass number of images to image soa create +- llvmpipe: pass number of images into image soa create +- gallivm/nir: support passing image index into image code. +- gallivm/nir: refactor image operations for indirect support. +- gallivm/img: refactor out the texel return type (v2) +- gallivm/nir: add support for indirect image loading +- draw/sample: add support for indirect images +- llvmpipe: handle indirect images properly +- ci: fixup tests after all indirect images fixes. +- docs: update llvmpipe GL 4.0 status +- draw/clip: cleanup viewport index handling code. +- draw/clip: fix viewport index for geometry shaders +- mesa/version: only enable GL4.1 with correct limits. +- llvmpipe: bump texture/scene limits to enable GL 4.1 +- llvmpipe: bump to GL support to GL 4.1 +- llvmpipe: enable GL 4.2 +- gallivm/nir: call end prim at end on all GS streams. +- draw: emit so primitives before ending empty pipeline. +- draw/gs: fix up current verts in output fetching. +- gallivm/draw/gs: pass vertex stream count into shader build +- draw/gs: only allocate memory for streams needed. +- gallivm/gs_iface: pass stream into end primitive interface. +- gallivm/nir: don't access stream var outside bounds +- gallivm/nir: end primitive for all streams. +- draw: account primitive lengths for all streams. +- draw/gs: reverse the polarity of the invocation/prims execution +- draw: use common exit path in pipeline finish. +- draw: free vertex info from geometry streams. +- draw/gs: use mask to limit vertex emission. +- ci/virgl: update results after streams fixes. +- llvmpipe: add ARB_post_depth_coverage support. +- llvmpipe: denote NEW fs when images change. +- llvmpipe: flush resources on sampler view binding +- llvmpipe/cs: fix image/sampler binding for compute +- nouveau: avoid LTO ODR warning (v2) +- gallivm/sample: always square rho before fast log2 +- llvmpipe/format: fix snorm conversion +- mesa: change dsa texture error codes for GL 4.6 +- ci: bump piglit checkout for dsa tests +- llvmpipe: fix stencil only formats. +- llvmpipe: fix position offset interpolation +- llvmpipe/cs: respect render condition +- llvmpipe: add framebuffer fetching support (v1.1) +- ci/llvmpipe: reenable gpu shader5 tests +- llvmpipe: enable EXT_texture_shadow_lod +- llvmpipe/draw: handle constant buffer limits and robustness (v1.1) +- drisw: add robustness extension support. +- glx/drisw: add robustness support +- llvmpipe: add device reset query context hook. +- llvmpipe: enable robust buffer access + GL 4.3, GLES 3.2 and robust buffer access behaviour +- llvmpipe/ms: fix sign extension bug in rasterizer. +- Revert "llvmpipe: Use the default behavior of ALLOW_MAPPED_BUFFERS." +- radv: cleanup locking around timeline waiting. +- llvmpipe: only read 0 for channels being read +- llvmpipe/blit: for 32-bit unorm depth blits just copy 32-bit +- llvmpipe: enable GL 4.5 +- llvmpipe/cs: update compute counters not fragment shader. +- llvmpipe: include gallivm perf flags in shader cache. +- gallivm: disable brilinear for lod bias and explicit lod. + +David McFarland (1): + +- radv: link with ld_args_build_id + +David Stevens (2): + +- nir: Add colorspace support to YUV lowering pass +- i965/i915: Add colorspace support to YUV sampling + +Denys (1): + +- gitlab: Ask about reproduction rate in the issue template + +Dmitriy Nester (8): + +- mesa: check draw buffer completeness on glClearBufferfv/glClearBufferuiv +- nir: replace fnv1a hash function with xxhash +- freedreno: replace fnv1a hash function with xxhash +- i965: replace fnv1a hash function with xxhash +- util/hash_table: replace fnv1a hash function with xxhash +- r600: replace fnv1a hash function with xxhash +- zink: replace fnv1a hash function with xxhash +- util: delete fnv1a hash function + +Duncan Hopkins (1): + +- zink. Changed sampler default name. + +Dylan Baker (41): + +- docs: Add release notes for 20.0.6 +- docs: Add SHA256 sums for 20.0.6 +- docs: update calendar, add news item, and link releases notes for 20.0.6 +- docs: Add release notes for 20.0.7 +- docs/relnotes Add sha256 sums to 20.0.7 +- docs: update calendar, add news item, and link releases notes for 20.0.7 +- tests: Make tests aware of meson test wrapper +- meson: Bump required version to 0.52.0 +- meson: Use the check_header function +- meson: Use build_always_stale instead of build_always +- meson: Use builtins for checking gnu __attributes__ +- drm-shim/meson: The name of the target is a string not a list +- drm-shim/meson: Use portable override_options for setting C standard +- meson: use gnu_symbol_visibility argument +- meson: use 2 space not 3 space indent +- meson: deprecated 'true' and 'false' in combo options for 'enabled' and 'disabled' +- vulkan-overlay/meson: use install_data instead of configure_file +- docs: Add release notes for 20.0.8 +- docs: Add sha256sums for 20.0.8 +- docs: update calendar, add news item, and link releases notes for 20.0.8 +- mesa/swrast: use logf2 instead of util_fast_log2 +- VERSION: bump for 20.2.0-rc1 +- .pick_status.json: Update to 9333a8570d2174b73da63c3ee6f1a740ae487ab8 +- .pick_status.json: Update to 1e28745bc0d3528c1dfc25459456849feb58d407 +- meson/freedreno: Fix lua requirement +- .pick_status.json: Update to fdb97d3d2914c8f887a7968432db4fdbd35d8376 +- bump version for 20.2.0-rc2 +- .pick_status.json: Update to 61042b1bdb199f98dd34085ed29a8c492ed9b2a3 +- .pick_status.json: Update to 6d28270968e0728bf8bdf48a6abd261c50d9ef07 +- .pick_status.json: Update to ca7d66e847d08914cec0a5e003b400da9c0a2695 +- VERSION: bump for 20.2.0-rc3 +- .pick_status.json: Update to 7fbded8b5821a47c26245b181446f972f920a96e +- .pick_status.json: Mark e93979ba599355c42df01a89073362b970489a3a as denominated +- .pick_status.json: Update to b9927c8c8d0c105699306a68773c015930ff9509 +- VERSION: bump for 20.2.0-rc4 +- .pick_status.json: Update to ef980ac0c1cd65993ba0c1d20e1c09b45bfef99d +- fix: gallivm: disable brilenear for lod bias and explicit lod. +- .pick_status.json: Update to a1f46d7b6943699e5efb60fbcfdd1450db85adb1 +- amd/ac_surface: convert tabs to 3 spaces +- .pick_status.json: Update to 90b98c06493f8a9759e5496d5ec91fb60edf7b92 +- .pick_status.json: Update to 472a20c5fc0feda0f074b4ff95fd7c7a6305c8cd + +Eduardo Lima Mitev (2): + +- freedreno: Centralize UUID generation into new files freedreno_uuid.c/h +- freedreno/uuid: Generate meaningful device and driver UUID + +Elie Tournier (12): + +- virgl: implement ARB_clear_texture +- virgl: Enable CAP_CLEAR_TEXTURE if host supports it +- docs/features: Add ARB_clear_texture to virgl +- gallium: add TGSI_PROPERTY_FS_BLEND_EQUATION_ADVANCED +- glsl_to_tgsi: Set TGSI_PROPERTY_FS_BLEND_EQUATION_ADVANCED +- virgl: Reserved last caps of capability_bits +- gallium: Add PIPE_CAP_BLEND_EQUATION_ADVANCED +- st: expose KHR_blend_equation_advanced if PIPE_CAP_BLEND_EQUATION_ADVANCED +- glsl_to_ir: do lower_blend_equation if PIPE_CAP_FBFETCH +- virgl: Use alpha_src_factor to store blend_equation_advenced value +- virgl: Encode barrier for blend_equation_advanced +- virgl: set PIPE_CAP_BLEND_EQUATION_ADVANCED + +Emmanuel (3): + +- meson: Do not enable USE_ELF_TLS for FreeBSD +- iris: Explicitly cast value to uint64_t +- i965: Explicitly cast value to uint64_t + +Emmanuel Gil Peyrot (2): + +- util/rand_xor: use getrandom() when available +- Expose EGL_KHR_platform_* when EXT is supported + +Emmanuel Vadot (1): + +- meson: Add versioning for xvmc tracker + +Eric Anholt (228): + +- freedreno/ir3: Initialize the unused dwords of the immediates consts. +- freedreno/ir3: Drop redundant IR3_REG_HALF setup in ALU ops. +- freedreno/ir3: Leave bools as 1-bit, storing them in full regs. +- freedreno/ir3: Set up the block predecessors for a3xx TF +- freedreno/ir3: Fix the a3xx TF outputs stores. +- freedreno/ir3: Fix register allocation assertion failures. +- freedreno: Stop doing binning shaders other than the VS in shader-db. +- freedreno/ir3: Skip tess epilogue if the program is missing stores. +- freedreno: Fix assertion failures on GS/tess shaders with shader-db enabled. +- freedreno/ir3: Remove unused half precision shader key flag. +- freedreno: Emit debug messages when doing draw-time recompiles of shaders. +- freedreno/ir3: Improve shader key normalization. +- freedreno/ir3: Stop initializing regid of so->outputs during setup. +- freedreno/ir3: Set up outputs for multi-slot varyings. +- freedreno: Immediately compile a default variant of shaders. +- freedreno/ir3: Set the FS .msaa flag to true during precompiles. +- freedreno/ir3: Add some more tests of cat6 disasm. +- freedreno/ir3: Sync some new changes from envytools. +- freedreno/ir3: Define the bindful uniform/nonuniform desc modes for cat6 a6xx. +- freedreno/ir3: Disable sin/cos range reduction for mediump. +- ci: Clean up setup of the job-specific env vars in baremetal testing. +- ci: Enable IRC flake reporting on freedreno baremetal boards. +- ci: Improve the flakes reports on IRC. +- ci: Fix the nick used in IRC reporting. +- freedreno: Deduplicate ringbuffer macros with computerator/fdperf +- freedreno: Clean up tests around ORing in the reloc flags. +- freedreno: Rename append_bo() in case it doesn't get inlined. +- freedreno: Initialize the bo's iova at creation time. +- freedreno: Start moving relocs flags into the BOs. +- freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it. +- freedreno: Mark all ringbuffer BOs as to be dumped on crash. +- freedreno: Tell the kernel that all BOs are for writing. +- freedreno: Replace OUT_RELOCW with OUT_RELOC. +- freedreno: Drop the "write" arg to emit_const_bo now relocs don't care. +- nir: Fix count when we didn't lower load_uniforms but did shift load_ubos. +- freedreno: Fix non-constbuf-upload UBO block indices and count. +- freedreno: Add a nohw flag to skip submitting to the kernel. +- freedreno: Split the fd_batch_resource_used by read vs write. +- freedreno: Add an early out for preparing to read a resource. +- freedreno: Move the resource_read early out to an inline. +- freedreno: Skip taking the lock for resource usage if it's already flagged. +- freedreno/a4xx+: Increase max texture size to 16384. +- freedreno/a6xx: Improve layout testcase logging for UBWC fails. +- freedreno/a6xx: Add a testcase for UBWC buffer sharing. +- freedreno: Pull the tile_alignment lookup for a layout to a helper. +- freedreno/a6xx: Fix UBWC blockheight for RG8. +- freedreno/a6xx: Fix UBWC mipmap sizing. +- freedreno/a6xx: Fix UBWC mipmapping height alignment. +- nir: Include num_ubos in the printed shader (if nonzero). +- freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa). +- freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift. +- freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges. +- freedreno: Trim num_ubos to just the ones we haven't lowered to constbuf. +- freedreno/a6xx: Use LDC for UBO loads. +- freedreno: Drop the noubo fails list for CI, since there aren't any now. +- freedreno: Fix attempts to push UBO contents past the constlen on pre-a6xx. +- freedreno: Fix resource layout dump loop. +- freedreno: Avoid duplicate BO relocs in FD_RINGBUFFER_OBJECTs. +- ci: Move cross file generation to a shared script. +- ci: Autodetect whether we need cross setup in lava_arm builds. +- ci: Make cmake toolchain file for deqp cross build setup. +- ci: Make the create-rootfs more resilient. +- ci: Update versions of packages to remove from rootfses. +- ci: Switch the baremetal runner to be an x86 docker image. +- ci: Disable SMP on the a5xx boards. +- ci: Make a530's GLES3/31 fractional runs much more complete. +- freedreno/a5xx: Move resource layout to fdl. +- freedreno/fdl: Separate the list of a6xx testcases from the the test code. +- freedreno/a5xx: Add the outline of a unit test for a5xx layout. +- freedreno/a5xx: Set MIN_LAYERSZ on 3D textures like we do on a6xx. +- freedreno/a5xx: Define the 2D blit UBWC pitch fields +- ci: Fix DEQP_CASELIST_FILTER (used by a630 noubo run) +- ci: Do an explicit NIR validation-enabled pass on freedreno a630. +- ci: Don't forget to set NIR_VALIDATE in baremetal runs. +- ci: Enable a fractional run with UBO-to-constbuf disabled on a3xx. +- ci: Improve baremetal's logging of the job env var passthrough. +- freedreno/a6xx: Fix the size of buffer image views. +- freedreno: Fix printing of unused src in disasm of cat6 RESINFO. +- freedreno: Add more resinfo/ldgb testcases. +- freedreno: Fix resinfo asm, which doesn't have srcs besides IBO number. +- freedreno: Set the immediate flag in a4/a5xx resinfos. +- freedreno/ir3: Refactor out IBO source references. +- freedreno/ir3: Move handle_bindless_cat6 to compiler_nir and reuse. +- freedreno/ir3: Use RESINFO for a6xx image size queries. +- ci: Drop double ".txt" suffix on the unexpected results file. +- ci: Drop old comment about enabling --deqp-watchdog. +- ci: Auto-detect the architecture for VK ICD filenames. +- ci: Add DEQP_EXPECTED_RENDERER support for VK tests. +- ci: Move baremetal DEQP_NO_SAVE_RESULTS setup to the yml. +- ci: Quick exit qpa extraction for non-matching qpas. +- ci: Disable the firmware loader user helper option in arm64 kernels. +- ci: Build a cheza kernel. +- ci: Add scripts for controlling bare-metal chezas. +- ci: Switch cheza (freedreno a630) testing to baremetal. +- ci: Don't build an arm_test container now that the last user is gone. +- ci: Rename x86_cross_arm_test to just arm_test. +- turnip: Move vertex buffer bindings to SET_DRAW_STATE. +- turnip: Don't bother clamping VB size. +- turnip: Simplify vertex buffer bindings. +- turnip: Use tu_cs_emit_regs() for BLEND_CONTROL. +- turnip: Add support for alphaToOne. +- freedreno/a6xx: Add support for ALPHA_TO_ONE. +- freedreno: Upload gallium constbufs as needed when referenced as a UBO. +- freedreno/ir3: Refactor ir3_cp's lower_immed(). +- freedreno/ir3: Stop pushing immediates once we've filled the constbuf. +- freedreno/ir3: Drop unnecessary alignment of pushed UBO size. +- freedreno/ir3: Stop shifting UBO 1 down to be UBO 0. +- freedreno/ir3: Account for driver params in UBO max const upload. +- freedreno/ir3: Drop the max_const on a6xx to 512. +- freedreno/ir3: Handle cases where we decide not to lower UBO 0 loads. +- turnip: Fix crashes in compute with no descriptors to load. +- ci: Bump up to the current version of the VK CTS. +- ci: Disable shader cache on vulkan CI runs. +- ci: Build the full VK CTS for baremetal testing. +- ci: Enable pre-merge fractional vulkan CTS runs on the turnip driver. +- ci: Use rsync for initial nfsroot population on cheza. +- turnip: Expose robustBufferAccess. +- freedreno/a6xx: Fix clip_halfz support. +- ci: Leave a note as to what might be going on with a test. +- ci: Fix weird filesystem globs appearing in failed test .qpa files. +- ci: Disable some flaky tests on turnip. +- ci/bare-metal: Reword the final output of the init script on the board. +- ci/bare-metal: Make which test to run configurable. +- ci/bare-metal: Use the deqp-runner bits straight out of the artifacts. +- ci/bare-metal: Stop fetching the git tree. +- ci/bare-metal: Terminate the job with an error on kernel panic. +- docs: Replace ancient swrast conformance docs with more current information. +- docs: Add dri-devel to the mailing lists and drop the DRI wiki link. +- ci: disable the windows tests until the runner can be stabilized again +- ci: Bump vulkan CTS to 1.2.3.0. +- ci: Enable NIR validation on a630 GLES2 and VK tests. +- ci/bare-metal: Skip setting of unset variables at startup. +- ci/bare-metal: Don't include dev packages in arm*test. +- ci/tracie: Print the path if the trace isn't found. +- ci/tracie: Fix apitrace dump using "less" which isn't in the ARM rootfs. +- ci: Add a freedreno a630 tracie run. +- freedreno/a6xx: Define the register fields for polygon fill mode. +- turnip: Add support for polygon fill modes. +- freedreno/a6xx: Add support for polygon fill mode (as long as front==back). +- ci: Remove a stray "always" on the freedreno traces job. +- ci/bare-metal: Fail early when we get stuck powering on a cheza. +- ci/baremetal: Bump the kernel to a recent drm-msm-fixes for msm semaphores. +- turnip: Do better TU_DEBUG=startup logging of drmGetDevices2() failure. +- turnip: Fix error handling of DRM_MSM_GEM_INFO ioctls. +- turnip: Properly return VK_DEVICE_LOST on queuesubmit failures. +- gallium/util: Add a helper function for point sprite handling. +- vc4: Enable PIPE_CAP_TGSI_TEXCOORD. +- v3d: Enable PIPE_CAP_TGSI_TEXCOORD. +- v3d: Fix -Wmaybe-uninitialized compiler warning in the v33 code. +- ci: Disable pixmark-piano trace on a630 due to GPU hangs. +- util: Avoid strict aliasing bugs in xxhash. +- util: Mark util_format_description() as a const function. +- softpipe: Clean up softpipe's SSBO load/store interpreting instructions. +- util: Remove unused util_format_planar_is_supported(). +- etnaviv: Use the util_pack_color_union() helper. +- gallium/util: Fix location of the comment about S8_UINT handling. +- gallium/util: Clean up the Z/S tile write path. +- gallium/util: Move the Z/S handling to the outside of get_tile(). +- svga: Reuse util_format_unpack_rgba(). +- util: Merge util_format_write_4* functions. +- util: Merge util_format_read_4* functions. +- util: Use designated initializers to clean up the format tables' pack/unpack. +- llvmpipe: Generalize "could llvmpipe fetch this format" check in unit testing. +- util: Remove the stub pack/unpack functions for YUV formats. +- util: Share a single function pointer for the 4-byte rgba unpack function. +- docs: Move the current CI .rst doc to docs/ci/ and link to it from .gitlab-ci. +- docs: Move the conformance and the CI docs to a top level Testing section. +- docs: Move the gitlab-ci docs to RST. +- docs: Relax the expectations of HW CI farms. +- docs: Document how to interact with docker containers. +- freedreno/ir3_cmdline: Fix an uninit var warning. +- freedreno/ir3: Fix uninit var warning. +- intel: Fix release-build warnings about sf_entry_size. +- intel/perf: Fix unused var warning in release builds. +- intel/perf: Move perf query register programming to static tables. +- freedreno/a2xx: Fix compiler warning in disasm. +- meson: Enable GCing of functions and data from compilation units by default. +- freedreno/ir3: Fix duplicated fine derivatives instructions. +- freedreno/ir3: Add unit tests for derivatives disasm. +- ci: Use FDO_CI_CONCURRENT as our -j flags when present in the runner env. +- freedreno/ir3: Add a note about the instructions in the disasm test. +- freedreno/ir3: Add a bunch more tests for cat6 opcodes. +- freedreno/ir3: Refactor cat6 general dst printing. +- freedreno/ir3: Fix disasm of register offsets in ldp/stp. +- freedreno/ir3: Add missing ld_args_build_id to the ir3_delay unit test. +- ci: Set XDG_CACHE_HOME to tmpfs for bare-metal runners to avoid NFS. +- ci: Update checksums for freedreno traces. +- llvmpipe: Remove a bunch of default handling of pipe caps. +- llvmpipe: Use the default behavior of ALLOW_MAPPED_BUFFERS. +- softpipe: Remove a bunch of default handling of pipe caps. +- softpipe: Use the default behavior of ALLOW_MAPPED_BUFFERS. +- virgl: Remove a bunch of default handling of pipe caps. +- swr: Remove a bunch of default handling of pipe caps. +- swr: Use the default behavior of ALLOW_MAPPED_BUFFERS. +- svga: Remove a bunch of default handling of pipe caps. +- i915: Remove a bunch of default handling of pipe caps. +- softpipe: Refactor pipe_shader_state setup. +- softpipe: Convert to comma-separated SOFTPIPE_DEBUG for debug options. +- softpipe: Add support for reporting shader-db output. +- softpipe: Enable PIPE_CAP_TGSI_TEXCOORD. +- softpipe: Enable PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS; +- ci/bare-metal: Capture the first devcoredump a job produces. +- drm-shim: Return -EINVAL instead of abort()ing on unknown ioctls. +- docs: Explain how to set up a personal gitlab runner. +- nir: Add a pass to cut the trailing ends of vectors. +- i965: Enable vector shrinking in the vec4 backend. +- amd: Swap from nir_opt_shrink_load() to nir_opt_shrink_vectors(). +- nir: Remove the old nir_opt_shrink_load. +- freedreno: Fix "Offset of packed bitfield changed" warnings: +- nir/lower_amul: Use num_ubos/ssbos instead of recomputing it. +- nir: Add a little more docs about NIR's constant_data. +- nir: Print the constant data size associated with a shader. +- freedreno/ir3: Fix the type of half-float indirect uniform loads. +- freedreno/a6xx: Document the bit for the magic 32bit-uniforms-as-16b mode. +- freedreno/computerator: Set SP_MODE_CONTROL to the same value as vulkan/GL +- freedreno/ir3: Merge the redundant immediate_idx/immediates_count fields +- freedreno/ir3: Simpify the immediates from an array of vec4 to array of dwords. +- freedreno: Rename emit_const_bo() to emit_const_ptrs(). +- freedreno: Split ir3_const's user buffer and indirect upload APIs. +- freedreno/ir3: Clean up instrlen setup. +- freedreno: Increase the NUM_UNIT on compute's consts in indirect dispatch. +- freedreno: Add more asserts for DST_OFF/NUM_UNIT in indirect const uploads. +- freedreno/ir3: Fix assertion failures dumping CS high full regs. +- turnip: Make sure we include the build id. +- gallium/tgsi_exec: Fix up NumOutputs counting +- freedreno: Make the pack struct have a .qword for wide addresses. +- turnip: Fix truncation of CS shader iovas to 32 bits. +- turnip: Fix truncation of iovas to 32 bits in queries. + +Eric Engestrom (146): + +- cut 20.1 branch +- docs: update calendar for 20.1.0-rc2 +- post_version.py: fix branch name construction for release candidates +- post_version.py: invert `is_point` into `is_first_release` to make its purpose clearer +- post_version.py: stop adding release candidates to the index and relnotes +- docs: update calendar for 20.1.0-rc3 +- gitlab-ci: exclude scripts that don't affect the build +- util/rand_xor: make it clear that {,s_}rand_xorshift128plus take *exactly 2* uint64_t +- util/rand_xor: drop unused header +- util/rand_xor: fallback Linux to time-based instead of fixed seed +- util/rand_xor: extend the urandom path to all non-Windows platforms +- docs: update calendar for 20.1.0-rc4 +- anv: pass the fd directly to anv_gem_reg_read() +- anv: replace magic `| 1` with already #define'd name +- anv: disable VK_EXT_calibrated_timestamps when the timestamp register is unreadable +- git_sha1_gen.py: fix out-of-date comment +- git_sha1_gen.py: fix code style +- git_sha1_gen.py: fix whitespace +- compiler: delete leftover autotools test wrapper +- no_extern_c.h: fix typo in comment +- tree-wide: fix deprecated GitLab URLs +- docs: drop no-longer-relevant comment about bugzilla +- docs: Add release notes for 20.1.0 +- docs: update calendar, add news item, and link releases notes for 20.1.0 +- meson: remove "empty array"/"array of an empty string" confusion +- glapi: remove deprecated .getchildren() that has been replace with an iterator +- intel/genxml: drop sort_xml.sh and move the loop directly in gen_sort_tags.py +- intel: fix gen_sort_tags.py +- docs: Add release notes for 20.1.1 +- docs: update calendar, add news item, and link releases notes for 20.1.1 +- v3d: add missing unlock() in error path +- intel/genxml: drop python 2 support for gen_sort_tags.py +- intel/genxml: replace gen_sort_tags.py MIT licence with SPDX equivalent +- docs: update the blocks of unused EGL enums assigned to us +- i965: drop dead #include "config.h" +- iris: drop dead #include "config.h" +- gen_release_notes.py: update script to the new rST way of things +- post_version.py: update script to the new rST way of things +- intel/tools: rewrite run-test.sh in python +- intel/tools: make test aware of the meson test wrapper +- khronos-update.py: add script to simplify update of Khronos headers & xml files +- docs: remove plain-text copy of versions.rst +- util/os_file: replace broken windows-detection code with detect_os.h +- util: introduce os_dupfd_cloexec() helper +- replace all F_DUPFD_CLOEXEC with os_dupfd_cloexec() +- vulkan/wsi: replace all dup() with os_dupfd_cloexec() +- radv: replace all dup() with os_dupfd_cloexec() +- anv: replace all dup() with os_dupfd_cloexec() +- iris: replace all dup() with os_dupfd_cloexec() +- i965: replace all dup() with os_dupfd_cloexec() +- egl: replace all dup() with os_dupfd_cloexec() +- etnaviv: replace all dup() with os_dupfd_cloexec() +- freedreno: replace all dup() with os_dupfd_cloexec() +- svga: replace all dup() with os_dupfd_cloexec() +- virgl: replace all dup() with os_dupfd_cloexec() +- docs: publish our release maintainers' keys +- docs: remind release maintainers to sign the tarballs and publish their key +- docs: suggest alternative installation methods for meson +- docs: stop considering `Cc: mesa-stable` as an email address +- docs: reword "sending a patch revision" to "updating a merge request" +- docs: drop `git sendemail` instructions +- docs: prefer `Fixes:` over `Cc: mesa-stable` +- docs: add some formatting to the "backport merge request" option +- docs: reword a sentence a bit +- docs: make it clear that the tags needs to be in the commit message +- docs: move `Fixes:` tag explanation to its own section +- docs: move "stable" tag explanation next to `Fixes:` +- driconf: drop 28% catalan translation +- driconf: drop 15% german translation +- driconf: drop 26% spanish translation +- driconf: drop 6% french translation +- driconf: drop 8% dutch translation +- driconf: drop 9% swedish translation +- driconf: drop now unused translation facility +- util: rename xmlpool.h to driconf.h +- gitlab-ci: drop gettext from the build images +- docs: drop deleted file from extra sphinx files +- docs: cat maintainer keys to a single file +- docs: add some padding to the release calendar +- docs: add planning for 20.2 +- bin/symbols-check: explain C++ symbols workaround +- docs: Add release notes for 20.1.2 +- docs: update calendar and link releases notes for 20.1.2 +- docs: fix 20.1.2 relnotes +- docs: add a page explaining the GitLab CI and the Intel CI +- mesa/glformats: make _mesa_gles_error_check_format_and_type() more consistent +- docs: add release notes for 20.1.3 +- docs: update calendar and link releases notes for 20.1.3 +- docs: fix a bunch of typos +- egl: always compile surfaceless +- vulkan: automatically compile the `display` platform when available +- meson: move xlib-lease block further down +- egl: automatically compile the `drm` platform when available +- introduce `commit_in_branch.py` script to help devs figure this out +- bin/gen_release_notes.py: drop new_features.txt when we release XX.Y.0 +- egl/wayland: add missing newline between functions +- glx: drop always-true #ifdef +- docs/submittingpatches: add more than one `Cc: mesa-stable` example to the examples list +- meson/intel: add missing dep on git_sha1.h +- meson: fix android vulkan build +- egl: inline fallback for create_pixmap_surface +- egl: inline fallback for create_pbuffer_surface +- egl: drop unused fallback function +- egl: inline fallback for swap_buffers_with_damage +- egl: inline fallback for swap_buffers_region +- egl: inline fallback for post_sub_buffer +- egl: inline fallback for copy_buffers +- egl: inline fallback for query_buffer_age +- egl: inline fallback for create_wayland_buffer_from_image +- egl: inline fallback for get_sync_values +- egl: drop now empty egl_dri2_fallbacks.h +- egl: mark the rest of the callbacks as mandatory or optional +- egl: inline _EGLAPI into _EGLDriver +- docs: add release notes for 20.1.4 +- docs: update calendar and link releases notes for 20.1.4 +- post_version.py: don't generate relnotes twice +- post_version.py: drop incorrect conf.py changes +- post_version.py: stop using non-existent functions and fix commit message +- post_version.py: update the files in the current worktree, not the one with the script that we run +- post_version.py: fix relnotes links +- bin/gen_release_notes: automatically commit release notes +- docs/releasing: improve wording +- bin/khronos-update: having a folder in include/ is not a requirement +- bin/khronos-update: add support for the SPIRV files +- bin/khronos-update: add workaround for python bug 9625 +- egl: replace _eglInitDriver() with a simple variable +- egl: drop unnecessary _eglGetDriver() +- egl: fix _eglMatchDriver() return type +- egl: inline _eglMatchAndInitialize() and refactor _eglMatchDriver() +- egl: rename _eglMatchDriver() to _eglInitializeDisplay() +- egl: drop left-over function prototype +- egl: const _eglDriver +- egl/haiku: drop overwritten preset of EGL version +- egl: consistently use dri2_egl_display() helper macro +- meson: fix `-D xlib-lease=auto` detection +- docs: add release notes for 20.1.5 +- docs: update calendar and link releases notes for 20.1.5 +- pick-ui: specify git commands in "resolve cherry pick" message +- egl/entrypoint-check: split sort-check into a function +- egl/entrypoint-check: add check that GLVND and plain EGL have the same entrypoints +- driconf: fix force_gl_vendor description +- meson: bump required glvnd version +- egl/x11_dri3: enable & require xfixes 2.0 +- egl/x11_dri3: implement EGL_KHR_swap_buffers_with_damage +- meson: don't advertise TLS support if glx wasn't build with it +- meson: drop leftover PTHREAD_SETAFFINITY_IN_NP_HEADER + +Erico Nunes (16): + +- lima/ppir: introduce liveness internal live set +- lima/ppir: fix lod bias register codegen +- lima/ppir: do not assume single src for pipeline outputs +- lima/ppir: combine varying loads in node_to_instr +- lima/ppir: duplicate intrinsics in nir +- lima/ppir: duplicate consts in nir +- lima/ppir: remove unused clone functions +- lima/ppir: rework emit nir to ppir +- lima/ppir: rework store output +- lima/ppir: add fallback mov option for const scheduler +- lima/ppir: rework select conditions +- lima/ppir: handle failures on all ppir_emit_cf_list paths +- lima/ppir: improve handling for successors in other blocks +- lima/ppir: rework tex lowering +- lima/ppir: optimize tex loads with single successor +- lima/ppir: use a ready list in node_to_instr + +Erik Faye-Lund (124): + +- compiler/nir: move tan-calculation to helper +- vtn/opencl: add native_tan-support +- vtn/opencl: native variants of sin/cos +- vtn/opencl: native divide support +- vtn/opencl: native powr support +- vtn/opencl: native recip support +- vtn/opencl: native rsqrt support +- vtn/opencl: native sqrt support +- compiler/glsl: explicitly store NumUniformBlocks +- mesa/st: consider NumUniformBlocks instead of num_ubos when binding +- zink: use nir_lower_uniforms_to_ubo +- zink: lower b2b to b2i +- util/os_memory: never use os_memory_debug.h +- st/wgl: pass st_context_iface into stw_st_framebuffer_present_locked +- st/wgl: allocate and resolve msaa-textures +- docs/features: add zink features +- zink: load vk_GetMemoryFdKHR while creating screen +- zink: add a GET_PROC_ADDR macro to simplify load_device_extensions +- docs/features: mark GL_NV_conditional_render as done for zink +- zink: disable vkCmdResolveImage when respecting render-condition +- zink: do not expose real value for PIPE_CAP_MAX_VIEWPORTS +- zink: correct PIPE_SHADER_CAP_MAX_SHADER_IMAGES +- zink: mark depth-component cube-maps as done +- zink: implement i2b1 +- docs: fix broken release-calendar +- zink: hammer in an explicit wait when retrieving buffer contents for reading +- zink: use samples from state +- zink: do not dig into resource for nr_samples +- zink: pass batch instead of context for queries +- zink: implement nir_texop_txf_ms +- zink: expose PIPE_CAP_TEXTURE_MULTISAMPLE +- docs/features: mark GL_ARB_texture_multisample as done for zink +- zink: use general-layout when blitting to/from same resource +- zink: Use store_dest_raw instead of storing an uint +- nir: reuse existing psiz-variable +- zink: emulate B8G8R8X8_SRGB with B8G8R8A8_SRGB +- zink: assert that image-view format isn't undefined +- zink: only report device-local memory as video-memory +- gallium/hud: do not specify potentially invalid depth-range +- TEMP: add rst-conversion scripts +- docs: convert articles to reructuredtext +- TEMP: remove rst-conversion scripts +- docs: delete no longer needed file +- docs: fixup botched table +- docs: escape double colons +- docs: escape asterisks +- docs: escape trailing underscores properly +- docs: fixup broken rst +- docs: fixup heading-levels +- docs: use sphinx +- docs: disable syntax-highlighting by default +- docs: use code-block with caption instead of table +- docs: format notes as rst-notes +- docs: use code-blocks +- docs: drop open-coded toc for articles +- docs: add xlibdriver to table-of-contents +- docs: do not copy source-files to site +- docs: use rst footnotes instead of manual ones +- docs: reformat license table as rst table +- docs: use rst-note for highlighted text +- docs: bundle extra files +- docs: include specs into the generated docs +- gitlab-ci: build and deploy docs +- docs: drop news in favour of the introduction as index-page +- README: update references to internal docs +- docs: update internal references +- docs/relnotes: update internal references +- radv: update internal reference +- bin/perf-annotate-jit.py: update internal reference +- docs/release-calendar: restore missing id +- nir: do not try to merge xfb-outputs +- Revert "gallium/hud: don't use user vertex buffers" +- gallium/hud: don't use user vertex buffers +- zink: enable cull-distance if supported +- zink: expose GLSL 1.30 +- docs: update internal references +- docs/relnotes: update internal references +- docs: fixup relnotes after rst-conversion +- docs/features: mark GL3 as complete for zink +- docs/features: update ARB_texture_buffer_object line +- docs/features: remove driver-list for forward-compatible context +- mesa/main: fix inverted condition +- gallium/os: call "ANSI" version of GetCommandLine +- graw/gdi: do not depend on UNICODE macro +- gallium/util: limit STACK_LEN on Windows +- gallium/util: add missing include +- docs: update favicon +- docs: remove non-existent reference +- docs: restore accidentally dropped labels +- docs: fix internal references +- docs: use ref-links for internal references +- gallium/docs: update to recent sphinx +- gallium/docs: fixup formatting of numbered lists +- gallium/docs: remove reference to non-existent label +- gallium/docs: use none for highlight_language +- gallium/docs: prefix exts dir with underscore +- gallium/docs: remove non-existent static dir +- gallium/docs: remove unused imgmath extension +- ci: only build docs in the upstream-repo +- ci: only build docs if any docs changed +- ci: test docs for non-master builds +- ci: move deploy-stage later in the pipeline +- ci: move test-docs to container stage +- ci: add graphviz to the .docs-base template +- merge gallium docs into main docs +- docs: clean up gallium index-file +- docs: add an extension to generate redirects +- docs: move gallium specific docs into gallium folder +- docs: use svg for graphviz output +- docs: fixup envvar output +- zink: expose depth-clip if supported +- mesa/main: factor out one-time-init into a helper +- mesa/main: use call_once instead of open-coding +- gallium/util: do not use _MTX_INITIALIZER_NP on Windows +- mesa/main: use p_atomic_inc_return instead of locking +- mesa: do not use bitfields for advanced-blend state +- mesa: treat Color._AdvancedBlendMode as enum +- zink: use ralloc in nir-to-spirv +- zink: use ralloc for plain malloc-calls +- zink: pass mem_ctx to ralloc_size-call +- zink: use ralloc for spirv_builder as well +- mesa/program: fix shadow property for samplers +- docs: add some very basic documentation about zink +- mesa: handle GL_FRONT after translating to it + +Francisco Jerez (23): + +- intel/ir: Update performance analysis parameters for memory fence codegen changes. +- iris: Simplify iris_batch_prepare_noop(). +- iris: Extend iris_context dirty state flags to 128 bits. +- iris: Add batch-local synchronization book-keeping to iris_bo. +- iris: Add infrastructure to partition batch into sync boundaries. +- iris: Bracket batch operations which access memory within sync regions. +- iris: Annotate all BO uses with domain and sequence number information. +- iris: Drop redundant iris_address::write flag. +- iris: Report use of any in-flight buffers on first draw call after sync boundary. +- iris: Introduce cache coherency matrix for batch-local memory ordering. +- iris: Update cache coherency matrix on PIPE_CONTROL. +- iris: Implement buffer-local memory barrier based on cache coherency matrix. +- iris: Insert buffer barrier in existing cache flush helpers. +- iris: Remove batch argument of iris_resource_prepare_access() and friends. +- iris: Perform compute predraw flushes from compute batch. +- iris: Remove depth cache set tracking and synchronization. +- iris: Remove render cache hash table-based synchronization. +- iris: Open-code iris_cache_flush_for_read() and iris_cache_flush_for_depth(). +- iris: Emit single render target flush PIPE_CONTROL on format mismatch. +- iris: Remove iris_flush_depth_and_render_caches(). +- OPTIONAL: iris: Perform BLORP buffer barriers outside of iris_blorp_exec() hook. +- iris/icl+: Report same caching domain as main surface for clear color BO. +- intel/ir/gen12+: Work around FS performance regressions due to SIMD32 discard divergence. + +Frank Binns (2): + +- docs: change "Fixes:" tag example to match git fixes output +- egl/dri2: only take a dri2_dpy reference when binding a new context/surfaces + +Frédéric Bonnard (2): + +- clover: Fix types collision between c++ and altivec +- meson: Revert commit overriding C++ standard with gnu++11 on ppc64el + +Gert Wollny (66): + +- r600: Annotate some case fallthroughs +- r600: remove unused static functions +- r600/sb: replace memset by using member initialization/assignment +- r600: remove some unused variables to silence warnings +- r600: Fix warning regarding mixing enums and unsigned in ?: expression +- r600: Fix nir compiler options, i.e. don't lower IO to temps for TESS +- r600/sfn: Unify semantic name and index query and use TEXCOORD semantic +- r600/sfn: Fix printing vertex fetch instruction flags +- r600: Lower int64 ops from TGSI-to-NIR shaders too +- r600: Lower lerp after tgsi_to_nir +- r600: Add support for loading index register from other than chan X +- r600/sfn: Handle CF index loading from non-X channel +- r600/sfn: rework getting a vector and uniforms from the value pool +- r600/sfn: Skip move instructions if they are only ssa and without modifiers +- r600/sfn: re-use an allocated register in lookup +- r600/sfn: skip copying LOD if the target register is is the same +- r600/sfn: Fix memring print output +- r600/sfn: Fix RING instruction assembly emission +- r600/sfn: Fix GDS assembly emission +- r600/sfn: Fix RAT instruction assembly emission +- r600/sfn: Make allocate_reserved_registers forward to a virtual function +- r600/sfn: Fix handling of output register index +- r600/sfn: Make 3vec loads skip possible moves +- r600/sfn: Add support for viewport index output +- r600/sfn: Take FOGC, and backcolors into account im GS outputs +- r600/sfn: Handle loading sample_pos +- r600/sfn: Add FS output sample_mask +- r600/sfn: Don't reject VARYING_SLOT_PCNT +- r600/sfn: remove pointless check +- r600/sfn: assert when alu dest is missing +- r600/sfn: support indirect sampler buffer reads. +- r600/sfn: Add support for texture_samples +- r600/sfn: use the per shader atomic base +- r600/sfn: SSBO: Fix query of dest components +- r600/sfn: Fix clip vertex output as possible stream variable +- r600/sfn: Fix splitting constants that come from different kcache banks. +- r600/sfn: Don't reorder outputs by location +- r600/sfn: Fix printing ALU op without dest +- r600: Fix duplicated subexpression in r600_asm.c +- r600/sfn: Fix mapping for f32tof64 and f64tof32 +- r600/sfn: use modern c++ in printing LDS read instruction +- r600/sfn: Correctly update the number of literals when forcing a new group +- r600/sfn: remove debug output leftover +- nir: lower_tex: Don't normalize coordinates for TXF with RECT +- r600/sfn: lower image derefs +- r600/sfn: Add imageio support +- r600/sfn: Add support for image_size +- r600/sfn: Add support for reading cube image array dim. +- r600/sfn: Take SSBO buffer ID offset into account +- r600/sfn: Handle memory_barrier +- r600/sfn: Add lowering pass for shared IO +- r600/sfn: Add support for shared atomics +- r600/sfn: Don't set num_components on TESS sysvalue intrinsics +- r600/sfn: lower rotate ALU ops +- r600/sfn: Pipe through requesting a register at a given channel +- r600/sfn: emit texture instructions in one block +- r600/sfn: Add option to get a temp value for a specific channel +- r600/sfn: correct handling of loading vec4 with fetching constants +- r600/sfn: Add a forced output swizzle for depth write +- r600/sfn: Fix Ring output swizzle masks +- r600/sfn: Fix default z swizzle for GDS instructions +- r600: Add shader key item to identify when the sample mask should be used +- r600/sfn: Only use sample mask if the according shader key is set +- r600/sfn: Make the pin_to_channel generic +- d600/sfn: write stream outputs to correct mem ring +- gallivm/nir: Lower uniforms to UBOs in llvm draw if the driver didn't request this already + +Greg V (1): + +- gallium,util: undef ALIGN on FreeBSD to prevent name clash + +Guido Günther (2): + +- etnaviv: drm: Use NSEC_PER_SEC +- etnaviv: drm: Normalize nano seconds + +Gurchetan Singh (1): + +- virgl: apply bgra dest swizzle and add Portal 2 + +Hanno Böck (1): + +- Properly check mmap return value + +Hyunjun Ko (6): + +- freedreno,tu: Don't request fragcoord components not being read. +- tu,radv: fix potentially wrong offset of flexible array. +- vulkan: Adds helpers for vk_object (de)alloation and (de)initialization. +- tu: Fix wrong copies of sampler descriptor. +- turnip: Use the common base object type and struct. +- turnip: implement VK_EXT_private_data + +Iago Toral Quiroga (7): + +- v3d/compiler: don't rewrite unused temporaries to point to NOP register +- v3d/compiler: fix spill offset +- v3d/compiler: fix image size for 1D arrays +- nir/lower_clip: make the pass compatible with Vulkan semantics +- v3d/compiler: handle compact varyings +- v3d/compiler: request fragment shader clip lowering to be vulkan compatible. +- nir/lower_tex: skip lower_tex_packing for the texture samples query + +Ian Romanick (24): + +- nir/algebraic: Recognize open-coded byte or word extract from bfe +- nir/algebraic: Split ibfe and ubfe with two constant sources +- nir/algebraic: Optimize some bfe patterns +- nir/algebraic: Optimize ushr of pack_half, not ishr +- nir/algebraic: Add some half packing optimizations for pack_half_2x16_split +- nir/algebraic: Eliminate useless extract before unpack +- i965: Assert that blorp always handles color blits +- meta: Make _mesa_meta_texture_object_from_renderbuffer static +- meta: Make _mesa_meta_setup_sampler static +- meta: Remove support for clearing integer buffers +- mesa: Add matrix utility functions to load matrices +- mesa: Add function to calculate an orthographic projection +- meta: Stop frobbing MatrixMode +- meta: Use same vertex coordinates for GLSL and FF clears +- meta: Coalesce the GLSL and FF paths in meta_clear +- meta: Remove support for multisample blits +- anv/tests: Don't rely on assert or changing NDEBUG in tests +- anv/tests: Silence unused parameter warnings in main +- anv: Silence unused parameter warning in anv_image_get_clear_color_addr +- intel: Silence unused parameter warning in __intel_log_use_args +- intel/drm-shim: Add noop ioctl handler for set_tiling +- intel/drm-shim: Return correct values for I915_PARAM_HAS_ALIASING_PPGTT +- glsl: Remove integer matrix support from ir_dereference_array::constant_expression_value +- nir/algebraic: Don't distrubte absolute-value into dot-products + +Icecream95 (78): + +- pan/midgard: Fix old style shadows +- panfrost: Fix background showing when using discard +- panfrost: Enable PIPE_CAP_VERTEX_COLOR_UNCLAMPED +- panfrost: Decode AFBC flag bits +- panfrost: Only use AFBC YTR with RGB and RGBA +- pan/midgard: Use a signed value for checking inline constants +- Revert "panfrost: Keep cached BOs mmap'd" +- panfrost: Mark PIPE_BUFFER BOs as not renderable +- pan/mdg: Add a macro for printing instruction source information +- pan/mdg: Move r1.w writeout to branch->dest +- pan/mdg: Remove old zs store lowering +- pan/mdg: Remove old depth writeout code +- pan/mdg: Remove writeout case from bytemask_of_read_components +- nir: Replace the zs_output_pan intrinsic with combined_output_pan +- pan/mdg: Replace writeout booleans with a single value +- pan/mdg: Add new depth writeout code +- pan/mdg: Move search_var to earlier in midgard_compile.c +- pan/mdg: Add depth/stencil support to emit_fragment_store +- pan/mdg: Add new depth store lowering +- pan/mdg: Print writeout sources in mir_print_instruction +- panfrost: Add writes_stencil to the EARLY_Z disable list +- panfrost: Move sampler view bo creation to a separate function +- panfrost: Create a new sampler view bo when the layout changes +- panfrost: Tiled to linear layout conversion +- panfrost: Clean up panfrost_frag_meta_rasterizer_update +- panfrost: Implement ARB_depth_clamp +- pan/decode: Fix helper invocations when tracing +- pan/decode: Add missing wrap modes +- pan/mdg: Fix max_comp calculation for constant printing +- panfrost: RGBA4 and RGB5_A1 framebuffer support +- panfrost: Update sampler views when the texture bo changes +- panfrost: Copy resources when mapping to avoid waiting for readers +- panfrost: Only copy resources when they are in a pending batch +- panfrost: Add PAN_MESA_DEBUG=gl3 flag +- panfrost: Do fine-grained flushing for occlusion query results +- pan/mdg: Vectorize vlut operations +- pan/decode: Make mapped memory read-only while decoding +- nir: Add a base value to load_raw_output_pan +- panfrost: Fix MALI_READS_TILEBUFFER +- pan/mdg: Handle tilebuffer wait loops +- pan/mdg: Use the writeout tag for tilebuffer wait loops +- panfrost: Add rt formats to shader state +- panfrost: Add a bitset of render targets read by shaders +- pan/mdg: Do the pan_lower_framebuffer pass later +- pan/mdg: Emit a tilebuffer wait loop when needed +- pan/mdg: Handle non-blend framebuffer lowering +- pan/mdg: Support MRT in output load lowering +- pan/mdg: Set the z/s store intrinsic base correctly +- pan/mdg: Use a 32-bit ld_color_buffer op when needed +- panfrost: Implement texture_barrier +- panfrost: Stop keying on rt format when using native loads +- panfrost: Use f2fmp for framebuffer lowering conversions +- panfrost: Enable framebuffer fetch +- pan/mdg: Fix non-debug compiliation +- compiler: Add dual-source factors to blend_factor +- gallium: Dual source support in blend_factor_to_shader +- pan/mdg: Add a nir pass to reorder store_output intrinsics +- pan/mdg: Dual source blend input/writeout support +- pan/mdg: Skip z/s combining for dual-source writes +- panfrost: Dual source blend support +- pan/decode: Open the dump file later +- pan/mdg: Don't disassemble blit shaders +- panfrost: Rename lower_store to is_blend in pan_lower_framebuffer +- pan/mdg: Do per-sample framebuffer loads +- panfrost: Do per-sample shading when outputs are read +- nir: Add a face_sysval argument to nir_lower_two_sided_color +- nir: Fix lower_two_sided_color when the face is an input +- panfrost: Report TEXTURE_BUFFER_OBJECTS cap when gl3 flag set +- panfrost: Set depth_enabled when stencil is enabled +- nir: Set the alignment for SSBO lowering +- panfrost: Make panfrost_bo_wait take a wait_readers bool +- panfrost: Fix calls to panfrost_flush_batches_accessing_bo +- panfrost: Fake RGTC support +- panfrost: Use more tilebuffer sizes +- panfrost: 8x MRT support +- pan/mdg: Use the blend RT for blend shader framebuffer fetches +- panfrost: Allow PIPE_TEXTURE_1D_ARRAY textures +- pan/mdg: Fix spilling of non-32-bit types + +Icenowy Zheng (1): + +- panfrost: signal syncobj if nothing is going to be flushed + +Ilia Mirkin (14): + +- freedreno/a3xx: there's no r8i/ui rb format, only rg8i/rg8ui +- freedreno/a3xx: reinstate rgb10_a2ui texture format +- freedreno/ir3: avoid applying (sat) on bary.f +- freedreno/a3xx: fix const footprint +- freedreno: fix off-by-one in assertions checking for const sizes +- freedreno/a3xx: parameterize ubo optimization +- freedreno/a3xx: fix rasterizer discard +- nouveau: allow invalidating coherent/persistent buffer backings +- st/mesa: allow R8 to not be exposed as renderable by driver +- a4xx: add noperspective interpolation support +- a4xx: add polygon offset clamp, fix units +- ir3: mark ucp_enables as allowed values on all keys +- a4xx: hook up centroid ij coords +- ir3: use empirical size for params as used by the shader + +Indrajit Kumar Das (2): + +- st/mesa: use fragment shader to copy stencil buffer +- st/mesa: optimize DEPTH_STENCIL copies using fragment shader + +Italo Nicola (17): + +- panfrost: Fix outmods on int to float conversions +- pan/mdg: fix src_type in instructions that need a implicit zero +- pan/mdg: prepare effective_writemask() +- pan/mdg: eliminate references to ins->alu.op +- pan/mdg: eliminate references to ins->alu.reg_mode +- pan/mdg: fix comment +- pan/mdg: eliminate references to ins->alu.outmod +- pan/mdg: apply float outmods to textures +- pan/mdg: eliminate references to ins->texture.op +- pan/mdg: eliminate references to ins->load_store.op +- pan/mdg: defer register packing +- pan/mdg: externalize mir_pack_mod +- pan/mdg: remove ins->alu +- pan/mdg: refactor emit_alu_bundle +- pan/mdg: defer branch packing +- pan/mdg: remove ins->br_compact and ins->branch_extended +- pan/mdg: emit REGISTER_UNUSED on unused ALU src2 + +Iván Briano (9): + +- anv: use the correct format on Android +- anv: Disable B5G6R5_UNORM_PACK16 +- anv: Add a way to reserve states from a pool +- anv: Implement VK_EXT_custom_border_color +- anv: support externally synchronized pipeline caches +- anv: implement VK_PIPELINE_CREATE_FAIL_ON_PIPELINE_COMPILE_REQUIRED_BIT_EXT +- anv: enable VK_EXT_pipeline_creation_cache_control +- anv: Add VK_EXT_custom_border_color to relnotes +- anv: fix allocation of custom border color pool + +James Park (1): + +- amd/llvm: Reorder LLVM headers + +James Zhu (1): + +- ac/gpu_info: Correct Acturus cu bitmap + +Jan Beich (5): + +- drm-uapi: Add sync_file.h +- anv,iris: unbreak on BSDs after 812cf5f522ab,abf8aed68047 +- util: enable futex usage on BSDs after 7dc2f4788288 +- meson: unbreak sysctl.h detection on BSDs +- anv: disable i915_perf warning on non-Linux + +Jan Palus (1): + +- targets/opencl: fix build against LLVM>=10 with Polly support + +Jan Zielinski (1): + +- gallium/swr: Fix crashes in sampling code + +Jason Ekstrand (167): + +- intel/eu: Use non-coherent mode (BTI=253) for stateless A64 messages +- Revert "anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT)" +- vulkan: Allow destroying NULL debug report callbacks +- vulkan,anv: Add a common base object type for VkDevice +- anv: Stop clflushing events +- anv: Allocate CPU-side memory for events +- vulkan,anv: Add a base object struct type +- vulkan,anv: Move the DEFINE_HANDLE_CASTS macros to vk_object.h +- anv: Refactor setting descriptors with immutable sampler +- vulkan: Add run-time object type asserts in handle casts +- vulkan/wsi: Make wsi_swapchain inherit from vk_object_base +- anv/allocator: Add a start_offset to anv_state_pool +- vulkan/object: Always include the type +- anv,vulkan: Implement VK_EXT_private_data +- vulkan: Handle vkGet/SetPrivateDataEXT on Android swapchains +- nir: Make "divergent" a property of an SSA value +- util/list: Add a list pair iterator +- util/vma: Add an option to configure high/low preference +- util/vma: Add a debug print helper +- util/ra: Add [de]serialization support +- anv: Set 3DSTATE_VF_INSTANCING on the SVGS element +- anv: Set MOCS in 3DSTATE_CONSTANT_* on Gen9+ +- nir: Add some docs to the metadata types +- anv: Call vk_object_base_finish for image views +- anv: Fix descriptor set clean-up on BO allocation failure +- nir: Use 8-bit types for most info fields +- anv:gpu_memcpy: Emit 3DSTATE_VF_INDEXING on Gen8+ +- nir: Validate jump instructions as an instruction type +- nir: Use a switch statement in nir_handle_add_jump +- nir: Add documentation for each jump instruction type +- nir/clone: Re-use clone_alu for nir_alu_instr_clone +- nir: Add a new helper for iterating phi sources leaving a block +- nir: Add a store_reg helper and use the builder in phis_to_regs +- nir: Add const to nir_intrinsic_src_components +- nir/lower_double_ops: Rework the if (progress) tree +- nir/opt_deref: Report progress if we remove a deref +- nir/copy_prop_vars: Record progress in more places +- nir: Fix sources for image atomic fadd +- intel/vec4: Stomp the return type of RESINFO to UINT32 +- intel/fs: Fix unused texture coordinate zeroing on Gen4-5 +- intel/fs: Emit HALT for discard on Gen4-5 +- anv/allocator: Compare to start_offset in state_pool_free_no_vg +- nir: Add a nir_metadata_all enum value +- nir: Add a nir_shader_preserve_all_metadata helper +- nir: Call nir_metadata_preserve on !progress +- nir: Properly preserve metadata in more cases +- intel/nir: Call nir_metadata_preserve on !progress +- iris: Better handle metadata in NIR passes +- anv: Add an anv_batch_set_storage helper +- anv: Add anv_pipeline_init/finish helpers +- nir/intrinsics: Put the _intel intrinsics together at the end +- anv: Use resolve_device_entrypoint for dispatch init +- vulkan: Update Vulkan XML and headers to 1.2.145 +- anv: Bump the advertised patch version to 145 +- intel/fs: Expose a couple of NIR lowering helpers +- intel/fs: Break wm_prog_data setup into a helper +- intel/fs: Move more prog_data setup into populate_wm_prog_data +- intel/compiler: Expose brw_texture_offset to C +- intel/eu: Add a brw_urb_dest_msg_type helper +- intel/eu: Set the right subnr for ALIGN16 destinations +- intel/eu: Add the RNDU opcode +- vulkan/wsi: Don't consider VK_SUBOPTIMAL_KHR to be an error condition +- wsi/x11: Log swapchain status changes +- freedreno: Only call nir_lower_io on shader_in/out +- lima: Only call nir_lower_io on shader_in/out +- nouveau: Only call nir_lower_io on shader_in/out +- vc4: Only call nir_lower_io on shader_in/out +- v3d: Only call nir_lower_io on shader_in/out +- panfrost: Only call nir_lower_io on shader_in/out +- nir: Assert that nir_lower_io is only called with allowed modes +- nir: Remove shared support from lower_io +- nir: Add docs to nir_lower[_explicit]_io +- anv: Handle clamping of inverted depth ranges +- nir/validate: Don't abort() until after the shader has printed +- spirv: Skip phis in unreachable blocks in the second phi pass +- spirv: Allow block-decorated struct types for constants +- vulkan: Update Vulkan XML and headers to 1.2.148 +- anv: Advertise VK_EXT_image_robustness +- spirv: Update headers and grammar json +- spirv: Add support for SPV_EXT_shader_atomic_float +- intel/fs: Use the correct logical op for global float atomics +- anv: Advertise support for VK_EXT_shader_atomic_float +- nir: Allow for system values with variable numbers of destination components +- nir/lower_io: Choose to set access based on intrinsic metadata +- nir/lower_io: Use b2b for shader and function temporaries +- nir/lower_io: Add support for global scratch addressing +- spirv: Simplify our handling of NonUniform +- spirv: Drop the void *ptr from vtn_value +- spirv: Fix indentation in vtn_handle_ptr +- spirv: Clean up OpSignBitSet +- spirv: Use nir_bany/ball for OpAny/All +- spirv: Add a helpers for getting types of values +- spirv: Rename push_value_pointer to push_pointer +- spirv: Add a vtn_push_nir_ssa helper +- spirv/amd: Use vtn_push_nir_ssa +- spirv: Add a vtn_get_nir_ssa helper +- spirv: Use the new helpers in OpConvertUToPtr/PtrToU +- spirv: Refactor vtn_push_ssa +- spirv/alu: Use vtn_push_ssa_value +- spirv/glsl450: Use vtn_push_ssa_value +- spirv/subgroups: Stop incrementing w +- spirv/subgroups: Refactor to use vtn_push_ssa +- spirv: Simplify vtn_ssa_value creation +- spirv: Hand-roll fewer vtn_ssa_value creations +- spirv: Add better checks for SSA value types +- spirv: Drop the sampled boolean from vtn_type +- spirv: Give atomic counters their own variable mode +- spirv: Add a helper for getting the NIR type of a vtn_type +- spirv: Remove a dead case in function parameter handling +- spirv: More heavily use vtn_ssa_value in function parameter handling +- anv,turnip,radv,clover,glspirv: Run nir_copy_prop before nir_opt_deref +- spirv: Rework our handling of images and samplers +- spirv: Also copy over binding information for atomic counters +- nir: Take a mode in remove_unused_io_vars +- nir/dead_variables: Respect the modes passed to remove_dead_vars +- nir: Add nir_foreach_shader_in/out_variable helpers +- nir: Add a nir_foreach_function_temp_variable helper +- nir: Add a nir_foreach_uniform_variable helper +- nir: Add a nir_foreach_gl_uniform_variable helper for GL linking +- nir: Add and use a nir_variable_list_for_mode helper +- nir: Take a nir_shader and variable mode in assign_var_locations +- nir: Take a shader and variable mode in nir_assign_io_var_locations +- nir/linking: Rework some internal helpers +- st/nir: Rework fixup_varying_slots +- nir/split_vars: Add mode checks to list walks +- nir: Split nir_index_vars into two functions +- nir/lower_amul: Add a variable mode check +- nir: Use a nir_shader and mode in lower_clip_cull_distance_arrays +- nir/lower_io_to_temporaries: Use a separate list for new inputs +- nir/io_to_vector: Use nir_foreach_variable_with_modes +- nir/lower_two_sided_color: Use nir_variable_create +- nir/lower_uniforms_to_ubo: Use nir_foreach_variable_with_modes +- nir/split_per_member_structs: Use nir_variable_with_modes_safe +- nir/lower_variable_initializers: Restrict the modes we lower +- nir/gl_nir_linker: Use nir_foreach_variable_with_modes +- freedreno/ir3_lower_tess: Rework var list helpers +- lima/standalone: Rework i/o variable fixup +- freedreno/ir3_cmdline: Rework i/o variable fixup +- r600/sfn/lower_tess_io: Rework get_tcs_varying_offset +- r600/sfn/lower_tex: Get rid of the lower_sampler vector +- r600/sfn: Use nir_foreach_variable_with_modes in IO vectorization +- panfrost/midgard: Make search_var take a nir_shader and mode +- panfrost: Use nir_foreach_variable_with_modes in pan_compile +- aco: Use nir_foreach_variable_with_modes to walk SSBOs +- mesa/ptn: Use nir_variable_create +- gallium/ttn: Use variable create/add helpers +- nir: Use a single list for all shader variables +- nir/split_per_member_structs: Inline split_variables_in_list +- nir/gl_nir_linker: Call add_vars_with_modes once for GL_PROGRAM_INPUT +- nir: Add a find_variable_with_[driver_]location helper +- vulkan: Update Vulkan XML and headers to 1.2.149 +- anv: Implement VK_EXT_4444_formats +- nir/deref: Don't try to compare derefs containing casts +- compiler/types: Add a struct_type_is_packed wrapper +- spirv: Do more complex unwrapping in get_nir_type +- anv: Advertise shaderIntegerFunctions2 +- spirv: Don't emit RMW for vector indexing in shared or global +- clover/spirv: Don't call llvm::regularizeLlvmForSpirv +- intel/nir: Pass the nir_builder by reference in lower_alpha_to_coverage +- intel/nir: Rewrite the guts of lower_alpha_to_coverage +- intel/fs: Fix MOV_INDIRECT and BROADCAST of Q types on Gen11+ +- intel/fs: Don't copy-propagate stride=0 sources into ddx/ddy +- iris: Re-emit push constants if we have a varying workgroup size +- spirv: Run repair_ssa if there are discard instructions +- nir: More NIR_MAX_VEC_COMPONENTS fixes +- intel/fs/swsb: SCHEDULING_FENCE only emits SYNC_NOP +- radeonsi: Only call nir_lower_var_copies at the end of the opt loop + +Jesse Natalie (10): + +- nir_lower_io: Add addr_format_is_offset helper +- nir: When nir_lower_vars_to_explicit_types is run on temps, update scratch_size +- nir: Support load/store of temps as scratch in nir_lower_explicit_io +- nir: Support vec8/vec16 in nir_lower_bit_size +- nir: Support algebraic opts on vectors larger than 4 +- nir: Support 8 and 16 component vectors for reduceable intrinsics +- nir/vtn: Add support for 8 and 16 vector ball/bany +- u_debug_stack_test: Fix MSVC compiling by using ATTRIBUTE_NOINLINE +- nir: More NIR_MAX_VEC_COMPONENTS fixes +- glsl_type: Add packed to structure type comparison for hash map + +JibbityJobbity (1): + +- drirc: Enable glthread for PCSX2 + +Jon Turney (1): + +- glthread: Fix use of alloca() without #include "c99_alloca.h" + +Jonathan Gray (13): + +- util: unbreak endian detection on OpenBSD +- util/anon_file: add OpenBSD shm_mkstemp() path +- meson: build with _ISOC11_SOURCE on OpenBSD +- meson: don't build with USE_ELF_TLS on OpenBSD +- meson: conditionally include -ldl in gbm pkg-config file +- util: futex fixes for OpenBSD +- util/u_thread: include pthread_np.h if found +- anv: use os_get_total_physical_memory() +- util/os_misc: add os_get_available_system_memory() +- anv: use os_get_available_system_memory() +- util/os_misc: os_get_available_system_memory() for OpenBSD +- radv: remove seccomp includes +- vulkan: make VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT conditional + +Jonathan Marek (135): + +- turnip: update "fetchsize" value to match fdl6_layout changes +- turnip: enable tiling for compressed formats +- util/format: translate 422_UNORM and 420_UNORM vulkan formats +- freedreno/registers: document 422_UNORM and 420_UNORM formats +- turnip: implement VK_KHR_sampler_ycbcr_conversion +- turnip: enable 422_UNORM formats +- freedreno: move a4xx specific layout code to a4xx code +- freedreno/a5xx: remove unused reference to gmem_alignw in layout code +- freedreno/a6xx: don't use gmem_alignw for imported buffers +- freedreno/a6xx: split up gmem/tile alignment requirements +- freedreno: reduce extra height alignment in a6xx layout +- freedreno/a6xx: use RESOLVE_TS event +- freedreno: add adreno 650 +- freedreno/layout: add explicit offset/pitch argument to fdl6_layout +- turnip: support VkImageDrmFormatModifierExplicitCreateInfoEXT +- turnip: fix RENDER_COMPONENTS value +- turnip: move HLSQ_UPDATE_CNTL write to before xs config writes +- turnip: update some properties based on blob driver +- turnip: clamp sampler minLod/maxLod +- freedreno/a6xx: use nonbinning VS when GS is used +- turnip: correctly emit non-binning vs in transform feedback case +- turnip: fix HW binning with geometry shader +- turnip: use common emit_xs_cntl to fill a6xx_sp_xs_ctrl_reg0 +- turnip: fix VFD_CONTROL for binning pass +- turnip: pipeline program state refactor +- turnip: share code between 3D blit/clear path and tu_pipeline +- turnip: add layered 3D path clear for CmdClearAttachments +- turnip: add emit renderpass cache flushes for sysmem 3D CmdClearAttachments +- turnip: remove some dead/redundant code +- freedreno/ir3: fix ir3_nir_move_varying_inputs +- turnip: remove duplicated stage2opcode and stage2shaderdb +- turnip: simplify stage2 helpers +- turnip: set VFD_INDEX_OFFSET in 3D clear/blit path +- turnip: fix 3D path always being used for CmdBlitImage +- turnip: fix cubic filtering with CmdBlitImage +- turnip: compute and graphics have completely separate state +- turnip: move descriptor set BO tracking to CmdBindDescriptorSets +- turnip: improve dirty bit handling a bit +- turnip: delete dead dynamic state code +- turnip: refactor draw states and dynamic states +- turnip: input attachment descriptor set rework +- turnip: use draw states for input attachments +- turnip: use u_format for packing gmem clear values +- freedreno/a6xx: FETCHSIZE is PITCHALIGN +- freedreno/fdl6: rework layout code a bit (reduce linear align to 64 bytes) +- turnip: fix a crash when rasterizerDiscardEnable is set +- turnip: fix a sample shading case +- turnip: fix renderpass gmem configs when there are too many attachments +- turnip: set the API version +- turnip: move enum translation functions to a common header +- freedreno/a6xx: VSC "STRM_ARRAY_PITCH" is "STRM_LIMIT" +- freedreno/a6xx: remove unnecessary OVERFLOW_FLAG_REG check +- turnip: remove unnecessary OVERFLOW_FLAG_REG check +- freedreno/a4xx: restore pitch to bytes change to layout code +- freedreno/a4xx: simplify setup_slices +- turnip: rework streamout state and add missing counter buffer read/writes +- turnip: refactor CmdDraw* functions (and a few fixes) +- turnip: enable VK_EXT_index_type_uint8 +- turnip: implement CmdDrawIndirectByteCountEXT +- turnip: fix ts_cs_memory typo +- turnip: use pipeline cs for shader programs instead of separate bo +- freedreno/registers: a6xx depth bounds test registers +- turnip: implement depthBounds +- turnip: translate CreateRenderPass to CreateRenderPass2 +- turnip: replace a memset(0) with zalloc in CreateRenderPass +- turnip: use RenderPassCreateInfo for render_pass_add_implicit_deps +- turnip: move some logic out of create_render_pass_common +- turnip: implement VK_EXT_vertex_attribute_divisor +- turnip: fix empty scissor case +- turnip: fix update_stencil_mask +- turnip: disable early_z for VK_FORMAT_S8_UINT +- freedreno/registers: add CP_DRAW_INDIRECT_MULTI +- freedreno/ir3: add support for load_draw_id +- turnip: implement VK_KHR_shader_draw_parameters +- turnip: fix VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_1_FEATURES +- turnip: fix huge scissor min/max case +- freedreno/ir3: fix resinfo wrmask +- freedreno/regs: add extra bits for UBWC array pitch +- turnip: enable largePoints +- turnip: enable depthBiasClamp +- freedreno/registers: update varying-related registers +- freedreno/a3xx: support LINEAR_PIXEL/PERSP_CENTROID/LINEAR_CENTROID sysvals +- freedreno/a4xx: fake LINEAR_PIXEL varying support for u_blitter +- freedreno/ir3: add generic get_barycentric() +- freedreno/a5xx: set missing bary sysvals +- freedreno/a6xx: set missing bary sysvals +- turnip: set missing bary sysvals +- freedreno/ir3: add support for INTERP_MODE_NOPERSPECTIVE +- turnip: make tiling config part of framebuffer state +- turnip: rework render_tiles loop +- turnip: vsc improvements +- turnip: fix tess param bo size calculation +- turnip: clear_blit: pass aspect mask to setup function +- turnip: support multi-image layouts +- turnip: enable 420_UNORM formats +- freedreno/layout: fix explicit layout offset not added to slice offset +- freedreno/ir3: fix/rework tess levels +- Revert "nir: Add an option for lowering TessLevelInner/Outer to vecs" +- Revert "nir: Support sysval tess levels in SPIR-V to NIR" +- freedreno/regs: document SS6_UBO state src +- turnip: use global bo for clear blit shaders +- freedreno/ir3: add support for a650 tess shared storage +- freedreno/regs: document CS shared storage size bit +- freedreno/a2xx: fix compressed textures +- freedreno: add a fd_resource_pitch helper +- freedreno/layout: layout simplifications and pitch from level 0 pitch +- turnip: fix active_desc_sets not being set for compute pipeline +- freedreno/ir3: fix setup_input for sparse vertex inputs +- freedreno/ir3: run nir_opt_loop_unroll in optimization loop +- freedreno: fix layout pitchalign field not being set for imported buffers +- freedreno/regs: update primitive output related registers +- turnip: clean up primitive output state +- turnip: drop GS clear path +- turnip: use DIRTY SDS bit to avoid making copies of pipeline load state ib +- turnip: emit compute pipeline directly in CmdBindPipeline +- turnip: fix inconsistencies with tu6_load_state_size +- turnip: remove use of tu_cs_entry for draw states +- gitlab-ci: re-enable arm64_a630_vk +- freedreno/regs: update a6xx GRAS registers +- freedreno/regs: update a6xx RB regs +- freedreno/regs: update a6xx VPC regs +- freedreno/regs: update a6xx PC regs +- turnip: disable tiling for NV12/IYUV formats +- turnip: remove extra gmem alignment +- freedreno/ir3: fix wrong local_primitive_id_start type +- turnip: move WFI out of draw state to fix a650 hangs +- turnip: use patchControlPoints for HS_INPUT_SIZE value +- turnip: fix SP_HS_UNKNOWN_A831 value for A650 +- turnip: workaround for a630 d24_unorm_s8_uint fails +- turnip: fix sysmem CmdClearAttachments 3D fallback breaking GMEM path flush +- turnip: delete tu_clear_sysmem_attachments_2d +- turnip: add support for D32_SFLOAT_S8_UINT +- turnip: rework extended formats to allow more extended formats +- util/format: translate A4R4G4B4_UNORM and A4B4G4R4_UNORM vulkan formats +- turnip: implement VK_EXT_4444_formats + +Jordan Justen (17): + +- intel/dev: Split .num_subslices out of GEN12_FEATURES macro +- intel/dev: Add device info for RKL +- intel/l3: Don't rely on cfg entry URB size being 0 as a sentinal +- intel/l3: Allow platforms to have no l3 configurations +- iris/l3: Enable L3 full way allocation when L3 config is NULL +- anv: Set L3 full way allocation at context init if L3 cfg is NULL +- intel/dev: Add device info for DG1 +- iris: Make use of devinfo has_aux_map field +- anv: Make use of devinfo has_aux_map field +- anv/pipeline: Split VFE/INTERFACE_DESCRIPTOR out to emit_media_cs_state +- anv/cmd_buffer: Split GPGPU_WALKER out to emit_gpgpu_walker +- iris: Split walker and state update into iris_upload_gpgpu_walker +- iris/compute: Split out iris_load_indirect_location +- intel/compiler/cs: Allow simd32 in some more cases with no8 and/or no16 +- intel/compiler/fs: Still attempt simd32 when INTEL_DEBUG=no16 is used +- iris: Add missing break in switch in modifier_is_supported +- anv, iris: Set MediaSamplerDOPClockGateEnable for gen12+ + +Jose Maria Casanova Crespo (4): + +- v3d: Fix swizzle in DXT3 and DXT5 formats +- v3d: Include supported DXT formats to enable s3tc/dxt extensions +- vc4: don't relay on intr->num_components for non-vectorized intrinsics +- nir: only uniforms with dynamically_uniform offset are dynamically_uniform + +Joshua Ashton (7): + +- anv: Remove RANGE_SIZE usage +- radv: Remove RANGE_SIZE usage +- turnip: Remove RANGE_SIZE usage +- vulkan: Update Vulkan XML and headers to 1.2.140 +- radv: Implement VK_EXT_custom_border_color +- radeonsi: Use TRUNC_COORD on samplers +- radv: Implement VK_EXT_4444_formats + +José Fonseca (3): + +- glthread: Add GLAPIENTRY to _mesa_marshal_MultiDrawArrays. +- appveyor: Upgrade pip. +- appveyor: Use Python3. + +Karol Herbst (50): + +- nir/deref: copy ptr_stride when rematerializing +- nir/validate: validate the stride for deref_ptr_as_array +- Revert "nir/validate: validate the stride for deref_ptr_as_array" +- nvir/nir: use component helpers instead of insn->num_components +- st/mesa: lower images when needed +- nir/lower_images: fix for array of arrays +- nir/lower_images: handle dec and inc +- nv50/ir/nir: move away from image_deref intrinsics +- nv50/ir/nir: handle image atomic inc and dec +- nv50/ir/nir: remove image uniform hack +- gv100/ir: fix atom cas +- gv100/ir: fix shift lowering +- gv100/ir: fix OP_TXG for shadow textures +- nv50/ir/nir: add workaround for double vertex attribs +- nv50/ir/print: add missing VIEWPORT_MASK handling +- nv50/ir/nir: fix ext_demote_to_helper_invocation +- nv50/ir/nir: fix nv_viewport_array2 +- nvc0: enable spirv caps with nir +- nv50/ir/nir: don't emit a restart with set a stream_id +- nv50/ir/nir: handle clip vertex for tess eval shaders +- nv50/ir/nir: rework input output handling +- nv50/ir/nir: rework CFG handling +- nv50/ir/ra: convert some for loops to Range-based for loops +- nv50/ir/ra: fix memory corruption when spilling +- nv50/ir/nir: fix interpolation on explicit operations +- gv100/ir: implement sample shading +- gv100/ir: fix coherent and volatile memory access +- nv50/ir/nir: fix cache mode conversion +- nv50/ir: fix memset on non trivial types warning +- nv50/ir/tgsi: move call to tgsi_scan_shader inside Source constructor +- nvc0: set local mem size for compute on gv100 +- nvc0: set sampler index mode to independently on gv100 compute +- gv100/ir: set ftz bit on floating point operations +- ci: bump libdrm to 2.4.102 +- nouveau: enable HMM +- gallium: add PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY +- nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY +- nouveau: expose HMM +- ci: need to install wget in order to download libdrm +- ci: bump libdrm to 2.4.102 +- nouveau: enable HMM +- gallium: add PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY +- nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY +- nouveau: expose HMM +- st/mesa: fix st_CopyPixels without support for stencil exports +- nv50/ir/tgsi: silence warning about unhandled GS_INPUT_PRIM property +- nv50/ir: initialize persampleInvocation to false +- nir/lower_io: assert that offsets are used for shader_in +- nv50/ir/nir: fix global_atomic_comp_swap +- spirv: extract switch parsing into its own function + +Kenneth Graunke (20): + +- iris: Include linux/sync_file.h instead of cut and pasting contents +- anv: Include linux/sync_file.h instead of cut and pasting contents +- iris: Rename iris_syncpt to iris_syncobj for clarity. +- iris: Give up on not passing ice to iris_init_batch +- iris: Destroy transfer slab after batches +- iris: Flush any current work in iris_fence_await before adding deps +- intel: Move anv_gem_supports_syncobj_wait to common code. +- iris: Detect DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT kernel support +- iris: Implement PIPE_FLUSH_DEFERRED support. +- intel: Delete hardcoded devinfo->urb.size values for Gen7+ (sans DG1). +- iris: Delete useless #define +- intel/eu: Add a brw_urb_desc helper +- CI: Disable Panfrost Mali-T820, Lima Mali-400 and Lima Mali-450 jobs +- intel: Disable loading drivers on DG1 devices for now +- nir: Fix divergence analysis for tessellation input/outputs +- iris: Implement pipe->texture_subdata directly +- iris: Fix CCS check in iris_texture_subdata(). +- iris: Delete shader variants when deleting the API-facing shader +- iris: Reorder the loops in iris_fence_await() for clarity. +- iris: Drop stale syncobj references in fence_server_sync + +Kristian Høgsberg (73): + +- freedreno/ir3: Pass stream output info to ir3_shader_from_nir +- freedreno/ir3: Rename ir3_nir_lower_to_explicit_io +- freedreno/ir3: Add ir3_nir_lower_to_explicit_input() pass +- freedreno/ir3: Lower GS builtins before lowering IO +- freedreno/ir3: Drop hack to clean up split vars +- freedreno/fdl: Align after dividing by block size +- freedreno/a6xx: Set tfetch correctly for compressed formats +- freedreno/ir3: Drop wrmask for ir3 local and global store intrinsics +- freedreno/a6xx: Create shader dependent streamout state at compile time +- freedreno/a6xx: Map inputs to VFD entries up front +- freedreno/a6xx: Allocate ringbuffer based on VFD count +- freedreno/a6xx: Emit VFD setup as array writes +- freedreno/a6xx: Avoid stalling for occlusion queries +- freedreno: Use the right amount of &'s +- freedreno: Use explicit *_NONE enum for undefined formats +- turnip: Use hw enum when emitting A6XX_RB_STENCIL_CONTROL +- turnip: Use tu6_reduction_mode() to avoid warning +- turnip: Use {} initializer to silence warning +- freedreno/ir3: Avoid {0} initializer for struct reginfo +- src/util: Remove out-of-range comparison +- mapi: Fix a couple of warning in generated code +- mesa/st: Use memset to zero out struct +- egl/android: Move get_format under HAVE_DRM_GRALLOC guard where it's used +- egl/android: Drop unused variable +- freedreno/a6xx: Move per element offset to VFD_DECODE +- freedreno/a6xx: Decouple VFD_FETCH and VFD_DECODE +- freedreno/a6xx: Create stateobj for VFD_DECODE +- freedreno/a6xx: Program VFD_DEST_CNTL from program stateobj +- freedreno/a6xx: Turn on robustness extensions +- docs/features.txt: Update for freedreno +- freedreno/a6xx: Fix VFD_CONTROL emit +- freedreno/a6xx: Don't write REG_A6XX_RB_SRGB_CNTL in restore +- freedreno/a6xx: Set index buffer size to bo size +- freedreno: Handle DRM_FORMAT_MOD_INVALID in shared code +- turnip: Put VK_KHR_external_fence_fd stubs back +- freedreno/a6xx: Don't blit with R2D_RAW +- freedreno/a6xx: Move fd6_ifmt into fd6_blitter.c +- freedreno/a6xx: Split out src and dst setup helpers for blit +- freedreno/a6xx: Don't set unknown bit when tiling differs +- freedreno/a6xx: Set src and dst rects outside blit loop +- freedreno/a6xx: Program SP_2D_SRC_FORMAT outside blit loop +- freedreno/a6xx: Consolidate computing blit_cntl +- freedreno/a6xx: Don't emit src state when clearing +- freedreno/a6xx: Separate stencil sysmem clear fix +- freedreno/a6xx: Enable FMT6_10_10_10_2_UNORM blitting +- freedreno/a6xx: Make blit_control helper a little more helpful +- freedreno/a6xx: Program A6XX_SP_2D_SRC_FORMAT_COLOR_FORMAT based on dst format +- freedreno/a6xx: Move REG_A6XX_SP_2D_SRC_FORMAT programming to helper +- freedreno/a6xx: Move CP_SET_MARKER to setup helper +- freedreno/a6xx: Program RB_UNKNOWN_8C01 in setup helper +- freedreno/a6xx: Don't take pipe_blit_info in emit_blit_dst +- freedreno/a6xx: Split clear and blit texture into different functions +- freedreno/registers: Rename SP_2D_SRC_FORMAT +- turnip: Move device enumeration and feature discovery to tu_drm.c +- turnip: Move tu_bo functions to tu_drm.c +- turnip: Collapse some tu_drm wrappers +- turnip: Move remaining drm code to tu_drm.c +- turnip: Only include msm_drm in tu_drm.c +- egl/android: Remove unused variable +- mapi/test: Change type to unsigned for offset +- gallium: Switch u_debug_stack/symbol.c to util/hash_table.h +- util: Move stack debug functions to src/util +- util: Add unit test for stack backtrace caputure +- gallium/android: Rewrite backtrace helper for android +- ci: Include enough Android headers to let us compile test EGL +- mapi: Mark TLS symbols as optional in glapi-symbols.txt +- turnip: Make tu_android.c compile again +- meson: Define ANDROID and ANDROID_API_LEVEL when compiling for Android +- anv: Pass device to setup_gralloc0_usage for error reporting +- anv: Add stub for anv_gem_get_tiling() for Android +- vulkan: Allow global symbol HMI for Android +- radv/android: Remove unused variable +- ci: Add a build test for the Android platform + +Krzysztof Raszkowski (1): + +- gallium/swr: Fix building swr with MSVC + +Laura Ekstrand (3): + +- docs: include meson in the toctree +- docs: Remove version. +- docs: Add the favicon to the new page. + +Leo Liu (3): + +- radeon/vcn: reset the decode flags from message buffer +- radeon/vcn: add Sienna to use internal register offset +- radeon/vcn/dec: add db_aligned_height to message buffer + +Lepton Wu (3): + +- mapi: x86: Fix dynamic entries in x86 tsd stubs. +- mapi: Return NULL function pointers for GL_EXT_debug_marker +- egl: Allow software rendering for vgem/virtio_gpu in platform_device + +Lionel Landwerlin (60): + +- drm-shim: move handle lock to shim_fd +- drm-shim: don't create a memfd per BO +- drm-shim: silence warnings +- intel/dev: print out error when platform is not found by name +- intel: add stub_gpu tool +- ci: Add intel to shaderdb runs +- iris: don't assert on unfinished aux import in copy paths +- anv: don't expose VK_INTEL_performance_query without kernel support +- anv: fix alignments for uniform buffers +- genxml: run sorting script +- genxml: fix invalid end value for video fields +- genxml: factor out utility functions +- genxml: pack: deal with default field not being simple integers +- intel/genxml: fix bits generation for MI_LOAD_REGISTER_IMM +- intel/mi-builder: add framework for self modifying batches +- anv: don't reserve a particular register for draw count +- anv: add a new execution mode for secondary command buffers +- intel/genxml: add PIPE_CONTROL command cache invalidate bit +- intel/perf: make pipeline statistic query loading optional +- intel/perf: store the appropriate OA formats in queries +- intel/perf: update generated code to ralloc all data +- intel/perf: create a unique list of counters +- intel/perf: compute number of passes for a set of counters +- intel/perf: emit counter units in generated code +- intel/perf: add helper to compute metrics from counters +- intel/perf: add counter category to generated code +- intel/perf: report whether the platform supported +- anv: use a query filled by the perf code +- intel/perf: reuse offset specified in the query +- anv: Implement VK_KHR_performance_query +- intel/perf: repurpose INTEL_DEBUG=no-oaconfig +- anv: fixup unwinding of device create failure +- blorp: rename workaround address function +- anv: store the workaround address +- iris: store workaround address +- i965: store workaround_bo offset +- intel: add identifier for debug purposes +- iris: add identifier BO +- i965: add identifier BO +- anv: add identifier BO +- intel/aub_error_decoder: print driver identifier if found +- iris: fix BO destruction in error path +- i965: don't forget to set screen on duped image +- iris: fix export of GEM handles +- i965: fix export of GEM handles +- anv: add an option to disable secondary command buffer calls +- anv: garbage collect timeline semaphore when querying value +- iris: fix fallback to swrast driver +- anv: fix uninitialized variable access +- anv: properly handle fence import of sync_fd = -1 +- anv: fix descriptor set free +- anv: fix incorrect realloc failure handling +- anv: centralize vk to gen arrays +- anv: fix up dynamic clip emission +- anv: don't fail userspace relocation with perf queries +- anv: fix transform feedback surface size +- anv: VK_INTEL_performance_query interaction with VK_EXT_private_data +- intel/perf: store query symbol name +- intel/perf: fix raw query kernel metric selection +- intel/compiler: fixup Gen12 workaround for array sizes + +Liviu Prodea (1): + +- util: Make process_test path compatible with mingw native toolchains + +Louis-Francis Ratté-Boulianne (1): + +- nir: Always create UBO variable when lowering uniforms to ubo + +Lucas Stach (3): + +- etnaviv: generalize FE stall before loading shader and sampler states +- etnaviv: retarget transfer to render resource when necessary +- etnaviv: don't expose timer queries + +Luigi Santivetti (3): + +- dri2: dri2_make_current() fold multiple if blocks +- dri2: do not conflate unbind and bindContext() failure +- egl/dri2: try to bind old context if bindContext failed + +Marcin Ślusarz (24): + +- i965: remove unused variable +- glsl_to_tgsi: add fallthrough comments +- glsl: cleanup vertex shader input checks +- iris: remove unused iris_bo->swizzle_mode +- intel/compiler: fix Android build +- st/mesa: fix reporting of float perf counters max value +- iris: return max counter value for AMD_performance_monitor +- iris: remove iris_monitor_config +- intel/perf: move query_mask and location out of gen_perf_query_counter +- iris: propagate error from gen_perf_begin_query to glBeginPerfQueryINTEL +- i965: propagate error from gen_perf_begin_query to glBeginPerfQueryINTEL +- util: fix possible fd leaks in os_socket_listen_abstract +- glsl: catch out of bounds access in the debug version +- util: fix possible buffer overflow in util_get_process_exec_path +- util/format: initialize non-important components to 0 +- mesa: fix out of bounds access in glGetFramebufferParameterivEXT +- mesa: quiet down static analyzers +- iris: quiet down static analyzers +- intel/vec4: fix out of bounds read +- intel/perf: fix performance counters availability after glFinish +- anv: refresh cached current batch bo after emitting some commands +- anv: fix minor gen_ioctl(I915_PERF_IOCTL_CONFIG) error handling issue +- intel/perf: split load_oa_metrics +- intel/perf: export performance counters sorted by [group|set] and name + +Marek Olšák (226): + +- mesa: optimize glPush/PopClientAttrib by removing malloc overhead +- mesa: don't call _mesa_update_state for _mesa_get_clamp_fragment_color +- mesa: don't set unnecessary program flags in _mesa_update_state +- mesa: don't update shaders on fixed-func state changes if user shaders are bound +- mesa,st/mesa: add a fast path for non-static VAOs +- mesa: inline vbo_context inside gl_context to remove vbo_context dereferences +- mesa: add glInternalBufferSubDataCopyMESA for glthread +- mesa: add _mesa_InternalBind{ElementBuffer,VertexBuffers} for glthread +- glthread: do glBufferSubData as unsynchronized upload + GPU copy +- glthread: don't use atomics for refcounting to decrease overhead on AMD Zen +- glthread: track pointers and strides for Pointer & EXT_dsa attrib functions +- glthread: track instance divisor changes +- glthread: track primitive restart state +- glthread: initialize VAOs properly +- glthread: handle POS vs GENERIC0 aliasing +- glthread: handle gl{Push,Pop}ClientAttrib{DefaultEXT} for glthread states +- glthread: upload non-VBO vertices and indices for non-Indirect non-IBM draws +- tgsi_to_nir: handle TGSI_SEMANTIC_BLOCK_SIZE +- tgsi_to_nir: handle TGSI_OPCODE_BARRIER +- radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size +- radeonsi: clean up and deduplicate code around internal compute dispatches +- radeonsi: bind shader images after DCC is disabled for image stores +- radeonsi: add SI_IMAGE_ACCESS_DCC_OFF to ignore DCC for shader images +- radeonsi: implement and use compute-based DCC decompression on gfx9-10 +- radeonsi: add a workaround to fix KHR-GL45.texture_view.view_classes on gfx9 +- radeonsi: fix si_compute_clear_render_target with render condition enabled +- radeonsi: revert an accidental change in si_clear_buffer +- Revert "ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it's always set" +- Revert "ac: reassociate FP expressions for inexact instructions for radeonsi" +- ac/surface: fix MSAA crash with FORCE_SWIZZLE_MODE on gfx9 +- radeonsi: don't wait for idle at the end of gfx IBs +- ac/surface: unset RADEON_SURF_TC_COMPATIBLE_HTILE if HTILE hasn't been computed +- radeonsi/gfx9: always use IMG_DATA_FORMAT_S8_32 for 8-bit stencil +- radeonsi: allow tc_compatible_htile to be mutable +- radeonsi: enable TC-compatible HTILE on demand for best Z/S performance +- tgsi_to_nir: translate non-vec4 image stores correctly +- radeonsi: fix compilation of monolithic PS +- amd: update amdgpu_drm.h +- amd: remove duplicated definitions from amdgpu_drm.h +- amd: assume CMASK is always rb/pipe_aligned, remove ac_surface.u.gfx9.cmask +- amd: assume HTILE is always rb/pipe_aligned, remove ac_surface.u.gfx9.htile +- ac/surface,radeonsi: move the set/get_bo_metadata code to ac_surface.c +- ac/surface,radeonsi: move the set/get_umd_metadata code into ac_surface.c +- amd: unify code for overriding offset and stride for imported buffers +- ac/surface: override all offsets including metadata offsets +- ac/surface: fix broken pitch override on gfx8 +- gallium: rename 'state tracker' to 'frontend' +- gallium: change comments to remove 'state tracker' +- gallium: rename PIPE_RESOURCE_FLAG_ST_PRIV to FRONTEND_PRIV +- gallium: remove more "state tracker" occurences +- radeonsi: also enable tgsi_to_nir caching for compute shaders +- glthread: stop using GLenum16 to get correct GL errors for out-of-bounds enums +- radeonsi: don't expose 16xAA on chips with 1 RB due to an occlusion query issue +- ac/nir: honor ACCESS_STREAM_CACHE_POLICY for L1 and L0 caches too +- radeonsi: use correct clear value size for EQAA in expand_fmask +- radeonsi: optimize access pattern for compute blits with linear textures +- radeonsi: tweak clear/copy_buffer limits when to use compute +- radeonsi: simplify setting resource usage for si_init_temp_resource_from_box +- radeonsi: rename SI_RESOURCE_FLAG_TRANSFER to FORCE_LINEAR +- radeonsi: use vi_dcc_enabled instead of using tex->surface.dcc_offset directly +- radeonsi: use display_dcc_offset for setting displayable_dcc_cb_mask +- winsys/amdgpu: add RADEON_FLAG_UNCACHED for faster blits over PCIe +- radeonsi: disable the L2 cache for most CPU mappings of textures +- radeonsi: disable the L2 cache for CPU read mappings of buffers +- radeonsi: compute perf tests - don't test 1 wave/SA limit, test no limit first +- radeonsi: test uncached clear/copy buffer performance with compute shaders +- gallium/u_threaded: execute transfer_unmap with THREAD_SAFE directly +- ac/gpu_info: compute the best safe IB alignment +- ac/surface: don't compute single-sample CMASK if it's unaligned +- radeonsi: don't use INDIRECT_BUFFER within IBs +- radeonsi: decrease the max GS invocation count to 32 +- Revert "radeonsi: don't wait for idle at the end of gfx IBs" +- ac: update register and packet definitions for preemption +- radeonsi: move resetting tracked registers into a new function +- radeonsi: split si_all_descriptors_begin_new_cs and rename functions +- radeonsi: don't enable TC-compatible HTILE for stencil if stencil doesn't use it +- radeonsi/gfx8: enable TC-compatible HTILE from the beginning as before +- radeonsi: don't hardcode most perf counter block counts +- ac/gpu_info: replace num_good_cu_per_sh with min/max_good_cu_per_sa +- amd: replace SH -> SA (shader array) in comments +- radeonsi/gfx10: implement most performance counters +- glthread: don't upload for glDraw inside a display list and always sync +- nir: add i2imp and u2ump opcodes for conversions to mediump +- nir: add int16 and uint16 type helpers +- nir: lower int16 and uint16 in nir_lower_mediump_outputs +- nir: fix lower_wpos for 16-bit fddy +- nir: add options::vectorize_vec2_16bit to limit vectorization to vec2 16 +- glsl: treat lowp as mediump when lowering builtins +- glsl: handle int16 and uint16 types and add instructions for mediump +- glsl: lower mediump integer types to int16 and uint16 +- glsl: lower mediump partial derivatives +- glsl: lower the precision of imageLoad +- glsl: lower samplers with highp coordinates correctly +- gallium: add shader caps INT16 and FP16_DERIVATIVES +- ac: rename has_double_rate_fp16 -> has_packed_math_16bit +- ac/nir: use more types from ac_llvm_context +- ac/nir: support vector types in the type suffix of overloaded intrinsics +- ac/nir: remove type and num_channels args from ac_build_buffer_store_common +- ac/nir: support 16-bit data in buffer_load_format opcodes +- ac/nir: support 16-bit data in image opcodes +- ac/nir: handle nir_op_[fiu]2[fiu]mp opcodes +- ac/nir: select v_cvt_pkrtz for all conversions from f32 to f16 for radeonsi +- ac/nir: set the second v_cvt_pkrtz argument to undef if it's unused +- ac/nir: support v2f16 derivatives +- nir: don't count samplers and images in interface blocks +- nir: gather which images are buffers +- nir: gather which images are MSAA +- radeonsi: remove unused leftover code for INDIRECT_BUFFER inside IBs +- radeonsi: remove const_buffers_declared hacks +- radeonsi: pass at most 3 images and/or shader buffers via user SGPRs for compute +- radeonsi: add a hack to disable TRUNC_COORD for shadow samplers +- gallium/u_vbuf: get rid of some pointer dereferences +- gallium/u_vbuf: add a faster path for uploading non-interleaved attribs +- glthread: sync in glFlush for multiple contexts +- radeonsi: enable ARB_sparse_buffer +- ac,radeonsi: replace == GFX10 with >= GFX10 where it's needed +- ac,radeonsi: start adding support for gfx10.3 +- ac/surface: add displayable DCC code for gfx10.3 +- radeonsi: honor a user-specified pitch on gfx10.3 +- radeonsi: enable larger SDMA clears and copies on gfx10.3 +- radeonsi: implement R9G9B9E5 render target and image store support on gfx10.3 +- radeonsi: move L2_CACHE_CONTROL registers into si_emit_framebuffer_state +- radeonsi: set BIG_PAGE fields on gfx10.3 +- radeonsi: don't set any XNACK options on gfx10.3 +- ac: align num_vgprs for gfx10.3 +- radeonsi: add support for Sienna Cichlid +- radeonsi: require LLVM 11 for gfx10.3 +- ac/surface: don't recompute the DCC retile map for imported textures +- amd/addrlib: don't recompute DCC info for every ComputeDccAddrFromCoord call +- amd/addrlib: remove unused members of ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT +- ac/surface: add a wrapper structure to hold ADDR_HANDLE +- ac/surface: cache DCC retile maps (v2) +- amd/addrlib: fix the C++ one definition rule violation +- ac/surface: don't set is_displayable if displayable DCC is missing +- ac/surface: require that gfx8 doesn't have DCC in order to be displayable +- ac/surface: enable DCC for the first level in the mip tail on gfx10 +- ac/surface: don't free dcc_retile_map on failure +- radeonsi: compact MRTs to save PS export memory space +- ac/nir: fix 64-bit division for GL CTS +- glapi: fix incorrect param names in ARB_vertex_attrib_binding functions +- glthread: rename non_vbo_attrib_mask -> user_buffer_mask, attribs -> buffers +- glthread: handle ARB_vertex_attrib_binding +- radeonsi: don't wait for idle at the end of gfx IBs +- radeonsi: replace ctx->screen with sscreen in si_flush_gfx_cs +- glsl,driconf: add allow_glsl_120_subset_in_110 for SPECviewperf13 +- driconf: add workarounds for SPECviewperf13 +- amd: add proper definitions for NOP packets +- ac,winsys/amdgpu: align IBs the same as the kernel +- radeonsi: don't add the border color buffer into the init_config state +- radeonsi: rename init_config states to cs_preamble states +- radeonsi: don't add the tess ring buffers into the cs_preamble state +- radeonsi: make wait_mem_scratch unmappable +- radeonsi: disallow adding BOs into si_pm4_state except 1 shader BO per state +- radeonsi: make si_pm4_cmd_begin/end static and simplify all usages +- radeonsi: clear per-context buffers at the end of si_create_context +- radeonsi: remove tabs +- radeonsi: don't flush in fence_server_sync +- ac/gpu_info: fix num_physical_sgprs_per_simd for gfx10 +- radeonsi: fix NGG culling for Wave64 +- radeonsi: always use Wave32 for GS fast launch, because Wave64 hangs +- radeonsi: always use Wave64 for HS/GS/VS shader stages (except GS fast launch) +- radeonsi: don't try to enable NGG culling for GS +- radeonsi: add a debug option to enable NGG culling for tessellation +- glsl: make print_type non-static for debugging +- glsl: print precision qualifiers in IR dumps +- glsl: print constant initializers +- glsl: fix the type of ir_constant_data::u16 +- glsl: fix evaluating float16 constant expression matrices +- glsl: run validate_ir_tree if GLSL_VALIDATE=1 regardless of the build config +- glsl: validate more stuff +- glsl: convert reusable lower_precision util code into helper functions +- glsl: remove the return type from lower_precision +- glsl: cleanups in lower_precision +- glsl: flatten a tautological conditional in lower_precision +- glsl: don't lower precision of textureSize +- glsl: don't lower builtins to mediump that don't allow it +- glsl: lower builtins to mediump that ignore precision of certain parameters +- glsl: lower builtins to mediump that always return mediump or lowp +- glsl: add capability to lower mediump array types +- glsl: lower mediump temporaries to 16 bits except structures (v2) +- gallium: add PIPE_SHADER_CAP_GLSL_16BIT_TEMPS for LowerPrecisionTemporaries +- Revert "ac/surface: require that gfx8 doesn't have DCC in order to be displayable" +- glsl: don't validate array types in ir_dereference_variable +- radeonsi: prevent a gfx10_ngg_calculate_subgroup_info failure for TES+NGG GS +- radeonsi: add missing initialization of registers +- radeonsi/gfx10: set the correct value for OFFCHIP_BUFFERING +- radeonsi: sort registers in si_emit_initial_compute_regs according to GPU gen +- radeonsi: sort registers in si_init_cs_preamble_state according to GPU gen +- ac: add helper ac_get_register_name +- ac: add tables for CP register shadowing +- winsys/amdgpu: make amdgpu_bo_unmap non-static +- radeonsi: make cs_preamble_state optional +- radeonsi: reorder code in update_gs_ring_buffers and init_tess_factor_ring +- radeonsi: implement CP register shadowing +- radeonsi: add reg shadowing codepaths to GS and tess ring setup +- radeonsi: add debug code for register shadowing +- radeonsi: don't restore states at the beginning of IBs if they're shadowed +- radeonsi: set up IBs for preemption +- radeonsi: enable preemption if the kernel enabled it +- amd: rename SIENNA -> SIENNA_CICHLID +- amd: add support for Navy Flounder +- amd: enable displayable DCC for everything newer than Navi1x +- radeonsi: disable SDMA on gfx9 +- radeonsi: reorder NIR optimizations +- radeonsi: call nir_split_array_vars/shrink_vec_array_vars/opt_find_array_copies +- glsl: lower_precision - fix assertion failure with dereferences of constants +- glsl: fix constant expression evaluation for 16-bit types +- glsl: don't lower atomic functions to mediump +- glsl: don't create conversion opcodes for array types +- glsl: don't lower to mediump for desktop OpenGL +- glsl: improve precision determination for calls +- Revert "radeonsi: honor a user-specified pitch on gfx10.3" +- radeonsi: use correct wave size in gfx10_ngg_calculate_subgroup_info +- radeonsi: use the same units for esgs_ring_size and ngg_emit_size +- radeonsi: increase minimum NGG vertex count requirement per workgroup on gfx 10.3 +- radeonsi: fix applying the NGG minimum vertex count requirement +- radeonsi: don't count unusable vertices to the NGG LDS size +- radeonsi: add a common function for getting the size of gs_ngg_scratch +- radeonsi: remove the NGG hack decreasing LDS usage to deal with overflows +- radeonsi: various fixes for gfx10.3 +- radeonsi: disable NGG culling on gfx10.3 because of hangs +- st/mesa: don't generate NIR for ARB_vp/fp if NIR is not preferred +- radeonsi: fix tess levels coming as scalar arrays from SPIR-V +- gallivm: fix build on LLVM 12 due to LLVMAddConstantPropagationPass removal +- ac/llvm: fix unaligned VS input loads on gfx10.3 +- Revert "ac: generate FMA for inexact instructions for radeonsi" + +Marek Vasut (3): + +- etnaviv: Disable seamless cube map on GC880 +- etnaviv: Remove etna_resource_get_status() +- etnaviv: Add lock around pending_ctx + +Mario Kleiner (1): + +- vulkan/wsi: Really terminate DRM lease in wsi_release_display(). + +Mathias Fröhlich (2): + +- st/mesa: Move _NEW_FRAG_CLAMP to NewFragClamp driver flag. +- mesa: set _NEW_FRAG_CLAMP only when needed + +Matt Turner (22): + +- intel/compiler: Drop opt_sampler_eot() +- intel/tools: Remove unnecessary reg number checking +- intel/tools: Drop srctype from ipreg +- intel/tools: Require explicit regions/types for special regs +- intel/tools: Disallow control subregisters > 3 +- intel/tools: Add assembler tests for the cr0 register +- intel/compiler: Add assert that set bits are within mask +- intel/compiler: Don't emit no-op cr0 changes +- intel/tools: Fix typos +- intel/tools: Remove stray newline +- intel/tools: Don't allow empty type specifier +- intel/tools: Simplify register type handling +- intel/tools: Make swizzle an integer +- intel/tools: Make writemask an integer +- intel/tools: Simplify immediate handling +- intel/tools: Simplify dstregion +- intel/compiler: Relax SENDS regioning assertions +- intel/tools: Pass integers, not enums, to stride() +- intel/tools: Manually set ARF register file/nr/subnr +- intel/tools: Don't hardcode notification register +- intel/tools: Simplify notification register handling +- intel/tools: Test notification subregisters + +Mauro Rossi (17): + +- android: iris: add iris_seqno.{c,h} to Makefile.sources +- freedreno/drm: android: add libfreedreno_registers static dependency +- freedreno: android: add adreno-pm4-pack.xml.h generation to android build +- android: util: fix build for GL4.1 support +- android: svga: fix build for GL4.1 support +- android: aco: add aco_ir.cpp to Makefile.sources +- android: nvir/gv100: update sources in Makefile.sources +- android: freedreno: add fd5_layout.c to Makefile.sources +- android: freedreno/ir3: add missing generated sources and rules +- android: freedreno/ir3: simplify generated sources rules +- android: panfrost/encoder: add libmesa_nir static dependency +- radv: fix build on Android 7 (v2) +- android: freedreno/registers: fix generated headers rules +- android: freedreno/ir3: fix include paths +- android: freedreno/common: add support for libfreedreno_common static +- android: freedreno: move a2xx disasm out of gallium +- android: freedreno/common: add libmesa_git_sha1 static dependency + +Michel Dänzer (38): + +- gitlab-ci: Use YAML anchor for llvmpipe paths in virgl rules +- gitlab-ci: Update to current templates +- gitlab-ci: Move down container_pre_build.sh invocation in x86_build.sh +- gitlab-ci: Add Debian testing repository for x86_build image +- gitlab-ci: Install WINE from Debian testing +- gitlab-ci: Move lib{drm,pciaccess}-dev cross packages out of loop +- gitlab-ci: Install g++-mingw-w64-x86-64-win32 instead of mingw-w64 +- Revert "ac,radeonsi: fix compilations issues with LLVM 11" +- Revert "gallium/gallivm: fix compilation issues with llvm 11" +- gitlab-ci: Enable -Werror in `meson-s390x` job +- gitlab-ci: Also list arm/x86_build in needs: of test jobs +- gitlab-ci: x86_test-base image as common base for x86_test-gl/vk +- gitlab-ci: Pull in GCC 9 from Debian testing in x86_test-gl/vk images +- gitlab-ci: Move LLVM/clang 6/7 packages to the x86_build_old image +- gitlab-ci: Use Debian 10 wine-development packages +- gitlab-ci: Stop using packages from Debian testing +- gitlab-ci: Move meson back to x86_test-gl/vk ephemeral packages lists +- gitlab-ci: Add x86_build-base docker image +- gitlab-ci: Use separate docker images for cross builds +- loader/dri3: Add dri3_wait_for_event_locked full_sequence out parameter +- loader/dri3: Use dri3_wait_for_event_locked in loader_dri3_wait_for_msc +- loader/dri3: Check for window destruction in dri3_wait_for_event_locked +- gitlab-ci: Automatically run pipelines for Marge Bot pre-merge only +- gitlab-ci: Use rules: instead of except:/only: for test-docs job +- gitlab-ci: Extend .ci-run-policy template for docs jobs +- gitlab-ci: Do not create the "success" job when the test-docs job exists +- ci: Use "when: always" for pages job +- ci: Move deploy stage between container & build stages +- Revert "loader/dri3: Check for window destruction in dri3_wait_for_event_locked" +- gitlab-ci: Remove indirect dependencies from needs: +- gitlab-ci: Drop dependencies: +- Revert https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4580 +- gitlab-ci: Fix "triggered by Marge for a merge request" rule +- gitlab-ci: Only trigger test-docs job automatically for MRs +- ci: Use FDO_CI_CONCURRENT in run-shader-db.sh as well +- ci: Do not mark container / pages jobs as interruptible +- ci: Use half as many parallel softpipe / virgl test jobs +- ci: Use ignore_scheduled_pipelines anchor in .radeonsi-rules + +Michel Zou (1): + +- swr: fix build with mingw + +Mike Blumenkrantz (73): + +- zink: explicitly zero some arrays in ntv +- zink: add SpvId returns to a couple ntv functions +- zink: flush active queries on destroy and free query object +- zink: fix vkCmdResetQueryPool usage +- zink: reset query on-demand when beginning a new query from resume +- zink: always use logical eq ops in ntv with 1bit inputs +- zink: track program usages for each shader +- zink: emit interpolation decorations for ntv outputs +- zink: handle more glsl->spirv builtin translation +- zink: rework input/output location emission +- zink: use '2' variants for device props/feats, check features for ext enabling +- zink: add spirv builder util functions for emitting xfb decorations +- zink: add spirv_builder methods for OpVectorExtractDynamic and OpVectorInsertDynamic +- zink: implement streamout and xfb handling in ntv +- zink: implement transform feedback support to finish off opengl 3.0 +- zink: set PIPE_CAP_VIEWPORT_TRANSFORM_LOWERED and remove POS special casing +- zink: switch to passing VkPhysicalDeviceFeatures2 in VkDeviceCreateInfo +- zink: enable xfb extension in screen creation +- zink: use int assignment for vk int type +- zink: use correct define value for reserved slot count in ntv +- zink: clamp VkImageCreateInfo.arrayLayers to 1 for image resource creation +- zink: unify code for setting resource barriers +- zink: handle signed and unsigned min/max ops in ntv +- zink: add ult handling for ntv +- zink: add bitfield_reverse handling to ntv +- zink: lower byte/word extract ops in nir +- zink: handle ixor in ntv +- zink: handle isign alu in ntv +- zink: set lower_mul_high and lower_rotate in ntv compiler options +- zink: use OpFUnordNotEqual for nir_op_fne +- zink: set lower_uadd_carry in nir options +- zink: implement Vk_EXT_index_type_uint8 +- nir: add lowering pass for clip plane enabling +- st/program: use nir_lower_clip_disable instead of nir_lower_clip_vs conditionally +- nir: add lowering pass for fragcolor -> fragdata +- zink: translate gl_FragColor to gl_FragData before ntv to fix multi-rt output +- u_prim_restart: handle user buffers in util_translate_prim_restart_ib() +- nir: allow nir_lower_point_size_mov to run in geometry shader +- nir: allow nir_lower_clip_halfz to run in geometry shaders +- zink: rework query handling +- zink: use #define for number of queries per-pool +- zink: only stall during query destroy for xfb queries +- zink: properly handle query pool overflows +- zink: only reset query pool on query end if current batch isn't in renderpass +- zink: use right vulkan type for GL_PRIMITIVES_GENERATED queries +- zink: handle ntv case of nested loop instructions more permissively +- zink: add lengthy comment and remove assert from discard_if ntv pass +- zink: use type of src[0] for ntv store and load ops +- zink: try copy_region hook for blits where we can't do a regular blit or resolve +- zink: block vkCmdBlitImage usage for multi sampled blits +- zink: block resolve blits for depth/stencil buffers +- zink: handle empty attachments +- zink: try to handle multisampled null buffers +- zink: enable tgsi texcoord pipe cap +- zink: destroy gfx program when a shader is freed +- zink: destroy descriptor pools on context destroy +- zink: free pipeline cache during program destroy +- zink: free all ntv allocations after creating shader module +- zink: use helper function to handle uvec/bvec types +- zink: handle texelFetchOffset with offsets +- zink: add some asserts for building access chains in ntv +- zink: omit Lod image operand in ntv when not using an image texture dim +- nir: allow lower_psiz_mov to run in tessellation stages +- nir_ allow nir_lower_clip_halfz to run in tess eval shader +- u_prim_restart: handle indirect draws +- zink: add extension loading framework for spirv builder +- zink: implement VK_EXT_robustness2 +- zink: clamp PIPE_SHADER_CAP_MAX_SHADER_BUFFERS to PIPE_MAX_SHADER_BUFFERS +- zink: handle VK_EXT_vertex_attribute_divisor setup +- zink: store valid timestamp bits onto zink_screen +- zink: implement handling for VK_EXT_calibrated_timestamps +- u_prim_restart: add inline function for getting restart index based on index size +- zink: reorder create_stream_output_target to fix failure case leak + +Miklós Máté (1): + +- docs: add some missing stuff to sourcetree.rst + +Nanley Chery (18): + +- iris: Drop can_fast_clear_color's format parameter +- iris: Remove the CCS_D fallback +- iris: Avoid fast-clear with incompatible view +- iris: Disable sRGB fast-clears for non-0/1 values +- intel: Add ISL_AUX_USAGE_GEN12_CCS_E +- iris: Don't support sRGB + Y_TILED_CCS on gen9 +- iris: Use ISL_AUX_USAGE_GEN12_CCS_E on gen12 +- isl/drm: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS +- gallium/dri2: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS +- iris: Handle importing aux-enabled surfaces on TGL +- iris: Refactor modifier_is_supported for gen12 +- iris: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS +- iris: Zero the add-on clear color BO on import +- dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_B8G8R8X8_UNORM +- iris: Don't call SET_TILING for dmabuf imports +- gallium/dri2: Report correct YUYV and UYVY plane count +- iris: Fix aux assertion in resource_get_handle +- blorp: Fix alignment test for HIZ_CCS_WT fast-clears + +Nataraj Deshpande (3): + +- anv: Limit vulkan version to 1.1 for Android +- anv: Disable extensions based on Android versions +- dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_R8G8B8X8_UNORM + +Neha Bhende (6): + +- util: Initialize pipe_shader_state for passthrough and transform shaders +- util: Add util functionality for GL4.1 support +- winsys/drm: Add GL4.1 support in drm winsys +- svga/include: Headers for GL4.1 support +- svga: Add GL4.1(compatibility profile) support in svga driver +- svga: Performance fixes + +Neil Armstrong (2): + +- Revert "CI: Disable Lima jobs due to lab unhealthiness" +- Revert "CI: Disable Panfrost Mali-T820 jobs" + +Neil Roberts (26): + +- nir/scheduler: Handle nir_intrinsic_load_per_vertex_input +- v3d: Remove unused member of v3d_compile +- nir/schedule: Store a pointer to the scoreboard in nir_deps_state +- nir/scheduler: Add an option to specify what stages share memory for I/O +- v3d: Let scheduler know GS doesn’t have shared I/O memory +- gallium: Add pipe cap for primitive restart with fixed index +- mesa: Add PrimitiveRestartFixedIndex to gl_constants +- v3d: Disable PIPE_CAP_PRIMITIVE_RESTART +- v3d: Add missing macro for stvpmd instruction +- v3d: Use stvpmd for non-uniform offsets in GS +- compiler: Add a system value for the line coord +- v3d: Implement the line coord intrinsic +- nir: Add intrinsics for the line width +- v3d: Handle the line width intrinsics +- v3d: Add a lowering pass for line smoothing +- v3d: Enable perpendicular line caps when line smoothing +- broadcom/qpu: set VC5_QPU_RADDR_A out of the switch at _pack_branch +- v3d/compiler: Fix sorting the gs and fs inputs +- v3d/compiler: Lower geometry output store base into offset src +- nir/scheduler: Move nir_scheduler to its own header +- nir/schedule: Store a pointer to the options struct in scoreboard +- nir/schedule: Add a callback for backend-specific dependencies +- v3d: Mark scheduling dependency for prim id and first output +- nir/schedule: Add an option for a fallback scheduling algorithm +- v3d: Changed v3d_compile:failed to an enum +- v3d: Retry with the fallback scheduler when RA fails + +Oschowa (5): + +- radv: Don't take absolute value of unsigned type. +- aco: Don't declare 'Block' as class, but define as struct. +- aco: Don't std::move temporary object. +- aco: Use correct reference type in for-range-loop. +- radv: Explicitly cast TIMESTAMP_NOT_READY value to uin32_t where needed. + +Pablo Saavedra (5): + +- ci: TRACES_DB_PATH and RESULTS_PATH defined as relative paths +- ci: ArgumentParser receives the args from the main parameters +- ci: Migrate tracie tests done in shell script to pytest +- ci: Split test_tracie_skips_traces_without_checksum in separate cases +- ci: Fix TypoError error when traces in traces.yml is an empty list + +Pavel Asyutchenko (1): + +- vulkan/overlay: fix crash on destroying NULL swapchain + +Peter Seiderer (3): + +- vc4_bufmgr: fix time_t printf +- pan_bo.h: add time.h include for time_t +- v3d_bufmgr: fix time_t printf + +Pierre Moreau (4): + +- clover/nir: Check the result of spirv_to_nir +- clover/api: Address missing braces for subobj init +- clover: Address unnecessary copy warnings +- clover/spirv: Remove unused tuple header + +Pierre-Eric Pelloux-Prayer (62): + +- radeonsi: fix export count +- mesa: add gl_coontext::ForceIntegerTexNearest +- driconf: add force_integer_tex_nearest option +- radeonsi: add workaround for issue 2647 +- radeonsi: don't print gs_copy_shader stats for shaderdb +- glsl: init gl_FragColor if zero_init=true +- glsl: rework zero initialization +- glsl: add a is_implicit_initializer flag +- mesa: extend GLSLZeroInit semantics +- gallium: add a new cap PIPE_CAP_GLSL_ZERO_INIT +- ac/nir: export some undef as zero +- ac/surface: remove shadowing declaration +- amdgpu/radeon: add secure api +- radeonsi: add AMD_DEBUG=tmz option +- radeon: add RADEON_CREATE_ENCRYPTED flag +- radeonsi: allocate framebuffer texture as secure when using tmz +- amdgpu: add encrypted slabs support +- radeonsi: force using staging texture when uploading to secure texture +- radeonsi/sdma: implement tmz support +- gallium: PIPE_RESOURCE_FLAG_ENCRYPTED +- radeonsi: add support for PIPE_RESOURCE_FLAG_ENCRYPTED +- amdgpu: use AMDGPU_IB_FLAGS_SECURE when requested +- radeonsi: determine secure flag must be set for gfx IB +- radeonsi: do not use cmask with encrypted texture +- amd/addrlib: fix forgotten char -> enum conversions +- radeonsi: fix inversed arguments in si_test_gds_memory_management +- amdgpu: fix unitialized variable +- radeonsi/sdma: remove useless compare +- radeonsi/drirc: enable zerovram option for 7 Days to Die +- winsys/radeon: do not cast bo->va as void* +- radeonsi: add return value to gfx10_ngg_calculate_subgroup_info +- radeonsi/ngg: try GS multi-cycling mode if default mode failed +- ac/surface: set SCANOUT if surf->is_displayable +- ac/surface: fix epitch when modifying surf_pitch +- ac/llvm: load 1 byte at a time if unaligned on gfx10 +- st/mesa: make texture views inherit compressed_data storage +- radeonsi: bump SI_NUM_SHADER_BUFFERS to 32 +- st/mesa: do not clear NewDriverState for inactive states +- glsl: reject size1x8 for image variable with floating-point data types +- ac/llvm: remove the -1 hack from ac_atomic_inc_wrap +- glsl: don't expose imageAtomicIncWrap for signed image +- glsl: only allow 32 bits atomic operations on images +- glsl: declare gl_Layer/gl_ViewportIndex/gl_ViewportMask as vs builtins +- st/mesa: set compressed_data to NULL when freed +- bin/symbols-check.py: add --ignore-symbol argument +- ac/llvm: export ac_init_llvm_once in targets +- mesa: rename _mesa_free_errors_data +- mesa: add bool param to _mesa_free_context_data +- mesa/st: release debug_output after destroying the context +- ac/surface: adapt surf_size when modifying surf_pitch +- radeonsi: adjust epitch for PIPE_FORMAT_R8G8_R8B8_UNORM +- radeonsi: extend workaround for KHR-GL45.texture_view.view_classes on gfx9 +- ac/llvm: handle static/shared llvm init separately +- mesa/st: introduce PIPE_CAP_NO_CLIP_ON_COPY_TEX +- radeonsi: enable PIPE_CAP_NO_CLIP_ON_COPY_TEX +- ac/llvm: add option to clamp division by zero +- radeonsi,driconf: add clamp_div_by_zero option +- radeonsi: use radeonsi_clamp_div_by_zero for SPECviewperf13, Road Redemption +- glsl: fix per_vertex_accumulator::fields size +- r600/uvd: set dec->bs_ptr = NULL on unmap +- radeon/vcn: set dec->bs_ptr = NULL on unmap +- mesa: fix glUniform* when a struct contains a bindless sampler + +Pierre-Loup A. Griffais (2): + +- radv: fix null descriptor for dynamic buffers +- radv: fix vertex buffer null descriptors + +Qiang Yu (6): + +- radeonsi: remove emacs style config file +- panfrost: don't always build bifrost_compiler +- radeonsi: fix syncobj wait timeout +- radeonsi: fix user fence space when MCBP is enabled +- radeonsi: fix max syncobj wait timeout +- radeonsi: fix user fence GPU address + +Rafael Antognolli (8): + +- intel: Store the aperture size in devinfo. +- intel/isl: Update mocs for DG1 +- intel/l3: Return the URB size from devinfo for DG1 +- intel/devinfo: Add function to check for DRM_I915_GEM_GET_TILING. +- iris/bufmgr: Do not use map_gtt or use set/get_tiling on DG1 +- anv/dg1: Don't use SET_TILING kernel uapi. +- iris: Align last_seqnos to 64 bits. +- anv: Align "used" attribute to 64 bits. + +Rhys Kidd (5): + +- nv50_2d: regenerate envytools-based rnndb headers +- nv50_2d,nvc0_2d: Document SET_PIXELS_FROM_MEMORY_SAFE_OVERLAP from rnndb +- nvc0_2d: Document SET_PIXELS_FROM_MEMORY_CORRAL_SIZE from rnndb +- nvc0: fix macro define for NVE4_COPY() +- nvc0: add documentation for nve4+ (Kepler) COPY class + +Rhys Perry (174): + +- aco: remove use of f-strings +- aco: add message to static_assert +- nir: add missing group_memory_barrier handling +- compiler/spirv: flag nclamp/nmin/nmax as exact +- nir: make fsat return 0.0 with NaN instead of passing it through +- docs: add src/amd/ to sourcetree.html +- docs/envvars: document ACO_DEBUG +- docs/envvars: update RADV_FORCE_FAMILY +- aco: simplify consecutive ordered vmem/lds writes optimization +- aco: fix consecutively written vgprs from vmem instructions +- aco: mark phi definitions as last-seen phi operands +- aco: consider affinities when creating v_mac_f32 +- aco: improve phi affinities with p_split_vector +- aco: split operations that use a swap's definition +- aco: fix disassembly with LLVM 11 +- nir/opt_if: run opt_peel_loop_initial_if after all other optimizations +- nir/opt_if: use nir_src_as_bool in opt_peel_loop_initial_if helper +- aco: fix typo in insert_waitcnt's kill() +- nir: fix lowering to scratch with boolean access +- aco: fix interaction with 3f branch workaround and p_constaddr +- aco: consider SDWA during value numbering +- aco: check instruction format before waiting for a previous SMEM store +- aco: preserve more fields when combining additions into SMEM +- aco: don't reorder barriers in the scheduler +- aco: fix 64-bit shared_atomic_exchange +- docs: add missing "shader_" in VK_KHR_shader_subgroup_extended_types +- radv: set keep_statistic_info with RADV_DEBUG=shaderstats +- ac/gpu_info, radv: set max_wave64_per_simd to 20 on GFX10 +- aco: use v_xor3_b32 +- aco: validate instructions reading/writing upper halves/bytes +- aco: p_extract_vector in 64-bit u2f16/i2f16 +- aco: allow reading/writing upper halves/bytes when possible +- aco: prefer 4-byte aligned definitions +- aco: add Info::{operand_size,definition_size} +- aco: use Info::definition_size instead of definition's regclass +- aco: fix moving sub-dword values out of a register for a fixed definition +- aco: use num_opcodes instead of last_opcode +- aco: improve code for f2{i,u}{8,16} +- aco: use p_as_uniform in emit_vop1_instruction +- aco: add and set precise flag +- aco: create mads when signed zeros should be preserved +- aco: try to use fma instead of mad when denormals are enabled +- aco: create 16-bit mad/fma +- aco: update comment about preserving fp16/fp64 denormals +- aco: create 16-bit input and output modifiers +- aco: improve sub-dword check for sgpr/constant propagation +- aco: fix half_pi constant for 16-bit fsin/fcos +- aco: use 32-bit inline constants for 16-bit integer instructions +- aco: improve 8/16-bit constants +- aco: copy-propagate constants through p_extract_vector/p_split_vector +- aco: optimize 16-bit and 64-bit float comparisons +- aco: validate sub-dword pseudo instructions +- aco: add more opcodes to can_swap_operands +- aco: allow GFX9 partial writes with instructions which use opsel +- aco: improve check for moving temporaries out of fixed definitions +- aco: fix encoding of certain s_setreg_imm32_b32 instructions +- aco: fix validation error from vgpr spill/restore code +- aco: fix sub-dword opsel/sdwa checks +- aco: fix validation of opsel when set for the definition +- aco: shrink ssa_info +- aco: make ssa_info::label 64-bit +- aco: shrink mad_info +- aco: fix edge check with sub-dword temporaries +- aco: use the same regclass as the definition for undef phi operands +- radv: add new drirc option radv_no_dynamic_bounds +- radv: enable radv_no_dynamic_bounds for Path of Exile +- radv: enable radv_no_dynamic_bounds for more Path of Exile executables +- nir: slight correction to cube_face_coord constant folding +- spirv: set variables to restrict by default +- radv: fix image variable types in meta shaders +- aco: only use SMEM if we can prove it's safe +- aco: allow SMEM for some sub-dword accesses +- radv/aco,aco: allow SMEM SSBO loads on GFX6/7 +- aco: fix copy+paste error in split_buffer_store +- aco: don't store byte-aligned short stores +- aco: add missing bld.scc() in byte_align_scalar() +- aco: don't create byte-aligned short loads +- aco: fix when sub-dword create_vector operand cannot be placed perfectly +- aco: improve vectorization of 8/16-bit loads/stores +- aco: ignore blocked registers when checking edges in get_reg_impl() +- aco: remove outdated assert in handle_operands() +- radv: enable zerovram for Quantic Dream games +- aco: use VOP2 version of v_mbcnt_hi_u32_b32 on GFX6/7 +- aco: rework boolean phi pass +- aco: create better code for boolean phis with constant operands +- aco: optimize boolean phis with uniform selections +- aco: don't create phis with undef operands in the boolean phi pass +- aco: read 0 from inactive lanes when using dpp +- aco: optimize some masked swizzles to DPP +- aco: implement <32-bit masked_swizzle_amd +- nir/lower_subgroups: pass options struct to lower_shuffle +- nir/lower_subgroups: add lower_shuffle_to_swizzle_amd +- radv: use lower_shuffle_to_swizzle_amd +- aco: add 32-bit integer addition to can_swap_operands +- aco: fix underestimated pressure in spiller when a phi has a killed def +- aco: rewrite graph coloring in spiller +- aco: use unordered_set for spill id interferences +- aco: add add_interference() helper +- aco: use s_round_mode/s_denorm_mode +- aco: flush denormals before fp16 fabs/fneg if needed +- aco: fix nir_op_f2f16_rtne with non-default rounding modes +- aco: set tcs_in_out_eq=false if float controls of VS and TCS stages differ +- radv: enable more float_controls features +- aco: properly recognize that s_waitcnt mitigates VMEMtoScalarWriteHazard +- aco: use s_waitcnt_depctr to mitigate VMEMtoScalarWriteHazard +- spirv: don't split memory barriers +- nir/lower_int64: lower 64-bit amul +- aco: always set FI on GFX10 +- radv: replace discard with demote for Quantic Dream games +- aco: implement b2i8/b2i16 +- aco: be more careful combining additions that could wrap into loads/stores +- aco: allow overflow for some SMEM instructions +- aco: add NUW flag +- nir: add nir_unsigned_upper_bound and nir_addition_might_overflow +- aco: use nir_addition_might_overflow to combine additions into SMEM +- aco: move some setup code into helpers +- aco: make validate() usable in tests +- aco: print ACO IR before scheduling instead of after +- radv: fix invalid conversion warnings in vk_format.h +- aco: fix copy of uninitialized boolean +- aco: fix includes in aco_ir.cpp +- aco: add missing add_to_hazard_query +- aco: rework barriers and replace can_reorder +- radv/aco,aco: use scoped barriers +- aco: consider intrinsic access in visit_{load,store}_image +- nir,radv/aco: add and use pass to lower make available/visible barriers +- aco: enable value numbering of s_buffer_load_* +- aco: use storage_scratch +- aco: improve sync_info for TCS output stores +- aco: improve workgroup-scope and lower vmem/smem barriers +- aco: create acq+rel barriers instead of acq/rel +- nir/load_store_vectorize: fix indentation +- ac/nir: implement scoped_barrier +- radv: use scoped barriers +- aco: remove isel for GLSL-style barriers +- aco: add framework for unit testing +- aco: add a few tests for the assembler and optimizer +- aco: add framework for testing isel and integration tests +- ci: enable ACO tests +- aco/tests: add tests for sub-dword swaps +- aco: optimize swizzled SALU 8/16-bit conversions +- aco: fix waitcnt insertion on GFX10.3 +- aco: don't create v_mad_f32 on GFX10.3 +- aco: update bug workarounds for GFX10_3 +- aco: fix max_waves_per_simd on Polaris, VegaM and GFX10.3 +- aco: update vgpr_alloc_granule for GFX10.3 +- aco: implement subgroup shader_clock on GFX10.3 +- aco: update aco_opcodes.py for GFX10.3 +- aco: disable SMEM stores on GFX10.3 +- aco: replace MADs in isel with FMA on GFX10.3 +- spirv: set ACCESS_COHERENT for ssbo/global/image atomic load/store +- radv/aco: enable VK_KHR_memory_model +- ac/nir: consider an image load/store intrinsic's access +- ac/nir: fix coherent global loads/stores +- radv/llvm: enable VK_KHR_memory_model +- aco: fix C++11/C++14 compilation +- aco: set constant_data_offset correctly in the case of merged shaders +- aco: don't move memory accesses to before control barriers +- aco: fix non-rtz pack_half_2x16 +- aco: consider branch definitions in spiller +- aco: don't consider the first partial spill if it's the wrong type +- aco: don't fix break condition for break+discard to exec +- aco: fix regclass checks when fixing to vcc/exec with Builder +- aco: fix spills_entry heuristic for branch blocks in init_live_in_vars() +- aco: keep loop live-through variables spilled +- aco: reserve 2 sgprs for each branch +- aco: create long jumps +- aco: fix byte_align_scalar for 3 dword vectors +- aco: fix one-off error in Operand(uint16_t) +- nir/opt_if: fix opt_if_merge when destination branch has a jump +- aco: fix v_writelane_b32 with two sgprs +- aco: don't apply constant to SDWA on GFX8 +- radv: initialize with expanded cmask if the destination layout needs it +- radv,aco: fix reading primitive ID in FS after TES + +Rob Clark (265): + +- util/simple_mtx: add assert_locked() +- freedreno: add screen lock wrappers +- freedreno: switch to simple_mtx +- freedreno: fix buffer import +- gallium: extract out logicop helper +- freedreno/drm: drop atomic refcnts +- freedreno/drm: inline the things +- freedreno/a6xx: small query cleanup +- freedreno/a6xx: avoid unnecessary clearing VS DP state +- freedreno/a6xx: move const state to single stateobj +- freedreno/a6xx: move scissor state to stateobj +- freedreno/a6xx: limit PROG_FB_RAST state emit +- freedreno/a6xx: limit LRZ state emit +- freedreno/a6xx: move blend-color to stateobj +- freedreno/a6xx: combine sample mask into blend state +- freedreno/a6xx: skip unnecessary MRT blend state +- freedreno/a6xx: add OUT_PKT() +- freedreno/a6xx: convert draw packet to OUT_PKT() +- freedreno/a6xx: split out const emit +- freedreno/ir3: inline const emit +- freedreno/a6xx: convert const emit to OUT_PKT() +- freedreno: scissor vs disabled scissor micro-opt +- freedreno/a6xx: more OUT_REG() +- freedreno: sync registers with envytools +- freedreno/a6xx: don't set SP_FS_CTRL_REG0.VARYING for fragcoord +- freedreno/a6xx: fix LRZ hang +- freedreno/a6xx: add some more formats +- freedreno: we don't need aligned vbo's +- freedreno/a6xx: compressed blit fixes +- freedreno/a6xx: enable tiled compressed textures +- freedreno/gmem: don't assume scissor opt when estimating # of bins +- freedreno: initialize max_scissor +- freedreno/gmem: add div_align() helper +- freedreno/gmem: add helper to dump GMEM layout +- freedreno: add gmemtool +- freedreno/gmem: relax alignment on a6xx +- freedreno/gmem: rework gmem layout algo +- freedreno/ir3: don't allow negative const_offset +- freedreno/ir3: fix indirect cb0 load_ubo lowering +- freedreno/ir3: limit # of tex prefetch by shader size +- freedreno/ir3/postsched: reset sfu_delay on sync +- freedreno/ir3/postsched: try to avoid (sy) syncs +- freedreno/ir3/sched: avoid scheduling outputs +- freedreno/ir3/sched: try to avoid syncs +- freedreno/a6xx: fix max-scissor opt +- freedreno/ir3: use const_index accessors +- nir: fix indices for ir3 ssbo_atomic intrinsics +- nir: add helper to copy const_index[] +- nir: add pass to lower disjoint wrmask's +- freedreno/ir3: use lower_wrmasks pass +- freedreno/fdperf: add dependency on generated headers +- freedreno/drm: don't pass thru 'DUMP' flag on older kernels +- freedreno/drm: handle ancient kernels +- freedreno/ir3: remove Sethi-Ullman numbering pass +- freedreno/ir3: juggle around ir3_debug_print() +- freedreno/ir3/dce: report progress +- freedreno/cf: report progress +- freedreno/ir3/cp: report progress +- freedreno/ir3/deps: report progress +- freedreno/ir3/group: report progress +- freedreno/ir3/legalize: report progress +- freedreno/ir3/postsched: report progress +- freedreno/ir3: add IR3_PASS() macro +- freedreno/ir3: move where we preserve binning pass inputs +- freedreno/ir3: be iterative +- freedreno/ir3: make foreach_src declare cursor ptr +- freedreno/ir3: make foreach_ssa_src declar cursor ptr +- freedreno/ir3: make input/output iterators declare cursor ptr +- freedreno/ir3/group: fix for half-regs +- freedreno/ir3: fix mismatched flags on split +- freedreno/ir3/cf: handle multiple cov's properly +- freedreno/ir3: fix immed type in create_addr0() +- freedreno/ir3/print: print cat2 condition +- freedreno/ir3/cp: fix cmps folding +- freedreno/ir3: fix mismatched wrmask for overlapping VS inputs +- freedreno/ir3: add simple validate pass +- freedreno/ir3: add helpers to deal with src/dst types +- freedreno/ir3/validate: add checking for types and opcodes +- freedreno/drm: disallow exported buffers in bo cache +- freedreno: add batch debugging +- freedreno: clear last_fence after resource tracking +- freedreno: handle PIPE_TRANSFER_MAP_DIRECTLY +- freedreno/gmem: make noscis debug actually do something on a6xx +- freedreno/gmemtool: make GMEM alignment per-gen +- freedreno/gmemtool: add a405 +- freedreno/gmemtool: add verbose mode +- freedreno/gmem: add some asserts +- freedreno/gmem: fix nbins_x/y mismatch +- freedreno/gmem: split out helper to calc # of bins +- freedreno/a6xx: LRZ fix for alpha-test +- freedreno/a6xx: document LRZ flag buffer +- freedreno/a6xx: fix vsc assert +- nir: get_base_type() should return enum type +- nir: extract out convert_to_bitsize() helper +- nir/builder: add bitsize conversion helpers +- nir/lower_tex: fixes for fp16 yuv lowering +- freedreno/ir3: split kill from no_earlyz +- freedreno/a6xx: sync registers from envytools +- freedreno/a6xx: update depth-plane control regs +- freedreno/a6xx: re-work LRZ state tracking +- freedreno/a6xx: add early-lrz-late-z mode +- freedreno/a6xx: also consider alpha-test for ztest-mode +- freedreno/a6xx: more early-z +- freedreno/computerator: fix missing dependency on generated header +- nir/print: print tex dest type +- freedreno/ir3: add debug code to print conflicting half-regs +- freedreno/ir3: respect tex prefetch limits +- freedreno/ir3: remove RA "q-values" optimization +- freedreno/ir3: limit pre-fetched tex dest +- freedreno/ir3: unify shader create/delete paths +- freedreno/ir3: move the libdrm dependency out of shared code +- turnip: drop linking libfreedreno_drm +- freedreno/ir3: don't rely on intr->num_components +- radv: don't set num_components for non-vectorized intrinsics +- nir/builder: don't set intr->num_components +- nir/lower-atomics-to-ssbo: don't set num_components +- spriv: don't set num_components for non-vectorised intrinsics +- v3d: don't use intr->num_components for non-vectorized intrinsics +- nir/validate: validate intr->num_components +- freedreno/log-parser: fix compute times +- freedreno/sched: reset delay counters at start of block +- freedreno/ir3/validate: also check instr->address +- freedreno/ir3/cp: properly handle already-folded RELATIV +- freedreno: splitup emit_string_marker +- freedreno/a6xx: emit shader names in debug builds +- freedreno/ir3/legalize: don't allow (nopN) if (rptN) +- freedreno/ir3/print: print (r) flag +- freedreno/ir3: add test for delay slot calculation +- freedreno/ir3/delay: calculate delay properly for (rptN)'d instructions +- freedreno/ir3: add helpers to move instructions +- freedreno/ir3: delay test support for vectorish instructions +- freedreno/ir3/cp: extract valid_flags +- freedreno/ir3: add post-scheduler cp pass +- freedreno/ir3: convert regmask_t to struct +- freedreno/ir3: move mergedreg state out of reg +- freedreno/ir3: decouple regset from gpu gen +- freedreno/ir3: pass variant to postsched +- freedreno/ir3: re-work assembler API +- freedreno/ir3: make mergedregs a property of the variant +- freedreno/a6xx: set .MERGEREGS based on variant +- turnip: set .MERGEDREGS based on variant +- freedreno/computerator: MERGEDREGS update +- freedreno/ir3: update obsolete comment +- spirv: atomic_counter_read_deref is not vectorized +- spirv: drop some dead code +- glsl_to_nir: fix is_helper_invocation +- glsl_to_nir: fix shader_clock +- glsl_to_nir: fix vote_any/vote_all +- freedreno/ir3: refactor out helper to compile shader from asm +- freedreno/ir3: add accessor for const_state +- freedreno/a6xx: defer userconst cmdstream size calculation +- freedreno/ir3: move ubo_state into const_state +- freedreno/ir3: drop shader->num_ubos +- freedreno/ir3: constify shader key +- freedreno/ir3: pass variant to ir3_create() +- freedreno/ir3: convert over to ralloc +- freedreno/ir3: move num_reserved_user_consts out of const_state +- freedreno/ir3: un-embed const_state +- freedreno/ir3: move const_state back to variant +- freedreno/ir3: move output_loc to variant +- freedreno/ir3: split out ubo info from range +- freedreno/ir3: splitup get_existing_range() +- freedreno/ir3: split ubo analysis/lowering passes +- ci: remove some freedreno a6xx skips +- freedreno/ir3: add helper to determine point-coord inputs +- freedreno/a6xx: de-duplicate vinterp/vpsrepl state building +- freedreno/a6xx: use point-coord helper +- freedreno/a5xx: use point-coord helper +- freedreno/a4xx: use point-coord helper +- freedreno/a3xx: use point-coord helper +- freedreno: convert builtin blit VS prog to ureg builder +- freedreno/ir3: switch PIPE_CAP_TGSI_TEXCOORD +- freedreno: make foreach_bit() declare it's cursor +- freedreno: split out batch draw tracking helper +- freedreno: split out batch clear tracking helper +- freedreno: handle batch flush in resource tracking +- freedreno/ir3/ra: fix pre-color edge case +- freedreno/ir3: add ir3_finalize_nir() +- freedreno/ir3: move finalize_nir to pscreen hook +- freedreno/ir3: add ir3_compiler_destroy() +- freedreno/ir3: shuffle some variant fields +- freedreno/a6xx+ir3: stop generating pointless binning shaders +- freedreno/ir3: build binning variant at same time as draw variant +- freedreno/ir3: disk-cache support +- freedreno/ir3: move nir finalization to after cache miss +- freedreno/fdperf: fix print of base address +- freedreno/fdperf: better compatible string matching +- freedreno/fdperf: prefer render node +- gitlab-ci: reduce a630 runner load +- freedreno/ir3: add missing VS driver params +- freedreno/ir3: make compile fails more visible +- freedreno/a6xx: bail instead of crash for compile fails +- freedreno/ir3/ra: be better at failing +- freedreno/a6xx: don't enable early-z/lrz if no z-test +- freedreno/ir3: DCE unused arrays +- driconf: allowlist/denylist +- gitlab-ci: re-enable all a630 jobs +- freedreno: small comment re-word +- freedreno: whitespace fix +- freedreno/ir3/parser: half-precision relative regs +- freedreno/ir3: set array precision on creation +- freedreno/ir3: fix half-reg array stores +- freedreno/ir3/ra: debug msgs tweak +- freedreno/ir3/ra: assign vreg names to all array elements +- freedreno/ir3/ra: fix array conflicts for split/merged +- freedreno: sync registers from envytools +- freedreno: make gen_header.py check parent directory +- freedreno: slurp in rnndb +- freedreno: slurp in rnn +- freedreno: slurp in decode tools +- freedreno: slurp in afuc +- freedreno/rnn: warnings cleanup +- freedreno/decode: warnings cleanup +- freedreno/afuc: warnings cleanup +- freedreno: add CI for envytools tools +- freedreno/ir3: split out regmask +- freedreno: drop shader_t +- freedreno: deduplicate a3xx+ disasm +- freedreno: move a2xx disasm out of gallium +- freedreno: deduplicate a2xx disasm +- freedreno/ci: add a2xx trace to CI job +- freedreno/tools: check rnn parse status +- freedreno/rnn: split out helper to find files +- freedreno/rnn: add error helper +- freedreno/rnn: rename schema file +- freedreno/rnn: update schema for 'pos' +- freedreno/rnn: add relaxed boolean type +- freedreno/rnn: add high/low/pos to registers +- freedreno/rnn: add radix/align +- freedreno/rnn: relax Hexadecimal to HexOrNumber +- freedreno/rnn: add variants/varset to domain +- freedreno/registers/a2xx: fix validation error +- freedreno/registers/a4xx: fix validation error +- freedreno/registers/adreno_pm4: fix validation errors +- freedreno/rnn: describe copyright element in schema +- freedreno/rnn: add "addvariant" to schema +- freedreno/rnn: allow name to be optional in arrays +- freedreno/rnn: fix use-group +- freedreno/registers/mdp5: fix validation error +- freedreno/rnn: schema updates for dynamic/irregular offsets +- freedreno/rnn: add schema validation +- freedreno/rnn: headergen2 warnings cleanup +- freedreno/decode: cffdec warnings cleanup +- freedreno/ir3: add missing track_ubo_use() +- freedreno/a6xx: don't emit a bogus size for empty cb slots +- freedreno/a6xx: fixup draw state earlier +- freedreno/rnn: also look for .xml.gz +- freedreno/rnn: rework RNN_DEF_PATH construction +- freedreno/registers: add .gitignore +- freedreno/registers: split header build into subdirs +- freedreno/registers: install gzip'd register database +- freedreno/decode: move dependencies up a level +- freedreno: allow fence_fd fences to be recycled +- freedreno/ir3: ir3_cmdline updates +- freedreno/ir3: lower local_index using local_id +- glsl/lower_precision: split out const lowering +- gallium: replace 16BIT_TEMPS cap with 16BIT_CONSTS +- glsl: remove LowerPrecisionTemporaries +- glsl: don't inline intrinsics for mediump +- glsl_to_nir: fix bitfield_extract with 16-bit operands +- freedreno/registers: add some missing regs to build +- freedreno/crashdec: handle section name typos +- freedreno/a6xx: fix occlusion query with more than one tile +- freedreno: handle case of shadowing current render target +- freedreno/gmemtool: add tile_alignw/h and a650 + +Rohan Garg (3): + +- iris: Fix documentation for _iris_batch_flush +- ci: Include trace replay support in ARM rootfses. +- gitlab-ci: Replay traces on lava devices + +Roland Scheidegger (1): + +- gallivm: fix half to float conversions with llvm 11 + +Roman Gilg (2): + +- vulkan/wsi/x11: add sent image counter +- vulkan/wsi/x11: wait for acquirable images in FIFO mode + +Roman Stratiienko (5): + +- egl: Build surfaceless platform on Android +- Android: Fixes for Q and R +- panfrost: Android build fixes 2020 week 31 +- lima: Fix lima_screen_query_dmabuf_modifiers() +- android: freedreno: Another build fix + +Sagar Ghuge (3): + +- iris: Use modfiy disables for 3DSTATE_WM_DEPTH_STENCIL command +- intel/compiler: Optimize integer add with 0 into mov +- intel/compiler: Remove unnecessary optimization for MUL + +Samuel Pitoiset (235): + +- ci: fix reporting the number of unexpected/flakes +- ci: add lists of expected failures & skipped tests for RAVEN with ACO +- aco: remove unecessary p_split_vector with v2b reg class +- radv: enable shaderInt16 unconditionally with LLVM and only GFX8+ with ACO +- radv: cleanup radv_CreateInstance() +- radv: rename radv_devices() to radv_enumerate_physical_devices() +- radv: fix a memleak if the physical device initialization failed +- radv: report INITIALIZATION_FAILED when the amdgpu winsys init failed +- radv: don't report error with other vendor DRM devices +- radv: use a linked list for physical devices +- radv: display an error message if the winsys init failed +- radv/winsys: do not count visible VRAM buffers twice in the budget +- ci: remove unused .test-radv-fossilize rule +- ci: set ACO_DEBUG=validateir,validatera global for RADV testing +- ci: run radv-fossils with Pitcairn (GFX6) and Bonaire (GFX7) too +- radv: remove the LLVM version string when ACO is used +- radv: do not print the LLVM version string twice in hang reports +- radv: report correct backend IR in hang reports when ACO is used +- aco: fix 64-bit trunc with negative exponents on GFX6 +- nir: do not vectorize load/store if offset can overflow and robustness enabled +- aco: prevent invalid loads/stores vectorization if robustness is enabled +- radv: limit the Vulkan version to 1.1 for Android +- radv: handle different Vulkan API versions correctly +- radv: update the list of allowed Android extensions +- aco: optimize add/sub(a, cndmask(b, 0, 1, cond)) -> addc/subbrev_co(0, a, b) +- radv: use the common base object type for VkDevice +- radv: use the base object struct types +- radv: implement VK_EXT_private_data +- vulkan: import common code for generating extensions +- radv: use the common code for generating extensions and dispatch tables +- anv: use the common code for generating extensions and dispatch tables +- turnip: use the common code for generating extensions and dispatch tables +- radv: add a LLVM version string workaround for SotTR and ACO +- aco: remove useless check for nir_tex_src_bias +- aco: add support for texturing with clamped LOD +- ac/llvm: add support for texturing with clamped LOD +- radv: enable shaderResourceMinLod +- spirv: handle OpCopyObject correctly with any types +- radv: fix missing break in radv_GetPhysicalDeviceProperties2() +- aco: store 16-bit temporary outputs as v2b +- aco: convert 16-bit values before exporting MRTs +- aco: allow to load/store 16-bit values in VMEM for tess and geom +- aco: implement 8-bit/16-bit mov's with p_create_vector +- aco: implement 16-bit vertex fetches with tbuffer_load_format_d16_* +- aco: validate v_interp_*_f16 as VOP3 instructions instead of VINTRP +- aco: emit v_interp_*_f16 instructions as VOP3 instead of VINTRP +- aco: implement 16-bit interp +- aco: fix off-by-one error with 16-bit MTBUF opcodes on GFX10 +- radv/aco: enable storageInputOutput16 on GFX9+ +- aco: fix missing break in label_instruction() +- radv: fix missing break in radv_GetPhysicalDeviceFeatures2() +- radv: fix duplicated expression in ac_setup_rings() +- radv/winsys: remove useless free in radv_amdgpu_create_bo_list() +- aco: declare 8-bit/16-bit reduce operations +- aco: implement 8-bit/16-bit reductions +- aco: validate 8-bit/16-bit VGPR operands for readfirstlane/readlane/writelane +- aco: implement 8-bit/16-bit nir_intrinsic_read_first_invocation +- aco: implement 8-bit/16-bit nir_intrinsic_{shuffle,_read_invocation} +- aco: implement 8-bit/16-bit nir_intrinsic_quad_* +- aco: use a temporary SGPR for 8-bit/16-bit literal reduction identities +- aco: sign-extend the input and identity for 8-bit subgroup operations +- radv: do not return from radv_GetPhysicalDeviceFeatures2() +- radv: cleanup physical device features +- radv: remove useless assignment in build_streamout_vertex() +- spirv: add ReadClockKHR support with device scope +- aco: implement nir_intrinsic_shader_clock with device scope +- ac/nir: fix shader clock with subgroup scope +- ac/nir: implement nir_intrinsic_shader_clock with device scope +- radv: advertise shaderDeviceClock on GFX8+ +- spirv: add SpvCapabilityImageGatherBiasLodAMD +- spirv: add support for bias/lod with OpImageGather +- ac/nir: add support for bias/lod with texture gather +- aco: add support for bias/lod with texture gather +- radv: add support for querying which formats support texture gather LOD +- radv: advertise VK_AMD_texture_gather_bias_lod +- spirv,radv,anv: implement no-op VK_GOOGLE_user_type +- radv/aco: enable VK_EXT_subgroup_size_control +- aco: fix register allocation for subdword instructions on GFX10 +- aco: implement 8-bit/16-bit reductions on GFX10 +- aco: allocate a temp VGPR for some 8-bit/16-bit reduction ops on GFX10 +- aco: allow gfx10_wave64_bpermute with 8-bit/16-bit input +- aco: sign-extend input/indentity for 32-bit reduce ops on GFX10 +- radv/aco: enable VK_KHR_subgroup_extended_types on GFX8+ +- radv: enable zero VRAM for Doom Eternal +- radv: enable zero VRAM for all VKD3D (DX12->VK) games +- aco: implement 16-bit reduce operations on GFX6-GFX7 +- aco: implement 16-bit nir_intrinsic_quad_* on GFX6-GFX7 +- aco: fix subdword copies on GFX6-GFX7 +- aco: sign-extend input/identity for 16-bit subgroup ops on GFX6-GFX7 +- radv/aco: enable 64-bit atomic features if RADV is linked with LLVM 8 +- aco: use v_bfe_u32 for unsigned reductions sign-extension on GFX6-GFX7 +- aco: fix sign-extend 8-bit subgroup operations on GFX6-GFX7 +- aco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7 +- radv/aco: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7 +- ac/nir: adjust an assertion for D16 on GFX6-GFX7 +- nir/lower_explicit_io: fix NON_UNIFORM access for UBO loads +- radv/llvm: expose VK_EXT_shader_demote_to_helper_invocation with LLVM 9+ +- aco: implement 8-bit/16-bit conversions on GFX6-GFX7 +- aco: fix alignment of vectors with 4 elements +- radv/aco: enable 8-bit/16-bit storage on GFX6-GFX7 +- radv/aco: enable shaderInt16 on GFX6-GFX7 +- radv/aco: enable shaderInt8 and VK_KHR_shader_float16_int8 on GFX6-GFX7 +- ac/nir: fix integer comparisons with pointers +- radv: set DB_SHADER_CONTROL.CONSERVATIVE_Z_EXPORT correctly +- radv: add new drirc option radv_enable_mrt_output_nan_fixup +- aco: implement radv_enable_mrt_output_nan_fixup workaround +- radv/llvm: implement radv_enable_mrt_output_nan_fixup workaround +- radv: enable radv_enable_mrt_output_nan_fixup for RAGE 2 +- ac: add ac_choose_spi_color_formats() to common code +- spirv: fix using OpSampledImage with OpUndef instead of OpType{Image,Sampler} +- aco: allow to swap operands for some 16-bit float instructions +- spirv: do not set num_components for non-vectorized mbcnt_amd intrinsic +- radv/aco: enable FP16 features/extensions on GFX9+ +- radv: lower discards to demote to workaround a RDR2 game bug +- radv: make sure to set CB_SHADER_MASK correctly for internal CB operations +- radv: compute CB_SHADER_MASK from the fragment shader outputs +- radv: only requires LLVM 9 for GFX10 if not using ACO +- radv: replace == GFX10 with >= GFX10 where it's needed +- aco: replace == GFX10 with >= GFX10 where it's needed +- radv: add support for Sienna Cichlid +- radv: require LLVM 11+ for GFX 10.3 if not using ACO +- aco: fix printing ASM on GFX6-7 if clrxdisasm is not found +- aco: improve validation checks for readlane/writelane +- aco: fix printing ASM on GFX6-7 again +- gitlab-ci: stop testing RADV with LLVM +- gitlab-ci: update the list of expected CTS failures for RADV/ACO +- gitlab-ci: update the list of expected failures for Pitcairn +- radv: fix checking the return value of cs_finalize() +- gitlab-ci: add parallel-rdp fossils +- radv: lower 64-bit drcp/dsqrt/drsq for fixing precision issues +- radv: lower 64-bit dfloor on GFX6 for fixing precision issues +- gitlab-ci: add a list of expected failures for RADV/ACO on NAVI14 +- gitlab-ci: set the number of Fossilize threads to 4 +- gitlab-ci: append Fossilize stdout/stderr to a file to reduce spam +- gitlab-ci: attach the Fossilize log file as artifact on failure +- radv: remove the shader ballot workaround for Youngblood with LLVM +- radv: remove the load/store workaround for Monster Hunter World with LLVM +- radv: enable VK_AMD_shader_ballot on GFX6-7 with both compiler backends +- radv: adjust CB_SHADER_MASK for dual-source blending in the shader info pass +- radv: rework 8/16-bit color attachment formats detection +- radv: use SPI_SHADER_ZERO for non-written color attachments +- radv: add support for MRTs compaction to avoid holes +- radv: fix wide points and lines +- radv: fix wide lines with multisample enabled +- Revert "vulkan/wsi/x11: Ensure we create at least minImageCount images." +- radv,vulkan: add a new x11 wsi drirc workaround for DOOM Eternal +- radv: disable FMASK compression when drawing with GENERAL layout +- radv: set depth/stencil enable values correctly for the meta clear path +- radv: implement missing VK_ACCESS_MEMORY_{READ,WRITE}_BIT +- radv: store the primitive topology hardware value in the pipeline +- radv: adjust IA_MULTI_VGT_PARAM.WD_SWITCH_ON_EOP at draw time +- radv: adjust IA_MULTI_VGT_PARAM.PARTIAL_VS_WAVE at draw time +- radv: compute prim_vertex_count at draw time +- aco: fix more validation errors from vgpr spill/restore code +- radv: return VK_ERROR_DEVICE_LOST if wait-for-idle failed or expired +- radv: remove the secure compile support feature +- radv: rework dynamic viewports/scissors support +- radv: add VK_EXT_extended_dynamic_state but leave it disabled +- radv: declare new extended dynamic states +- radv: add support for dynamic cull mode and front face +- radv: add support for dynamic primitive topology +- radv: add support for dynamic and scissor count +- radv: add support for dynamic depth/stencil states +- radv: add support for dynamic vertex input binding stride +- radv: advertise VK_EXT_extended_dynamic_state +- radv: add the custom border color BO to the list of buffers +- radv: destroy the base object if VkCreateQueryPool() failed +- radv: destroy the base object if VkCreateRenderPass*() failed +- radv: destroy the base object if VkCreateImage() failed +- radv: destroy the base object if VkCreateBuffer() failed +- radv: destroy the base object if VkCreateEvent() failed +- radv: destroy the base object if VkCreateSemaphore() failed +- radv: destroy the base object if VkCreateFence() failed +- radv: destroy the base object if VkAllocateCommandBuffers() failed +- radv: destroy the base object if VkCreateInstance() failed +- radv/winsys: replace alloca() by malloc() everywhere +- radv/winsys: pass the buffer list via the CS ioctl for less CPU overhead +- radv: fix destroying the syncobj when exporting a fence FD +- radv: fix the error code when exporting a semaphore/fence fails +- radv: fix the error code when allocating a fresh imported syncobj fails +- radv: optimize creating signaled syncobj with amdgpu_cs_create_syncobj2() +- radv: split fence into two parts as enum+union. +- radv: remove one useless goto in radv_queue_submit_deferred() +- radv: improve the error messages when a CS submission failed +- radv: return better Vulkan error codes when VkQueueSubmit() fails +- radv: disable CPU caching for IBS to reduce fetch latency +- radv/winsys: always allow GTT placements on APUs +- radv: advertise VK_EXT_image_robustness +- radv: do not perform read-modify-write with the upload BO +- radv: disable CPU caching for the upload BO to reduce fetch latency +- aco: add support for nir_intrinsic_shared_atomic_fadd +- ac/nir: add support for nir_intrinsic_shared_atomic_fadd +- radv: advertise VK_EXT_shader_atomic_float +- radv: add missing return values check for some winsys calls +- radv/winsys: check more allocation failures +- radv/winsys: remove useless check when binding virtual buffers/images +- radv/winsys: return a Vulkan error code when binding virtual buffers/images +- radv/winsys: be more robust when a CS failed during recording +- radv: remove declared but unused radv_pipeline::is_dual_src +- radv: remove set but unused radv_pipeline::vertex_elements +- radv: remove outdated TODO related to PA_SU_VTX_CNTL.PIX_CENTER +- radv: emit more invariant registers as part of the initial gfx state +- radv: emit PA_SC_LINE_CNTL as part of the rasterization state +- radv: clean up VGT_SHADER_STAGES_EN emission +- radv: clean up PA_SC_CLIPRECT_RULE emission +- radv: reduce the number of allocated dwords for compute CS +- radv: clean up radv_compute_generate_pm4() +- radv: remove unnecessary radv_tessellation_state::num_patches +- radv: remove no-op si_multiwave_lds_size_workaround() +- radv: remove one unnecessary param to radv_generate_graphics_pipeline_key() +- radv: align the LDS size in calculate_tess_lds_size() +- radv: set LDS TCS size at shaders creation for GFX9+ +- radv: remove unnecessary radv_tessellation_state::lds_size +- radv: clean up tessellation state emission +- radv: add radv_pipeline_init_input_assembly_state() +- radv: add radv_pipeline_generate_vgt_gs_out() +- radv: clean up adjusting MSAA state if conservative rast is enabled +- radv: clean up binning state initialization +- radv: assign pipeline gfx fields before PM4 emission +- radv: constify all radv_pipeline_generate_*() helpers +- radv: add radv_pipeline_init_shader_stages_state() +- radv: remove useless return value to radv_pipeline_scratch_init() +- radv: clean up remaining pipeline init functions +- radv: print warnings for famous RADV_PERFTEST options that no longer exist +- radv: do not honor a user-specified pitch on GFX 10.3 +- radv: increase minimum NGG vertex count requirement per workgroup on GFX 10.3 +- radv: fix sample shading on GFX 10.3 +- radv: set BYPASS_VTX_RATE_COMBINER_GFX103 on GFX 10.3 +- radv/gfx10: add missing initialization of registers +- radv: limit LATE_ALLOC_GS to prevent a GPU hang on GFX10 +- radv: fix emitting the border color pointer on the compute queue +- nir/algebraic: mark some optimizations with fsat(NaN) as inexact +- aco: handle unaligned loads on GFX10.3 +- spirv: fix emitting switch cases that directly jump to the merge block +- radv: fix transform feedback crashes if pCounterBufferOffsets is NULL + +Satyajit Sahu (1): + +- frontends/va: Handle dynamic resolution/SVC for VP9 + +Satyeshwar Singh (1): + +- intel/dev: Don't consider all TGL SKUs as GT1 only + +Serge Martin (3): + +- amd/common: Fix incorrect use of asprintf instead of vasprintf +- clover: add more cl_mem_object_type to pipe_texture_target mapping +- clover: implements clEnqueueFillBuffer + +Shawn Guo (1): + +- freedreno/a4xx: fix *_NONE enum conversion + +Simon Ser (3): + +- EGL: sync headers with Khronos +- gbm: document that gbm_bo_map exposes a linear view +- radv: use bitshifts for debug enum values + +SureshGuttula (1): + +- radeon/vcn: Corrected vp9 ref associated data incase of target->codec is NULL + +Tapani Pälli (14): + +- st/mesa: destroy only own program variants when program is released +- anv: call base finish only if pass given in DestroyRenderPass +- anv: add VK_EXT_extended_dynamic_state but leave it disabled +- anv: add new dynamic states +- anv: consider dynamic state when creating pipeline +- anv: handle dynamic viewport count +- anv: add support for dynamic cull mode and winding order +- anv: add support for dynamic viewport and scissor with count +- anv: add support for dynamic primitive topology change +- anv: depth/stencil dynamic state support +- anv: dynamic vertex input binding stride and size support +- anv: toggle on VK_EXT_extended_dynamic_state +- anv: add a check for depthStencilState before using it +- anv: null check for buffer before reading size + +Thong Thai (8): + +- radeon: Fix whitespaces +- gallium/auxiliary/vl: Fix compute shader scaling for non-square pixels +- gallium/auxiliary/vl: Fix compute shader scale_y for interlaced videos +- frontends/va: Fix deinterlace bottom field first flag +- frontends/vdpau: Default destination rect to source rect +- radeon/vcn: add vcn 3.0 encode support +- radeonsi: use PIPE_FORMAT_P010 for 10-bit VP9 decoding +- radeon/vcn: increase render_pic_list size + +Timothy Arceri (69): + +- glsl: stop cascading errors if process_parameters() fails +- glsl: fix slow linking of uniforms in the nir linker +- radv: fix regression with builtin cache +- nir: add glsl_get_ifc_packing() helper +- nir: add callback to nir_remove_dead_variables() +- glsl: add can_remove_uniform() helper to the NIR linker +- glsl: remove dead uniforms in the nir linker +- glsl/spirv: remove dead uniforms in spirv nir linker +- gitlab-ci: bump piglit checkout commit +- i965: call brw_nir_lower_uniforms() after uniform linking is complete +- util: add BITSET_LAST_BIT() helper +- glsl: add struct to gather more info about uniform array access +- glsl: add update_array_sizes() helper to the NIR uniform linker +- glsl: gather uniform dereference info before main linking loop +- glsl: when NIR linker enable use it to resize uniform arrays +- glsl: fix potential slow compile times for GLSLOptimizeConservatively +- glsl: fix incorrect optimisation in opt_constant_variable() +- glsl: fix uniform array resizing in the nir linker +- glsl: small optimisation fix for uniform array resizing +- st_glsl_to_nir: fix potential use after free +- mesa: remove _mesa prefix from static function +- mesa: add _mesa_program_state_value_size() helper +- glsl: define gl_LightSource members in ARB_vertex_program order +- st/glsl_to_nir: disable st_nir_lower_builtin() when packing supported +- glsl: remove stale FIXME +- i965: add and fix fallthrough comments +- llvmpipe: add missing fallthrough comments +- gallivm: add missing break +- anv: update fallthrough comment so gcc sees it +- intel/compiler: add and fix up fallthrough comments for gcc warnings +- iris: add missing fallthrough comment +- egl: move fallthrough comment so gcc can see it +- nir: add missing break to nir_opt_access() +- mesa: fix fallthrough in glformats +- mesa: add fallthrough comments to glformats.c +- mesa: add fallthrough comments to get.c +- nir: fix implicit fallthrough warnings +- mesa: add fallthrough comments to COPY_SZ_4V() +- radeonsi: add missing fallthrough comment +- glx: add missing fallthrough comment +- glsl: move fallthrough comment to where gcc can see it +- radeon: add missing fallthrough comments +- spirv: add missing fallthrough comments +- mesa/vbo: add some missing fallthrough comments +- mesa: add missing fallthrough comment to teximage.c +- mesa: fix unintended fallthrough in glIsEnabled() +- r300: add and fix up fallthrough comments +- svga: add missing fallthrough comments +- mesa: update fallthrough comment so gcc can see it +- nv30: add missing fallthrough comment +- meson: turn on Wimplicit-fallthrough project wide +- nouveau: fix pointer-sign warning +- gitlab-ci: Enable -Werror in `meson-classic` job +- r600/radeonsi: silence zero-length-bounds gcc warnings +- radeonsi: fix SI_NUM_ATOMS +- iris: fix maybe-uninitialized warning for initial_state variable +- iris: silence maybe-uninitialized for stc_dst_aux_usage variable +- nouveau/nvc0: silence maybe-uninitialized warning +- panfrost: add some missing fallthrough comments +- panfrost: hide more unused code in bi_lower_combine.c +- panfrost: add some missing fallthrough comments to bi_pack.c +- freedreno: fix missing fallthrough comments +- v3d: remove redefine of VG(x) +- zink: fix missing fallthrough comment +- nine: remove unused var +- etnaviv: add missing fallthrough comments +- lima: add missing fallthrough comments +- lima: add missing break +- gitlab-ci: Enable -Werror in `meson-gallium` job + +Timur Kristóf (4): + +- aco/gfx10: Refactor of GFX10 wave64 bpermute. +- aco: Implement subgroup shuffle on GFX6-7. +- radv/aco: Always enable subgroup shuffle. +- aco: Fix emit_boolean_exclusive_scan in wave32 mode. + +Tomeu Vizoso (55): + +- panfrost: Emit blend descriptors on Bifrost +- panfrost: Don't leak temporary descriptors array +- pan/decode: Check for correct unknown field +- pan/decode: Use correct printf modifier for long int +- panfrost: Split bit out of format.unk3 +- panfrost: Create additional BO for the checksum of imported BOs (Bifrost) +- panfrost: Add a bit more info about some tiler fields +- pan/bi: Print shaders only if BIFROST_MESA_DEBUG=shaders +- pan/decode: Trace to stderr with PANDECODE_DUMP_FILE=stderr +- panfrost: GPUs newer than G-71 don't have swizzles... +- panfrost: mali_attr_meta.unknown1 is zero on Bifrost +- panfrost: Add Bifrost texture trampoline BO to batch +- pan/decode: Properly print tripped zeroes +- virgl: Properly check for encode_stride when encoding transfers +- panfrost: Add checksum BOs to batch +- panfrost: Don't trample on top of Bifrost-specific unions +- panfrost: Handle MALI_RGB8_UNORM in panfrost_format_to_bifrost_blend +- gitlab-ci: Run more dEQP tests for virgl +- gitlab-ci: Add manual tests for Virgl using GLES on the host +- gitlab-ci: Test virgl with Khronos' OpenGL CTS +- gitlab-ci: Update CTS runner +- ci: Don't call renderdoc's ReplayController.Shutdown() +- ci: Move ARM rootfses to stable +- gitlab-ci: Build kernel drivers for a few ethernet USB dongles +- gitlab-ci: More stable URL for kernel and ramdisk artifacts, for LAVA +- gitlab-ci: Remove left-behind rules: +- gitlab-ci: Don't rebuild kernels and rootfs if they have been already built in mainline +- gitlab-ci: Run all of GLES3 tests for Panfrost +- gitlab-ci: Re-add kernels for bare-metal +- gitlab-ci: Download traces from MinIO +- gitlab-ci: Upload tracie artifacts to MinIO +- gitlab-ci: Fix needs: of the arm64 LAVA test jobs +- ci: Upload images of failed replays to MinIO +- ci: Use smaller glxgears trace +- ci: Prefix tracie artifacts with the device name +- ci: Test with more traces +- ci: Disable trace testing on Mali T760 +- ci: Fix the overwriting of traces.yml for baremetal +- ci: Namespace trace artifacts to the job number +- ci: Always print status code of HTTP uploads in tracie +- ci: Print load stats after running dEQP +- ci: Fix URL for glslang +- ci: Don't ship vk-build-programs after building dEQP +- ci: Split building of libdrm to its own script +- ci: Build kernels and rootfs for x86 devices +- ci: Upload reference images for traces +- ci: Print URL to image diff when a trace replay fails +- ci: Generate MinIO credentials within LAVA jobs +- ci: Set date in LAVA DUTs from NTP servers +- ci: Build-test Panfrost tools +- ci: Upload traces' reference and actual images to MinIO +- ci: Download traces from MinIO in baremetal runs +- ci: Remove kernel module build that slipped in +- ci: Actually upload trace artifacts to MinIO for baremetal +- ci: Use a rootfs tarball for NFS root, instead of a ramdisk (for LAVA) + +Tony Wasserka (4): + +- nir/lower_idiv: Port recent LLVM fixes to emit_udiv +- radv: Fix various non-critical integer overflows +- aco: Fix integer overflows when emitting parallel copies during RA +- amd/common: Fix various non-critical integer overflows + +Vinson Lee (25): + +- freedreno: Add missing break statement. +- llvmpipe: Fix variable name. +- r600/sfn: Initialize VertexStageExportForGS m_num_clip_dist member variable. +- panfrost: Ensure final.no_colour is initialized. +- r600/sfn: Use correct setter method. +- freedreno: Add missing va_end. +- pan/bi: Initialize struct fma_op_info member extended. +- zink: Check fopen result. +- etnaviv: Fix memory leak on error path. +- panfrost: Fix printf format specifier. +- r300g: Remove extra printf format specifiers. +- vdpau: Fix wrong calloc sizeof argument. +- mesa: Fix NetBSD compiler macro. +- Switch from cElementTree to ElementTree. +- intel/genxml: Migrate from deprecated xml.etree.ElementTree getchildren. +- rbug: Fix rbug_delete_vs_state lock acquisition. +- nir: Add nir_lower_clip_disable.c to SCons build. +- util: Fix SCons build. +- util: Fix memory leaks in unit test. +- meson: Fix lmsensors warning message. +- vulkan: Fix memory leaks. +- freedreno: Fix file descriptor leak. +- svga: Fix unused printf argument. +- freedreno: Check file descriptor before write. +- panfrost: Delete debug allocated syncobj. + +Yevhenii Kharchenko (1): + +- st/mesa: fix corrupted texture levels, when adding more levels than expected + +Yevhenii Kolesnikov (5): + +- glsl: subroutine signatures must match exactly +- nvir: don't use designated initialisers in C++ code +- intel/compiler: don't propagate cmp to add if add is saturated +- mesa: change error code of *TextureSubImage* for incorreect target +- nine: fix incorrect calculation of layer count for 3D textures + +jzielins (2): + +- gallium/swr: Fix compilation warnings +- swr: Bump maximum 2D texture size to 16kx16k + +mmenzyns (1): + +- nv50: Clear nv50_ir_prog_info of dead and codegen specific variables