summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/iris/iris_context.h
Commit message (Collapse)AuthorAgeFilesLines
* iris: Create, destroy and replace Xe enginesJosé Roberto de Souza2023-04-131-1/+1
| | | | | | Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22172>
* iris: Store iris_context's priorityJosé Roberto de Souza2023-03-171-0/+7
| | | | | | | | | | | | | | | This way when replacing a broken context we don't need to ask to kernel what is the priority of the context being replaced. Also this will be necessary for Xe kmd as it don't have any uapi to query engine priority. While doing that also taking the oportunity to move more code from iris_bufmgr.c/h that only has one caller. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21965>
* iris: Drop usage of i915 EXEC_OBJECT_WRITEJosé Roberto de Souza2023-03-151-0/+2
| | | | | | | | | | | | | The whole usage of this flag is to call iris_use_pinned_bo() with writable argument, for that we don't need any i915_drm.h specific type. IRIS_BLORP_RELOC_FLAGS_EXEC_OBJECT_WRITE could have any other value but keeping the same as i915_drm.h. With this we can drop 2 i915_drm.h imports from generic Iris code. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21887>
* iris: trace frames with u_traceLionel Landwerlin2023-03-101-0/+4
| | | | | | Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21648>
* iris: Drop iris_cache_flush_for_renderNanley Chery2023-02-151-3/+0
| | | | | | | | | | | | | | | Before dropping this function, handle the two callers of this function: * The call in iris_blorp.c is redundant. The required cache flushes are already handled by the callers of blorp functions. Delete this. * The call in iris_resolve.c is still providing a benefit because it calls iris_emit_buffer_barrier_for internally. Inline the needed barrier. Cc: 23.0 <mesa-stable> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21303>
* iris: add restrictions for 3DSTATE_RASTER::AntiAliasingEnableTapani Pälli2023-01-201-0/+3
| | | | | | | | | | | | | Field must be disabled if any render targets have integer format, additionally for Gfx12+ field must be disabled when num multisamples > 1 or forced multisample count > 1. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7892 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20671>
* iris: Don't flush the render cache for a compute batchRohan Garg2023-01-201-0/+12
| | | | | | | | | Make sure we comply with BSpec and ensure that certain flush flags are not set for compute batches Signed-off-by: Rohan Garg's avatarRohan Garg <rohan.garg@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15664>
* iris: Update aux state tracking for image views after draws/dispatchesKenneth Graunke2022-12-141-0/+5
| | | | | | | | | | | | | | | | | On Tigerlake and later, we enable compression for image views. However, we never actually added any code to update the aux state, which meant that if it ever changed, things would break, badly. We managed to avoid catastrophic effects in most cases because of two other issues which papered over the problem: if compression wasn't already enabled for an image, we'd leave it disabled. And, we avoided writing via the CPU to buffers with auxiliary. So in most cases, CCS remained disabled, or got enabled (say by glTexImage()) then stayed on permanently. There were still issues, but they managed to remain more hidden than one would expect given the severity of the bug. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>
* iris: Drop batch parameter from iris_update_postdraw_resolve_trackingKenneth Graunke2022-12-141-2/+1
| | | | | | | | Eventually the resolve code started making everything take ice instead of batch, and at some point this ceased to be used. It's always render. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>
* iris: move bindless surface state heap inside the surface state heapLionel Landwerlin2022-11-191-1/+1
| | | | | | | | | | | | | | | | | | | We're about to make scratch surface states part of the surface state heap. Because those are required to be in the low 26bits parts surface state heap (we're limited in bits handed in the CFE_STATE, 3DSTATE_VS, etc... instructions), this change splits the 32bit surface state heap as follow: - 8Mb of surface states for scratch - 1Gb - 8Mb of binding tables - 3Gb of surface states That way all of the surfaces are located within a 4Gb region visible from STATE_BASE_ADDRESS::SurfaceStateBaseAddress Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19727>
* iris: Destroy batch contexts in a single placeJosé Roberto de Souza2022-11-161-0/+1
| | | | | | | | | While at it also moving has_engines_context to iris_context, no need to have this information replicated into every iris_batch. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19687>
* iris: invalidate sysvals if grid dimension changesKarol Herbst2022-11-021-0/+2
| | | | | | | | Cc: mesa-stable Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18581>
* iris: Emit protection & session ID on protected command buffersLionel Landwerlin2022-10-271-0/+3
| | | | | | | Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8092>
* iris: Set SamplerCount in shader packetsJason Ekstrand2022-10-101-0/+2
| | | | | | Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18670>
* iris: bump IRIS_MAX_GLOBAL_BINDINGS to 128Karol Herbst2022-10-101-1/+1
| | | | | | Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18670>
* iris: Support up to 128 texturesJason Ekstrand2022-09-221-2/+3
| | | | | | | | | | This is required for OpenCL. I kind-of hate this patch. I really don't like GROUP_TEXTURE_LOW64 and GROUP_TEXTURE_HIGH64 but it was either that or I had to make all the used bitsets 128 which would have mean making them BITSET and that would have been a lot more churn. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16442>
* iris: Support up to 64 imagesJason Ekstrand2022-09-221-1/+1
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16442>
* iris: Split max #defines for textures/samplers/imagesJason Ekstrand2022-09-221-3/+2
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16442>
* iris: Handle new untyped dataport cache flush PIPE_CONTROL fieldSagar Ghuge2022-08-051-0/+2
| | | | | | | | | | | | Also while switching to GPGPU pipeline, make sure to flush the untyped dataport cache. HDC pipeline flush bit must be set if we are flushing untyped dataport L1 data cache. v2: Add utrace support (Lionel) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16905>
* iris: reorder to minimize paddingMark Janes2022-07-291-6/+4
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17749>
* iris: pad all structures used in a shader keyMark Janes2022-07-291-4/+20
| | | | | | | | | | | | | When the compiler pads a data structure, the padded bytes will not be initialized. Shader keys are compared with memcmp and unitialized bytes within the structure breaks this mechanism. Explicitly pad the structures with members, so the compiler is forced to initialize them. Add a warning to indicate if a change to alignment in any of the data structures requires additional padding. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17749>
* iris: Add FLUSH_HDC to PIPE_CONTROL_CACHE_FLUSH_BITSKenneth Graunke2022-05-171-0/+1
| | | | | | | | This is considered a bottom-of-pipe flush bit. Fixes: a969ad1ddfd ("iris: Demote DC flush to HDC flush in cache tracker") Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16565>
* driconf: Add a limit_trig_input_range optionVadym Shovkoplias2022-05-131-0/+1
| | | | | | | | | | | | | | With this option enabled range of input values for fsin and fcos is limited to [-2*pi : 2*pi] by calculating the reminder after 2*pi modulo division. This helps to improve calculation precision for large input arguments on Intel. -v2: Add limit_trig_input_range option to prog_key to update shader cache (Lionel) Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16388>
* iris: Extend the cache tracker to handle L3 flushes and invalidatesKenneth Graunke2022-04-131-0/+4
| | | | | | | | | | | | | | | Most clients are L3-coherent these days. However, there are some notable exceptions, such as push constants, stream output, and command streamer memory reads and writes. With the advent of the tile cache, flushing the render or depth caches alone are no longer sufficient for memory to become globally-observable. For those, we need to flush the tile cache as well. However, we'd like to avoid that for L3-coherent clients, as it shouldn't be necessary, and is expensive. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
* iris: Add a separate PIPE_CONTROL_L3_READ_ONLY_CACHE_INVALIDATE bitKenneth Graunke2022-04-131-0/+1
| | | | | | | | This will let us use it without performing a VF cache invalidation, should we want to do that. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
* util: Rename pipe_debug_callback to util_debug_callbackYonggang Luo2022-04-011-1/+1
| | | | | | Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15657>
* util: Rename pipe_debug_message to util_debug_messageYonggang Luo2022-04-011-1/+1
| | | | | | Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15657>
* iris: have a single border color pool per bufmgrPaulo Zanoni2022-02-111-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Have a single border color pool per bufmgr instead of per context. We want to have a single VM shared among every context and the border color pool is the last feature preventing us from having that. Previously we had 1024 colors per context but once the buffer was full we just waited for the buffer to be unused and restarted it. After this patch we have 4096 colors for every single context and we can't just flush buffers if they are full, so we simply return black. There are many strategies we could try to implement to help alleviate this new 4096 limit, none of which are implemented by this patch: - We could just expand the buffer to the full 16MB we can use, allowing 262144 colors. - We could use multiple buffers and make the contexts refcount them, so eventually older buffers would reach zero references and be recycled, moving us to a working set maximum from a lifetime maximum. - We could also make the border color pool be a standard memzone and then give smaller buffers to each context when they need, so the limit would be in the number of contexts that can use border color pools. This was my first implementation but Ken suggested I switch to the one provided by this patch, which is simpler. Keep it like this since border colors don't seem to be used very much and other Mesa drivers such as radeonsi also seem to employ the "return black once we reach the limit" strategy. As a last note, we could also move the contents of iris_border_color.c to iris_bufmgr.c in order to avoid breaking some abstractions we have in Iris, like we do with iris_bufmgr_get_border_color_pool(). I can do this in case we want it. v2: Switch from standard memzone to a per-screen thing (see above). v3: Actually make it per bufmgr. Just making it per screen is not enough, since screens can share the same VM, an example being the gputest benchmark suite. v4: Rebase. v5: Remove dead code, lock around hash table lookup (Ken). v6: Simple rebase. v7: Another rebase (for_each_batch). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12028>
* iris: Set BLORP_BATCH_USE_{COMPUTE,BLITTER} flags for the target batchKenneth Graunke2022-01-241-0/+11
| | | | | | | | | | | | | This makes blits, copies, and (non-fast) clears set the appropriate BLORP_BATCH_USE_{COMPUTE,BLITTER} flag if their batch is either IRIS_BATCH_COMPUTE or IRIS_BATCH_BLITTER. We ignore the other operations for now as those don't support compute or blit yet. Of course, there is no code to attempt to launch BLORP operations on either the compute or blitter batches yet, but that will come in time. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14687>
* mesa/*: use an internal enum for tessellation primitive types.Dave Airlie2022-01-191-1/+1
| | | | | | | | | | To avoid dragging gl.h into places it has no business being, defined tessellation primitive mode to an enum. This has a lot of fallout all over the place. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
* iris: utrace/perfetto supportLionel Landwerlin2022-01-141-0/+5
| | | | | | | | | | | v2: Fixup gpu_id computation, use minor of /dev/dri/* % 128 since we don't know whether we get card0 or renderD128 for instance. (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> (v1) Acked-by: Antonio Caggiano <antonio.caggiano@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13996>
* anv,iris: PSS Stall Sync around color fast clearsNanley Chery2022-01-121-0/+1
| | | | | | | | Needed for XeHP (see Bspec 47704). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14024>
* iris: Program pixel hashing tables on XeHP.Francisco Jerez2022-01-101-0/+3
| | | | | | | | | | | | | | | Unlike the Gen11 code, this requires us to allocate a pipe_resource for the pixel pipe hashing tables and hold a reference to it from the context, since we need to add it to the validation list of every batch, the tables may be accessed by the hardware at any time after they're specified via 3DSTATE_SLICE_TABLE_STATE_POINTERS. Note that this has an effect even for unfused native die platforms, since the pixel pipe hashing tables we intend to program aren't equivalent to the hardware's defaults on such configs. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
* intel: fix INTEL_DEBUG environment variable on 32-bit systemsMarcin Ślusarz2021-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | INTEL_DEBUG is defined (since 4015e1876a77162e3444eeaa29a0dfbc47efe90e) as: #define INTEL_DEBUG __builtin_expect(intel_debug, 0) which unfortunately chops off upper 32 bits from intel_debug on platforms where sizeof(long) != sizeof(uint64_t) because __builtin_expect is defined only for the long type. Fix this by changing the definition of INTEL_DEBUG to be function-like macro with "flags" argument. New definition returns 0 or 1 when any of the flags match. Most of the changes in this commit were generated using: for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c done but it didn't handle all cases and required minor cleanups (like removal of round brackets which were not needed anymore). Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
* iris: Enable geometry distributionAnuj Phogat2021-10-131-0/+1
| | | | | | | | | | | | | | | | | | | Using recommended values based on performance studies across a range of workloads. Rework: * Always enable geometry distribution * Set ListCutIndexEnable if primitive restart is enabled * Set distribution mode based on TEEnable v2: - Flag missing IRIS_DIRTY_VFG bit (Ken) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12091>
* iris: Move suballocated resources to a dedicated allocation on exportKenneth Graunke2021-10-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't want to export suballocated resources to external consumers, for a variety of reasons. First of all, it would be exporting random other pieces of memory which we may not want those external consumers to have access to. Secondly, external clients wouldn't be aware of what buffers are packed together and busy-tracking implications there. Nor should they be. And those are just the obvious reasons. When we allocate a resource with the PIPE_BIND_SHARED flag, indicating that it's going to be used externally, we avoid suballocation. However, there are times when the client may suddenly decide to export a texture or buffer, without any prior warning. Since we had no idea this buffer would be exported, we suballocated it. Unfortunately, this means we need to transition it to a dedicated allocation on the fly, by allocating a new buffer and copying the contents over. Making things worse, this often happens in DRI hooks that don't have an associated context (which we need to say, run BLORP commands). We have to create an temporary context for this purpose, perform our blit, then destroy it. The radeonsi driver uses a permanent auxiliary context stored in the screen for this purpose, but we can't do that because it causes circular reference counting. radeonsi doesn't do the reference counting that we do, but also doesn't use u_transfer_helper, so they get lucky in avoiding stale resource->screen pointers. Other drivers don't create an auxiliary context, so they avoid this problem for now. For auxiliary data, rather than copying it over bit-for-bit, we simply copy over the underlying data using iris_copy_region (GPU memcpy), and take whatever the resulting aux state is from that operation. Assuming the copy operation compresses, the result will be compressed. v2: Stop using a screen->aux_context and just invent one on the fly to avoid circular reference counting issues. Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12623>
* iris: Move iris_set_max_shader_compiler_threads and ↵Ian Romanick2021-09-171-0/+1
| | | | | | | | | | iris_is_parallel_shader_compilation_finished There's going to be at least one more shader function set in pipe_screen, so it makes more sense to do it in iris_program.c. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
* iris: Eliminate iris_uncompiled_shader::needs_edge_flagIan Romanick2021-09-171-2/+0
| | | | | | | | | | Use the flag that was set by nir_lower_passthrough_edgeflags. The lowering passes will soon be moved to a finalize_nir hook, so there won't be any choice. Ideally we'd like to eliminate iris_fix_edge_flags completely, and this is a first step. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
* iris: crocus: Use shader_info::is_arb_asm flagIan Romanick2021-09-171-3/+0
| | | | | | | | | | | | | ...instead of looking for "ARB" in the name of the shader. This matches the behavior of i965. Using "ARB" was added in a1ebac3750e ("iris: Implement ALT mode for ARB_{vertex,fragment}_shader"), but there's no explanation of why that method was used. v2: Just use shader_info::is_arb_asm everywhere instead of iris_uncompiled_shader::use_alt_mode. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
* iris: Make sure a bound resource is flushed after iris_dirty_for_history.Francisco Jerez2021-09-021-0/+1
| | | | | | | | | | | | | | | This is the last step before we can start removing the history flush mechanism: In cases where a dirtied buffer has the potential to be concurrently bound to the pipeline (as indicated by the bind_history mask), flag the "flush" dirty bits corresponding to its binding point. This ensures that the buffer-local memory barriers introduced earlier in this series are executed before the next draw call, which in turn will emit any necessary PIPE_CONTROLs in cases where the buffer is bound through a cache incoherent with the cache that performed the write. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12691>
* iris: Track dirty UBOs per-stage for more targeted flushing.Francisco Jerez2021-09-021-0/+1
| | | | | | | | | | | | | | | | | | | | This allows us to skip over individual constant buffer bindings which haven't been changed since the last flush, or which are set to a user buffer, which means they don't require flushing. Omitting this commit would lead to the following statistically significant Piglit Draw Overhead regressions: 107/DrawArrays (16 VBO| 8 UBO| 8 Tex) w/ 1 UBO change: XXX ±2.31% x22 -> XXX ±2.55% x21 d=-3.49% ±2.38% p=0.00% 79/DrawArrays ( 1 VBO| 8 UBO| 8 Tex) w/ 8 UBOs change: XXX ±1.90% x22 -> XXX ±2.25% x21 d=-3.20% ±2.04% p=0.00% 78/DrawArrays ( 1 VBO| 8 UBO| 8 Tex) w/ 1 UBO change: XXX ±2.64% x22 -> XXX ±2.58% x21 d=-2.74% ±2.58% p=0.12% 45/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ 1 UBO change: XXX ±2.53% x22 -> XXX ±2.29% x21 d=-2.41% ±2.39% p=0.20% 108/DrawArrays (16 VBO| 8 UBO| 8 Tex) w/ 8 UBOs change: XXX ±2.10% x22 -> XXX ±1.41% x21 d=-2.36% ±1.78% p=0.01% 16/DrawElements ( 1 VBO| 8 UBO| 8 Tex) w/ 1 UBO change: XXX ±2.44% x22 -> XXX ±1.19% x21 d=-2.12% ±1.93% p=0.09% 46/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ 8 UBOs change: XXX ±2.93% x22 -> XXX ±2.44% x21 d=-1.99% ±2.68% p=1.93% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12691>
* iris: Use separate dirty bits for UBO and SSBO flushes.Francisco Jerez2021-09-021-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | This moves UBO+SSBO flushing into a dirty bit separate from the one used for image and sampler views, which saves some CPU overhead in the frequent case where buffers from only one or the other set are updated. Omitting this commit would lead to the following statistically significant Piglit Draw Overhead regressions: 107/DrawArrays (16 VBO| 8 UBO| 8 Tex) w/ 1 UBO change: XXX ±2.31% x22 -> XXX ±1.80% x21 d=-24.31% ±1.91% p=0.00% 78/DrawArrays ( 1 VBO| 8 UBO| 8 Tex) w/ 1 UBO change: XXX ±2.64% x22 -> XXX ±2.21% x21 d=-24.13% ±2.22% p=0.00% 45/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ 1 UBO change: XXX ±2.53% x22 -> XXX ±1.90% x21 d=-23.63% ±2.07% p=0.00% 16/DrawElements ( 1 VBO| 8 UBO| 8 Tex) w/ 1 UBO change: XXX ±2.44% x22 -> XXX ±1.97% x21 d=-23.23% ±2.04% p=0.00% 108/DrawArrays (16 VBO| 8 UBO| 8 Tex) w/ 8 UBOs change: XXX ±2.10% x22 -> XXX ±1.50% x21 d=-22.15% ±1.71% p=0.00% 79/DrawArrays ( 1 VBO| 8 UBO| 8 Tex) w/ 8 UBOs change: XXX ±1.90% x22 -> XXX ±1.70% x21 d=-22.12% ±1.64% p=0.00% 17/DrawElements ( 1 VBO| 8 UBO| 8 Tex) w/ 8 UBOs change: XXX ±2.85% x22 -> XXX ±1.59% x21 d=-21.03% ±2.22% p=0.00% 46/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ 8 UBOs change: XXX ±2.93% x22 -> XXX ±1.09% x21 d=-20.62% ±2.18% p=0.00% 7/DrawElements ( 1 VBO| 8 UBO| 8 Tex) w/ vertex attrib change: XXX ±9.30% x22 -> XXX ±7.02% x21 d=-6.49% ±8.08% p=1.19% 68/DrawArrays ( 1 VBO| 8 UBO| 8 Tex) w/ shader program change: XXX ±1.60% x22 -> XXX ±1.93% x21 d=-2.23% ±1.75% p=0.01% 6/DrawElements ( 1 VBO| 8 UBO| 8 Tex) w/ shader program change: XXX ±2.90% x22 -> XXX ±2.71% x21 d=-2.04% ±2.78% p=2.08% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12691>
* iris: Add separate dirty bit for VBO flushes.Francisco Jerez2021-09-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | Instead of emitting barriers every time IRIS_DIRTY_VERTEX_BUFFERS is flagged, use a separate dirty bit and optimize out the barriers in cases where the same buffer object is re-bound as vertex buffer. Omitting this commit would lead to the following statistically significant Piglit Draw Overhead regressions: 36/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ vertex attrib change: XXX ±7.22% x22 -> XXX±11.09% x21 d=-20.10% ±8.06% p=0.00% 98/DrawArrays (16 VBO| 8 UBO| 8 Tex) w/ vertex attrib change: XXX ±7.27% x22 -> XXX ±7.70% x21 d=-17.76% ±6.83% p=0.00% 69/DrawArrays ( 1 VBO| 8 UBO| 8 Tex) w/ vertex attrib change: XXX ±9.94% x22 -> XXX ±8.72% x21 d=-7.46% ±9.08% p=1.02% 53/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ depth enable change: XXX ±8.34% x22 -> XXX ±6.88% x21 d=-7.30% ±7.45% p=0.26% 61/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ cull face enable change: XXX±10.22% x22 -> XXX ±8.63% x21 d=-6.75% ±9.23% p=2.11% 55/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ stencil enable change: XXX ±9.30% x22 -> XXX ±7.25% x21 d=-6.60% ±8.16% p=1.14% 50/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ viewport change: XXX ±6.48% x22 -> XXX ±5.93% x21 d=-6.58% ±6.04% p=0.09% 54/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ depth clamp enable change: XXX ±9.95% x22 -> XXX ±7.95% x21 d=-6.50% ±8.81% p=2.02% 35/DrawElements (16 VBO| 8 UBO| 8 Tex) w/ shader program change: XXX ±7.27% x22 -> XXX ±7.25% x21 d=-5.77% ±7.06% p=1.06% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12691>
* gallium: remove vertices_per_patch, add pipe_context::set_patch_verticesMarek Olšák2021-08-211-0/+1
| | | | | | | | | | | We would like draw-only display lists to have immutable draw info and this is the only GL non-draw state in pipe_draw_info (not counting view_mask). It also allows removing some code from draw_vbo for tessellation. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12351>
* iris: declare padding for iris_vue_prog_keyMao, Marc2021-08-181-0/+5
| | | | | | | | | | Otherwise with some compilers/environments (Android) padding may contain garbage and memcmp of the key will fail. Cc: mesa-stable Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12438>
* iris: Split iris_upload_shader in twoIan Romanick2021-07-281-8/+10
| | | | | | | | | | Now the part that uploads the shader and the part that finishes the creation of the shader are separated. Each now has a more reasonable number of parameters. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229>
* iris: Enable threaded shader compilationIan Romanick2021-07-281-0/+3
| | | | | | | | | | | | | | | | | | | | | There are a couple minor things that can be improved: 1. Eliminate (or reduce) the dynamic allocation of the threaded_compile_job. 2. For apps like shader-db, improve the case where nr_threads=0. Right now this adds thread switching and mutex overhead. 3. Other performance improvements? iris_uncompiled_shader::variants has some special properties that make it ripe for replacement with a lockless list. Without gathering some data, it's hard to guess what impact that could have. v2: Fix whitespace and formatting issues. Noticed by Ken. s/threaded_compile_job/iris_threaded_compile_job/g. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229>
* iris: Add the variant to the list as early as possibleIan Romanick2021-07-281-17/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I tried to find a way to break this into some smaller commits, but everything is very intertwined. :( When searching the variants list in the iris_uncompiled_shader, add the new variant if it is not found. This will be necessary for threaded shader compilation. This conceptually simple change had a bunch of fallout. Much of this was at least conceptually borrowed from radeonsi. - Other threads might find a variant in the list before the variant has been compiled. To accomdate this, add a fence. Each thread will wait on the fence in the variant when searching the list. - A variant in the list may fail compilation. To accomodate this, add a flag. All paths will examine iris_compiled_shader::compilation_failed before trying to use the variant. - The race condition between multiple threads trying to create the same variant at the same time is handled *before* both thread spend the effort to compile the shader. The means that iris_upload_shader cannot change shaders on the caller, so it does not need to return anything. v2: Change "found" parameter of find_or_add_variant to "added." This inverts the values returned, and it probably makes uses of the returned value more easily understood. Always set the value in the called function. Suggested by Ken. v3: Move shader->compilation_failed check to avoid shader != NULL test. Rearrange some logic and add a comment in iris_update_compiled_tcs. Suggested by Ken. Don't call find_or_add_variant in iris_create_shader_state. See https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229#note_1000843 for more details. Noticed by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229>
* iris: Allocate shader variant in caller of iris_upload_shaderIan Romanick2021-07-281-0/+1
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229>
* iris: Extract allocation bits from iris_upload_shader to ↵Ian Romanick2021-07-281-0/+7
| | | | | | | | | | | | | iris_create_shader_variant The added assertion in iris_create_shader_variant helped catch a bug in the next commit. v2: Drop (unnecessary) initialization of shader->assembly.res when moving to iris_create_shader_variant. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229>