| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vl functions moved from radeonsi to gallium/auxiliary/vl have left
android build of radeonsi in broken state.
libmesa_galliumvl static is need to build readeonsi,
gallium_dri building rules are reworked to avoid multiple symbols
and libmesa_galliumvl static dependency is needed in radeonsi.
Here is the changelog:
- android: gallium/auxiliary: add libmesa_galliumvl static
- android: gallium_dri: move libmesa_gallium to static to prevent multiple symbols
- android: radeonsi: fix build after vl refactoring
Fixes the following building error:
external/mesa/src/gallium/drivers/radeonsi/si_uvd.c:47:
error: undefined reference to 'vl_video_buffer_create_as_resource'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)
Fixes: 86e60bc ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This is for state commands like CmdSetViewport that can be used outside of
a renderpass. Accumulating those into draw_cs outside of the renderpass
should have the desired effect.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
| |
Somehow adjusting maxloc based on existing outputs got lost, resulting
in the clipdist varying clobbering the position varying. Causing a
shader that had no position output in freedreno/ir3, which triggers GPU
hangs in neverball.
Fixes: d0f746b6458 ("nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic to ensure VS and BS inputs are aligned wasn't accounting for
unused inputs in VS. This *usually* doesn't happen, but it seems it
can in the case of ARB programs?
Fixes assert:
```
fd6_program_create: Assertion `bs->inputs[i].regid == vs->inputs[i].regid' failed.
```
Fixes: 882d53d8e36 ("freedreno/ir3+a6xx: same VBO state for draw/binning")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes crashes that were unnoticed in CI because debug_assert() was not
enabled (but become real crashes after the next patch):
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_highp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_lowp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_mediump_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_highp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_lowp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_mediump_geometry
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following programming note shows up in all 3DSTATE_CONSTANT_*
packets:
"The sum of all four read length fields must be less than or equal to
the size of 64."
The backend compiler should guarantee this for us, so let's just add a
check here.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use this new instruction introduced in Gen12. The instruction itself is
smaller, and it also allows us to emit a single instruction to all
stages that have the same push constant buffers (e.g. when they don't
have constant buffers).
There's one restriction to use this instruction, though: the length
field is only 5 bits long, so we need to check whether we can use it,
and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32.
v2:
- Rebased on top of the lasted changes from Jason.
- Added review suggestions by Caio.
- Removed struct push_bos and merged some code into
anv_nir_compute_push_layout().
v3:
- Remove code churn due to gen8+ workaround in
anv_nir_compute_push_layout(). This code has been removed in an earlier
commit, and implemented in cmd_buffer_emit_push_constant().
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Add a helper function to get the push range address. Once we have a
separate function for emitting gen12 push constants, we can use this
helper and avoid duplicating code.
v3: Do not add range->start to the address in gen7 (Caio).
v4: Do not drop range->start from gen7 (Caio, Jason).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Store push_ranges in ascending order, and only "shift" them to the end
of the array during state packet emission.
We don't need this workaround with the new 3DSTATE_CONSTANT_ALL packet.
So instead of applying the workaround here just for GEN < 12 (which
requires and extra loop through all the ranges to figure out if we
should shift them or not), we simply move the whole logic to the state
emission code. At that point, in a later commit, we are already looping
through all of the ranges anyway to check which packet we will be using,
so we might as well implement the workaround there, where it is going to
be used.
v3: Move gen8+ workaround to the state emission code (Caio).
v4: Add explanation of why we moved the workaroudn (Caio).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use this new instruction introduced in Gen12. The instruction itself is
smaller, and it also allows us to emit a single instruction to all
stages that have the same push constant buffers (e.g. when they don't
have constant buffers).
There's one restriction to use this instruction, though: the length
field is only 5 bits long, so we need to check whether we can use it,
and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32.
v2 (Suggestions from Caio):
- use max_length instead of large_buffers.
- remove UNUSED and use #if GEN_GEN >= 12 instead.
- inline "buffers" and drop BITSET_RANGE() usage.
- add assert(n <= max_pointers)
- move emit to outside of the loop.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split into a function the logic to gather the push constant buffers,
which now stores them in struct push_bos. Another function is added to
emit the packet, using data from the push_bos struct.
This will be useful when adding a new function for emitting push
constants for newer platforms.
v2 (Suggestions from Caio):
- rename 'n' -> 'buffer_count'
- remove large_buffers (for now)
- initialize push_bos
- remove assert
- change for() condition (i <= 3 -> i < 4)
v3:
- Add comment about size limit.
- Rework "shift" logic and 'for' loop.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
In blorp, all the push constants are disabled, so we only need to emit a
single 3DSTATE_CONSTANT_ALL with the bitmask for stage update
appropriately set.
v2: Update comment (Caio).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
| |
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
|
|
| |
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
| |
These bits are ignored when clearing so don't bother setting them.
Note: MSAA samples when clearing comes from other registers (tu6_emit_msaa)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
| |
Passes these deqp tests: dEQP-VK.api.image_clearing.core.*attach*single*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
| |
This makes it easier to find the gmem_offset associated with an attachment.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have tiled format modifier merged into linux we can enable tiling.
That should improve overall performance and also workaround broken mipmapping
for linear textures since now we prefer tiled textures.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Patch adds additional linker check for SSO programs to make sure they
are redeclaring built-in blocks as required by the desktop spec.
This fixes following Piglit tests:
arb_separate_shader_objects/linker/pervertex-*
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
|
|
|
|
|
|
|
|
| |
Commit also updates the Piglit quick_gl.txt, list modifications happened
due to following Piglit commits: c248bf201,c acff58ca, 5603e2e60.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
|
|
|
|
|
|
|
|
|
| |
v2: add test
Fixes: ce9205c03bd ('nir: add a load/store vectorization pass')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v2)
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
|
|
|
| |
Previously, it would only work when the ballot size was set to the
lane mask. This patch makes is possible to set the ballot size
to either 32-bit or 64-bit for wave32 mode.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
|
|
|
| |
Currently all usages of exec and vcc are hardcoded to use s2 regclass.
This commit makes it possible to use s1 in wave32 mode and
s2 in wave64 mode.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several places in ACO we use SOP1 or SOP2 instructions to operate over the
exec mask or VCC, and these need to be adapted to the new size in wave32
mode.
This commit adds a way to deal with this problem in aco_builder: the caller
can specify a wave size specific opcode and the builder will translate that
to the correct opcode based on the current wave size.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
|
|
| |
This is relevant because in wave32 mode the v_mbcnt_hi_u32_b32
instruction is superfluous.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
|
|
|
|
|
|
|
|
|
|
|
| |
And only use --process-isolation false for the quick_gl tests.
This will hopefully avoid variance in the test results that we've been
seeing lately. But even if it doesn't, it should at least help narrow
down the cause of the variance.
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
This expression was unused by the macro, probably why it didn't
register in the compilation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a more accurate description of what happens in processing the
OA reports.
Previously we only had a somewhat difficult to parse state machine
tracking the context ID.
What we really only need to do to decide if the delta between 2
reports (r0 & r1) should be accumulated in the query result is :
* whether the r0 is tagged with the context ID relevant to us
* if r0 is not tagged with our context ID and r1 is: does r0 have a
invalid context id? If not then we're in a case where i915 has
resubmitted the same context for execution through the execlist
submission port
v2: Update comment (Ken)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we read the OA reports late enough after the query happens, we can
get a timestamp in the report that is significantly in the past
compared to the start timestamp of the query. The current code must
deal with the wraparound of the timestamp value (every ~6 minute). So
consider that if the difference is greater than half that wraparound
period, we're probably dealing with an old report and make the caller
aware it should read more reports when they're available.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We always add an empty buffer in the list when creating the query.
Let's set the len appropriately so that we can recognize it when we
read OA reports up to the end of a query.
We were using an 0 timestamp value associated with the empty buffer
and incorrectly assuming this was a valid value. In turn that led to
not reading enough reports and resulted in deltas added to our counter
values which should have been discarded because those would be flagged
for a different context.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Accumulation happens between 2 reports, it can be between a start/end
report from another context. So only consider updating the hw_id of
the results when it's not already valid and that we have a valid value
to put in there.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
| |
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
|
| |
My fix wasn't totally correct as pointed out by Marek.
Ported from RadeonSI.
Fixes: deafe4cc587 ("radv/gfx10: fix primitive indices orientation for NGG GS")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
|
|
|
|
|
| |
The number of loaded channels should always be > 0 now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|