summaryrefslogtreecommitdiff
path: root/src/amd/compiler/aco_ir.h
diff options
context:
space:
mode:
authorTimur Kristóf <timur.kristof@gmail.com>2023-01-22 19:50:46 +0100
committerTimur Kristóf <timur.kristof@gmail.com>2023-02-18 21:16:58 +0100
commit2c40215ab9d20890efb88e7b3e26ca909d7fd74b (patch)
treed9145f2ece46d97ca33810b58efa39f698613581 /src/amd/compiler/aco_ir.h
parent616d595d18d54c8e39e64386a5a2ac2be8e5fef9 (diff)
downloadmesa-2c40215ab9d20890efb88e7b3e26ca909d7fd74b.tar.gz
aco/optimizer: Change v_cmp with subgroup invocation to constant.
When a shader has a comparison with the subgroup invocation id, we can use a constant instead, saving a VALU instruction. When the constant can't be represented as a 64-bit literal, use the s_bfm_b64 instruction to generate it instead, which is still a win. Fossil DB stats on GFX11: Totals from 300 (0.22% of 134913) affected shaders: CodeSize: 2223052 -> 2214336 (-0.39%); split: -0.43%, +0.04% Instrs: 430216 -> 429882 (-0.08%); split: -0.14%, +0.06% Latency: 5881180 -> 5878181 (-0.05%); split: -0.05%, +0.00% InvThroughput: 731846 -> 729293 (-0.35%) Copies: 31662 -> 31847 (+0.58%); split: -0.03%, +0.61% Branches: 8241 -> 8100 (-1.71%) PreVGPRs: 15788 -> 15786 (-0.01%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20843>
Diffstat (limited to 'src/amd/compiler/aco_ir.h')
-rw-r--r--src/amd/compiler/aco_ir.h1
1 files changed, 1 insertions, 0 deletions
diff --git a/src/amd/compiler/aco_ir.h b/src/amd/compiler/aco_ir.h
index d31e53ad300..f57ba4a9268 100644
--- a/src/amd/compiler/aco_ir.h
+++ b/src/amd/compiler/aco_ir.h
@@ -1883,6 +1883,7 @@ bool needs_exec_mask(const Instruction* instr);
aco_opcode get_ordered(aco_opcode op);
aco_opcode get_unordered(aco_opcode op);
aco_opcode get_inverse(aco_opcode op);
+aco_opcode get_swapped(aco_opcode op);
aco_opcode get_f32_cmp(aco_opcode op);
aco_opcode get_vcmpx(aco_opcode op);
unsigned get_cmp_bitsize(aco_opcode op);