summaryrefslogtreecommitdiff
path: root/src/compiler/nir/nir_range_analysis.c
Commit message (Collapse)AuthorAgeFilesLines
* nir: Fix use of alloca() without #include c99_alloca.hSil Vilerino2023-03-291-0/+1
| | | | | Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22150>
* nir/range_analysis: use perform_analysis() in nir_analyze_range()Rhys Perry2023-03-221-104/+170
| | | | | | Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21381>
* nir/range_analysis: use perform_analysis() in nir_unsigned_upper_bound()Rhys Perry2023-03-221-385/+426
| | | | | | Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21381>
* nir/range_analysis: add helpers for limiting stack usageRhys Perry2023-03-221-0/+82
| | | | | | Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21381>
* nir/range_analysis: add missing masking of shift amountsRhys Perry2023-03-221-4/+8
| | | | | | | Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: 72ac3f60261 ("nir: add nir_unsigned_upper_bound and nir_addition_might_overflow") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21381>
* nir: Add load_typed_buffer_amd intrinsic.Timur Kristóf2023-03-151-0/+19
| | | | | | | | | | | | | | This new intrinsic maps to the MTBUF instruction format on AMD GPUs and represents a typed buffer load in NIR. Also add an unsigned upper bound for the new intrinsic. Code for that ported from aco_instruction_selection_setup. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16805>
* nir/range_analysis: fix vectorized phis and intrinsicsRhys Perry2023-03-041-6/+6
| | | | | | | | Found by inspection. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21288>
* nir: Apply a maximum stack depth to avoid stack overflows.Bas Nieuwenhuizen2023-02-111-15/+37
| | | | | | | | | A stackless (or at least using allocated memory for stack) version might be nice but for now this works around some games compiling large shaders and hitting stack overflows. CC: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21231>
* nir/range_analysis: unsigned upper bound analysis for b2iRhys Perry2022-12-061-0/+8
| | | | | | | | | | | | | | | | fossil-db (navi21): Totals from 93 (0.07% of 135636) affected shaders: Instrs: 133949 -> 133899 (-0.04%); split: -0.05%, +0.01% CodeSize: 708124 -> 707528 (-0.08%); split: -0.09%, +0.01% Latency: 2451564 -> 2450158 (-0.06%); split: -0.06%, +0.00% InvThroughput: 398282 -> 397345 (-0.24%) SClause: 4441 -> 4437 (-0.09%); split: -0.18%, +0.09% Copies: 7578 -> 7546 (-0.42%); split: -0.55%, +0.13% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20117>
* nir/range_analysis: Set higher default maximum for max_workgroup_countIan Romanick2022-11-191-1/+12
| | | | | | | Fixes: c2a81ebe19f ("nir: Add default unsigned upper bound configuration.") Closes: #7676 Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19835>
* nir/range_analysis: Teach range analysis about fdot opcodesIan Romanick2022-06-231-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This really, really helps on platforms where fabs() isn't free. A great many shaders use a * frsq(fabs(fdot(a, a))) to normalize a vector. Since the result of the fdot must be non-negative, the fabs can be eliminated by an existing algebraic rule. shader-db results: r300 (run on R420 - X800XL) total instructions in shared programs: 1369807 -> 1368550 (-0.09%) instructions in affected programs: 59986 -> 58729 (-2.10%) helped: 609 HURT: 0 total vinst in shared programs: 512899 -> 512861 (<.01%) vinst in affected programs: 1522 -> 1484 (-2.50%) helped: 36 HURT: 0 total sinst in shared programs: 260690 -> 260570 (-0.05%) sinst in affected programs: 1419 -> 1299 (-8.46%) helped: 120 HURT: 0 total consts in shared programs: 957295 -> 957230 (<.01%) consts in affected programs: 849 -> 784 (-7.66%) helped: 65 HURT: 0 LOST: 0 GAINED: 3 The 3 gained shaders are all vertex shaders from XCom: Enemy Unknown. I'm guessing that game is never going to run on my X800XL. :) i915 total instructions in shared programs: 791121 -> 780843 (-1.30%) instructions in affected programs: 220170 -> 209892 (-4.67%) helped: 2085 HURT: 0 total temps in shared programs: 47765 -> 47766 (<.01%) temps in affected programs: 9 -> 10 (11.11%) helped: 0 HURT: 1 total const in shared programs: 93048 -> 92983 (-0.07%) const in affected programs: 784 -> 719 (-8.29%) helped: 65 HURT: 0 LOST: 0 GAINED: 36 Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown) total instructions in shared programs: 16702250 -> 16697908 (-0.03%) instructions in affected programs: 119277 -> 114935 (-3.64%) helped: 1065 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 4.08 x̃: 4 helped stats (rel) min: 0.48% max: 10.17% x̄: 3.66% x̃: 3.94% 95% mean confidence interval for instructions value: -4.26 -3.89 95% mean confidence interval for instructions %-change: -3.76% -3.56% Instructions are helped. total cycles in shared programs: 880772068 -> 880734134 (<.01%) cycles in affected programs: 2134456 -> 2096522 (-1.78%) helped: 941 HURT: 324 helped stats (abs) min: 2 max: 2180 x̄: 123.06 x̃: 44 helped stats (rel) min: 0.04% max: 49.96% x̄: 7.08% x̃: 3.81% HURT stats (abs) min: 2 max: 2098 x̄: 240.33 x̃: 35 HURT stats (rel) min: 0.04% max: 77.07% x̄: 12.34% x̃: 3.00% 95% mean confidence interval for cycles value: -47.93 -12.04 95% mean confidence interval for cycles %-change: -2.87% -1.34% Cycles are helped. No shader-db changes on any other Intel platform. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17181>
* nir: Add upper bound for AMD shader arg intrinsics.Timur Kristóf2022-05-101-0/+7
| | | | | | Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13155>
* nir: Add a helper for setting up a nir_ssa_scalar struct.Emma Anholt2022-03-021-8/+8
| | | | | | | | | Trivial, but will help users avoid some struct constructions that can be awkward in C. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
* nir: Constify def parameter to nir_ssa_def_bits_usedIan Romanick2022-02-101-2/+2
| | | | | Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>
* nir/algebraic: optimize expressions using fmulz/ffmazRhys Perry2022-01-201-9/+17
| | | | | | Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>
* nir/unsigned_upper_bound: don't follow 64-bit f2u32()Rhys Perry2022-01-171-1/+1
| | | | | | | | | Fixes Doom Eternal crash. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: 72ac3f60261 ("nir: add nir_unsigned_upper_bound and nir_addition_might_overflow") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14555>
* nir: Fix local_invocation_index upper bound for non-compute-like stages.Timur Kristóf2021-08-301-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lowered LS and NGG stages use local_invocation_index and they can benefit from the unsigned upper bound because they can emit a less expensive integer multiplication instruction. This was working in the past, but accidentally borked by a refactor. Fossil DB changes on Sienna Cichlid: Totals from 956 (0.74% of 128647) affected shaders: CodeSize: 2354172 -> 2344712 (-0.40%) Instrs: 434359 -> 434327 (-0.01%) Latency: 1883949 -> 1876814 (-0.38%) InvThroughput: 762638 -> 757405 (-0.69%) Fossil DB changes on Sienna Cichlid (with NGGC enabled): Totals from 57873 (44.99% of 128647) affected shaders: CodeSize: 155844192 -> 155607064 (-0.15%) Instrs: 29799184 -> 29799152 (-0.00%) Latency: 130959764 -> 130814224 (-0.11%); split: -0.11%, +0.00% InvThroughput: 21100300 -> 20928635 (-0.81%); split: -0.81%, +0.00% Fixes: 8af6766062044167fb3b61950ddbc7d67e4c3e48 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
* nir: Add unsigned upper bound for extract opcodes.Timur Kristóf2021-08-301-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This helps with some cases of extract, such as: - Emitting more optimal integer multiplications - Better address calculation - Possibly others Fossil DB results on Sienna Cichlid: Totals from 4064 (3.16% of 128647) affected shaders: VGPRs: 262040 -> 262032 (-0.00%) CodeSize: 28856648 -> 28811892 (-0.16%); split: -0.18%, +0.02% Instrs: 5370279 -> 5367827 (-0.05%); split: -0.08%, +0.04% Latency: 74230112 -> 74016671 (-0.29%); split: -0.29%, +0.01% InvThroughput: 12082532 -> 12036365 (-0.38%); split: -0.39%, +0.01% VClause: 108506 -> 108721 (+0.20%); split: -0.03%, +0.22% SClause: 217731 -> 216602 (-0.52%); split: -0.67%, +0.15% Copies: 265689 -> 270811 (+1.93%); split: -0.26%, +2.19% PreSGPRs: 201982 -> 204907 (+1.45%); split: -0.01%, +1.46% PreVGPRs: 236099 -> 236079 (-0.01%) Fossil DB results on Sienna Cichlid with NGGC enabled: Totals from 60375 (46.93% of 128647) affected shaders: VGPRs: 2212576 -> 2212568 (-0.00%) CodeSize: 180870420 -> 179684816 (-0.66%); split: -0.66%, +0.00% Instrs: 34386715 -> 34213682 (-0.50%); split: -0.51%, +0.01% Latency: 199676290 -> 198987998 (-0.34%); split: -0.35%, +0.00% InvThroughput: 32288299 -> 31736433 (-1.71%); split: -1.71%, +0.00% VClause: 621521 -> 621743 (+0.04%); split: -0.00%, +0.04% SClause: 900447 -> 899392 (-0.12%); split: -0.16%, +0.04% Copies: 3439529 -> 3445305 (+0.17%); split: -0.02%, +0.19% PreSGPRs: 2216297 -> 2219220 (+0.13%); split: -0.00%, +0.13% PreVGPRs: 1842887 -> 1842867 (-0.00%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
* amd: Add extra source to the mbcnt_amd NIR intrinsic.Timur Kristóf2021-06-091-1/+10
| | | | | | | | | The v_mbcnt instructions can take an extra source that they add to the result. This is not exposed in SPIR-V but we now expose it in NIR. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
* nir: Add nir_op_sad_u8x4 which corresponds to AMD's v_sad_u8.Timur Kristóf2021-06-091-0/+3
| | | | | | | | | NIR currently doesn't have any intrinsics for a horizontal packed add, so this one is modeled after AMD's v_sad_u8. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
* nir: Move workgroup_size and workgroup_variable_size into common shader_infoCaio Marcelo de Oliveira Filho2021-06-081-13/+13
| | | | | | | | Move it out the "cs" sub-struct, since these will be used for other shader stages in the future. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>
* nir: Rename WORK_GROUP (and similar) to WORKGROUPCaio Marcelo de Oliveira Filho2021-06-071-18/+18
| | | | | | | | | | | Be consistent with other usages in Vulkan and SPIR-V, and the recently added workgroup_size field. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
* compiler: Rename local_size to workgroup_sizeCaio Marcelo de Oliveira Filho2021-06-071-12/+12
| | | | | | | | Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
* nir/unsigned_upper_bound: don't require dominance metadataRhys Perry2021-06-041-8/+2
| | | | | | | | Instead, determine if it's a merge or loop exit phi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9808>
* nir: Support upper bound of unsigned bit size conversions.Timur Kristóf2021-05-121-0/+15
| | | | | | | | | These allow us to generate slightly better code in some cases, eg. multiplications in ACO. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
* nir: Support upper bound of subgroup_id/num_subgroups for non-compute.Timur Kristóf2021-05-121-2/+2
| | | | | | | | | | These intrinsics will be used when lowering NGG shaders, including currently supported stages like VS, TES, GS and also by mesh shaders in the future. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
* nir: Add tessellation related AMD-specific intrinsics.Timur Kristóf2021-03-171-0/+5
| | | | | | Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>
* nir: Add default unsigned upper bound configuration.Timur Kristóf2021-03-171-0/+17
| | | | | | Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>
* nir: Add unsigned upper bound for TCS load_invocation_id.Timur Kristóf2021-03-171-0/+6
| | | | | | Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>
* nir: Fix unsigned upper bound of local_invocation_index for non-CS stages.Timur Kristóf2021-03-171-1/+2
| | | | | | Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>
* nir/range_analysis: Simplify analysis of bcselIan Romanick2021-03-111-56/+1
| | | | | | | | union_ranges was previously guarded by 'ifndef NDEBUG'. After removing that, I noticed that the two tables were identical. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
* nir/range_analysis: Fix analysis of fmin, fmax, or fsat with NaN sourceIan Romanick2021-03-111-5/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recall that when either value is NaN, fmax will pick the other value. This means the result range of the fmax will either be the "ideal" result range (calculated above) or the range of the non-NaN value. Previously, something like fmax({gt_zero}, {lt_zero, is_a_number}) would return a range of gt_zero. However, if the "gt_zero" parameter is NaN, the actual result will be the "lt_zero" parameter. This analysis depends on the is_a_number analysis also added in this MR. Assuming this doesn't cause any unforeseen problems, I believe we should wait a bit, then nominate a subset of the series for the stable branches. This fixes the piglit tests tests/spec/glsl-1.30/execution/range_analysis_fmax_of_nan.shader_test tests/spec/glsl-1.30/execution/range_analysis_fmin_of_nan.shader_test from https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/463. Even with the added fsat fixes, range_analysis_fsat_of_nan.shader_test still fails. There are some other issues there that will be addressed in later commits (in another MR). v2: Add fsat fixes. Suggested by Rhys. Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Shader-db results: All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21049290 -> 21049314 (<.01%) instructions in affected programs: 3175 -> 3199 (0.76%) helped: 0 HURT: 17 HURT stats (abs) min: 1 max: 3 x̄: 1.41 x̃: 1 HURT stats (rel) min: 0.20% max: 1.89% x̄: 0.97% x̃: 0.92% 95% mean confidence interval for instructions value: 1.09 1.73 95% mean confidence interval for instructions %-change: 0.75% 1.19% Instructions are HURT. total cycles in shared programs: 855136176 -> 855136406 (<.01%) cycles in affected programs: 37579 -> 37809 (0.61%) helped: 0 HURT: 17 HURT stats (abs) min: 12 max: 20 x̄: 13.53 x̃: 14 HURT stats (rel) min: 0.17% max: 1.13% x̄: 0.79% x̃: 0.91% 95% mean confidence interval for cycles value: 12.53 14.53 95% mean confidence interval for cycles %-change: 0.63% 0.94% Cycles are HURT. Fossil-db results: Tiger Lake Instructions in all programs: 160901033 -> 160902591 (+0.0%) SENDs in all programs: 6812270 -> 6812270 (+0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7430016795 -> 7429003266 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304539 (+0.0%) Ice Lake Instructions in all programs: 145299102 -> 145301634 (+0.0%) SENDs in all programs: 6863890 -> 6863890 (+0.0%) Loops in all programs: 38219 -> 38219 (+0.0%) Cycles in all programs: 8798390846 -> 8798589772 (+0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334250 -> 334250 (+0.0%) Skylake Instructions in all programs: 135889478 -> 135892010 (+0.0%) SENDs in all programs: 6802916 -> 6802916 (+0.0%) Loops in all programs: 38216 -> 38216 (+0.0%) Cycles in all programs: 8442624166 -> 8442597324 (-0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301116 -> 301116 (+0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
* nir/range_analysis: Add "is a number" range analysis trackingIan Romanick2021-03-111-13/+83
| | | | | | | | | | | | | | | | | | | This commit is necessary to support "nir/range_analysis: Fix analysis of fmin and fmax with NaN". No shader-db or fossil-db changes on any Intel platform. v2: Pack and unpack is_a_number. v3: Don't set is_a_number of integer constants. The bit pattern might be NaN. v4: Update handling of b2i32. intBitsToFloat(int(true)) is 1.401298464324817e-45. Return a value consistent with that. Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
* nir/range_analysis: Add "is finite" range analysis trackingIan Romanick2021-03-111-13/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | The obvious changes to nir_search_helpers.h are in a separate commit to limit the scope of this change. These additions are really only needed to support the next commit "nir/range_analysis: Add "is a number" range analysis tracking". This reduction in scope is intended to increase the suitability for stable branches. No shader-db or fossil-db changes on any Intel platform. v2: Pack and unpack is_finite. v3: Split nir_search_helpers.h changes into a separate commit. v4: Remove assertion intended for the next commit. Update is_finite comment for fsign. Both noticed by Rhys. Fix is_finite handling for load_const vectors. If any element is not finite, set the flag to false. This is the same way is_integral is already handled. v5: Update handling of b2i32. intBitsToFloat(int(true)) is 1.401298464324817e-45. Return a value consistent with that. Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
* nir/range_analysis: Refactor fsat handlingIan Romanick2021-03-111-8/+11
| | | | | | | | This will greatly simplify a later commit. The assert(r.is_integral) in the eq_zero case is dropped because I don't think it's useful anymore. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
* nir/range_analysis: Handle vectors better in ssa_def_bits_usedIan Romanick2021-02-221-4/+36
| | | | | | | | | | | | | | | If a query is made of a vector ssa_def (possibly from an intermediate result), return all_bits. If a constant source is a vector, swizzle the correct component. Unit tests were added for the constant vector cases. I don't see a great way to make unit tests for the other cases. v2: Add a FINIHSME comment about u16vec2 hardware. Fixes: 96303a59eae ("nir: Add some range analysis for used bits") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9123>
* nir: Add some range analysis for used bitsJason Ekstrand2021-02-161-0/+166
| | | | | | | | This isn't 100% accurate, of course, but it should be good enough for what we're about to do with it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>
* nir: fix determining if an addition might overflow for phi sourcesSamuel Pitoiset2020-12-311-29/+31
| | | | | | | | | | | | | | | nir_addition_might_overflow() expects the parent instruction to be an alu instr but it might be a phi instr. Fix it by assuming that the addition might overflow. This fixes compiler crashes with Horizon Zero Dawn. No fossils-db changes. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8268>
* nir: update fallthrough commentsPierre-Eric Pelloux-Prayer2020-12-011-0/+1
| | | | | | | | clang doesn't support /* fallthrough */ so switch to fallthrough attribute. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7747>
* nir/unsigned_upper_bound: decrement num_sources_left before recursingRhys Perry2020-11-251-1/+2
| | | | | | | | | Otherwise, search_phi_bcsel() will be called with a buf_size that is slightly lower than it has to be. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7748>
* nir/unsigned_upper_bound: fix buffer overflow in search_phi_bcselRhys Perry2020-11-251-8/+11
| | | | | | | | | It should only recurse if there's enough space to add the phi sources. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: 72ac3f60261 ("nir: add nir_unsigned_upper_bound and nir_addition_might_overflow") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7748>
* nir,amd: remove trinary_minmax opcodesDaniel Schürmann2020-08-241-14/+0
| | | | | | | | These consist of the variations nir_op_{i|u|f}{min|max|med}3 which are either lowered in the backend (LLVM) anyway or can be recombined by the backend (ACO). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6421>
* nir: rename nir_op_fne to nir_op_fneuKarol Herbst2020-08-211-1/+1
| | | | | | | | | | | | | It was always fneu but naming it fne causes confusion from time to time. So lets rename it. Later we also want to add other unordered and fne, this is a smaller preparation for that. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
* nir: nir_range_analysis needs to be updated for vec16Jesse Natalie2020-08-111-1/+4
| | | | | | Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6275>
* nir: Add a find_variable_with_[driver_]location helperJason Ekstrand2020-07-291-5/+2
| | | | | | | | | We've hand-rolled this loop 10 places and those are just the ones I found easily. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
* nir: Add nir_foreach_shader_in/out_variable helpersJason Ekstrand2020-07-291-1/+1
| | | | | Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
* nir: add nir_unsigned_upper_bound and nir_addition_might_overflowRhys Perry2020-07-211-0/+417
| | | | | | | | | | | | | This adds a nir_unsigned_upper_bound() helper which does something similar to nir_analyze_range() except it tries to obtain the largest possible value instead of it's relation to zero. It also adds nir_addition_might_overflow(), which uses this helper to try to prove that an unsigned addition does not wrap around. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>
* nir: replace GCC unroll with an option that works on GCC < 8.0Marek Olšák2020-02-271-8/+14
| | | | | Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3970>
* nir: Make unroll pragma work on clangKristian H. Kristensen2020-02-041-9/+18
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3686>
* nir: no-op C99 _Pragma() with MSVCBrian Paul2019-11-231-0/+7
| | | | | | | | | | This fixes a build failure on MSVC. BTW, it looks like clang supports _Pragma() but I don't know if it understands the "gcc unroll N" directive. Signed-off-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>