summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Bump version for rc3mesa-19.2.0-rc3Dylan Baker2019-09-111-1/+1
|
* radeonsi/gfx10: fix wave occupancy computationsMarek Olšák2019-09-104-21/+49
| | | | | | Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (cherry picked from commit d95afd8b9e7f9b3880813203292257bf0ed7babf)
* radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contextsMarek Olšák2019-09-101-1/+1
| | | | | | | | This fixes a crash. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (cherry picked from commit 28adf0d00c6b5506ed2206b950336bdc568d2247)
* virgl: Fix pipe_resource leaks under multi-sample.Lepton Wu2019-09-101-1/+3
| | | | | | | | Fixes: 900a80f9e4f ("virgl: virgl_transfer should own its virgl_resource") Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (cherry picked from commit 263136fb5d2646bea718579de272729b2474d31a)
* st/nine: Properly initialize GLSL types for NIR shaders.Timur Kristóf2019-09-091-0/+5
| | | | | | | | NIR shaders use GLSL types (note: these live outside libglsl), and nine needs to properly initialize these just like the other state trackers. This fixes an assertion failure when TTN is used. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
* Revert "ac/nir: Lower large indirect variables to scratch"Bas Nieuwenhuizen2019-09-091-68/+0
| | | | | | | | | | | | | This reverts commit 74470baebbdacc8fd31c9912eb8c00c0cd102903. This change introduces some significant performance regressions. We are fixing those on master, but the follow up work is large enough not to backport to 19.2 . Fixes: 74470baebbd "ac/nir: Lower large indirect variables to scratch" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111576 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir/dead_cf: Repair SSA if the pass makes progressJason Ekstrand2019-09-091-2/+13
| | | | | | | | | | | | | | | | | | | | | The dead_cf pass calls into the CF manipulation helpers which attempt to keep NIR's SSA form sane. However, when the only break is removed from a loop, dominance gets messed up anyway because the CF SSA clean-up code only looks at phis and doesn't consider the case of code becoming unreachable. One solution to this would be to put the loop into LCSSA form before we modify any of its contents. Another (and the approach taken by this pass) is to just run the repair_ssa pass afterwards because the CF manipulation helpers are smart enough to keep all the use/def stuff sane; they just don't always preserve dominance properties. While we're here, we clean up some bogus indentation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit c832820ce959ae7c2b4971befceae634d800330f)
* nir/repair_ssa: Insert deref casts when neededJason Ekstrand2019-09-091-2/+29
| | | | | | Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit 1005272a2b7c744b6ac4d846566359a8ff1b6295)
* nir/repair_ssa: Repair dominance for unreachable blocksJason Ekstrand2019-09-091-4/+8
| | | | | | | | | | | | | NIR currently assumes that unreachable blocks are trivially dominated by everything. However, when considering well-formed SSA, there is no path from any block to an unreachable block. Therefore, we can break any use-def chains where the use is in an unreachable block. This removes any dependencies on code created by uses in unreachable blocks and lets DCE do a better job of cleaning it up. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit a3268599f3c9bb1d92571e15df95750a06114811)
* nir: Add a block_is_unreachable helperJason Ekstrand2019-09-092-0/+15
| | | | | | Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit f81a2623d82ccad6177fe1fe5b80a6398df29b6e)
* nir: Don't infinitely recurse in lower_ssa_defs_to_regs_blockJason Ekstrand2019-09-091-5/+15
| | | | | | Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit 517142252f0c63189293c7176efbf490b7ae95ea)
* nir: Handle complex derefs in nir_split_array_varsJason Ekstrand2019-09-091-2/+5
| | | | | | | | | We already bail and don't split the vars but we were passing a NULL to _mesa_hash_table_search which is not allowed. Fixes: f1cb3348f1 "nir/split_vars: Properly bail in the presence of ..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit 37cdb7fc4465cba67b220f940404338f6ff98ee1)
* drirc: override minImageCount=2 for gfxbenchEric Engestrom2019-09-091-0/+4
| | | | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110765 Fixes: 4689e98fe884d9412b72 ("vulkan/wsi: Set X11 minImageCount to 3.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 27339fe9a74bf57b082b7ac657cdf76f3fd00f57)
* radv: add support for vk_x11_override_min_image_countEric Engestrom2019-09-091-0/+1
| | | | | | | | Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 5eb7d48b5840f33e759ba0da36134883f2a44d9f)
* amd: move adaptive sync to performance section, as it is defined in xmlpoolEric Engestrom2019-09-092-5/+2
| | | | | | | | | Fixes: 3844ed8d44677588bc29 ("radv: Add adaptive_sync driconfig option and enable it by default.") Fixes: e260493f2ab2483e5a55 ("radeonsi: Enable adaptive_sync by default for radeon") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 4ad99ee9616f86eff96f4840354c64a60e341a6b)
* anv: add support for vk_x11_override_min_image_countEric Engestrom2019-09-091-0/+3
| | | | | | | | Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 037b5b567f75db2dd264b23a93abbc88305c7551)
* wsi: add minImageCount overrideEric Engestrom2019-09-095-3/+27
| | | | | | | | Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit a72cdd00abd5f3c18df01acc60bf3b385facfdb6)
* anv: add support for driconfEric Engestrom2019-09-094-3/+19
| | | | | | | | | | No option is supported yet, this is just the boilerplate. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 4dcb1fff19383ae451f3228e55d3fc987a7ab46d)
* anv: Bump maxComputeWorkgroupSizeJason Ekstrand2019-09-091-4/+6
| | | | | | | Fixes: 9a129510f56f "anv: Bump maxComputeWorkgroupInvocations" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111552 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 3b1a7e5333335900293935399ce49a67562eafc7)
* nir/lower_explicit_io: Handle 1 bit loads and storesCaio Marcelo de Oliveira Filho2019-09-061-9/+24
| | | | | | | | | | | | | | | Load a 32-bit value then convert to 1-bit. Convert 1-bit to 32-bit value, then Store it. These cases started to appear when we changed Anvil to use derefs for shared memory. v2: Use `bit_size` in a couple of places we were missing. (Jason) Reassign `value` instead of `src[0]`. (Jason) Fixes: 024a46a4079 ("anv: use derefs for shared memory access") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit c0c55bd84f744f9d4d498403f1eea93fafd6cb4b)
* freedreno/a2xx: ir2: fix lowering of instructions after float loweringJonathan Marek2019-09-061-3/+2
| | | | | | | | | | | | Some instructions generated by int/bool float lowering need to be lowered by opt_algebraic. Fixes: 43dbd7d6 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit 3516a90ab4842a6610dc31514809d490bc4add87)
* intel/dri: finish proper glthreadSergii Romantsov2019-09-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | KWin was able to get NULL-context in the call intelUnbindContext. But a call _mesa_glthread_finish is not resistent to such case. Case can be catched with steps: 1. Create both glx and egl contexts 2. Make glx as current 3. Make egl as current 4. Reset glx context 5. Make egl as current Solution adds proper finishing of glthread-context (context will be taken from the requested dri-context for unbinding, but not from the saved current context). Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87 Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271 Fixes: dca36d5516d0 (i965: Implement threaded GL support) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit 1dce75c1839f08cfa78400367019f998c258eff5)
* broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.Dave Stevenson2019-09-051-8/+23
| | | | | | | | | | | | | Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride." for v3d. Allows YUV buffers with a single buffer and plane offsets to be passed in. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit 873b092e9110a0605293db7bc1c5bcb749cf9a28)
* radv: Call nir_propagate_invariant()Connor Abbott2019-09-051-0/+2
| | | | | | | | | Without this, invariant qualifiers don't do anything. Together with a fix to the game, this fixes flickering in No Man's Sky. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit 3f5b541fc8b2aae6e71783b7a9bb8eb2ffa3a74d)
* nir: do not assume that the result of fexp2(a) is always an integralSamuel Pitoiset2019-09-041-0/+1
| | | | | | | | | | | It's only correct when 'a' is an integral greater or equal to 0. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493 Fixes: 5544b2cbbd2 ("nir/algebraic: Use value range analysis to eliminate useless unary ops") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit 966a455bb912cc9fd22580c6cf9b74e27faa4491) (conflicts resolved by Dylan Baker)
* nir: Add is_not_negative helper functionDylan Baker2019-09-041-0/+6
| | | | | This was taken from 636da1243346e4e2a5aaf79bac65850884a9b859, and is needed by the next patch.
* glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX.Hal Gentz2019-09-042-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When run in optirun, applications that linked to `libGLX.so` and then proceeded to querying Mesa for extension strings caused a SEGV in Mesa. `glXQueryExtensionsString` was calling a chain of functions that eventually led to `__glXQueryServerString`. This function would call `xcb_glx_query_server_string` then `xcb_glx_query_server_string_reply`. The latter for some unknown reason returned `NULL`. Passing this `NULL` to `xcb_glx_query_server_string_string_length` would cause a SEGV as the function tried to dereference it. The reason behind the function returning `NULL` is yet to be determined, however, simply checking that the ptr is not `NULL` resolves this. A similar check has been added to `__glXGetString` for completeness sake, although not immediately necessary. In addition to that, we stumbled into a similar problem in `AllocAndFetchScreenConfigs` which tries to access the configs to free them if `__glXQueryServerString` fails. This, of course, SEGVs, because the configs are yet to have been allocated. Simply continuing past the configs if their config ptrs are `NULL` resolves this. We also switch to `calloc` to make sure that the config ptrs are `NULL` by default, and not some uninitialized value. Cc: mesa-stable@lists.freedesktop.org Fixes: 24b8a8cfe821 "glx: implement __glXGetString, hide __glXGetStringFromServer" Fixes: cb3610e37c4c "Import the GLX client side library, formerly from xc/lib/GL/glx. Build it " Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Hal Gentz <zegentzy@protonmail.com> (cherry picked from commit 1591d1fee5016a21477edec0d2eb6b2d24221952)
* iris: Report correct number of planes for planar imagesKenneth Graunke2019-09-041-1/+8
| | | | | | | | | | | | We were only handling the modifiers case and not counting the number of planes in actual planar images. Fixes Piglit's ext_image_dma_buf_import-export. Fixes: fc12fd05f56 ("iris: Implement pipe_screen::resource_get_param") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit f8887909c6683986990474b61afd6d4335a69e41)
* nir: fix memleak in error pathEric Engestrom2019-09-041-1/+3
| | | | | | | | Fixes: 2cf59861a8128a91bfdd ("nir: Add partial redundancy elimination for compares") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit 7659c6197f08587f57f101a88a7e477337ce363c)
* freedreno/drm-shim: fix mem leakEric Engestrom2019-09-041-3/+4
| | | | | | | Fixes: 494ecef6b42198ab6c3e ("freedreno: Add support for drm-shim.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit c4969b0a25982505dd784257d7c12f1abe8b2180)
* anv: fix format string in error messageEric Engestrom2019-09-041-1/+1
| | | | | | | Fixes: 9775894f102535a79186 ("anv: Move size check from anv_bo_cache_import() to caller (v2)") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 7abf65aedc679069b794fddfa6feafa68d36d06a)
* util/os_file: fix double-close()Eric Engestrom2019-09-041-1/+0
| | | | | | | Fixes: 955c63d3643f30d7db0c ("util/os_file: resize buffer to what was actually needed") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit 1667360f7d920a35191dc67b7ee120eef95e8788)
* egl: fix deadlock in malloc error pathEric Engestrom2019-09-041-1/+3
| | | | | | | Fixes: cb0980e69aa921af7086 ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit 43d470404c47d86d1fab93d1345e09375bcf4fb6)
* ttn: fix 64-bit shift on 32-bit `1`Eric Engestrom2019-09-041-1/+1
| | | | | | | | | Fixes: 4d0b2c7aaac3cf3de5af ("ttn: Update shader->info as we generate code.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit 3afe9d798aacc0abc6d898dda3360f06517caf8e)
* vulkan/overlay: bounce image back to present layoutLionel Landwerlin2019-09-041-2/+30
| | | | | | | | | | | Once we write the overlay to an image to be presented, we must not forget to put it back into present layout. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111401 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit 320b0f66c27407008784da3606e23cb44c70ddf0)
* egl: fix platform selectionLionel Landwerlin2019-09-041-2/+7
| | | | | | | | | | | | | Add missing "device" platform v2: Add the missing platform (Eric) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Jean Hertel <jean.hertel@hotmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111529 Fixes: d6edccee8d ("egl: add EGL_platform_device support") Reviewed-by: Eric Engestrom <eric@engestrom.ch> (cherry picked from commit 6775a524004ba15ab281c1391fb24cbf621fe859)
* gallium/auxiliary/indices: consistently apply start only to inputErik Faye-Lund2019-09-041-10/+10
| | | | | | | | | | | | | | | | | | | The majority of these only apply the start argument to the input, but a few of them also does for the output-array. util_primconvert, the only user of this argument expects this pass a non-zero start-argument does not expect this to be applied to the output; if it is, it will write outside of allocated memory, leading to VRAM corruption. The reason this doesn't seem to have been noticed before, is that no driver currently use util_primconvert to convert a primitive-type to itself, which is the cases where this was broken. But for Zink, this will no longer be true, because we need to eliminate the use of 8-bit index-buffers. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: 28f3f8d413f ("gallium/auxiliary/indices: add start param") Reviewed-by: Rob Clark <robdclark@chromium.org> (cherry picked from commit 52af1427c6d248829130ce25b86cd27a31d34633)
* travis: Fail build if any command in if statement fails.Vinson Lee2019-09-041-4/+4
| | | | | | | | | | Travis is checking the exit code of the entire if statement. Fixes: 64ffc289be89 ("travis: add MacOS Scons build") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (cherry picked from commit 029b07b2adbde4466e4776e10cf37ec30aee4f1f)
* swr: Fix build with llvm-9.0 again.Vinson Lee2019-09-043-0/+28
| | | | | | | | | | | Commit 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr") unintentionally removed changes for llvm-9.0. Fixes: 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr") Fixes: 5dd9ad157005 ("swr/rasterizer: Better implementation of scatter") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jan Zielinski <jan.zielinski@intel.com> (cherry picked from commit 3664a6600eb0efbb4606f0b59730df3088b3b490)
* bump version to 19.2-rc2mesa-19.2.0-rc2Dylan Baker2019-09-041-1/+1
|
* glsl: replace 'x + (-x)' with constant 0Pierre-Eric Pelloux-Prayer2019-09-041-0/+12
| | | | | | | | | | | | | | This fixes a hang in shadertoy for radeonsi where a buffer was initialized with: value -= value with value being undefined. In this case LLVM replace the operation with an assignment to NaN. Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241 Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit 47cc660d9c19572e5ef2dce7c8ae1766a2ac9885)
* Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"Thong Thai2019-09-041-7/+4
| | | | | | | | | | | | This reverts commit 5a2e65be89d97ed5d7263f0296ea69ae8517187b. Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL still needs to be emitted by the UMD, or else the driver will hang Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit 2a3a5604076e94445079a0b25aa108ee99b5fcba)
* nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcselIan Romanick2019-09-041-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I discovered this while looking at a shader that was hurt by some other work I'm doing. When I examined the changes, I was confused that one instance of a comparison that was used in a discard_if was (incorrectly) eliminated, while another instance used by a bcsel was (correctly) not eliminated. I had to use NIR_PRINT=true to see exactly where things when wrong. A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and Strike Suit Zero were impacted. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16280659 -> 16281075 (<.01%) instructions in affected programs: 21042 -> 21458 (1.98%) helped: 0 HURT: 136 HURT stats (abs) min: 1 max: 9 x̄: 3.06 x̃: 3 HURT stats (rel) min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03% 95% mean confidence interval for instructions value: 2.93 3.19 95% mean confidence interval for instructions %-change: 2.08% 2.37% Instructions are HURT. total cycles in shared programs: 367168270 -> 367170313 (<.01%) cycles in affected programs: 172020 -> 174063 (1.19%) helped: 14 HURT: 111 helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9 helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79% HURT stats (abs) min: 2 max: 584 x̄: 21.08 x̃: 5 HURT stats (rel) min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40% 95% mean confidence interval for cycles value: 5.41 27.28 95% mean confidence interval for cycles %-change: 0.64% 1.81% Cycles are HURT. (cherry picked from commit 7dba7df5e577b94e009848a2ca3e0b0a41629fe9)
* nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)Ian Romanick2019-09-041-3/+8
| | | | | | | | | | | | | | | | Found by inspection. I tried really, really hard to make a test case that would trigger this problem, but I was unsuccesful. It's very hard to get an instruction to produce a ne_zero result without ne_zero sources. The most plausible way is using bcsel. That proves problematic because bcsel interprets its sources as integers, so it cannot currently be used to "clean" values for floating point instructions. No shader-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") (cherry picked from commit 0b4782fccd22b0a01ded1e4cbfe06821bdf19d05)
* nir/range-analysis: Adjust result range of multiplication to account for ↵Ian Romanick2019-09-041-31/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flush-to-zero Fixes piglit tests (new in piglit!110): - fs-underflow-fma-compare-zero.shader_test - fs-underflow-mul-compare-zero.shader_test v2: Add back part of comment accidentally deleted. Noticed by Caio. Remove is_not_zero function as it is no longer used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308 Fixes: fa116ce357b ("nir/range-analysis: Range tracking for ffma and flrp") Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> All Gen7+ platforms** had similar results. (Ice Lake shown) total instructions in shared programs: 16278465 -> 16279492 (<.01%) instructions in affected programs: 16765 -> 17792 (6.13%) helped: 0 HURT: 23 HURT stats (abs) min: 7 max: 275 x̄: 44.65 x̃: 8 HURT stats (rel) min: 1.15% max: 17.51% x̄: 4.23% x̃: 1.62% 95% mean confidence interval for instructions value: 9.57 79.74 95% mean confidence interval for instructions %-change: 1.85% 6.61% Instructions are HURT. total cycles in shared programs: 367135159 -> 367154270 (<.01%) cycles in affected programs: 279306 -> 298417 (6.84%) helped: 0 HURT: 23 HURT stats (abs) min: 13 max: 6029 x̄: 830.91 x̃: 54 HURT stats (rel) min: 0.17% max: 45.67% x̄: 7.33% x̃: 0.49% 95% mean confidence interval for cycles value: 100.89 1560.94 95% mean confidence interval for cycles %-change: 0.94% 13.71% Cycles are HURT. total spills in shared programs: 8870 -> 8869 (-0.01%) spills in affected programs: 19 -> 18 (-5.26%) helped: 1 HURT: 0 total fills in shared programs: 21904 -> 21901 (-0.01%) fills in affected programs: 81 -> 78 (-3.70%) helped: 1 HURT: 0 LOST: 0 GAINED: 1 ** On Broadwell, a shader was hurt for spills / fills instead of helped. No changes on any earlier platforms. (cherry picked from commit ef2e235252ea3dbadad79bb48c760bb6c376b97c)
* nir/range-analysis: Adjust result range of exp2 to account for flush-to-zeroIan Romanick2019-09-041-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes piglit tests (new in piglit!110): - fs-underflow-exp2-compare-zero.shader_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308 Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Most of the shaders affected are, unsurprisingly, in Unigine Heaven. All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16278207 -> 16278465 (<.01%) instructions in affected programs: 11374 -> 11632 (2.27%) helped: 0 HURT: 58 HURT stats (abs) min: 2 max: 13 x̄: 4.45 x̃: 4 HURT stats (rel) min: 0.54% max: 4.11% x̄: 2.42% x̃: 2.82% 95% mean confidence interval for instructions value: 3.77 5.13 95% mean confidence interval for instructions %-change: 2.19% 2.64% Instructions are HURT. total cycles in shared programs: 367134284 -> 367135159 (<.01%) cycles in affected programs: 81207 -> 82082 (1.08%) helped: 17 HURT: 36 helped stats (abs) min: 6 max: 356 x̄: 90.35 x̃: 6 helped stats (rel) min: 0.69% max: 21.45% x̄: 5.71% x̃: 0.78% HURT stats (abs) min: 4 max: 235 x̄: 66.97 x̃: 16 HURT stats (rel) min: 0.35% max: 27.58% x̄: 5.34% x̃: 1.09% 95% mean confidence interval for cycles value: -20.36 53.38 95% mean confidence interval for cycles %-change: -1.08% 4.67% Inconclusive result (value mean confidence interval includes 0). No changes on any earlier platforms. (cherry picked from commit 33ad2bab4bcb52c0f6be56e2f9cce5f52601a4ea)
* nir/algebraic: Mark some value range analysis-based optimizations impreciseIan Romanick2019-09-041-9/+13
| | | | | | | | | | | | | | | | | | | This didn't fix bug #111308, but it was found will trying to find the actual cause of that bug. Fixes piglit tests (new in piglit!110): - fs-fract-of-NaN.shader_test - fs-lt-nan-tautology.shader_test - fs-ge-nan-tautology.shader_test No shader-db changes on any Intel platform. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308 Fixes: b77070e293c ("nir/algebraic: Use value range analysis to eliminate tautological compares") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit ccb236d1bc6375bdf9bc47550bdfa348ea7369b9)
* iris: Fix partial fast clear checks to account for miplevel.Kenneth Graunke2019-09-041-2/+2
| | | | | | | | | | | | | | | We enabled fast clears at level > 0, but didn't minify the dimensions when comparing the box size, so we always thought it was a partial clear and as a result never actually enabled any. This eliminates some slow clears in Civilization VI, but they are mostly during initialization and not the main rendering. Thanks to Dan Walsh for noticing we had too many slow clears. Fixes: 393f659ed83 ("iris: Enable fast clears on other miplevels and layers than 0.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (cherry picked from commit 30b9ed92ea423a4857023ca5e2222ae409672fa5)
* intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardwareIan Romanick2019-09-041-0/+1
| | | | | | | | | | | See the previous commit for the explanation of the Fixes tag. Hurts 21 shaders in shader-db. All of the hurt shaders are in Unreal Engine 4 tech demos. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: 7afa26d4e39 ("nir: Add lowering for nir_op_bitfield_reverse.") (cherry picked from commit b418269d7dd576a7c9afd728bf8a883b4da98b30)
* nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is ↵Ian Romanick2019-09-041-1/+1
| | | | | | | | | | | | | | | | | | | | | enabled This caused a problem on Sandybridge where an open-coded bitfieldReverse() function could be optimized to a nir_op_bitfield_reverse that would generate an unsupported BFREV instruction in the backend. This was encountered in some Unreal4 tech demos in shader-db. The bug was not previously noticed because we don't actually try to run those demos on Sandybridge. The fixes tag is a bit a lie. The actual bug was introduced about 26,000 commits earlier in 371c4b3c48f ("nir: Recognize open-coded bitfield_reverse."). Without the NIR lowering pass, the flag needed to avoid the optimization does not exist. Hopefully nobody will care to fix this on an earlier Mesa release. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: 7afa26d4e39 ("nir: Add lowering for nir_op_bitfield_reverse.") (cherry picked from commit d3fd1c761aab01e06665180ab86c9528c0b285b2)