summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: fix discard-only fragment shaders (11.1 version)intel-ciNicolai Hähnle2016-02-041-0/+3
| | | | | | | | | | | | | | | | | | When a fragment shader is used that has no outputs but does conditional discard (KILL_IF), all fragments are killed without this patch. By comparing various register settings, my conclusion is that the exec mask is either not properly forwarded to the DB by NULL exports or ends up being unused, at least when there is _only_ a NULL export (the ISA documentation claims that NULL exports can be used to override a previously exported exec mask). Of the various approaches I have tried to work around the problem, this one seems to be the least invasive one. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93761 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* st/mesa: use the correct address generation functions in st_TexSubImage blitNicolai Hähnle2016-02-041-5/+5
| | | | | | | | | | | | | | | | | | | We need to tell the address generation functions about the dimensionality of the texture to correctly implement the part of Section 3.8.1 (Texture Image Specification) of the OpenGL 2.1 specification which says: "For the purposes of decoding the texture image, TexImage2D is equivalent to calling TexImage3D with corresponding arguments and depth of 1, except that ... * UNPACK SKIP IMAGES is ignored." Fixes a low impact bug that was found by chance while browsing the spec and extending piglit tests. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (cherry picked from commit 4a448a63adbbece1d9bddacd9428aad7cc68a628)
* st/omx/dec/h264: fix corruption when scaling matrix present flag setLeo Liu2016-02-041-2/+5
| | | | | | | | | | | | The scaling list should be filled out with zig zag scan v2: integrate zig zag scan for list 4x4 to vl(Christian) v3: move list determination out from the loop(Ilia) Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> (cherry picked from commit 6ad2e55a1405ac3757439dae55ed86425bb65806)
* vl: add zig zag scan for list 4x4Leo Liu2016-02-042-0/+8
| | | | | | | Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> (cherry picked from commit 4f598f2173c6555a52aad942ce6ea75c65afe21a)
* st/mesa: treat a write as a read for range purposesIlia Mirkin2016-02-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use this logic to detect live ranges and then do plain renaming across the whole codebase. As such, to prevent WaW hazards, we have to treat a write as if it were also a read. For example, the following sequence was observed before this patch: 13: UIF TEMP[6].xxxx :0 14: ADD TEMP[6].x, CONST[6].xxxx, -IN[3].yyyy 15: RCP TEMP[7].x, TEMP[3].xxxx 16: MUL TEMP[3].x, TEMP[6].xxxx, TEMP[7].xxxx 17: ADD TEMP[6].x, CONST[7].xxxx, -IN[3].yyyy 18: RCP TEMP[7].x, TEMP[3].xxxx 19: MUL TEMP[4].x, TEMP[6].xxxx, TEMP[7].xxxx While after this patch it becomes: 13: UIF TEMP[7].xxxx :0 14: ADD TEMP[7].x, CONST[6].xxxx, -IN[3].yyyy 15: RCP TEMP[8].x, TEMP[3].xxxx 16: MUL TEMP[4].x, TEMP[7].xxxx, TEMP[8].xxxx 17: ADD TEMP[7].x, CONST[7].xxxx, -IN[3].yyyy 18: RCP TEMP[8].x, TEMP[3].xxxx 19: MUL TEMP[5].x, TEMP[7].xxxx, TEMP[8].xxxx Most importantly note that in the first example, the second RCP is done on the result of the MUL while in the second, the second RCP should have the same value as the first. Looking at the GLSL source, it is apparent that both of the RCP's should have had the same source. Looking at what's going on, the GLSL looks something like float tmin_8; float tmin_10; tmin_10 = tmin_8; ... lots of code ... tmin_8 = tmpvar_17; ... more code that never looks at tmin_8 ... And so we end up with a last_read somewhere at the beginning, and a first_write somewhere at the bottom. For some reason DCE doesn't remove it, but even if that were fixed, DCE doesn't handle 100% of cases, esp including loops. With the last_read somewhere high up, we overwrite the previously correct (and large) last_read with a low one, and then proceed to decide to merge all kinds of junk onto this temp. Even if that weren't the case, and there were just some writes after the last read, then we might still overwrite a merged value with one of those. As a result, we should treat a write as a last_read for the purpose of determining the live range. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit 047b91771845453826dcdd0019adc7333348b158)
* gallium: Add DragonFly supportFrançois Tigeot2016-02-041-1/+1
| | | | | | Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit a48afb92ffda6e149c553ec82a05fee9a17441f8)
* nv50/ir: fix false global CSE on instructions with multiple defsIlia Mirkin2016-02-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | If an instruction has multiple defs, we have to do a lot more checks to make sure that we can move it forward. Among other things, various code likes to do a, b = tex() if () c = a else c = b which means that a single phi node will have results pointing at the same instruction. We obviously can't propagate the tex in this case, but properly accounting for this situation is tricky. Just don't try for instructions with multiple defs. This fixes about 20 shaders in shader-db, including the dolphin efb2ram shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit 3ca941d60ed38800038cd545842e0ed3a69946da)
* nv50,nvc0: fix buffer clearing to respect engine alignment requirementsIlia Mirkin2016-02-042-52/+247
| | | | | | | | | | | | | | | | It appears that the nvidia render engine is quite picky when it comes to linear surfaces. It doesn't like non-256-byte aligned offsets, and apparently doesn't even do non-256-byte strides. This makes arb_clear_buffer_object-unaligned pass on both nv50 and nvc0. As a side-effect this also allows RGB32 clears to work via GPU data upload instead of synchronizing the buffer to the CPU (nvc0 only). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> # tested on GF108, GT215 Tested-by: Nick Sarnie <commendsarnex@gmail.com> # GK208 Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit 3ca2001b537a2709e7ef60410e7dfad5d38663f4)
* nvc0: avoid crashing when there are holes in vertex array bindingsIlia Mirkin2016-02-041-3/+13
| | | | | | | | | | When using the "shared" vertex array configuration strategy, we bind each of the buffers as a separate array. However there can be holes in such vertex buffer lists, so just emit a disable for those. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit 438d421f8bb3f65402701628c3504c0ad04184c0)
* vc4: Throttle outstanding rendering after submission.Eric Anholt2016-02-041-0/+9
| | | | | | | | | | | | Just make sure that after we've submitted, we get to at least 5 (global) submits ago before we go on to do more. Prevents up to seconds of lag with window movement in X with xcompmgr -c. There may be useful tuning to do in the future, but for now this gets us usability. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Eric Anholt <eric@anholt.net> (cherry picked from commit 3fba517bdd03551f7c7ff21dfe1896c677cbccda)
* vc4: Don't record the seqno of a failed job submit.Eric Anholt2016-02-041-2/+2
| | | | | | | | | | On an error return, the returned seqno will probably be unset, so we'd lose track of what we've submitted so far for waiting on in the future. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Eric Anholt <eric@anholt.net> (cherry picked from commit 2a449ce7c961f3269f9a37ddf4fe340fc170c609)
* nv50/ir: fix memory corruption when spilling and redoing RAKarol Herbst2016-02-041-0/+3
| | | | | | | | | | | | When RA fails, and we spill, we have to clean everything up before doing RA again. We were forgetting to reset the hi/lo linked lists - at least the hi list is guaranteed to still have pointers to now-deleted RIG nodes. Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit 19ae5de981e014e1b366b4652e14eb1ea0421574)
* i965/bxt: Fix conservative wm thread counts.Ben Widawsky2016-02-041-1/+1
| | | | | | | | | | | | | | | | | | | | When setting the conservative thread counts, I halved everything. That isn't correct for the wm, which has nothing to do with actual thread counts. I suck. BXT only has 1 slice, and there is some ambiguity about subslices, so just reserve the max possible for now. It looks like this might fix: piglit.spec.glsl-1_50.execution.variable-indexing.gs-output-array-vec4-index-wr.bxtm64. I kind of question why that is, but it is what Jenkins says. Mark is current running some of the other blacklisted tests on this patch. (it effects anything requiring scratch space). Cc: mesa-stable <mesa-stable@lists.freedesktop.org> Cc: Neil Roberts <neil@linux.intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Mark Janes <mark.a.janes@intel.com> (cherry picked from commit a443b5b7320ef0d8a63faf8a7fc6136f97460cea)
* meta: Use internal functions to set texture parametersIan Romanick2016-02-044-24/+49
| | | | | | | | | | | | | | | | | | | | | | | _mesa_texture_parameteriv is used because (the more obvious) _mesa_texture_parameteri just stuffs the parameter in an array and calls _mesa_texture_parameteriv. This just cuts out the middleman. As a side bonus we no longer need check that ARB_stencil_texturing is supported. The test doesn't allow non-supporting implementations to avoid any work, and it's redundant with the value-changed test. Fix bug #93717 because the state restore commands at the bottom of _mesa_meta_GenerateMipmap no longer depend on the bound state. Fixes piglit arb_direct_state_access-generatetexturemipmap with the changes recently sent to the piglit mailing list. See the bugzilla entry for more info. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93717 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> (cherry picked from commit 2542871387393e855f6afe6c94d44611eefaf6eb)
* meta/blit: Restore GL_DEPTH_STENCIL_TEXTURE_MODE state for GL_TEXTURE_RECTANGLEIan Romanick2016-02-041-8/+8
| | | | | | | | | | | | | | Commit c246828c added the code to save and restore the stencil texturing mode. The restore, however, was erroneously inside the 'target != GL_TEXTURE_RECTANGLE' block. Fixes piglit test 'arb_stencil_texturing-blit_corrupts_state GL_TEXTURE_RECTANGLE'. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> (cherry picked from commit 18b0ba340b9229e7afd5e38b5d825fde3a435b63)
* radeonsi: add DCC buffer for sampler views on new CSNicolai Hähnle2016-02-041-15/+18
| | | | | | | | | This fixes a VM fault and possible lockup in high memory pressure situations. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (cherry picked from commit 1067e6eb552ac850258e33e4eba689a964581c8a)
* radeonsi: ensure that VGT_GS_MODE is sent when necessaryNicolai Hähnle2016-02-041-8/+21
| | | | | | | | | | | | | | | | Specifically, when the API switches from using a GS to not using a GS and then back to using the same GS again, we do not have to re-send all the GS state, but we do have to send VGT_GS_MODE. So make VGT_GS_MODE consistently be a part of the VS state. This fixes a rendering bug in Dolphin, but surely other applications are affected as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93648 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit 004fcd423011d45f746d571be47062feeea75455)
* radeonsi: extract the VGT_GS_MODE calculation into its own functionNicolai Hähnle2016-02-041-19/+28
| | | | | | | Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit 9f89bd69df4a32eb1a7ebaf8bbd2c463f1a7882f)
* glsl: always compute proper varying type, irrespective of varying packingIlia Mirkin2016-02-041-8/+5
| | | | | | | | | | | | | | | | | | Normally there's a producer and consumer, and the producer var gets picked. In both the vertex->gs and tes->gs cases, that's the un-arrayed version. In the SSO case, however, there is no producer. So we picked the arrayed GS variable, and as a result, used more slots than we should. More critically, these slots would also no longer line up with the producer's calculation. To fix this, we need to fix up the type of the variable based on stage no matter what. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93650 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> (cherry picked from commit dac2964f3ebd96d5ac227984ab0cd79c2c3b2a1a)
* glsl: create helper to remove outer vertex index array used by some stagesTimothy Arceri2016-02-041-10/+26
| | | | | | | | | This will be used in the following patch for calculating array sizes correctly when reserving explicit varying locations. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (cherry picked from commit 5907a02ab6fbe20b4ba58eb00bf93261129798d5)
* egl/dri2: expose srgb configs when KHR_gl_colorspace is availableEmil Velikov2016-02-031-0/+2
| | | | | | | | | | | | | | | | Otherwise the user has no way of using it, and we'll try to access the linear one. v2: - Bail out when KHR_gl_colorspace is missing and srgb is set (Marek) Cc: Chih-Wei Huang <cwhuang@android-x86.org> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Fixes: c2c2e9ab604(egl: implement EGL_KHR_gl_colorspace (v2)) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> (cherry picked from commit 54702c2fa1a146f45a1f8c35abe2b529e24b2acf)
* targets/dri: android: use WHOLE static librariesEmil Velikov2016-02-031-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | By using whole static libraries the android buildsystem provides whole-archive (alike) solution. This means that we don't need to worry about the order of the static libraries and any reverse, recursive or circular dependencies that they have between one another. Without this the linker will discard any unused hunks of one library and we'll end up with unresolved symbols as those are required by another static library. This issue has become more prominent with the introduction of pipe-loader. Whole static libraries has been used in i915/i965 for a very long time, so we might do the same. v2: - Better commit message (Ilia) - Keep external dependencies as [normal] static libs (Mauro) Cc: mesa-stable@lists.freedesktop.org Cc: Mauro Rossi <issor.oruam@gmail.com> Reported-by: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit f29a772a7e0ae5113822bcf14eb3bc87477c5fb1)
* i915: correctly parse/set the context flagsEmil Velikov2016-02-031-0/+2
| | | | | | | | | | | | | | | | | | | With an earlier commit we've spit the flags parsing to a separate function, but forgot to update all the dri modules to use it. Noticed when we've enabled KHR_debug for every dri module - fdo#93048 Fixes: 38366c0c6e7 "dri_util: Don't assume __DRIcontext->driverPrivate is a gl_context" Cc: Mark Janes <mark.a.janes@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com> (cherry picked from commit 72fda2b710d864d23aec1e8f959147d05c5ff3f3)
* pipe-loader: Fix PATH_MAX define on MSVC.Jose Fonseca2016-01-221-0/+5
| | | | | | (cherry picked from commit 4befd82a649e926e64bc2c17cf362a84d5be42e6) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93628 Nominted-by: Emil Velikov <emil.l.velikov@gmail.com>
* scons: Conditionally use DRM module on pipe-loader.Jose Fonseca2016-01-221-5/+4
| | | | | | | | | | Fixes non Linux builds. Trivial. (cherry picked from commit 02afbd247620bd51a5b1661ced9b01a865136484) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93628 Nominted-by: Emil Velikov <emil.l.velikov@gmail.com>
* r600g: don't leak driver const buffersGrazvydas Ignotas2016-01-221-0/+6
| | | | | | | | | | | | | | The buffers are referenced from r600_update_driver_const_buffers() -> r600_set_constant_buffer() -> u_upload_data(), but nothing ever releases the reference. Similar case with driver_consts. Found using valgrind. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> [Emil Velikov: resolve trivial conflicts] (cherry picked from commit 0153ff8379be789262ad9cd636080d92b77becad) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
* util/u_pstipple.c: copy immediates during transformationNicolai Hähnle2016-01-221-0/+1
| | | | | | | | | | | | Apparently, nobody has combined stippling with a fragment shader containing immediates in almost five years... Fixes a bug in Kodi with radeonsi reported by Christian König. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Tested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit e6281a285012d76cf60fb8639838c369cf4d438f)
* glsl: fix interface block error messageTimothy Arceri2016-01-221-1/+1
| | | | | | | | | Print the stream value not the pointer to the expression, also use the unsigned format specifier. Cc: 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit d018619d7f52b9a72d7c010d4afb70653d674f12)
* glsl: fix subroutine lowering reusing actual parmatersDave Airlie2016-01-221-5/+19
| | | | | | | | | | | | | | | | One of the oglconform tests was crashing here, and it was due to not cloning the actual parameters before creating the new call. This makes a call clone function that does the right things to make sure we clone all the needed info, and points the callee at it. (It differs from ->clone due to this). this may fix https://bugs.freedesktop.org/show_bug.cgi?id=93722, I had this patch in my cts fixes tree, but hadn't had time to make sure I liked it. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit 119bef954379ebb35faf182b0b665becacddab76)
* mesa: fix segfault in glUniformSubroutinesuiv()Timothy Arceri2016-01-221-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | From Section 7.9 (SUBROUTINE UNIFORM VARIABLES) of the OpenGL 4.5 Core spec: "The command void UniformSubroutinesuiv(enum shadertype, sizei count, const uint *indices); will load all active subroutine uniforms for shader stage shadertype with subroutine indices from indices, storing indices[i] into the uniform at location i. The indices for any locations between zero and the value of ACTIVE_SUBROUTINE_UNIFORM_LOCATIONS minus one which are not used will be ignored." V2: simplify NULL check suggested by Jason. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "11.0 11.1" mesa-stable@lists.freedesktop.org https://bugs.freedesktop.org/show_bug.cgi?id=93731 (cherry picked from commit 86677f101641c75d52577e3cd9e76441b1228b21)
* glsl: fix segfault linking subroutine uniform with explicit locationTimothy Arceri2016-01-221-1/+1
| | | | | | Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "11.0 11.1" mesa-stable@lists.freedesktop.org (cherry picked from commit 50376e0c0e81f80cac6ec77da491241e4ccf57f0)
* glsl: Allow implicit int -> uint conversions for bitwise operators (&, ^, |).Kenneth Graunke2016-01-221-8/+38
| | | | | | | | | | | | | | | | | | | The ARB has decided that implicit conversions should be performed for bitwise operators in future language revisions. Implementations of current language revisions may or may not perform them. This patch makes Mesa apply implicti conversions even on current language versions. Applications appear to expect this behavior, and there's really no downside to doing so. Fixes shader compilation in Shadow of Mordor. Bugzilla: https://www.khronos.org/bugzilla/show_bug.cgi?id=1405 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit d54a70aa18a0139a5eefbc5193fabaf739317533)
* i965/fs: Always set channel 2 of texture headers in some stagesJason Ekstrand2016-01-221-0/+8
| | | | | | | | | | | | | | | | | In the vertex and fragment stages, the hardware is nice to us and leaves g0.2 zerod out for us so we can use it for headers. However, in compute, geometry, and tessellation stages, the hardware is not so nice. In particular, for compute shaders on BDW, the hardware places some debug bits in 23:15. As it happens, bit 15 is interpreted by the sampler as the alpha channel mask. This means that if you use a texturing instruction with a header in a compute shader, you may randomly get the alpha channel disabled. Since channel masks affect the return length of the sampler message, this can lead the GPU to expect a different mlen to the one you specified in the shader and this, in turn, hangs your GPU. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit 61b0cfd84ee6fb1273928ee2c8751301ae805eaa)
* i965/fs/generator: Take an actual shader stage rather than a stringJason Ekstrand2016-01-226-10/+14
| | | | | | | | | | | | | Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> [Emil Velikov: drop not applicable TES changes] (cherry picked from commit 9870f798beab701a9edda81ff7ccc39f1875d610) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/mesa/drivers/dri/i965/brw_fs_generator.cpp src/mesa/drivers/dri/i965/brw_shader.cpp
* i965/vec4: Use UW type for multiply into accumulator on GEN8+Jason Ekstrand2016-01-221-1/+5
| | | | | | | | | BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator." Cc: "11.1,11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit 0a6811207fbe18d49c7ab95f93ed01f75ffcdda0)
* st/mesa: use surface format to generate mipmaps when availableIlia Mirkin2016-01-221-1/+7
| | | | | | | | | | | | | | This fixes the recently posted mipmap + texture views piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> [Emil Velikov: resolve trivial conflicts] (cherry picked from commit e94ef885bb71b46aba4517523ebb63c0d4b36c4b) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/mesa/state_tracker/st_gen_mipmap.c
* radeonsi: don't miss changes to SPI_TMPRING_SIZEMarek Olšák2016-01-221-2/+7
| | | | | | | | | | | I'm not sure about the consequences of this bug, but it's definitely dangerous. This applies to SI, CIK, VI. Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit dc96a18d24409102e36cdfd7de0552f66c3925bf)
* cherry-ignore: drop the i965/kbl .num_slices patchEmil Velikov2016-01-221-0/+4
| | | | | | The variable was introduced after the branch point. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
* glsl: Make bitfield_insert/extract and bfi/bfm non-vectorizable.Kenneth Graunke2016-01-221-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, opt_vectorize() tries to combine: result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x); result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y); result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z); result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w); into a single ir_quadop_bitfield_insert opcode, which operates on ivec4s. However, GLSL IR's opcodes currently require the bits and offset parameters to be scalar integers. So, this breaks. We want to be able to vectorize this eventually, but for now, just chicken out and make opt_vectorize() bail by marking all the bitfield insert/extract related opcodes as horizontal. This is a relatively uncommon case today, so we'll do the simple fix for stable branches, and fix it properly on master. Fixes assertion failures when compiling Shadow of Mordor vertex shaders on i965 in vec4 mode (where OptimizeForAOS enables opt_vectorize()). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org [Emil Velikov: resolve trivial conflicts] (cherry picked from commit 5e3edd4b2891d839d440f58053f7207fc71554f4) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/glsl/ir.h
* i965: use _mesa_delete_buffer_objectNicolai Hähnle2016-01-211-1/+1
| | | | | | | | | | This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit 051603efd546efea9975a5109910171a2e7853a4)
* i915: use _mesa_delete_buffer_objectNicolai Hähnle2016-01-211-1/+1
| | | | | | | | | | This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit 1b74c02e83c59a51f155b64de0444ea3df183af6)
* radeon: use _mesa_delete_buffer_objectNicolai Hähnle2016-01-211-1/+1
| | | | | | | | | | This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit 8882b46226152733960ae006e3856baf00aa71f3)
* st/mesa: use _mesa_delete_buffer_objectNicolai Hähnle2016-01-211-3/+1
| | | | | | | | This is more future-proof than the current code. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 1c2187b1c225b2f7e1891544d184bde60390977e)
* mesa/bufferobj: make _mesa_delete_buffer_object externally accessibleNicolai Hähnle2016-01-212-1/+5
| | | | | | | | | | gl_buffer_object has grown more complicated and requires cleanup. Using this function from drivers will be more future-proof. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit 6aed083b9304cd718ee5bc7839a6222b982d3e3b)
* docs: add sha256 checksums for 11.1.1Emil Velikov2016-01-131-1/+2
| | | | Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
* docs: add release notes for 11.1.1mesa-11.1.1Emil Velikov2016-01-131-0/+196
| | | | Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
* Update version to 11.1.1Emil Velikov2016-01-131-1/+1
| | | | Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
* mesa: Add KBL PCI IDs and platform information.Sarah Sharp2016-01-082-0/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | Add PCI IDs for the Intel Kabylake platforms. The IDs are taken directly from the Linux kernel patches, which are under review: http://lists.freedesktop.org/archives/intel-gfx/2015-October/078967.html http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=kbl-upstream-v2 The Kabylake PCI IDs taken from the kernel are rearranged to be in order of GT type, then PCI ID. Please note that if this patch is backported, the following fixes will need to be added before this patch: commit 28ed1e08e8ba98e "i965/skl: Remove early platform support" commit c1e38ad37042b0e "i965/skl: Use larger URB size where available." Thanks to Ben for fixing a bug around setting urb.size, and being patient with my questions about what the various fields mean. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com> Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (KBL-GT2) Cc: "11.1" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 39c41be50d9474dde4c0dcf23a546d14b212e80a)
* st/mesa: check state->mesa in early return check in st_validate_state()Brian Paul2016-01-081-1/+1
| | | | | | | | | | | | | | | | We were checking the dirty->st flags but not the dirty->mesa flags. When we took the early return, we didn't clear the dirty->mesa flags so the next time we called st_validate_state() we'd often flush the glBitmap cache. And since st_validate_state() is called from st_Bitmap(), it meant we flushed the bitmap cache for every glBitmap() call. This change seems to recover most of the performance loss observed with the ipers demo on llvmpipe since commit commit 36c93a6fae27561. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: José Fonseca <jfonseca@vmware.com> (cherry picked from commit c28d72a3473ad0127c82c1244b6688dcc184e85e)
* nir: Add a lower_fdiv option, turn fdiv into fmul/frcp.Kenneth Graunke2016-01-083-0/+3
| | | | | | | | | | | | | | | | | The nir_opt_algebraic rule (('fadd', ('flog2', a), ('fneg', ('flog2', b))), ('flog2', ('fdiv', a, b))), can produce new fdiv operations, which need to be lowered on i965, as we don't actually implement fdiv. (Normally, we handle this in GLSL IR's lower_instructions pass, but in the above case we introduce an fdiv after that point. So, make NIR do it for us.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit 7295f4fcc2b2dd1bc6a8d1d834774b8152a029cf)