diff options
author | Vasily Khoruzhick <anarsoul@gmail.com> | 2021-11-16 22:43:52 -0800 |
---|---|---|
committer | Marge Bot <emma+marge@anholt.net> | 2021-11-24 02:26:08 +0000 |
commit | 3b15fb35753763a0611d1209f7f55742228a2bca (patch) | |
tree | 0f980370b98a1581cf4398a28e1f7280813c8dbd /src/gallium/drivers/lima/ir/pp/node.c | |
parent | 98a7c4c6f8e0dd8aca665ff1ae475ab3cdd53b12 (diff) | |
download | mesa-3b15fb35753763a0611d1209f7f55742228a2bca.tar.gz |
lima/ppir: implement gl_FragDepth support
Mali4x0 supports writing depth and stencil from fragment shader
and we've been using it quite a while for depth/stencil buffer reload.
The missing part was specifying output register for depth/stencil.
To figure it out, I changed reload shader to use register $4 as output
and poked RSW bits (or rather consecutive 4 bit groups) until tests
that rely on reload started to pass again.
It turns out that register number for gl_FragDepth/gl_FragStencil is in
rsw->depth_test and register number for gl_FragColor is in
rsw->multi_sample and it's repeated 4 times for some reason (likely for
MSAA?)
With this knowledge we now can modify ppir compiler to support multiple
store_output intrinsics.
To do that just add destination SSA for store_output to the registers
list for regalloc and mark them explicitly as output. Since it's never
read in shader we have to take care about it in liveness analysis -
basically just mark it alive from the time when it's written to the end
of the block. If it's live only in the last instruction, mark it as
live_internal, so regalloc doesn't clobber it.
Then just let regalloc do its job, and then copy register number to the
shader state and program it in RSW.
The tricky part is gl_FragStencil, since it resides in the same register
as gl_FragDepth and with the current design of the compiler it's hard to
merge them. However gl_FragStencil doesn't seem to be part of GL2
or GLES2, so we can just leave it not implemented.
Also we need to take care of stop bit for instructions - now we can't
just set it in every instruction that stores output, since there may be
several outputs. So if there's any store_output instructions in the
block just mark that block has a stop, and set stop bit in the last
instruction in the block. The only exception is discard - we always need
to set stop bit in discard instruction.
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13830>
Diffstat (limited to 'src/gallium/drivers/lima/ir/pp/node.c')
-rw-r--r-- | src/gallium/drivers/lima/ir/pp/node.c | 6 |
1 files changed, 3 insertions, 3 deletions
diff --git a/src/gallium/drivers/lima/ir/pp/node.c b/src/gallium/drivers/lima/ir/pp/node.c index 99d025e2c05..cc51448bfb3 100644 --- a/src/gallium/drivers/lima/ir/pp/node.c +++ b/src/gallium/drivers/lima/ir/pp/node.c @@ -618,9 +618,9 @@ static ppir_node *ppir_node_insert_mov_local(ppir_node *node) ppir_node_add_dep(move, node, ppir_dep_src); list_addtail(&move->list, &node->list); - if (node->is_end) { - node->is_end = false; - move->is_end = true; + if (node->is_out) { + node->is_out = false; + move->is_out = true; } return move; |