summaryrefslogtreecommitdiff
path: root/src/compiler/nir/nir_opcodes.py
diff options
context:
space:
mode:
authorConnor Abbott <cwabbott0@gmail.com>2019-10-09 09:36:56 -0400
committerMarge Bot <eric+marge@anholt.net>2021-09-01 08:30:57 +0000
commit77a852c1baff8b25078b8efbcd7674e140beea2c (patch)
treea6ffa38424dd8574afbff7140bfff3950ee83311 /src/compiler/nir/nir_opcodes.py
parent3d957b40cc167a97bb9dfc1b16f0d4fff149c15b (diff)
downloadmesa-77a852c1baff8b25078b8efbcd7674e140beea2c.tar.gz
lima/gpir: Rewrite register allocation for value registers
The usual linear-scan register allocation algorithm can't handle preallocated registers, since we might be forced to choose a color for a non-preallocated variable that overlaps with a pre-allocated variable. But in such cases we can simply split the live range of the offending variable when we reach the beginning of the pre-allocated variable's live range. This is still optimal in the sense that it always finds a coloring whenever one is possible, but we may not insert the smallest possible number of moves. However, since it's actually the scheduler which splits live ranges afterwards, we can simply fold in the move while keeping its fake dependencies, and then everything still works! In other words, inserting a live range split for a value register during register allocation is pretty much free. This means that we can split register allocation in two. First globally allocate the cross-block registers accessed through load_reg and store_reg instructions, which is still done via graph coloring, and then run a linear scan algorithm over each block, treating the load_reg and store_reg nodes as referring to pre-allocated registers. This makes the existing RA more complicated, but it has two benefits: first, using round-robin with the linear scan allocator results in much fewer fake dependencies, resulting in around 15 less instructions in the glmark2 jellyfish shader and fixing a regression in instruction count since branching support went in. Second, it will simplify handling spilling. With just graph coloring for everything, every time we spill a node, we have to create new value registers which become new nodes in the graph and re-run RA. This is worsened by the fact that when writing a value to a temporary, we need to have an extra register available to load the write address with a load_const node. With the new scheme, we can ignore this entirely in the first part and then in the second part we can just reserve an extra register in sections where we know we have to spill. So no re-running RA many times, and we can get a good result quickly. The current implementation does linear scan backwards, so that we can insert the fake dependencies while allocating and avoid creating any move nodes at all when we have to split a live range. However, it turns out that this makes handling schedule_first nodes a bit more complicated, so it's not clear if that was worth it. Note: The commit was originally authored by Connor Abbott <cwabbott@gmail.com> and was cherry-picked from <mesa/mesa!2315>. Rebasing was necessary due to changes to BITSET_FOREACH_SET, see 4413537c Because some deqp tests pass now, deqp-lima-fails.txt was also changed. The above changes are Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7682>
Diffstat (limited to 'src/compiler/nir/nir_opcodes.py')
0 files changed, 0 insertions, 0 deletions