<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/go-git.git/src, branch dev.ssa</title>
<subtitle>github.com: golang/go
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/'/>
<entry>
<title>[dev.ssa] cmd/compile: PPC64, FP to/from int conversions.</title>
<updated>2016-08-15T14:47:49+00:00</updated>
<author>
<name>David Chase</name>
<email>drchase@google.com</email>
</author>
<published>2016-08-10T18:44:57+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=d08010f94e06e5d5b6d01c248ba2a976b75d95a2'/>
<id>d08010f94e06e5d5b6d01c248ba2a976b75d95a2</id>
<content type='text'>
Passes ssa_test.

Requires a few new instructions and some scratchpad
memory to move data between G and F registers.

Also fixed comparisons to be correct in case of NaN.
Added missing instructions for run.bash.
Removed some FP registers that are apparently "reserved"
(but that are also apparently also unused except for a
gratuitous multiplication by two when y = x+x would work
just as well).

Currently failing stack splits.

Updates #16010.

Change-Id: I73b161bfff54445d72bd7b813b1479f89fc72602
Reviewed-on: https://go-review.googlesource.com/26813
Run-TryBot: David Chase &lt;drchase@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Cherry Zhang &lt;cherryyz@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Passes ssa_test.

Requires a few new instructions and some scratchpad
memory to move data between G and F registers.

Also fixed comparisons to be correct in case of NaN.
Added missing instructions for run.bash.
Removed some FP registers that are apparently "reserved"
(but that are also apparently also unused except for a
gratuitous multiplication by two when y = x+x would work
just as well).

Currently failing stack splits.

Updates #16010.

Change-Id: I73b161bfff54445d72bd7b813b1479f89fc72602
Reviewed-on: https://go-review.googlesource.com/26813
Run-TryBot: David Chase &lt;drchase@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Cherry Zhang &lt;cherryyz@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[dev.ssa] cmd/compile, etc.: more ARM64 optimizations, and enable SSA by default</title>
<updated>2016-08-15T03:37:34+00:00</updated>
<author>
<name>Cherry Zhang</name>
<email>cherryyz@google.com</email>
</author>
<published>2016-08-10T17:24:03+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=d99cee79b98dfb6c1cd8e64c96845ee29aa28b4c'/>
<id>d99cee79b98dfb6c1cd8e64c96845ee29aa28b4c</id>
<content type='text'>
Add more ARM64 optimizations:
- use hardware zero register when it is possible.
- use shifted ops.
  The assembler supports shifted ops but not documented, nor knows
  how to print it. This CL adds them.
- enable fast division.
  This was disabled because it makes the old backend generate slower
  code. But with SSA it generates faster code.

Turn on SSA by default, also adjust tests.

Change-Id: I7794479954c83bb65008dcb457bc1e21d7496da6
Reviewed-on: https://go-review.googlesource.com/26950
Run-TryBot: Cherry Zhang &lt;cherryyz@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add more ARM64 optimizations:
- use hardware zero register when it is possible.
- use shifted ops.
  The assembler supports shifted ops but not documented, nor knows
  how to print it. This CL adds them.
- enable fast division.
  This was disabled because it makes the old backend generate slower
  code. But with SSA it generates faster code.

Turn on SSA by default, also adjust tests.

Change-Id: I7794479954c83bb65008dcb457bc1e21d7496da6
Reviewed-on: https://go-review.googlesource.com/26950
Run-TryBot: Cherry Zhang &lt;cherryyz@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[dev.ssa] cmd/compile: simplify 386+PIC+globals a bit</title>
<updated>2016-08-11T20:34:47+00:00</updated>
<author>
<name>Keith Randall</name>
<email>khr@golang.org</email>
</author>
<published>2016-08-11T19:56:36+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=94c8e59ae11d374cd8dd46afec4710ad10500ad9'/>
<id>94c8e59ae11d374cd8dd46afec4710ad10500ad9</id>
<content type='text'>
We shouldn't issue instructions like MOVL foo(SB), AX directly from the
SSA backend.  Instead we should do LEAL foo(SB), AX; MOVL (AX), AX.

This simplifies obj logic because now only LEAL needs to be treated
specially.  The register allocator uses the LEAL to in effect allocate
the temporary register required for the shared library thunk calls.

Also, the LEALs can now be CSEd.  So code like
    var g int
    func f() { g += 5 }
Requires only one thunk call instead of 2.

Change-Id: Ib87d465f617f73af437445871d0ea91a630b2355
Reviewed-on: https://go-review.googlesource.com/26814
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We shouldn't issue instructions like MOVL foo(SB), AX directly from the
SSA backend.  Instead we should do LEAL foo(SB), AX; MOVL (AX), AX.

This simplifies obj logic because now only LEAL needs to be treated
specially.  The register allocator uses the LEAL to in effect allocate
the temporary register required for the shared library thunk calls.

Also, the LEALs can now be CSEd.  So code like
    var g int
    func f() { g += 5 }
Requires only one thunk call instead of 2.

Change-Id: Ib87d465f617f73af437445871d0ea91a630b2355
Reviewed-on: https://go-review.googlesource.com/26814
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[dev.ssa] cmd/compile: fix fp constant loads for 386+PIC</title>
<updated>2016-08-11T19:52:45+00:00</updated>
<author>
<name>Keith Randall</name>
<email>khr@golang.org</email>
</author>
<published>2016-08-11T16:49:48+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=8f955d3664813c831b35cb02c6e7b48dd0341ece'/>
<id>8f955d3664813c831b35cb02c6e7b48dd0341ece</id>
<content type='text'>
In position-independent 386 code, loading floating-point constants from
the constant pool requires two steps: materializing the address of
the constant pool entry (requires calling a thunk) and then loading
from that address.

Before this CL, the materializing happened implicitly in CX, which
clobbered that register.

Change-Id: Id094e0fb2d3be211089f299e8f7c89c315de0a87
Reviewed-on: https://go-review.googlesource.com/26811
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
Reviewed-by: David Crawshaw &lt;crawshaw@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In position-independent 386 code, loading floating-point constants from
the constant pool requires two steps: materializing the address of
the constant pool entry (requires calling a thunk) and then loading
from that address.

Before this CL, the materializing happened implicitly in CX, which
clobbered that register.

Change-Id: Id094e0fb2d3be211089f299e8f7c89c315de0a87
Reviewed-on: https://go-review.googlesource.com/26811
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
Reviewed-by: David Crawshaw &lt;crawshaw@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[dev.ssa] cmd/compile: add some ARM64 optimizations</title>
<updated>2016-08-11T18:08:47+00:00</updated>
<author>
<name>Cherry Zhang</name>
<email>cherryyz@google.com</email>
</author>
<published>2016-08-03T13:56:36+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=ed1ad8f56cc51cc55a8c12514e1c2b3098c1218b'/>
<id>ed1ad8f56cc51cc55a8c12514e1c2b3098c1218b</id>
<content type='text'>
Mostly mirrors ARM, includes:
- constant folding
- simplification of load, store, extension, and arithmetics
- nilcheck removal

Change-Id: Iffaa5fcdce100fe327429ecab316cb395e543469
Reviewed-on: https://go-review.googlesource.com/26710
Run-TryBot: Cherry Zhang &lt;cherryyz@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mostly mirrors ARM, includes:
- constant folding
- simplification of load, store, extension, and arithmetics
- nilcheck removal

Change-Id: Iffaa5fcdce100fe327429ecab316cb395e543469
Reviewed-on: https://go-review.googlesource.com/26710
Run-TryBot: Cherry Zhang &lt;cherryyz@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[dev.ssa] cmd/internal/obj/arm64: fix encoding constant into some instructions</title>
<updated>2016-08-10T20:33:11+00:00</updated>
<author>
<name>Cherry Zhang</name>
<email>cherryyz@google.com</email>
</author>
<published>2016-08-03T15:47:40+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=748aa84424418fb71c2528e7340df0ad6075b265'/>
<id>748aa84424418fb71c2528e7340df0ad6075b265</id>
<content type='text'>
When a constant can be encoded in a logical instruction (BITCON), do
it this way instead of using the constant pool. The BITCON testing
code runs faster than table lookup (using map):

(on AMD64 machine, with pseudo random input)
BenchmarkIsBitcon-4   	300000000	         4.04 ns/op
BenchmarkTable-4      	50000000	        27.3 ns/op

The equivalent C code of BITCON testing is formally verified with
model checker CBMC against linear search of the lookup table.

Also handle cases when a constant can be encoded in a MOV instruction.
In this case, materializa the constant into REGTMP without using the
constant pool.

When constants need to be added to the constant pool, make sure to
check whether it fits in 32-bit. If not, store 64-bit.

Both legacy and SSA compiler backends are happy with this.

Fixes #16226.

Change-Id: I883e3069dee093a1cdc40853c42221a198a152b0
Reviewed-on: https://go-review.googlesource.com/26631
Run-TryBot: Cherry Zhang &lt;cherryyz@google.com&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a constant can be encoded in a logical instruction (BITCON), do
it this way instead of using the constant pool. The BITCON testing
code runs faster than table lookup (using map):

(on AMD64 machine, with pseudo random input)
BenchmarkIsBitcon-4   	300000000	         4.04 ns/op
BenchmarkTable-4      	50000000	        27.3 ns/op

The equivalent C code of BITCON testing is formally verified with
model checker CBMC against linear search of the lookup table.

Also handle cases when a constant can be encoded in a MOV instruction.
In this case, materializa the constant into REGTMP without using the
constant pool.

When constants need to be added to the constant pool, make sure to
check whether it fits in 32-bit. If not, store 64-bit.

Both legacy and SSA compiler backends are happy with this.

Fixes #16226.

Change-Id: I883e3069dee093a1cdc40853c42221a198a152b0
Reviewed-on: https://go-review.googlesource.com/26631
Run-TryBot: Cherry Zhang &lt;cherryyz@google.com&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[dev.ssa] cmd/compile: implement GO386=387</title>
<updated>2016-08-10T17:41:01+00:00</updated>
<author>
<name>Keith Randall</name>
<email>khr@golang.org</email>
</author>
<published>2016-07-26T18:51:33+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=c069bc49963bd5c2c152fe600fdcb9a2b7b58f76'/>
<id>c069bc49963bd5c2c152fe600fdcb9a2b7b58f76</id>
<content type='text'>
Last part of the 386 SSA port.

Modify the x86 backend to simulate SSE registers and
instructions with 387 registers and instructions.
The simulation isn't terribly performant, but it works,
and the old implementation wasn't very performant either.
Leaving to people who care about 387 to optimize if they want.

Turn on SSA backend for 386 by default.

Fixes #16358

Change-Id: I678fb59132620b2c47e993c1c10c4c21135f70c0
Reviewed-on: https://go-review.googlesource.com/25271
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Last part of the 386 SSA port.

Modify the x86 backend to simulate SSE registers and
instructions with 387 registers and instructions.
The simulation isn't terribly performant, but it works,
and the old implementation wasn't very performant either.
Leaving to people who care about 387 to optimize if they want.

Turn on SSA backend for 386 by default.

Fixes #16358

Change-Id: I678fb59132620b2c47e993c1c10c4c21135f70c0
Reviewed-on: https://go-review.googlesource.com/25271
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[dev.ssa] cmd/compile: more fixes for 386 shared libraries</title>
<updated>2016-08-10T17:09:38+00:00</updated>
<author>
<name>Keith Randall</name>
<email>khr@golang.org</email>
</author>
<published>2016-08-09T20:58:06+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=77ef597f38e11e03522d1ccac34cfd39a1ca8d8e'/>
<id>77ef597f38e11e03522d1ccac34cfd39a1ca8d8e</id>
<content type='text'>
Use the destination register for materializing the pc
for GOT references also. See https://go-review.googlesource.com/c/25442/
The SSA backend assumes CX does not get clobbered for these instructions.

Mark duffzero as clobbering CX. The linker needs to clobber CX
to materialize the address to call. (This affects the non-shared-library
duffzero also, but hopefully forbidding one register across duffzero
won't be a big deal.)

Hopefully this is all the cases where the linker is clobbering CX
under the hood and SSA assumes it isn't.

Change-Id: I080c938170193df57cd5ce1f2a956b68a34cc886
Reviewed-on: https://go-review.googlesource.com/26611
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Michael Hudson-Doyle &lt;michael.hudson@canonical.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the destination register for materializing the pc
for GOT references also. See https://go-review.googlesource.com/c/25442/
The SSA backend assumes CX does not get clobbered for these instructions.

Mark duffzero as clobbering CX. The linker needs to clobber CX
to materialize the address to call. (This affects the non-shared-library
duffzero also, but hopefully forbidding one register across duffzero
won't be a big deal.)

Hopefully this is all the cases where the linker is clobbering CX
under the hood and SSA assumes it isn't.

Change-Id: I080c938170193df57cd5ce1f2a956b68a34cc886
Reviewed-on: https://go-review.googlesource.com/26611
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Michael Hudson-Doyle &lt;michael.hudson@canonical.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[dev.ssa] cmd/compile: PPC: FP load/store/const/cmp/neg; div/mod</title>
<updated>2016-08-09T17:13:43+00:00</updated>
<author>
<name>David Chase</name>
<email>drchase@google.com</email>
</author>
<published>2016-08-08T21:19:56+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=ff37d0e681a8b35c0ebff34bc1e73105f8845928'/>
<id>ff37d0e681a8b35c0ebff34bc1e73105f8845928</id>
<content type='text'>
FP&lt;-&gt;int conversions remain.

Updates #16010.

Change-Id: I38d7a4923e34d0a489935fffc4c96c020cafdba2
Reviewed-on: https://go-review.googlesource.com/25589
Run-TryBot: David Chase &lt;drchase@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
FP&lt;-&gt;int conversions remain.

Updates #16010.

Change-Id: I38d7a4923e34d0a489935fffc4c96c020cafdba2
Reviewed-on: https://go-review.googlesource.com/25589
Run-TryBot: David Chase &lt;drchase@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[dev.ssa] cmd/compile: fix PIC for SSA-generated code</title>
<updated>2016-08-09T15:50:07+00:00</updated>
<author>
<name>Keith Randall</name>
<email>khr@golang.org</email>
</author>
<published>2016-08-03T20:00:49+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=2cbdd55d640314e37e43d7dd8e60c457846a2876'/>
<id>2cbdd55d640314e37e43d7dd8e60c457846a2876</id>
<content type='text'>
Access to globals requires a 2-instruction sequence on PIC 386.

    MOVL foo(SB), AX

is translated by the obj package into:

    CALL getPCofNextInstructionInTempRegister(SB)
    MOVL (&amp;foo-&amp;thisInstruction)(tmpReg), AX

The call returns the PC of the next instruction in a register.
The next instruction then offsets from that register to get the
address required.  The tricky part is the allocation of the
temp register.  The legacy compiler always used CX, and forbid
the register allocator from allocating CX when in PIC mode.
We can't easily do that in SSA because CX is actually a required
register for shift instructions. (I think the old backend got away
with this because the register allocator never uses CX, only
codegen knows that shifts must use CX.)

Instead, we allow the temp register to be anything.  When the
destination of the MOV (or LEA) is an integer register, we can
use that register.  Otherwise, we make sure to compile the
operation using an LEA to reference the global.  So

    MOVL AX, foo(SB)

is never generated directly.  Instead, SSA generates:

    LEAL foo(SB), DX
    MOVL AX, (DX)

which is then rewritten by the obj package to:

    CALL getPcInDX(SB)
    LEAL (&amp;foo-&amp;thisInstruction)(DX), AX
    MOVL AX, (DX)

So this CL modifies the obj package to use different thunks
to materialize the pc into different registers.  We use the
registers that regalloc chose so that SSA can still allocate
the full set of registers.

Change-Id: Ie095644f7164a026c62e95baf9d18a8bcaed0bba
Reviewed-on: https://go-review.googlesource.com/25442
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Access to globals requires a 2-instruction sequence on PIC 386.

    MOVL foo(SB), AX

is translated by the obj package into:

    CALL getPCofNextInstructionInTempRegister(SB)
    MOVL (&amp;foo-&amp;thisInstruction)(tmpReg), AX

The call returns the PC of the next instruction in a register.
The next instruction then offsets from that register to get the
address required.  The tricky part is the allocation of the
temp register.  The legacy compiler always used CX, and forbid
the register allocator from allocating CX when in PIC mode.
We can't easily do that in SSA because CX is actually a required
register for shift instructions. (I think the old backend got away
with this because the register allocator never uses CX, only
codegen knows that shifts must use CX.)

Instead, we allow the temp register to be anything.  When the
destination of the MOV (or LEA) is an integer register, we can
use that register.  Otherwise, we make sure to compile the
operation using an LEA to reference the global.  So

    MOVL AX, foo(SB)

is never generated directly.  Instead, SSA generates:

    LEAL foo(SB), DX
    MOVL AX, (DX)

which is then rewritten by the obj package to:

    CALL getPcInDX(SB)
    LEAL (&amp;foo-&amp;thisInstruction)(DX), AX
    MOVL AX, (DX)

So this CL modifies the obj package to use different thunks
to materialize the pc into different registers.  We use the
registers that regalloc chose so that SSA can still allocate
the full set of registers.

Change-Id: Ie095644f7164a026c62e95baf9d18a8bcaed0bba
Reviewed-on: https://go-review.googlesource.com/25442
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
