| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
The pretty printers for these make it hard to understand
what's actually in the fields of these structures. These
"ugly printers" show exactly what's in each field, which can
be useful for understanding and debugging code.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://codereview.appspot.com/175780043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NaCl requires the addition of a 32-byte "halt sled" at the end
of the text segment. This means that segtext.len is actually
32 bytes shorter than reality. The computation of the file offset
of the end of the data segment did not take this 32 bytes into
account, so if len and len+32 rounded up (by 64k) to different
values, the symbol table overwrote the last page of the data
segment.
The last page of the data segment is usually the C .string
symbols, which contain the strings used in error prints
by the runtime. So when this happens, your program
probably crashes, and then when it does, you get binary
garbage instead of all the usual prints.
The chance of hitting this with a randomly sized text segment
is 32 in 65536, or 1 in 2048.
If you add or remove ANY code while trying to debug this
problem, you're overwhelmingly likely to bump the text
segment one way or the other and make the bug disappear.
Correct all the computations to use segdata.fileoff+segdata.filelen
instead of trying to rederive segdata.fileoff.
This fixes the failure during the nacl/amd64p32 build.
TBR=iant
CC=golang-codereviews
https://codereview.appspot.com/135050043
|
|
|
|
|
|
|
|
|
| |
The helps certain diagnostics and also removed duplicated enums as a side effect.
LGTM=dave, rsc
R=rsc, dave
CC=golang-codereviews
https://codereview.appspot.com/115060044
|
|
|
|
|
|
|
| |
LGTM=rsc
R=rsc, iant
CC=golang-codereviews
https://codereview.appspot.com/120220043
|
|
|
|
|
|
|
| |
LGTM=dave
R=rsc, dave
CC=golang-codereviews
https://codereview.appspot.com/116330043
|
|
|
|
|
|
|
|
|
| |
Unused. cmd/dist will generate enams as liblink/anames[568].c.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://codereview.appspot.com/119940043
|
|
|
|
|
|
|
| |
LGTM=dave, rsc
R=rsc, iant, dave
CC=golang-codereviews
https://codereview.appspot.com/108360043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main changes fall into a few patterns:
1. Replace #define with enum.
2. Add /*c2go */ comment giving effect of #define.
This is necessary for function-like #defines and
non-enum-able #defined constants.
(Not all compilers handle negative or large enums.)
3. Add extra braces in struct initializer.
(c2go does not implement the full rules.)
This is enough to let c2go typecheck the source tree.
There may be more changes once it is doing
other semantic analyses.
LGTM=minux, iant
R=minux, dave, iant
CC=golang-codereviews
https://codereview.appspot.com/106860045
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
benchmark old ns/op new ns/op delta
BenchmarkCopyFat512 1307 329 -74.83%
BenchmarkCopyFat256 666 169 -74.62%
BenchmarkCopyFat1024 2617 671 -74.36%
BenchmarkCopyFat128 343 89.0 -74.05%
BenchmarkCopyFat64 182 48.9 -73.13%
BenchmarkCopyFat32 103 28.8 -72.04%
BenchmarkClearFat128 102 46.6 -54.31%
BenchmarkClearFat512 344 167 -51.45%
BenchmarkClearFat64 50.5 26.5 -47.52%
BenchmarkClearFat256 147 87.2 -40.68%
BenchmarkClearFat32 22.7 16.4 -27.75%
BenchmarkClearFat1024 511 662 +29.55%
Fixes issue 7624
LGTM=rsc
R=golang-codereviews, khr, bradfitz, josharian, dave, rsc
CC=golang-codereviews
https://codereview.appspot.com/92760044
|
|
|
|
|
|
|
|
|
| |
Update issue 7331
LGTM=dave, iant
R=golang-codereviews, dave, gobot, iant
CC=golang-codereviews
https://codereview.appspot.com/89520043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new code is adapted from the Go 1.2 nosplit code,
but it does not have the bug reported in issue 7623:
g% go run nosplit.go
g% go1.2 run nosplit.go
BUG
rejected incorrectly:
main 0 call f; f 120
linker output:
# _/tmp/go-test-nosplit021064539
main.main: nosplit stack overflow
120 guaranteed after split check in main.main
112 on entry to main.f
-8 after main.f uses 120
g%
Fixes issue 6931.
Fixes issue 7623.
LGTM=iant
R=golang-codereviews, iant, ality
CC=golang-codereviews, r
https://codereview.appspot.com/88190043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The relocation and automatic variable types were using
arch-specific numbers. Introduce portable enumerations
instead.
To the best of my knowledge, these are the only arch-specific
bits left in the new object file format.
Remove now, before Go 1.3, because file formats are forever.
LGTM=iant
R=iant
CC=golang-codereviews
https://codereview.appspot.com/87670044
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new channel and map runtime routines take pointers
to values, typically temporaries. Without help, the compiler
cannot tell when those temporaries stop being needed,
because it isn't sure what happened to the pointer.
Arrange to insert explicit VARKILL instructions for these
temporaries so that the liveness analysis can avoid seeing
them as "ambiguously live".
The change is made in order.c, which was already in charge of
introducing temporaries to preserve the order-of-evaluation
guarantees. Now its job has expanded to include introducing
temporaries as needed by runtime routines, and then also
inserting the VARKILL annotations for all these temporaries,
so that their lifetimes can be shortened.
In order to do its job for the map runtime routines, order.c arranges
that all map lookups or map assignments have the form:
x = m[k]
x, y = m[k]
m[k] = x
where x, y, and k are simple variables (often temporaries).
Likewise, receiving from a channel is now always:
x = <-c
In order to provide the map guarantee, order.c is responsible for
rewriting x op= y into x = x op y, so that m[k] += z becomes
t = m[k]
t2 = t + z
m[k] = t2
While here, fix a few bugs in order.c's traversal: it was failing to
walk into select and switch case bodies, so order of evaluation
guarantees were not preserved in those situations.
Added tests to test/reorder2.go.
Fixes issue 7671.
In gc/popt's temporary-merging optimization, allow merging
of temporaries with their address taken as long as the liveness
ranges do not intersect. (There is a good chance of that now
that we have VARKILL annotations to limit the liveness range.)
Explicitly killing temporaries cuts the number of ambiguously
live temporaries that must be zeroed in the godoc binary from
860 to 711, or -17%. There is more work to be done, but this
is a good checkpoint.
Update issue 7345
LGTM=khr
R=khr
CC=golang-codereviews
https://codereview.appspot.com/81940043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL replays the following one CL from the rsc-go13nacl repo.
This is the last replay CL: after this CL the main repo will have
everything the rsc-go13nacl repo did. Changes made to the main
repo after the rsc-go13nacl repo branched off probably mean that
NaCl doesn't actually work after this CL, but all the code is now moved
over and just needs to be redebugged.
---
cmd/6l, cmd/8l, cmd/ld: support for Native Client
See golang.org/s/go13nacl for design overview.
This CL is publicly visible but not CC'ed to golang-dev,
to avoid distracting from the preparation of the Go 1.2
release.
This CL and the others will be checked into my rsc-go13nacl
clone repo for now, and I will send CLs against the main
repo early in the Go 1.3 development.
R?khr
https://codereview.appspot.com/15750044
---
LGTM=bradfitz, dave, iant
R=dave, bradfitz, iant
CC=golang-codereviews
https://codereview.appspot.com/69040044
|
|
|
|
|
|
|
|
|
|
|
| |
The "fat" referred to being used for multiword values only.
We're going to use it for non-fat values sometimes too.
No change other than the renaming.
TBR=iant
CC=golang-codereviews
https://codereview.appspot.com/63650043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now use the %A, %D, %P, and %R routines from liblink
across the board.
Fixes issue 7178.
Fixes issue 7055.
LGTM=iant
R=golang-codereviews, gobot, rsc, dave, iant, remyoudompheng
CC=golang-codereviews
https://codereview.appspot.com/49170043
Committer: Russ Cox <rsc@golang.org>
|
|
|
|
|
|
|
|
|
|
| |
rsc suggested that we split the whole linker changes into three parts.
This is the first one, mostly dealing with adding Hsolaris.
LGTM=iant
R=golang-codereviews, iant, dave
CC=golang-codereviews
https://codereview.appspot.com/54210050
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CL 56120043 fixed and cleaned up TLS on ARM after introducing liblink, but
left flag_shared broken. This CL restores the (unsupported) flag_shared
behaviour by simply rewriting access to $runtime.tlsgm(SB) with
runtime.tlsgm(SB), to compensate for the extra indirection when going from
the R_ARM_TLS_LE32 relocation to the R_ARM_TLS_IE32 relocation.
Also, remove unnecessary symbol lookup left after 56120043.
LGTM=iant
R=iant, rsc
CC=golang-codereviews
https://codereview.appspot.com/57000043
Committer: Ian Lance Taylor <iant@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CL 56120043 fixed TLS handling on ARM after the introduction of
liblink but left older ARM processors broken.
Before liblink, the MRC instruction was replaced with a fallback
on older ARMs. CL 56120043 removed that, because the rewrite matched
bit patterns on the AWORD pseudo-instruction and could therefore change
unrelated AWORDs that happened to match.
This CL adds an AMRC instruction to encode both MRC and MCR previously
encoded as AWORDs. Then, in liblink, the AMRC instructions are either
rewritten to AWORD, or, on goarm < 7, replaced with a branch to the
fallback.
./all.bash completes successfully on an ARMv7 with either GOARM=7 or
GOARM=5. I have verified that the fallback is indeed present in both
runtime.save_gm and runtime.load_gm when GOARM=5 but not when GOARM=7.
If all goes well, this should fix the armv5 builders.
LGTM=iant
R=iant, rsc
CC=golang-codereviews
https://codereview.appspot.com/55540044
Committer: Ian Lance Taylor <iant@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- new object file reader/writer (liblink/objfile.c)
- remove old object file writing routines
- add pcdata iterator
- remove all trace of "line number stack" and "path fragments" from
object files, linker (!!!)
- dwarf now writes a single "compilation unit" instead of one per package
This CL disables the check for chains of no-split functions that
could overflow the stack red zone. A future CL will attack the problem
of reenabling that check (issue 6931).
This CL is just the liblink and cmd/ld changes.
There are minor associated adjustments in CL 37030045.
Each depends on the other.
R=golang-dev, dave, iant
CC=golang-dev
https://codereview.appspot.com/39680043
|
|
|
|
|
|
|
|
| |
The linker is in charge of providing the one true declaration.
R=golang-dev, dave, r
CC=golang-dev
https://codereview.appspot.com/39560043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is an enormous amount of code moving around in this CL,
but the code is the same, and it is invoked in the same ways.
This CL is preparation for the new linker structure, not the new
structure itself.
The new library's definition is in include/link.h.
The main change is the use of a Link structure to hold all the
linker-relevant state, replacing the smattering of global variables.
The Link structure should both make it clearer which state must
be carried around and make it possible to parallelize more easily
later.
The main body of the linker has moved into the architecture-independent
cmd/ld directory. That includes the list of known header types, so the
distinction between Hplan9x32 and Hplan9x64 is removed (no other
header type distinguished 32- and 64-bit formats), and code for unused
formats such as ipaq kernels has been deleted.
The code being deleted from 5l, 6l, and 8l reappears in liblink or in ld.
Because multiple files are being merged in the liblink directory,
it is not possible to show the diffs nicely in hg.
The Prog and Addr structures have been unified into an
architecture-independent form and moved to link.h, where they will
be shared by all tools: the assemblers, the compilers, and the linkers.
The unification makes it possible to write architecture-independent
traversal of Prog lists, among other benefits.
The Sym structures cannot be unified: they are too fundamentally
different between the linker and the compilers. Instead, liblink defines
an LSym - a linker Sym - to be used in the Prog and Addr structures,
and the linker now refers exclusively to LSyms. The compilers will
keep using their own syms but will fill out the corresponding LSyms in
the Prog and Addr structures.
Although code from 5l, 6l, and 8l is now in a single library, the
code has been arranged so that only one architecture needs to
be linked into a particular program: 5l will not contain the code
needed for x86 instruction layout, for example.
The object file writing code in liblink/obj.c is from cmd/gc/obj.c.
Preparation for golang.org/s/go13linker work.
This CL does not build by itself. It depends on 35740044
and will be submitted at the same time.
R=iant
CC=golang-dev
https://codereview.appspot.com/35790044
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pointer maps by liveness analysis
This change allows the garbage collector to examine stack
slots that are determined as live and containing a pointer
value by the garbage collector. This results in a mean
reduction of 65% in the number of stack slots scanned during
an invocation of "GOGC=1 all.bash".
Unfortunately, this does not yet allow garbage collection to
be precise for the stack slots computed as live. Pointers
confound the determination of what definitions reach a given
instruction. In general, this problem is not solvable without
runtime cost but some advanced cooperation from the compiler
might mitigate common cases.
R=golang-dev, rsc, cshapiro
CC=golang-dev
https://codereview.appspot.com/14430048
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two bugs:
1. The first iteration of the traceback always uses LR when provided,
which it is (only) during a profiling signal, but in fact LR is correct
only if the stack frame has not been allocated yet. Otherwise an
intervening call may have changed LR, and the saved copy in the stack
frame should be used. Fix in traceback_arm.c.
2. The division runtime call adds 8 bytes to the stack. In order to
keep the traceback routines happy, it must copy the saved LR into
the new 0(SP). Change
SUB $8, SP
into
MOVW 0(SP), R11 // r11 is temporary, for use by linker
MOVW.W R11, -8(SP)
to update SP and 0(SP) atomically, so that the traceback always
sees a saved LR at 0(SP).
Fixes issue 6681.
R=golang-dev, r
CC=golang-dev
https://codereview.appspot.com/19910044
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The CL causes misc/cgo/test to fail randomly.
I suspect that the problem is the use of a division instruction
in usleep, which can be called while trying to acquire an m
and therefore cannot store the denominator in m.
The solution to that would be to rewrite the code to use a
magic multiply instead of a divide, but now we're getting
pretty far off the original code.
Go back to the original in preparation for a different,
less efficient but simpler fix.
??? original CL description
cmd/5l, runtime: make ARM integer division profiler-friendly
The implementation of division constructed non-standard
stack frames that could not be handled by the traceback
routines.
CL 13239052 left the frames non-standard but fixed them
for the specific case of a divide-by-zero panic.
A profiling signal can arrive at any time, so that fix
is not sufficient.
Change the division to store the extra argument in the M struct
instead of in a new stack slot. That keeps the frames bog standard
at all times.
Also fix a related bug in the traceback code: when starting
a traceback, the LR register should be ignored if the current
function has already allocated its stack frame and saved the
original LR on the stack. The stack copy should be used, as the
LR register may have been modified.
Combined, these make the torture test from issue 6681 pass.
Fixes issue 6681.
R=golang-dev, r, josharian
CC=golang-dev
https://codereview.appspot.com/19810043
???
TBR=r
CC=golang-dev
https://codereview.appspot.com/20350043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implementation of division constructed non-standard
stack frames that could not be handled by the traceback
routines.
CL 13239052 left the frames non-standard but fixed them
for the specific case of a divide-by-zero panic.
A profiling signal can arrive at any time, so that fix
is not sufficient.
Change the division to store the extra argument in the M struct
instead of in a new stack slot. That keeps the frames bog standard
at all times.
Also fix a related bug in the traceback code: when starting
a traceback, the LR register should be ignored if the current
function has already allocated its stack frame and saved the
original LR on the stack. The stack copy should be used, as the
LR register may have been modified.
Combined, these make the torture test from issue 6681 pass.
Fixes issue 6681.
R=golang-dev, r, josharian
CC=golang-dev
https://codereview.appspot.com/19810043
|
|
|
|
|
|
|
|
|
|
| |
Add the -installsuffix flag to gc and {5,6,8}l, which overrides -race
for the suffix if both are supplied.
Pass this flag from the go tool for build and install.
R=rsc
CC=golang-dev
https://codereview.appspot.com/14246044
|
|
|
|
|
|
|
|
| |
Keith is too clever for me.
R=ken2
CC=golang-dev, khr
https://codereview.appspot.com/13272050
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #1:
Issue 5406 identified an interesting case:
defer iface.M()
may end up calling a wrapper that copies an indirect receiver
from the iface value and then calls the real M method. That's
two calls down, not just one, and so recover() == nil always
in the real M method, even during a panic.
[For the purposes of this entire discussion, a wrapper's
implementation is a function containing an ordinary call, not
the optimized tail call form that is somtimes possible. The
tail call does not create a second frame, so it is already
handled correctly.]
Fix this bug by introducing g->panicwrap, which counts the
number of bytes on current stack segment that are due to
wrapper calls that should not count against the recover
check. All wrapper functions must now adjust g->panicwrap up
on entry and back down on exit. This adds slightly to their
expense; on the x86 it is a single instruction at entry and
exit; on the ARM it is three. However, the alternative is to
make a call to recover depend on being able to walk the stack,
which I very much want to avoid. We have enough problems
walking the stack for garbage collection and profiling.
Also, if performance is critical in a specific case, it is already
faster to use a pointer receiver and avoid this kind of wrapper
entirely.
Bug #2:
The old code, which did not consider the possibility of two
calls, already contained a check to see if the call had split
its stack and so the panic-created segment was one behind the
current segment. In the wrapper case, both of the two calls
might split their stacks, so the panic-created segment can be
two behind the current segment.
Fix this by propagating the Stktop.panic flag forward during
stack splits instead of looking backward during recover.
Fixes issue 5406.
R=golang-dev, iant
CC=golang-dev
https://codereview.appspot.com/13367052
|
|
|
|
|
|
|
|
|
|
| |
Pull the stack split generation into its own function.
This will make an upcoming change to fix recover
easier to digest.
R=ken2
CC=golang-dev
https://codereview.appspot.com/13611044
|
|
|
|
|
|
| |
R=golang-dev, bradfitz, dave
CC=golang-dev
https://codereview.appspot.com/13225043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also introduce BGET2/4, BPUT2/4 as they are widely used.
Slightly improve BGETC/BPUTC implementation.
This gives ~5% CPU time improvement on go install -a -p1 std.
Before:
real user sys
0m23.561s 0m16.625s 0m5.848s
0m23.766s 0m16.624s 0m5.846s
0m23.742s 0m16.621s 0m5.868s
after:
0m22.999s 0m15.841s 0m5.889s
0m22.845s 0m15.808s 0m5.850s
0m22.889s 0m15.832s 0m5.848s
R=golang-dev, r
CC=golang-dev
https://codereview.appspot.com/12745047
|
|
|
|
|
|
|
|
| |
Add dragonflydynld to 5l and 8l so that they compile again.
R=golang-dev, bradfitz
CC=golang-dev
https://codereview.appspot.com/12739048
|
|
|
|
|
|
|
|
|
|
|
|
| |
See golang.org/s/go12nil.
This CL is about getting all the right checks inserted.
A followup CL will add an optimization pass to
remove redundant checks.
R=ken2
CC=golang-dev
https://codereview.appspot.com/12970043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
external linking
This CL is an aggregate of 10271047, 10499043, 9733044. Descriptions of each follow:
10499043
runtime,cmd/ld: Merge TLS symbols and teach 5l about ARM TLS
This CL prepares for external linking support to ARM.
The pseudo-symbols runtime.g and runtime.m are merged into a single
runtime.tlsgm symbol. When external linking, the offset of a thread local
variable is stored at a memory location instead of being embedded into a offset
of a ldr instruction. With a single runtime.tlsgm symbol for both g and m, only
one such offset is needed.
The larger part of this CL moves TLS code from gcc compiled to internally
compiled. The TLS code now uses the modern MRC instruction, and 5l is taught
about TLS fallbacks in case the instruction is not available or appropriate.
10271047
This CL adds support for -linkmode external to 5l.
For 5l itself, use addrel to allow for D_CALL relocations to be handled by the
host linker. Of the cases listed in rsc's comment in issue 4069, only case 5 and
63 needed an update. One of the TODO: addrel cases was since replaced, and the
rest of the cases are either covered by indirection through addpool (cases with
LTO or LFROM flags) or stubs (case 74). The addpool cases are covered because
addpool emits AWORD instructions, which in turn are handled by case 11.
In the runtime, change the argv argument in the rt0* functions slightly to be a
pointer to the argv list, instead of relying on a particular location of argv.
9733044
The -shared flag to 6l outputs a shared library, implemented in Go
and callable from non-Go programs such as C.
The main part of this CL change the thread local storage model.
Go uses the fastest and least general mode, local exec. TLS data in shared
libraries normally requires at least the local dynamic mode, however, this CL
instead opts for using the initial exec mode. Initial exec mode is faster than
local dynamic mode and can be used in linux since the linker has reserved a
limited amount of TLS space for performance sensitive TLS code.
Initial exec mode requires an extra load from the GOT table to determine the
TLS offset. This penalty will not be paid if ld is not in -shared mode, since
TLS accesses will be reduced to local exec.
The elf sections .init_array and .rela.init_array are added to register the Go
runtime entry with cgo at library load time.
The "hidden" attribute is added to Cgo functions called from Go, since Go
does not generate call through the GOT table, and adding non-GOT relocations for
a global function is not supported by gcc. Cgo symbols don't need to be global
and avoiding the GOT table is also faster.
The changes to 8l are only removes code relevant to the old -shared mode where
internal linking was used.
This CL only address the low level linker work. It can be submitted by itself,
but to be useful, the runtime changes in CL 9738047 is also needed.
Design discussion at
https://groups.google.com/forum/?fromgroups#!topic/golang-nuts/zmjXkGrEx6Q
Fixes issue 5590.
R=rsc
CC=golang-dev
https://codereview.appspot.com/12871044
Committer: Russ Cox <rsc@golang.org>
|
|
|
|
|
|
|
|
|
| |
They are just like MOVW and should be setting only
two register fields, not three.
R=ken2
CC=golang-dev, remyoudompheng
https://codereview.appspot.com/12781043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
arithmetic.
Pseudo-instructions MOVBS and MOVHS are used to clarify
the semantics of short integers vs. registers:
* 8-bit and 16-bit values in registers are assumed to always
be zero-extended or sign-extended depending on their type.
* MOVB is truncation or move of an already extended value
between registers.
* MOVBU enforces zero-extension at the destination (register).
* MOVBS enforces sign-extension at the destination (register).
And similarly for MOVH/MOVS/MOVHU.
The linker is adapted to assemble MOVB and MOVH to an ordinary
mov. Also a peephole pass in 5g that aims at eliminating
redundant zero/sign extensions is improved.
encoding/binary:
benchmark old ns/op new ns/op delta
BenchmarkReadSlice1000Int32s 220387 217185 -1.45%
BenchmarkReadStruct 12839 12910 +0.55%
BenchmarkReadInts 5692 5534 -2.78%
BenchmarkWriteInts 6137 6016 -1.97%
BenchmarkPutUvarint32 257 241 -6.23%
BenchmarkPutUvarint64 812 754 -7.14%
benchmark old MB/s new MB/s speedup
BenchmarkReadSlice1000Int32s 18.15 18.42 1.01x
BenchmarkReadStruct 5.45 5.42 0.99x
BenchmarkReadInts 5.27 5.42 1.03x
BenchmarkWriteInts 4.89 4.99 1.02x
BenchmarkPutUvarint32 15.56 16.57 1.06x
BenchmarkPutUvarint64 9.85 10.60 1.08x
crypto/des:
benchmark old ns/op new ns/op delta
BenchmarkEncrypt 7002 5169 -26.18%
BenchmarkDecrypt 7015 5195 -25.94%
benchmark old MB/s new MB/s speedup
BenchmarkEncrypt 1.14 1.55 1.36x
BenchmarkDecrypt 1.14 1.54 1.35x
strconv:
benchmark old ns/op new ns/op delta
BenchmarkAtof64Decimal 457 385 -15.75%
BenchmarkAtof64Float 574 479 -16.55%
BenchmarkAtof64FloatExp 1035 906 -12.46%
BenchmarkAtof64Big 1793 1457 -18.74%
BenchmarkAtof64RandomBits 2267 2066 -8.87%
BenchmarkAtof64RandomFloats 1416 1194 -15.68%
BenchmarkAtof32Decimal 451 379 -15.96%
BenchmarkAtof32Float 547 435 -20.48%
BenchmarkAtof32FloatExp 1095 986 -9.95%
BenchmarkAtof32Random 1154 1006 -12.82%
BenchmarkAtoi 1415 1380 -2.47%
BenchmarkAtoiNeg 1414 1401 -0.92%
BenchmarkAtoi64 1744 1671 -4.19%
BenchmarkAtoi64Neg 1737 1662 -4.32%
Fixes issue 1837.
R=rsc, dave, bradfitz
CC=golang-dev
https://codereview.appspot.com/12424043
|
|
|
|
|
|
|
|
|
|
|
|
| |
MOVBS and MOVHS are defined as duplicates of MOVB and MOVH,
and perform sign-extension moving.
No change is made to code generation.
Update issue 1837
R=rsc, bradfitz
CC=golang-dev
https://codereview.appspot.com/12682043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can then include this file in assembly to replace
cryptic constants like "7" with meaningful constants
like "(NOPROF|DUPOK|NOSPLIT)".
Converting just pkg/runtime/asm*.s for now. Dropping NOPROF
and DUPOK from lots of places where they aren't needed.
More .s files to come in a subsequent changelist.
A nonzero number in the textflag field now means
"has not been converted yet".
R=golang-dev, daniel.morsing, rsc, khr
CC=golang-dev
https://codereview.appspot.com/12568043
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL introduces a FUNCDATA number for runtime-specific
garbage collection metadata, changes the C and Go compilers
to emit that metadata, and changes the runtime to expect it.
The old pseudo-instructions that carried this information
are gone, as is the linker code to process them.
R=golang-dev, dvyukov, cshapiro
CC=golang-dev
https://codereview.appspot.com/11406044
|
|
|
|
|
|
| |
R=golang-dev, r, dave
CC=golang-dev
https://codereview.appspot.com/11494043
|
|
|
|
|
|
|
|
|
| |
The portable code in cmd/ld already knows how to process it,
we just have to ignore it during code generation.
R=ken2
CC=golang-dev
https://codereview.appspot.com/11363043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Design at http://golang.org/s/go12symtab.
This enables some cleanup of the garbage collector metadata
that will be done in future CLs.
This CL does not move the old symtab and pclntab back into
an unmapped section of the file. That's a bit tricky and will be
done separately.
Fixes issue 4020.
R=golang-dev, dave, cshapiro, iant, r
CC=golang-dev, nigeltao
https://codereview.appspot.com/11085043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the stack frame size is larger than the known-unmapped region at the
bottom of the address space, then the stack split prologue cannot use the usual
condition:
SP - size >= stackguard
because SP - size may wrap around to a very large number.
Instead, if the stack frame is large, the prologue tests:
SP - stackguard >= size
(This ends up being a few instructions more expensive, so we don't do it always.)
Preemption requests register by setting stackguard to a very large value, so
that the first test (SP - size >= stackguard) cannot possibly succeed.
Unfortunately, that same very large value causes a wraparound in the
second test (SP - stackguard >= size), making it succeed incorrectly.
To avoid *that* wraparound, we have to amend the test:
stackguard != StackPreempt && SP - stackguard >= size
This test is only used for functions with large frames, which essentially
always split the stack, so the cost of the few instructions is noise.
This CL and CL 11085043 together fix the known issues with preemption,
at the beginning of a function, so we will be able to try turning it on again.
R=ken2
CC=golang-dev
https://codereview.appspot.com/11205043
|
|
|
|
|
|
| |
R=golang-dev, dave, r
CC=golang-dev, nigeltao
https://codereview.appspot.com/10713043
|
|
|
|
|
|
|
|
|
|
|
| |
STRINGSZ (200) is fine for lines generated by things like
instruction dumps, but an error containing a couple file
names can easily exceed that, especially on Macs with
the ridiculous default $TMPDIR.
R=ken2
CC=golang-dev
https://codereview.appspot.com/11199043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.
For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:
TEXT myfunc
check for split
jmpcond nosplit
call morestack
nosplit:
sub $xxx, sp
Now an extra instruction is inserted after the call:
TEXT myfunc
start:
check for split
jmpcond nosplit
call morestack
jmp start
nosplit:
sub $xxx, sp
The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.
The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.
The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.
Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)
runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow
goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
/Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
/Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
/Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
/Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
/Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
/Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
/Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
/Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
/Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
/Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
/Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
/Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
/Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8
And here is a seg fault during oldstack:
SIGSEGV: segmentation violation
PC=0x1b2a6
runtime.oldstack()
/Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
/Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22
goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
/Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
/Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
/Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
/Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
/Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
/Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
/Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
/Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8
goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
/Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
/Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
/Users/rsc/g/go/src/pkg/runtime/proc.c:166
rax 0x23ccc0
rbx 0x23ccc0
rcx 0x0
rdx 0x38
rdi 0x2102c0170
rsi 0x221032cfe0
rbp 0x221032cfa0
rsp 0x7fff5fbff5b0
r8 0x2102c0120
r9 0x221032cfa0
r10 0x221032c000
r11 0x104ce8
r12 0xe5c80
r13 0x1be82baac718
r14 0x13091135f7d69200
r15 0x0
rip 0x1b2a6
rflags 0x10246
cs 0x2b
fs 0x0
gs 0x0
Fixes issue 5723.
R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://codereview.appspot.com/10360048
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Requires adding new linker instruction
RET f(SB)
meaning return but then immediately call f.
This is what you'd use to implement a tail call after
fiddling with the arguments, but the compiler only
uses it in genwrapper.
This CL eliminates the copy-and-paste genembedtramp
functions from 5g/8g/6g and makes the code run on ARM
for the first time. It removes a small special case for function
generation, which should help Carl a bit, but at the same time
it does not bother to implement general tail call optimization,
which we do not want anyway.
Fixes issue 5627.
R=ken2
CC=golang-dev
https://codereview.appspot.com/10057044
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes issue 5111.
Update issue 4718
This CL makes BL (Rx) to use BLX Rx instead of:
MOV LR, PC
MOV PC, Rx
R=cshapiro, rsc
CC=dave, gobot, golang-dev
https://codereview.appspot.com/9669045
|
|
|
|
|
|
| |
R=golang-dev, dave, rsc
CC=golang-dev
https://codereview.appspot.com/10085050
|