| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In preparation for indirecting all references to closures,
we rename _closure to _static_closure to ensure any old code
will get an undefined symbol error. In order to reference
a closure foobar_closure (which is now undefined), you should instead
use STATIC_CLOSURE(foobar). For convenience, a number of these
old identifiers are macro'd.
Across C-- and C (Windows and otherwise), there were differing
conventions on whether or not foobar_closure or &foobar_closure
was the address of the closure. Now, all foobar_closure references
are addresses, and no & is necessary.
CHARLIKE/INTLIKE were not changed, simply alpha-renamed.
Part of remove HEAP_ALLOCED patch set (#8199)
Depends on D265
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
Test Plan: validate
Reviewers: simonmar, austin
Subscribers: simonmar, ezyang, carter, thomie
Differential Revision: https://phabricator.haskell.org/D267
GHC Trac Issues: #8199
|
|
|
|
| |
This reverts commit 39b5c1cbd8950755de400933cecca7b8deb4ffcd.
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
| |
This will hopefully help ensure some basic consistency in the forward by
overriding buffer variables. In particular, it sets the wrap length, the
offset to 4, and turns off tabs.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Avoid unnecessary clock_gettime() syscalls in GC stats.
Test Plan: Use strace.
Reviewers: simonmar, austin
Reviewed By: simonmar, austin
Subscribers: simonmar, relrod, carter
Differential Revision: https://phabricator.haskell.org/D39
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Check for integer overflow in allocate() (#9172)
Test Plan: validate
Reviewers: austin
Reviewed By: austin
Subscribers: simonmar, relrod, carter
Differential Revision: https://phabricator.haskell.org/D36
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
|
|
|
|
|
|
|
|
| |
Problems were found on 32-bit platforms, I'll commit again when I have a fix.
This reverts the following commits:
54b31f744848da872c7c6366dea840748e01b5cf
b0534f78a73f972e279eed4447a5687bd6a8308e
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This tracks the amount of memory allocation by each thread in a
counter stored in the TSO. Optionally, when the counter drops below
zero (it counts down), the thread can be sent an asynchronous
exception: AllocationLimitExceeded. When this happens, given a small
additional limit so that it can handle the exception. See
documentation in GHC.Conc for more details.
Allocation limits are similar to timeouts, but
- timeouts use real time, not CPU time. Allocation limits do not
count anything while the thread is blocked or in foreign code.
- timeouts don't re-trigger if the thread catches the exception,
allocation limits do.
- timeouts can catch non-allocating loops, if you use
-fno-omit-yields. This doesn't work for allocation limits.
I couldn't measure any impact on benchmarks with these changes, even
for nofib/smp.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function was inlined at two places already. And the function is
having the STATIC_INLINE annotation, so the assembly output should.
be the same.
To convince myself, I did diff the output of the object files before
and after the patch and they matched on my 64-bit Ubuntu 13.10 machine,
running gcc 4.8.1-10ubuntu9.
Also, I had to move scavenge_small_bitmap up a bit since it's not in any
.h-file.
While I was at it, I also applied the analogous patch for Compact.c.
Though I had to write `thread_small_bitmap` instead of just moving it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A long debate is in issue #8742, but the main motivation is that this
allows for applying a patch to reuse the function scavenge_small_bitmap
without changing the .o-file output.
Similarly, I changed the types in rts/sm/Compact.c, so I can create
a STATIC_INLINE function for the redundant code block:
while (size > 0) {
if ((bitmap & 1) == 0) {
thread((StgClosure **)p);
}
p++;
bitmap = bitmap >> 1;
size--;
}
|
|
|
|
|
|
|
|
|
|
|
| |
n_capabilities is declared as unsigned int (32bit), and so multiplication
is 32-bit before being stored in a 64bit integer (StgWord).
Instead, cast n_capabilities to StgWord before multiplying.
Discovered by Coverity. CID 43164.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These array types are smaller than Array# and MutableArray# and are
faster when the array size is small, as they don't have the overhead
of a card table. Having no card table reduces the closure size with 2
words in the typical small array case and leads to less work when
updating or GC:ing the array.
Reduces both the runtime and memory allocation by 8.8% on my insert
benchmark for the HashMap type in the unordered-containers package,
which makes use of lots of small arrays. With tuned GC settings
(i.e. `+RTS -A6M`) the runtime reduction is 15%.
Fixes #8923.
|
| |
|
|
|
|
|
|
|
| |
This should have manifested earlier, but for some reason it only seemed
to trigger on Mavericks.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
| |
Following 298a25bdf and #8722 as Peter mentioned, this probably isn't
needed anymore.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
| |
As Luke Iannini reported, the Clang iOS cross compiler apparently
doesn't support __thread for some bizarre reason, so unfortunately they
too must fall back to pthread_{get,set}specific.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This basically cleans a lot of GCTDecl up - I found it quite hard to
read and a bit confusing. The changes are mostly cosmetic: better
delineation between the alternative cases and light touchups, and tries
to make every branch as consistent as possible.
However, this patch does have one significant effect: it will ensure
that any LLVM-based compilers will use __thread if they support it.
Before, they would simply always use pthread_getspecific and
pthread_setspecific, which are almost surely even *more* inefficient.
The details are a bit too long and boring to go into here; see #7602.
After talking with Simon, we decided to play it safe - __thread can at
least be optimized by future clang releases even further on OS X if they
choose, and it's safer until we can investigate the pthread
implementation further on Mavericks.
For Linux, the story isn't so bleak if you use Clang (for whatever
reason) - Linux directly writes to `%fs` for __thread slots (while OS X
will perform a load followed by an indirect call.) So it should still be
fairly competitive, speed-wise.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
|
|
|
|
|
|
|
| |
On win64 sizeof(long) != sizeof(void*), so debugTrace was casting a
value of incorrect size causing a validate failure.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
| |
An earlier patch fixes a bug in flushExec on linux only. This
patch uses the fixed code on all operating systems.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We now do the allocation of the blackhole indirection closure inside the
RTS procedure 'newCAF' instead of generating the allocation code inline
in the closure body of each CAF. This slightly decreases code size in
modules with a lot of CAFs.
As a result of this change, for example, the size of DynFlags.o drops by
~60KB and HsExpr.o by ~100KB.
|
| |
|
|
|
|
|
|
|
|
| |
Instead, just don't do anything on x86/amd64, and on !x86, use either A)
__clear_cache from libgcc, or B) sys_icache_invalidate for OS X (and
iOS.)
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds code for jumping to given addresses for ARM, written by Ben
Gamari.
However, when allocating new infotables for bytecode (which is where
this jump code occurs), we need to be sure to flush the cache on the
execute pointer returned from allocateExec() - on systems like ARM, the
processor won't reliably read back code or automatically cache flush,
where x86 will.
So we add a new flushExec primitive to call out to GCC's
__builtin___clear_cache primitive, which will properly generate the
correct code (nothing on x86, and a call to libgcc's __clear_cache on
ARM) and make sure we use it after writing the code out.
Authored-by: Ben Gamari <bgamari.foss@gmail.com>
Authored-by: Austin Seipp <austin@well-typed.com>
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
| |
This creates a new C API:
initLinker_ (int retain_cafs)
The old initLinker() was left as-is for backwards compatibility. See
documentation in Linker.h.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This resurrects some old code and makes it work again. The idea is
that we want to get an error message if we ever enter a CAF that has
been GC'd, rather than following its indirection which will likely
cause a segfault. Without this patch, these bugs are hard to track
down in gdb, because the IND_STATIC code overwrites R1 (the pointer to
the CAF) with its indirectee before jumping into bad memory, so we've
lost the address of the CAF that got GC'd.
Some associated refactoring while I was here.
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This is to avoid
rts/sm/Storage.c: In function ‘allocate’:
rts/sm/Storage.c:725:13:
error: multi-line comment [-Werror=comment]
cc1: all warnings being treated as errors
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
| |
c.f. commit 0b0fec536e35769b64b8bc5397c84138fa512155
|
| |
|
| |
|
| |
|
|
|
|
| |
volatile StgWord8 is not guaranteed to be atomic.
|
|
|
|
| |
whitehole_spin is only defined when PROF_SPIN is set.
|
|
|
|
| |
This reverts commit d85044f6b201eae0a9e453b89c0433608e0778f0.
|