| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
| |
|
|
|
|
|
|
|
| |
Previously the libffi Adjustor implementation would use allocateExec to
create executable mappings. However, allocateExec is also used elsewhere
in GHC to allocate things other than ffi_closure, which is a use-case
which libffi does not support.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Pthread based timer was initialized started while some other parts of
the RTS assume it is initialized stopped, e.g. in hs_init_ghc:
/* Start the "ticker" and profiling timer but don't start until the
* scheduler is up. However, the ticker itself needs to be initialized
* before the scheduler to ensure that the ticker mutex is initialized as
* moreCapabilities will attempt to acquire it.
*/
* after a fork, don't start the timer before the IOManager is
initialized: the timer handler (handle_tick) might call wakeUpRts to
perform an idle GC, which calls wakeupIOManager/ioManagerWakeup
Found while debugging #18033/#20132 but I couldn't confirm if it fixes
them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In `checkBlockingQueues` we must always untag the `bh` field of an `StgBlockingQueue`.
While at first glance it might seem a sensible assumption that `bh` will
always be a blackhole and therefore never be tagged, the GC could
shortcut the indirection and put a tagged pointer into the indirection.
This blew up on aarch64-darwin with a misaligned access. `bh` pointed
to an address that always ended in 0xa. On architectures that
are a little less strict about alignment, this would have read
a garbage info table pointer, which very, very unlikely would have been equal to
`stg_BLACKHOLE_info` and therefore things accidentally worked. However,
on AArch64, the read of the info table pointer resulted in a SIGBUS due
to misaligned read.
Fixes #20093.
|
|
|
|
| |
fixes #19628
|
|
|
|
|
|
|
|
|
| |
Update submodule libffi-tarballs to upstream commit 4f9e20a.
Remove C compiler flags that suppress warnings in the RTS. Those
warnings have been fixed by libffi upstream.
Fixes #19885
|
|
|
|
|
|
| |
Previously we branched unnecessarily on
IF_NONMOVING_WRITE_BARRIER_ENABLED on every trip through the array
barrier push loop.
|
|
|
|
|
|
| |
We need to be careful about the sign bit for BR26 relocation
otherwise we end up encoding a large positive number and reading
back a large negative number.
|
| |
|
|
|
|
| |
Fixes #19995
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note [fd_set_overflow]
~~~~~~~~~~~~~~~~~~~~~~
In this note is the very sad tale of __darwin_fd_set_overflow.
The 8.10.5 release was broken because it was built in an environment
where the libraries were provided by XCode 12.*, these libraries introduced
a reference to __darwin_fd_set_overflow via the FD_SET macro which is used in
Select.c. Unfortunately, this symbol is not available with XCode 11.* which
led to a linker error when trying to link anything. This is almost certainly
a bug in XCode but we still have to work around it.
Undefined symbols for architecture x86_64:
"___darwin_check_fd_set_overflow", referenced from:
_awaitEvent in libHSrts.a(Select.o)
ld: symbol(s) not found for architecture x86_64
One way to fix this is to upgrade your version of xcode, but this would
force the upgrade on users prematurely. Fortunately it also seems safe to pass
the linker option "-Wl,-U,___darwin_check_fd_set_overflow" because the usage of
the symbol is guarded by a guard to check if it's defined.
__header_always_inline int
__darwin_check_fd_set(int _a, const void *_b)
{
if ((uintptr_t)&__darwin_check_fd_set_overflow != (uintptr_t) 0) {
return __darwin_check_fd_set_overflow(_a, _b, 1);
return __darwin_check_fd_set_overflow(_a, _b, 0);
} else {
return 1;
}
Across the internet there are many other reports of this issue
See: https://github.com/mono/mono/issues/19393
, https://github.com/sitsofe/fio/commit/b6a1e63a1ff607692a3caf3c2db2c3d575ba2320
The issue was originally reported in #19950
Fixes #19950
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now have two darwin flavours. AArch64-Darwin, and
x86_64-darwin, the latter one which has proper custom
adjustor support, the former though relies on libffi.
Mixing both leads to odd crashes, as the closures might
not fit the size of the libffi closures. Hence this
needs to be guarded by the USE_LBFFI_FOR_ADJUSTORS guard.
Original patch by Hamish Mackenzie
|
|
|
|
| |
Fixes #20006
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In which we add a new code generator to the Glasgow Haskell
Compiler. This codegen supports ELF and Mach-O targets, thus covering
Linux, macOS, and BSDs in principle. It was tested only on macOS and
Linux. The NCG follows a similar structure as the other native code
generators we already have, and should therfore be realtively easy to
follow.
It supports most of the features required for a proper native code
generator, but does not claim to be perfect or fully optimised. There
are still opportunities for optimisations.
Metric Decrease:
ManyAlternatives
ManyConstructors
MultiLayerModules
PmSeriesG
PmSeriesS
PmSeriesT
PmSeriesV
T10421
T10421a
T10858
T11195
T11276
T11303b
T11374
T11822
T12227
T12545
T12707
T13035
T13253
T13253-spj
T13379
T13701
T13719
T14683
T14697
T15164
T15630
T16577
T17096
T17516
T17836
T17836b
T17977
T17977b
T18140
T18282
T18304
T18478
T18698a
T18698b
T18923
T1969
T3064
T5030
T5321FD
T5321Fun
T5631
T5642
T5837
T783
T9198
T9233
T9630
T9872d
T9961
WWRec
Metric Increase:
T4801
|
| |
|
|
|
|
|
|
|
|
| |
The stg_ctoi_t and stg_ret_t procedures which convert unboxed
tuples between the bytecode an native calling convention were
causing a panic when using the LLVM backend.
Fixes #19591
|
| |
|
|
|
|
|
|
| |
NetBSD supports pthread_setname_np() but it expects a printf-style format string and a string argument.
Also use pthread for itimer on this platform.
|
| |
|
|
|
|
|
|
| |
derivatives
The constant CLOCK_THREAD_CPUTIME_ID is defined in a system header but it isn't acutally usable. clock_gettime(2) always returns EINVAL.
|
|
|
|
|
|
|
| |
The __BSD_VISIBLE and _DARWIN_C_SOURCE macros expose non-POSIX prototypes in
system header files. We should scope these to just the ".c" modules that
actually need them, and avoid defining them in header files used in other C
modules.
|
| |
|
|
|
|
|
|
|
|
| |
Previously we would check only that the *start* of the mapping was in
the bottom 32-bits of address space. However, we need the *entire*
mapping to be in low memory. Fix this.
Noticed by @Phyx.
|
|
|
|
|
|
| |
This avoids surprises in the non-threaded runtime with
blocked signals killing the process because they're only
blocked in the main thread and not in the ticker thread.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also some code cleanup, and a fix for an (extant unrelated) missing
<pthread_np.h> include that should hopefully resolve a failure in the
FreeBSD CI build, since it is best to make sure that this MR actually
builds on FreeBSD systems other than mine.
Some unexpected metric changes on FreeBSD (perhaps because CI had been
failing for a while???):
Metric Decrease:
T3064
T5321Fun
T5642
T9020
T12227
T13253-spj
T15164
T18282
WWRec
Metric Increase:
haddock.compiler
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The FreeBSD C <ctype.h> header supports per-thread locales by exporting a
static inline function that references the `_ThreadRuneLocale` thread-local
variable. This means that object files that use e.g. isdigit(3) end up with
TLSGD(19) relocations, and would not load into ghci or the language server.
Here we add support for this type of relocation, for now just on FreeBSD, and
only for external references to thread-specifics defined in already loaded
dynamic modules (primarily libc.so). This is sufficient to resolve the
<ctype.h> issues.
Runtime linking of ".o" files which *define* new thread-specific variables
would be noticeably more difficult, as this would likely require new rtld APIs.
|
|
|
|
|
|
|
| |
Previously we used this non-portable function unconditionally, breaking
FreeBSD.
Fixes #19637.
|
| |
|
| |
|
|
|
|
|
|
|
| |
So we did *not* have the stgCallocBytes prototype, and subsequently
the C compiler defaulted to `int` as a return value. Thus generating
sxtw instructions for the return value of stgCalloBytes to produce
the expected void *.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This drops allocateExec for darwin, and replaces it with
a alloc, write, mark executable strategy instead. This prevents
us from trying to allocate an executable range and then write to
it, which X^W will prohibit on darwin.
This will *only* work if we can use mmap.
|
|
|
|
| |
This is a pre-requisite for making aarch64-darwin work.
|
|
|
|
|
| |
Previously we were treating the thread ID as a HANDLE, but it is not. We
must first OpenThread.
|
|
|
|
|
|
|
|
| |
Previously `pathstat` relied on msvcrt's `stat` implementation, which was
not long-path-aware. It should rather be defined in terms of the `stat`
implementation provided by `utils/fs`.
Fixes #19541.
|
| |
|
| |
|
|
|
|
|
|
| |
tuples and sums.
fixes #1257
|
|
|
|
| |
This means it will be reposted everytime the eventlog is started.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement new debugger command `:ignore` to set an `ignore count`
for a specified breakpoint.
* Allow new optional parameter on `:continue` command to set an
`ignore count` for the current breakpoint.
* In the Interpreter replace the current `Word8` BreakArray with
an `Int` array.
* Change semantics of values in `BreakArray` to:
n < 0 : Breakpoint is disabled.
n == 0 : Breakpoint is enabled.
n > 0 : Breakpoint is enabled, but ignore next `n` iterations.
* Rewrite `:enable`/`:disable` processing as a special case of `:ignore`.
* Remove references to `BreakArray` from `ghc/UI.hs`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Related to #19381 #19359 #14702
After a spike in memory usage we have been conservative about returning
allocated blocks to the OS in case we are still allocating a lot and would
end up just reallocating them. The result of this was that up to 4 * live_bytes
of blocks would be retained once they were allocated even if memory usage ended up
a lot lower.
For a heap of size ~1.5G, this would result in OS memory reporting 6G which is
both misleading and worrying for users.
In long-lived server applications this results in consistent high memory
usage when the live data size is much more reasonable (for example ghcide)
Therefore we have a new (2021) strategy which starts by retaining up to 4 * live_bytes
of blocks before gradually returning uneeded memory back to the OS on subsequent
major GCs which are NOT caused by a heap overflow.
Each major GC which is NOT caused by heap overflow increases the consec_idle_gcs
counter and the amount of memory which is retained is inversely proportional to this number.
By default the excess memory retained is
oldGenFactor (controlled by -F) / 2 ^ (consec_idle_gcs * returnDecayFactor)
On a major GC caused by a heap overflow, the `consec_idle_gcs` variable is reset to 0
(as we could continue to allocate more, so retaining all the memory might make sense).
Therefore setting bigger values for `-Fd` makes the rate at which memory is returned slower.
Smaller values make it get returned faster. Setting `-Fd0` disables the
memory return completely, which is the behaviour of older GHC versions.
The default is `-Fd4` which results in the following scaling:
> mapM print [(x, 1/ (2**(x / 4))) | x <- [1 :: Double ..20]]
(1.0,0.8408964152537146)
(2.0,0.7071067811865475)
(3.0,0.5946035575013605)
(4.0,0.5)
(5.0,0.4204482076268573)
(6.0,0.35355339059327373)
(7.0,0.29730177875068026)
(8.0,0.25)
(9.0,0.21022410381342865)
(10.0,0.17677669529663687)
(11.0,0.14865088937534013)
(12.0,0.125)
(13.0,0.10511205190671433)
(14.0,8.838834764831843e-2)
(15.0,7.432544468767006e-2)
(16.0,6.25e-2)
(17.0,5.255602595335716e-2)
(18.0,4.4194173824159216e-2)
(19.0,3.716272234383503e-2)
(20.0,3.125e-2)
So after 13 consecutive GCs only 0.1 of the maximum memory used will be retained.
Further to this decay factor, the amount of memory we attempt to retain is
also influenced by the GC strategy for the oldest generation. If we are using
a copying strategy then we will need at least 2 * live_bytes for copying to take
place, so we always keep that much. If using compacting or nonmoving then we need a lower number,
so we just retain at least `1.2 * live_bytes` for some protection.
In future we might want to make this behaviour more aggressive, some
relevant literature is
> Ulan Degenbaev, Jochen Eisinger, Manfred Ernst, Ross McIlroy, and Hannes Payer. 2016. Idle time garbage collection scheduling. SIGPLAN Not. 51, 6 (June 2016), 570–583. DOI:https://doi.org/10.1145/2980983.2908106
which describes the "memory reducer" in the V8 javascript engine which
on an idle collection immediately returns as much memory as possible.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If startEventlog is called after the program has already started running
then quite a few useful events are missing from the eventlog because
they are only posted when the program starts. This patch adds a
mechanism to declare that an event should be reposted everytime the
startEventlog function is called.
Now in EventLog.c there is a global list of functions called
`eventlog_header_funcs` which stores a list of functions which should be
called everytime the eventlog starts.
When calling `postInitEvent`, the event will not only be immediately
posted to the eventlog but also added to the global list.
When startEventLog is called, the list is traversed and the events
reposted.
|