summaryrefslogtreecommitdiff
path: root/rts/win32/OSMem.c
Commit message (Collapse)AuthorAgeFilesLines
* Add error information to osCommitMemory on failure.Moritz Angermann2021-03-201-1/+1
|
* rts: Allocate MBlocks with MAP_TOP_DOWN on WindowsBen Gamari2020-11-271-1/+4
| | | | | | | As noted in #18991, we would previously allocate heap in low memory. Due to this the linker, which typically *needs* low memory, would end up competing with the heap. In longer builds we end up running out of low memory entirely, leading to linking failures.
* rts/win32: Exit with EXIT_HEAPOVERFLOW if memory commit failsBen Gamari2020-07-261-1/+1
| | | | | | | | Since switching to the two-step allocator, the `outofmem` test fails via `osCommitMemory` failing to commit. However, this was previously exiting with `EXIT_FAILURE`, rather than `EXIT_HEAPOVERFLOW`. I think the latter is a more reasonable exit code for this case and matches the behavior on POSIX platforms.
* winio: Use SlimReaderLocks and ConditonalVariables provided by the OS ↵Tamar Christina2020-07-151-18/+1
| | | | instead of emulated ones
* Enable large address space optimization on windows.Andreas Klebinger2020-06-251-1/+1
| | | | | | | | | | | Starting with Win 8.1/Server 2012 windows no longer preallocates page tables for reserverd memory eagerly, which prevented us from using this approach in the past. We also try to allocate the heap high in the memory space. Hopefully this makes it easier to allocate things in the low 4GB of memory that need to be there. Like jump islands for the linker.
* Windows: Update tarballs to GCC 9.2 and remove MAX_PATH limit.Tamar Christina2019-10-201-4/+4
|
* rts: fix Windows megablock allocatorTamar Christina2018-11-221-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | The megablock allocator does not currently check that after aligning the free region if it still has enough space to actually do the allocation. This causes it to return a memory region which it didn't fully allocate itself. Even worse, it can cause it to return a block with a region that will be present in two allocation pools. This causes if you're lucky an error from the OS that you're committing memory that has never been reserved, or causes random heap corruption. This change makes it consider the alignment as well. Test Plan: ./validate , testcase testmblockalloc Reviewers: bgamari, erikd, simonmar Reviewed By: simonmar Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5363
* rts: Throw better error if --numa is used without libnuma supportBen Gamari2018-05-031-0/+5
| | | | | | | | | | | | | Test Plan: Validate, run program with `+RTS --numa` without libnuma support compiled in Reviewers: erikd, simonmar Subscribers: thomie, carter GHC Trac Issues: #14956 Differential Revision: https://phabricator.haskell.org/D4556
* Fix NUMA support on Windows (#15049)David Kraeutmann2018-05-031-7/+16
| | | | | | | | | | | | | | | | | | | | * osNumaNodes now returns the right number of nodes * thread affinity is now correctly set TODO: no noticeable performance improvement. does windows already distribute threads in a NUMA-aware fashion? Test Plan: * validate * local tests on a NUMA machine Reviewers: bgamari, erikd, simonmar Reviewed By: bgamari, simonmar Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4607
* Various Windows / Cross Compile to Windows fixesMoritz Angermann2018-03-021-1/+1
| | | | | | | | | | | | | | | - Adds quick-cross-ncg flavour. - Fix windows wchar with `_s` for mingw - Lookup windres, dllwrap and objdump - Fix type. Reviewers: bgamari, hvr, Phyx, erikd, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie, erikd, carter Differential Revision: https://phabricator.haskell.org/D4430
* Prefer #if defined to #ifdefBen Gamari2017-04-281-1/+1
| | | | Our new CPP linter enforces this.
* rts: Make out-of-memory errors more consistentBen Gamari2017-04-021-1/+1
| | | | This will make it a bit easier to maintain consistent output in the testsuite.
* Fix x86 Windows build and testsuiteTamar Christina2016-12-061-3/+3
| | | | | | | | | | | | | | | | Summary: Fix issues preventing x86 GHC to build on Windows and fix segfault in the testsuite. Test Plan: ./validate Reviewers: austin, erikd, simonmar, bgamari Reviewed By: bgamari Subscribers: #ghc_windows_task_force, thomie Differential Revision: https://phabricator.haskell.org/D2789
* Use C99's boolBen Gamari2016-11-291-2/+2
| | | | | | | | | | | | Test Plan: Validate on lots of platforms Reviewers: erikd, simonmar, austin Reviewed By: erikd, simonmar Subscribers: michalt, thomie Differential Revision: https://phabricator.haskell.org/D2699
* Add NUMA support for WindowsTamar Christina2016-10-011-8/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: NOTE: I have been able to do simple testing on emulated NUMA nodes. Real hardware would be needed for a proper test. D2199 Added NUMA support for Linux, I have just filled in the missing pieces following the description of the Linux APIs. Test Plan: Use `bcdedit.exe /set groupsize 2` to modify the kernel again (Similar to D2533). This generates some NUMA nodes: ``` Logical Processor to NUMA Node Map: NUMA Node 0: ** -- NUMA Node 1: -- ** Approximate Cross-NUMA Node Access Cost (relative to fastest): 00 01 00: 1.1 1.1 01: 1.0 1.0 ``` run ` ../test-numa.exe +RTS --numa -RTS` and check PerfMon for NUMA allocations. Reviewers: simonmar, erikd, bgamari, austin Reviewed By: simonmar Subscribers: thomie, #ghc_windows_task_force Differential Revision: https://phabricator.haskell.org/D2534 GHC Trac Issues: #12602
* Make start address of `osReserveHeapMemory` tunable via command line -xbFrancesco Mazzoli2016-09-091-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We stumbled upon a case where an external library (OpenCL) does not work if a specific address (0x200000000) is taken. It so happens that `osReserveHeapMemory` starts trying to mmap at 0x200000000: ``` void *hint = (void*)((W_)8 * (1 << 30) + attempt * BLOCK_SIZE); at = osTryReserveHeapMemory(*len, hint); ``` This makes it impossible to use Haskell programs compiled with GHC 8 with C functions that use OpenCL. See this example ​https://github.com/chpatrick/oclwtf for a repro. This patch allows the user to work around this kind of behavior outside our control by letting the user override the starting address through an RTS command line flag. Reviewers: bgamari, Phyx, simonmar, erikd, austin Reviewed By: Phyx, simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D2513
* NUMA supportSimon Marlow2016-06-101-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The aim here is to reduce the number of remote memory accesses on systems with a NUMA memory architecture, typically multi-socket servers. Linux provides a NUMA API for doing two things: * Allocating memory local to a particular node * Binding a thread to a particular node When given the +RTS --numa flag, the runtime will * Determine the number of NUMA nodes (N) by querying the OS * Assign capabilities to nodes, so cap C is on node C%N * Bind worker threads on a capability to the correct node * Keep a separate free lists in the block layer for each node * Allocate the nursery for a capability from node-local memory * Allocate blocks in the GC from node-local memory For example, using nofib/parallel/queens on a 24-core 2-socket machine: ``` $ ./Main 15 +RTS -N24 -s -A64m Total time 173.960s ( 7.467s elapsed) $ ./Main 15 +RTS -N24 -s -A64m --numa Total time 150.836s ( 6.423s elapsed) ``` The biggest win here is expected to be allocating from node-local memory, so that means programs using a large -A value (as here). According to perf, on this program the number of remote memory accesses were reduced by more than 50% by using `--numa`. Test Plan: * validate * There's a new flag --debug-numa=<n> that pretends to do NUMA without actually making the OS calls, which is useful for testing the code on non-NUMA systems. * TODO: I need to add some unit tests Reviewers: erikd, austin, rwbarton, ezyang, bgamari, hvr, niteria Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2199
* Runtime linker: Break m32 allocator out into its own fileErik de Castro Lopo2016-05-251-6/+6
| | | | | | | | | | | | | | | | | | | | This makes the code a little more modular and allows the removal of some CPP hackery. By providing dummy implementations of of the `m32_*` functions (which simply call `errorBelch`) it means that the call sites for these functions are syntax checked even when `RTS_LINKER_USE_MMAP` is `0`. Also changes some size parameter types from `unsigned int` to `size_t`. Test Plan: Validate on Linux, OS X and Windows Reviewers: Phyx, hsyl20, bgamari, simonmar, austin Reviewed By: simonmar, austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2237
* Get types in osFreeMBlocks in sync with osGetMBlocksTomas Carnecky2016-05-191-1/+1
| | | | | | | | | | | | | The first argument of 'osFreeMBlocks' ought to have the same type as the return value from 'osGetMBlocks'. Make it so. Reviewers: austin, simonmar, bgamari Reviewed By: bgamari Subscribers: erikd, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D2235
* rts: Replace `nat` with `uint32_t`Erik de Castro Lopo2016-05-051-4/+4
| | | | | | | | | | | | The `nat` type was an alias for `unsigned int` with a comment saying it was at least 32 bits. We keep the typedef in case client code is using it but mark it as deprecated. Test Plan: Validated on Linux, OS X and Windows Reviewers: simonmar, austin, thomie, hvr, bgamari, hsyl20 Differential Revision: https://phabricator.haskell.org/D2166
* T11300: Fix test on windowsTamar Christina2016-01-141-2/+3
| | | | | | | | | | | | | | Summary: Fix exit code for Windows to match expected for out-of-memory test Test Plan: ./validate Reviewers: simonmar, austin, thomie, bgamari Reviewed By: thomie, bgamari Differential Revision: https://phabricator.haskell.org/D1753 GHC Trac Issues: #11422
* rts/posix: Reduce heap allocation amount on mmap failureBen Gamari2015-11-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Since the two-step allocator the RTS asks the kernel for a large upfront mmap'd region of memory (on the order of terabytes). While we have no expectation that this entire region will be backed by physical memory, this scheme nevertheless fails on some systems with resource limits. Here we use a back-off scheme to reduce our allocation request until we find a size agreeable to the kernel. Fixes #10877. This also fixes a latent bug wherein the heap reservation retry logic would fail to free the previously reserved address space, which would likely result in a heap allocation failure. Test Plan: set address space limit with `ulimit -v 67108864` and try running a compiled program Reviewers: simonmar, austin Reviewed By: simonmar Subscribers: thomie, RyanGlScott Differential Revision: https://phabricator.haskell.org/D1405 GHC Trac Issues: #10877
* rts: Make MBLOCK_SPACE_SIZE dynamicBen Gamari2015-10-301-3/+3
| | | | | | | | | | | | | | | | | | | | | | | Previously this was introduced in D524 as a compile-time constant. Sadly, this isn't flexible enough to allow for environments where ulimits restrict the maximum address space size (see, for instance, Consequently, we are forced to make this dynamic. In principle this shouldn't be so terrible as we can place both the beginning and end addresses within the same cache line, likely incurring only one or so additional instruction in HEAP_ALLOCED. Test Plan: validate Reviewers: austin, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1353 GHC Trac Issues: #10877
* Two step allocator for 64-bit systemsGiovanni Campagna2015-07-221-5/+72
| | | | | | | | | | | | | | | | | | | | | | | Summary: The current OS memory allocator conflates the concepts of allocating address space and allocating memory, which makes the HEAP_ALLOCED() implementation excessively complicated (as the only thing it cares about is address space layout) and slow. Instead, what we want is to allocate a single insanely large contiguous block of address space (to make HEAP_ALLOCED() checks fast), and then commit subportions of that in 1MB blocks as we did before. This is currently behind a flag, USE_LARGE_ADDRESS_SPACE, that is only enabled for certain OSes. Test Plan: validate Reviewers: simonmar, ezyang, austin Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D524 GHC Trac Issues: #9706
* Replaced SEH handles with VEH handlers which should work uniformly across ↵Tamar Christina2015-03-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x86 and x64 Summary: On Windows, the default action for things like division by zero and segfaults is to pop up a Dr. Watson error reporting dialog if the exception is unhandled by the user code. This is a pain when we are SSHed into a Windows machine, or when we want to debug a problem with gdb (gdb will get a first and second chance to handle the exception, but if it doesn't the pop-up will show). veh_excn provides two macros, `BEGIN_CATCH` and `END_CATCH`, which will catch such exceptions in the entire process and die by printing a message and calling `stg_exit(1)`. Previously this code was handled using SEH (Structured Exception Handlers) however each compiler and platform have different ways of dealing with SEH. `MSVC` compilers have the keywords `__try`, `__catch` and `__except` to have the compiler generate the appropriate SEH handler code for you. `MinGW` compilers have no such keywords and require you to manually set the SEH Handlers, however because SEH is implemented differently in x86 and x64 the methods to use them in GCC differs. `x86`: SEH is based on the stack, the SEH handlers are available at `FS[0]`. On startup one would only need to add a new handler there. This has a number of issues such as hard to share handlers and it can be exploited. `x64`: In order to fix the issues with the way SEH worked in x86, on x64 SEH handlers are statically compiled and added to the .pdata section by the compiler. Instead of being thread global they can now be Image global since you have to specify the `RVA` of the region of code that the handlers govern. You can on x64 Dynamically allocate SEH handlers, but it seems that (based on experimentation and it's very under-documented) that the dynamic calls cannot override static SEH handlers in the .pdata section. Because of this and because GHC no longer needs to support < windows XP, the better alternative for handling errors would be using the in XP introduced VEH. The bonus is because VEH (Vectored Exception Handler) are a runtime construct the API is the same for both x86 and x64 (note that the Context object does contain CPU specific structures) and the calls are the same cross compilers. Which means this file can be simplified quite a bit. Using VEH also means we don't have to worry about the dynamic code generated by GHCi. Test Plan: Prior to this diff the tests for `derefnull` and `divbyzero` seem to have been disabled for windows. To reproduce the issue on x64: 1) open ghci 2) import GHC.Base 3) run: 1 `divInt` 0 which should lead to ghci crashing an a watson error box displaying. After applying the patch, run: make TEST="derefnull divbyzero" on both x64 and x86 builds of ghc to verify fix. Reviewers: simonmar, austin Reviewed By: austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D691 GHC Trac Issues: #6079
* Revert "rts: add Emacs 'Local Variables' to every .c file"Simon Marlow2014-09-291-8/+0
| | | | This reverts commit 39b5c1cbd8950755de400933cecca7b8deb4ffcd.
* rts: add Emacs 'Local Variables' to every .c fileAustin Seipp2014-07-281-0/+8
| | | | | | | | This will hopefully help ensure some basic consistency in the forward by overriding buffer variables. In particular, it sets the wrap length, the offset to 4, and turns off tabs. Signed-off-by: Austin Seipp <austin@well-typed.com>
* rts: delint/detab/dewhitespace win32/OSMem.cAustin Seipp2014-07-281-15/+21
| | | | Signed-off-by: Austin Seipp <austin@well-typed.com>
* Fix Windows build.Austin Seipp2013-10-261-0/+2
| | | | | | | | | GlobalMemoryStatusEx actually requires _WIN32_WINNT to be defined as 0x0501 (Windows XP) for availability. For completeness, I bumped WIN32_WINNT in Ticker and OSThreads as well. Signed-off-by: Austin Seipp <austin@well-typed.com>
* rts: Add getPhysicalMemorySizeBen Gamari2013-10-251-0/+18
|
* Deprecate lnat, and use StgWord insteadSimon Marlow2012-09-071-15/+15
| | | | | | | | | | | | lnat was originally "long unsigned int" but we were using it when we wanted a 64-bit type on a 64-bit machine. This broke on Windows x64, where long == int == 32 bits. Using types of unspecified size is bad, but what we really wanted was a type with N bits on an N-bit machine. StgWord is exactly that. lnat was mentioned in some APIs that clients might be using (e.g. StackOverflowHook()), so we leave it defined but with a comment to say that it's deprecated.
* Use lnats to avoid overflowing when allocating large amountsIan Lynagh2012-05-051-8/+8
| | | | Stops outofmem segfaulting on Win64
* Fix warnings on Win64Ian Lynagh2012-04-261-2/+2
| | | | | | Mostly this meant getting pointer<->int conversions to use the right sizes. lnat is now size_t, rather than unsigned long, as that seems a better match for how it's used.
* Fix Windows memory freeing: add a check for fb == NULL; fixes trac #4506Ian Lynagh2010-12-081-40/+51
| | | | Also added a few comments, and a load of code got indented 1 level deeper.
* On Windows, when returning memory to the OS, we try to release itIan Lynagh2010-11-011-3/+87
| | | | as well as decommiting it.
* Whitespace only, in rts/win32/OSMem.cIan Lynagh2010-10-291-20/+20
|
* Return memory to the OS; trac #698Ian Lynagh2010-08-131-0/+36
|
* Windows build fixesSimon Marlow2009-08-031-2/+1
|
* wibble in setExecutableAustin Seipp2009-03-201-1/+1
|
* Refactoring: extract platform-specific code from sm/MBlock.cSimon Marlow2007-10-171-2/+227
| | | | Also common-up some duplicate bits in the platform-specific code
* fix an error message (barf -> sysErrorBelch)Simon Marlow2007-10-171-2/+3
|
* fix Win32 buildsimonmar@microsoft.com2006-05-301-2/+4
|
* replace stgMallocBytesRWX() with our own allocatorSimon Marlow2006-05-301-0/+34
See bug #738 Allocating executable memory is getting more difficult these days. In particular, the default SELinux policy on Fedora Core 5 disallows making the heap (i.e. malloc()'d memory) executable, although it does apparently allow mmap()'ing anonymous executable memory by default. Previously, stgMallocBytesRWX() used malloc() underneath, and then tried to make the page holding the memory executable. This was rather hacky and fails with Fedora Core 5. This patch adds a mini-allocator for executable memory, based on the block allocator. We grab page-sized blocks and make them executable, then allocate small objects from the page. There's a simple free function, that will free whole pages back to the system when they are empty.