delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Reorganisation of the source tree	Simon Marlow	2006-04-07	1	-258/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of the other users of the fptools build system have migrated to Cabal, and with the move to darcs we can now flatten the source tree without losing history, so here goes. The main change is that the ghc/ subdir is gone, and most of what it contained is now at the top level. The build system now makes no pretense at being multi-project, it is just the GHC build system. No doubt this will break many things, and there will be a period of instability while we fix the dependencies. A straightforward build should work, but I haven't yet fixed binary/source distributions. Changes to the Building Guide will follow, too.
*	make the smp way RTS-only, normal libraries now work with -smp	Simon Marlow	2006-02-08	1	-21/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had to bite the bullet here and add an extra word to every thunk, to enable running ordinary libraries on SMP. Otherwise, we would have needed to ship an extra set of libraries with GHC 6.6 in addition to the two sets we already ship (normal + profiled), and all Cabal packages would have to be compiled for SMP too. We decided it best just to take the hit now, making SMP easily accessible to everyone in GHC 6.6. Incedentally, although this increases allocation by around 12% on average, the performance hit is around 5%, and much less if your inner loop doesn't use any laziness.
*	[project @ 2005-10-21 14:02:17 by simonmar]	simonmar	2005-10-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Big re-hash of the threaded/SMP runtime This is a significant reworking of the threaded and SMP parts of the runtime. There are two overall goals here: - To push down the scheduler lock, reducing contention and allowing more parts of the system to run without locks. In particular, the scheduler does not require a lock any more in the common case. - To improve affinity, so that running Haskell threads stick to the same OS threads as much as possible. At this point we have the basic structure working, but there are some pieces missing. I believe it's reasonably stable - the important parts of the testsuite pass in all the (normal,threaded,SMP) ways. In more detail: - Each capability now has a run queue, instead of one global run queue. The Capability and Task APIs have been completely rewritten; see Capability.h and Task.h for the details. - Each capability has its own pool of worker Tasks. Hence, Haskell threads on a Capability's run queue will run on the same worker Task(s). As long as the OS is doing something reasonable, this should mean they usually stick to the same CPU. Another way to look at this is that we're assuming each Capability is associated with a fixed CPU. - What used to be StgMainThread is now part of the Task structure. Every OS thread in the runtime has an associated Task, and it can ask for its current Task at any time with myTask(). - removed RTS_SUPPORTS_THREADS symbol, use THREADED_RTS instead (it is now defined for SMP too). - The RtsAPI has had to change; we must explicitly pass a Capability around now. The previous interface assumed some global state. SchedAPI has also changed a lot. - The OSThreads API now supports thread-local storage, used to implement myTask(), although it could be done more efficiently using gcc's __thread extension when available. - I've moved some POSIX-specific stuff into the posix subdirectory, moving in the direction of separating out platform-specific implementations. - lots of lock-debugging and assertions in the runtime. In particular, when DEBUG is on, we catch multiple ACQUIRE_LOCK()s, and there is also an ASSERT_LOCK_HELD() call. What's missing so far: - I have almost certainly broken the Win32 build, will fix soon. - any kind of thread migration or load balancing. This is high up the agenda, though. - various performance tweaks to do - throwTo and forkProcess still do not work in SMP mode
*	[project @ 2005-04-22 12:28:00 by simonmar]	simonmar	2005-04-22	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	- Now that labels are always prefixed with '&' in .hc code, we have to fix some sloppiness in the RTS .cmm code. Fortunately it's not too painful. - SMP: acquire/release the storage manager lock around atomicModifyMutVar#. This is a hack: atomicModifyMutVar# isn't atomic under SMP otherwise, but the SM lock is a large sledgehammer. I think I'll apply the sledgehammer to the MVar primitives too, for the time being.
*	[project @ 2005-03-27 13:41:13 by panne]	panne	2005-03-27	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	* Some preprocessors don't like the C99/C++ '//' comments after a directive, so use '/* /' instead. For consistency, a lot of '//' in the include files were converted, too. UnDOSified libraries/base/cbits/runProcess.c. * My favourite sport: Killed $Id$s.
*	[project @ 2005-02-10 13:01:52 by simonmar]	simonmar	2005-02-10	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GC changes: instead of threading old-generation mutable lists through objects in the heap, keep it in a separate flat array. This has some advantages: - the IND_OLDGEN object is now only 2 words, so the minimum size of a THUNK is now 2 words instead of 3. This saves some amount of allocation (about 2% on average according to my measurements), and is more friendly to the cache by squashing objects together more. - keeping the mutable list separate from the IND object will be necessary for our multiprocessor implementation. - removing the mut_link field makes the layout of some objects more uniform, leading to less complexity and special cases. - I also unified the two mutable lists (mut_once_list and mut_list) into a single mutable list, which lead to more simplifications in the GC.
*	[project @ 2004-11-18 09:56:07 by tharris]	tharris	2004-11-18	1	-6/+7
\| \| \| \|	Support for atomic memory transactions and associated regression tests conc041-048
*	[project @ 2004-08-13 13:04:50 by simonmar]	simonmar	2004-08-13	1	-16/+112
\| \| \| \|	Merge backend-hacking-branch onto HEAD. Yay!
*	[project @ 2003-04-28 09:55:20 by simonmar]	simonmar	2003-04-28	1	-3/+9
\| \| \| \| \| \| \| \| \|	Following the recent change to the layout of the StgRetDyn frame, we now need to bump RESERVED_STACK_WORDS because this governs the amount of room which is guaranteed to be available on the stack in the event of a stack check failure. This accounts for at least one cause of recent crashes in the HEAD.
*	[project @ 2003-03-25 16:19:55 by sof]	sof	2003-03-25	1	-7/+1
\| \| \| \|	wibble - move LARGE_OBJECT_THRESHOLD from Constants.h to Block.h, as it's defined in terms of Block.h defines
*	[project @ 2003-03-25 16:06:39 by sof]	sof	2003-03-25	1	-7/+6
\| \| \| \|	upd MAX_SPEC_SELECTEE_SIZE comment
*	[project @ 2002-12-11 15:36:20 by simonmar]	simonmar	2002-12-11	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge the eval-apply-branch on to the HEAD ------------------------------------------ This is a change to GHC's evaluation model in order to ultimately make GHC more portable and to reduce complexity in some areas. At some point we'll update the commentary to describe the new state of the RTS. Pending that, the highlights of this change are: - No more Su. The Su register is gone, update frames are one word smaller. - Slow-entry points and arg checks are gone. Unknown function calls are handled by automatically-generated RTS entry points (AutoApply.hc, generated by the program in utils/genapply). - The stack layout is stricter: there are no "pending arguments" on the stack any more, the stack is always strictly a sequence of stack frames. This means that there's no need for LOOKS_LIKE_GHC_INFO() or LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know how to find the boundary between the text and data segments (BIG WIN!). - A couple of nasty hacks in the mangler caused by the neet to identify closure ptrs vs. info tables have gone away. - Info tables are a bit more complicated. See InfoTables.h for the details. - As a side effect, GHCi can now deal with polymorphic seq. Some bugs in GHCi which affected primitives and unboxed tuples are now fixed. - Binary sizes are reduced by about 7% on x86. Performance is roughly similar, some programs get faster while some get slower. I've seen GHCi perform worse on some examples, but haven't investigated further yet (GHCi performance should be about the same or better in theory). - Internally the code generator is rather better organised. I've moved info-table generation from the NCG into the main codeGen where it is shared with the C back-end; info tables are now emitted as arrays of words in both back-ends. The NCG is one step closer to being able to support profiling. This has all been fairly thoroughly tested, but no doubt I've messed up the commit in some way.
*	[project @ 2002-09-23 14:33:50 by simonmar]	simonmar	2002-09-23	1	-5/+1
\| \| \| \|	remove HEAP_HWM_WORDS; it probably hasn't been used for about 5 years
*	[project @ 2001-11-28 14:31:27 by simonmar]	simonmar	2001-11-28	1	-2/+2
\| \| \| \| \|	Revert previous commit: I accidentally committed my local version of this file which has BLOCK_SIZE set to 2k rather than 4k (for testing).
*	[project @ 2001-11-26 16:54:21 by simonmar]	simonmar	2001-11-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Profiling cleanup. This commit eliminates some duplication in the various heap profiling subsystems, and generally centralises much of the machinery. The key concept is the separation of a heap census (which is now done in one place only instead of three) from the calculation of retainer sets. Previously the retainer profiling code also did a heap census on the fly, and lag-drag-void profiling had its own census machinery. Value-adds: - you can now restrict a heap profile to certain retainer sets, but still display by cost centre (or type, or closure or whatever). - I've added an option to restrict the maximum retainer set size (+RTS -R<size>, defaulting to 8). - I've cleaned up the heap profiling options at the request of Simon PJ. See the help text for details. The new scheme is backwards compatible with the old. - I've removed some odd bits of LDV or retainer profiling-specific code from various parts of the system. - the time taken doing heap censuses (and retainer set calculation) is now accurately reported by the RTS when you say +RTS -Sstderr. Still to come: - restricting a profile to a particular biography (lag/drag/void/use). This requires keeping old heap censuses around, but the infrastructure is now in place to do this.
*	[project @ 2001-10-03 13:57:42 by simonmar]	simonmar	2001-10-03	1	-89/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tidy up ghc/includes/Constants and related things. Now all the constants that the compiler needs to know, such as header size, update frame size, info table size and so on are generated automatically into a header file, DeriviedConstants.h, by a small C program in the same way as NativeDefs.h. The C code in the RTS is expected to use sizeof() directly (it already does). Also tidied up the constants in MachDeps.h - all the constants representing the sizes of various types are named SIZEOF_<foo>, to match the constants defined in config.h. PrelStorable.lhs now doesn't contain any special knowledge about GHC's conventions as regards the size of certain types, this is all in MachDeps.h.
*	[project @ 2001-08-01 08:20:33 by simonmar]	simonmar	2001-08-01	1	-2/+1
\| \| \| \| \| \|	now UF_CCS isn't used anywhere. (and it was wrong, too, which is why I wanted to get rid of it)
*	[project @ 2001-07-31 18:30:22 by qrczak]	qrczak	2001-07-31	1	-1/+2
\| \| \| \|	UF_CCS was used in compiler/main/Constants.lhs
*	[project @ 2001-07-31 13:44:37 by simonmar]	simonmar	2001-07-31	1	-2/+1
\| \| \| \|	UF_CCS is unused
*	[project @ 2000-08-07 23:37:19 by qrczak]	qrczak	2000-08-07	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now Char, Char#, StgChar have 31 bits (physically 32). "foo"# is still an array of bytes. CharRep represents 32 bits (on a 64-bit arch too). There is also Int8Rep, used in those places where bytes were originally meant. readCharArray, indexCharOffAddr etc. still use bytes. Storable and {I,M}Array use wide Chars. In future perhaps all sized integers should be primitive types. Then some usages of indexing primops scattered through the code could be changed to then-available Int8 ones, and then Char variants of primops could be made wide (other usages that handle text should use conversion that will be provided later). I/O and _ccall_ arguments assume ISO-8859-1. UTF-8 is internally used for string literals (only). Z-encoding is ready for Unicode identifiers. Ranges of intlike and charlike closures are more easily configurable. I've probably broken nativeGen/MachCode.lhs:chrCode for Alpha but I don't know the Alpha assembler to fix it (what is zapnot?). Generally I'm not sure if I've done the NCG changes right. This commit breaks the binary compatibility (of course). TODO: * is* and to{Lower,Upper} in Char (in progress). * Libraries for text conversion (in design / experiments), to be plugged to I/O and a higher level foreign library. * PackedString. * StringBuffer and accepting source in encodings other than ISO-8859-1.
*	[project @ 2000-08-03 11:28:35 by simonmar]	simonmar	2000-08-03	1	-8/+1
\| \| \| \| \| \| \|	Implement +RTS -C<n>, the context switch interval flag. This was previously advertised in the usage message, but there was a note in the Users' Guide stating that it didn't work. Anwyay, I'm going to consider it a bug and backport to 4.08.1.
*	[project @ 2000-07-26 13:27:54 by simonmar]	simonmar	2000-07-26	1	-1/+5
\| \| \| \|	Add a constant definition for WORD_SIZE, the size of an StgWord in bytes.
*	[project @ 2000-02-28 12:02:31 by sewardj]	sewardj	2000-02-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many changes to improve the quality and correctness of generated code, both for x86 and all-platforms. The intent is that the x86 NCG will now be good enough for general use. -- Add an almost-trivial Stix (generic) peephole optimiser, whose sole purpose is elide assignments to temporaries used only once, in the very next tree. This generates substantially better code for conditionals on all platforms. Enhance Stix constant folding to take advantage of the inlining. The inlining presents subsequent insn selection phases with more complex trees than would have previously been used to. This has shown up several bugs in the x86 insn selectors, now fixed. (assumptions that data size is Word, when could be Byte, assumptions that an operand will always be in a temp reg, etc) -- x86: Use the FLDZ and FLD1 insns. -- x86: spill FP registers with 80-bit loads/stores so that Intel's extra 16 bits of accuracy are not lost. If this isn't done, FP spills are not suitably transparent. Increase the number of spill words available to 2048. -- x86: give the register allocator more flexibility in choosing spill temporaries. -- x86, RegAllocInfo.regUsage: fix error for GST, and rewrite to make it clearer. -- Correctly track movements in the C stack pointer, and generate correct spill code for archs which spill against the stack pointer even when the stack pointer moves. Redo the x86 ccall mechanism to push args on the C stack in the normal way. Rather than have the spiller have to analyse code sequences to determine the current stack offset, the insn selectors communicate the current offset whenever it changes by inserting a DELTA pseudo-insn. Then the spiller only has to spot DELTAs. This means having a new native-code-generator monad (Stix.NatM) which carries both a UniqSupply and the current stack offset. -- Remove the asmPar/asmSeq ways of grouping insns together. In the presence of fixed registers, it is hard to demonstrate that insn selectors using asmPar always give correct code, and the extra complication doesn't help any. Also, directly construct code sequences using tree-based ordered lists (utils/OrdList.lhs) for linear-time appends, rather than the bizarrely complex method using fns and fn composition. -- Inline some hcats in printing of x86 address modes. -- Document more of the hidden assumptions which insn selection relies on, particular wrt addressing modes.
*	[project @ 2000-02-01 14:08:22 by sewardj]	sewardj	2000-02-01	1	-2/+2
\| \| \| \| \|	Double the number of RESERVED_C_STACK_BYTES so as to give the native code generator up to 508 spill slots.
*	[project @ 2000-01-24 18:22:07 by sewardj]	sewardj	2000-01-24	1	-2/+3
\| \| \| \| \| \| \|	ARR_HDR_SIZE --> ARR_WORDS_HDR_SIZE, and derived quantities in Constants.h, Constants.lhs et al are similarly renamed. new constant ARR_PTRS_HDR_SIZE, with corresponding derivatives.
*	[project @ 2000-01-13 14:33:57 by hwloidl]	hwloidl	2000-01-13	1	-2/+12
\| \| \| \| \|	Merged GUM-4-04 branch into the main trunk. In particular merged GUM and SMP code. Most of the GranSim code in GUM-4-04 still has to be carried over.
*	[project @ 1999-10-27 09:58:36 by simonmar]	simonmar	1999-10-27	1	-3/+3
\| \| \| \|	reduce block size to 4k
*	[project @ 1999-03-26 10:29:02 by simonm]	simonm	1999-03-26	1	-12/+5
\| \| \| \|	More profiling fixes.
*	[project @ 1999-02-05 16:02:18 by simonm]	simonm	1999-02-05	1	-1/+3
\| \| \| \|	Copyright police.
*	[project @ 1999-01-26 16:16:19 by simonm]	simonm	1999-01-26	1	-1/+7
\| \| \| \| \|	- Add specialised closure types (CONSTR_p_n, THUNK_p_n, FUN_p_n) - Add -T<n> RTS flag to specify the number of steps in younger generations.
*	[project @ 1999-01-21 10:31:41 by simonm]	simonm	1999-01-21	1	-4/+4
\| \| \| \| \|	Resurrect ticky-ticky profiling. Not quite polished yet, but it compiles and produces some reasonable-looking stats.
*	[project @ 1998-12-02 13:17:09 by simonm]	simonm	1998-12-02	1	-0/+224
	Move 4.01 onto the main trunk.