delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Remove the LFBlackHole constructor	Patrick Palka	2013-12-05	1	-27/+0
\| \| \| \|	After commit 55c703b8fdb0, this code is no longer used anywhere.
*	Update and deduplicate the comments on CAF management (#8590)	Patrick Palka	2013-12-04	1	-31/+4
\|
*	Move the allocation of CAF blackholes into 'newCAF' (#8590)	Patrick Palka	2013-12-04	1	-30/+10
\| \| \| \| \| \| \| \| \| \|	We now do the allocation of the blackhole indirection closure inside the RTS procedure 'newCAF' instead of generating the allocation code inline in the closure body of each CAF. This slightly decreases code size in modules with a lot of CAFs. As a result of this change, for example, the size of DynFlags.o drops by ~60KB and HsExpr.o by ~100KB.
*	Move the LDV code below the self-loop label (#8275)	Patrick Palka	2013-12-01	1	-1/+1
\|
*	Don't explicitly refer to nodeReg in ldvEnterClosure	Patrick Palka	2013-12-01	2	-7/+8
\|
*	Document solution to #8275	Jan Stolarek	2013-12-01	1	-2/+13
\|
*	Fix loopification with profiling and enable it by default (#8275)	Patrick Palka	2013-12-01	1	-4/+2
\|
*	Comments on slow-call-shortcutting	Simon Marlow	2013-11-28	1	-0/+36
\|
*	Fix up shortcut for slow calls	Patrick Palka	2013-11-28	1	-7/+7
\|
*	Implement shortcuts for slow calls (#6084)	Simon Marlow	2013-11-28	1	-7/+43
\|
*	Move isVoidRep, isGcPtrRep to TyCon to join primRepSizeW etc	Simon Peyton Jones	2013-11-22	3	-14/+4
\| \| \| \|	This is just a modest refactoring
*	Fix some cases where we were leaving slop in the heap (#8515, #8298)	Simon Marlow	2013-11-14	1	-6/+14
\|
*	comments	Simon Marlow	2013-11-14	1	-5/+5
\|
*	Revert "Implement shortcuts for slow calls that would require PAPs (#6084)"	Austin Seipp	2013-10-26	1	-43/+7
\| \| \| \|	This reverts commit 2f5db98e90cf0cff1a11971c85f108a7480528ed.
*	Revert "comments"	Austin Seipp	2013-10-26	1	-27/+0
\| \| \| \|	This reverts commit 9026c77a07533bda3773c3c3f3df1c6592bc80c7.
*	comments	Simon Marlow	2013-10-25	1	-0/+27
\|
*	Implement shortcuts for slow calls that would require PAPs (#6084)	Simon Marlow	2013-10-25	1	-7/+43
\|
*	More comments about stack layout	Simon Peyton Jones	2013-10-18	1	-9/+27
\|
*	Comments (about the stack overflow check) only	Simon Peyton Jones	2013-10-18	1	-15/+23
\|
*	Generate (old + 0) instead of Sp in stack checks	Jan Stolarek	2013-10-16	1	-2/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When compiling a function we can determine how much stack space it will use. We therefore need to perform only a single stack check at the beginning of a function to see if we have enough stack space. Instead of referring directly to Sp - as we used to do in the past - the code generator uses (old + 0) in the stack check. Stack layout phase turns (old + 0) into Sp. The idea here is that, while we need to perform only one stack check for each function, we could in theory place more stack checks later in the function. They would be redundant, but not incorrect (in a sense that they should not change program behaviour). We need to make sure however that a stack check inserted after incrementing the stack pointer checks for a respectively smaller stack space. This would not be the case if the code generator produced direct references to Sp. By referencing (old + 0) we make sure that we always check for a correct amount of stack: when converting (old + 0) to Sp the stack layout phase takes into account changes already made to stack pointer. The idea for this change came from observations made while debugging #8275.
*	Fix a bug in the canned selector code when profiling.	Simon Marlow	2013-10-11	1	-1/+6
\|
*	Comments only	Simon Peyton Jones	2013-10-04	1	-1/+7
\|
*	Add support for prefetch with locality levels.	Austin Seipp	2013-10-01	1	-23/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for several new primitive operations which support using processor-specific instructions to help guide data and cache locality decisions. We have levels ranging from [0..3] For LLVM, we generate llvm.prefetch intrinsics at the proper locality level (similar to GCC.) For x86 we generate prefetch{NTA, t2, t1, t0} instructions. On SPARC and PowerPC, the locality levels are ignored. This closes #8256. Authored-by: Carter Tazio Schonwald <carter.schonwald@gmail.com> Signed-off-by: Austin Seipp <austin@well-typed.com>
*	Globally replace "hackage.haskell.org" with "ghc.haskell.org"	Simon Marlow	2013-10-01	1	-1/+1
\|
*	Check that SIMD vector instructions are compatible with current set of ↵	Geoffrey Mainland	2013-09-22	1	-14/+59
\| \| \| \| \| \| \| \|	dynamic flags. SIMD vector instructions currently require the LLVM back-end. The set of available instructions also depends on the set of architecture flags specified on the command line.
*	Pass 512-bit-wide vectors in registers.	Geoffrey Mainland	2013-09-22	1	-0/+7
\|
*	Add support for 512-bit-wide vectors.	Geoffrey Mainland	2013-09-22	2	-0/+6
\|
*	Pass 256-bit-wide vectors in registers.	Geoffrey Mainland	2013-09-22	1	-0/+7
\|
*	Add support for 256-bit-wide vectors.	Geoffrey Mainland	2013-09-22	2	-3/+9
\|
*	SIMD primops are now generated using schemas that are polymorphic in	Geoffrey Mainland	2013-09-22	1	-125/+163
\| \| \| \| \| \| \| \| \| \| \| \| \|	width and element type. SIMD primops are now polymorphic in vector size and element type, but only internally to the compiler. More specifically, utils/genprimopcode has been extended so that it "knows" about SIMD vectors. This allows us to, for example, write a single definition for the "add two vectors" primop in primops.txt.pp and have it instantiated at many vector types. This generates a primop in GHC.Prim for each vector type at which "add two vectors" is instantiated, but only one data constructor for the PrimOp data type, so the code generator is much, much simpler.
*	Add flag to control loopification	Jan Stolarek	2013-09-18	2	-3/+9
\| \| \| \| \|	It is off by default, which is meant to be a workaround for #8275. Once #8275 is fixed we will enable this option by default.
*	New primops for byte range copies ByteArray# <-> Addr#	Duncan Coutts	2013-09-15	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have primops for copying ranges of bytes between ByteArray#s: * ByteArray# -> MutableByteArray# * MutableByteArray# -> MutableByteArray# This extends it with three further cases: * Addr# -> MutableByteArray# * ByteArray# -> Addr# * MutableByteArray# -> Addr# One use case for these is copying between ForeignPtr-based representations and in-heap arrays (like Text, UArray etc). The implementation is essentially the same as for the existing primops, and shares the memcpy stuff in the code generators. Defficiencies / future directions: none of these primops (existing or the new ones) let one take advantage of knowing that ByteArray#s are word-aligned in memory. Though it is unclear that any of the code generators would make use of this information unless the size to copy is also known at compile time. Signed-off-by: Austin Seipp <austin@well-typed.com>
*	Fix AMP warnings.	Austin Seipp	2013-09-11	2	-0/+15
\| \| \| \| \|	Authored-by: David Luposchainsky <dluposchainsky@gmail.com> Signed-off-by: Austin Seipp <austin@well-typed.com>
*	Explicit import lists for StgCmmProf.	Edward Z. Yang	2013-09-01	8	-8/+9
\| \| \| \|	Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
*	Optimize self-recursive tail calls	Jan Stolarek	2013-08-29	4	-94/+222
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements loopification optimization. It was described in "Low-level code optimisations in the Glasgow Haskell Compiler" by Krzysztof Woś, but we use a different approach here. Krzysztof's approach was to perform optimization as a Cmm-to-Cmm pass. Our approach is to generate properly optimized tail calls in the code generator, which saves us the trouble of processing Cmm. This idea was proposed by Simon Marlow. Implementation details are explained in Note [Self-recursive tail calls]. Performance of most nofib benchmarks is not affected. There are some benchmarks that show 5-7% improvement, with an average improvement of 2.6%. It would require some further investigation to check if this is related to benchamrking noise or does this optimization really help make some class of programs faster. As a minor cleanup, this patch renames forkProc to forkLneBody. It also moves some data declarations from StgCmmMonad to StgCmmClosure, because they are needed there and it seems that StgCmmClosure is on top of the whole StgCmm* hierarchy.
*	Whitespaces and comment formatting	Jan Stolarek	2013-08-29	2	-33/+31
\|
*	Comments only, relating to #8166 fix	Simon Peyton Jones	2013-08-27	3	-9/+15
\|
*	Properly externalise codegen identifiers (#8166)	Austin Seipp	2013-08-26	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	388e14e2 unfortunately broke a subtle invariant in the code generator: when generating code for an application, names may need to be externalised, in case you're building against something external with was built with -split-objs. We were never externalising the ids of the applied functions. This means if the libraries are split and we call into them, then the compiler won't may not generate correct ids when making references to functions in the library (causing linker failure). I'm not entirely sure how this didn't break everything, but it certainly caused several failures for a bunch of people. I had to fiddle with my tree a little to make this occur. This should fix #8166. Signed-off-by: Austin Seipp <aseipp@pobox.com>
*	Comments only	Jan Stolarek	2013-08-22	1	-3/+1
\| \| \| \|	This comment is no loger true
*	Detabify	Jan Stolarek	2013-08-21	1	-39/+32
\| \| \| \|	I missed that file yesterday when I was cleaning up codeGen/ directory.
*	Comments only	Jan Stolarek	2013-08-20	1	-1/+2
\|
*	Merge cgTailCall and cgLneJump into one function	Jan Stolarek	2013-08-20	2	-31/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previosly logic of these functions was sth like this: cgIdApp x = case x of A -> cgLneJump x _ -> cgTailCall x cgTailCall x = case x of B -> ... C -> ... _ -> ... After merging there is no nesting of cases: cgIdApp x = case x of A -> -- body of cgLneJump B -> ... C -> ... _ -> ...
*	Remove unused module	Jan Stolarek	2013-08-20	3	-133/+2
\| \| \| \| \| \|	This commit removes module StgCmmGran which has only no-op functions. According to comments in the module, it was used by GpH, but GpH project seems to be dead for a couple of years now.
*	Cleanup StgCmm pass	Jan Stolarek	2013-08-20	7	-115/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This cleanup includes: * removing dead code. This includes forkStatics function, which was in fact one big noop, and global bindings in CgInfoDownwards, * converting functions that used FCode monad only to access DynFlags into functions that take DynFlags as a parameter and don't work in a monad, * addBindC function is now smarter. It extracts Id from CgIdInfo passed to it in the same way addBindsC does. Previously this was done at every call site, which was redundant.
*	Trailing whitespaces, code formatting, detabify	Jan Stolarek	2013-08-20	13	-481/+467
\| \| \| \| \|	A major cleanup of trailing whitespaces and tabs in codeGen/ directory. I also adjusted code formatting in some places.
*	Comments only	Simon Peyton Jones	2013-08-19	1	-1/+8
\|
*	Comparison primops return Int# (Fixes #6135)	Jan Stolarek	2013-08-14	2	-23/+18
\| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies all comparison primops for Char#, Int#, Word#, Double#, Float# and Addr# to return Int# instead of Bool. A value of 1# represents True and 0# represents False. For a more detailed description of motivation for this change, discussion of implementation details and benchmarking results please visit the wiki page: http://hackage.haskell.org/trac/ghc/wiki/PrimBool There's also some cleanup: whitespace fixes in files that were extensively edited in this patch and constant folding rules for Integer div and mod operators (which for some reason have been left out up till now).
*	Fix a bug in stack layout with safe foreign calls (#8083)	Simon Marlow	2013-07-24	1	-1/+2
\| \| \| \| \| \| \|	We weren't properly tracking the number of stack arguments in the continuation of a foreign call. It happened to work when the continuation was not a join point, but when it was a join point we were using the wrong amount of stack fixup.
*	Add final remaining bits to fix #7978.	Geoffrey Mainland	2013-07-22	1	-30/+1
\|
*	Add support for byte endian swapping for Word 16/32/64.	Austin Seipp	2013-07-17	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Exposes bSwap{,16,32,64}# primops * Add a new machop: MO_BSwap * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation in NCG. * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr instead of using xchg. * Generate llvm.bswap intrinsics in llvm codegen. Authored-by: Vincent Hanquez <tab@snarc.org> Signed-off-by: Austin Seipp <aseipp@pobox.com>