delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	compiler: Use compact representation/FastStrings for `SourceNote`s	Zubin Duggal	2023-05-16	1	-2/+2
\| \| \| \| \| \| \| \|	`SourceNote`s should not be stored as [Char] as this is highly wasteful and in certain scenarios can be highly duplicated. Metric Decrease: hard_hole_fits
*	Add fused multiply-add instructions	sheaf	2023-05-11	1	-13/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds eight new primops that fuse a multiplication and an addition or subtraction: - `{fmadd,fmsub,fnmadd,fnmsub}{Float,Double}#` fmadd x y z is x * y + z, computed with a single rounding step. This patch implements code generation for these primops in the following backends: - X86, AArch64 and PowerPC NCG, - LLVM - C WASM uses the C implementation. The primops are unsupported in the JavaScript backend. The following constant folding rules are also provided: - compute a * b + c when a, b, c are all literals, - x * y + 0 ==> x * y, - ±1 * y + z ==> z ± y and x * ±1 + z ==> z ± x. NB: the constant folding rules incorrectly handle signed zero. This is a known limitation with GHC's floating-point constant folding rules (#21227), which we hope to resolve in the future.
*	Misc cleanup	Krzysztof Gogolewski	2023-04-17	1	-1/+1
\| \| \| \| \| \|	- Use dedicated list functions - Make cloneBndrs and cloneRecIdBndrs monadic - Fix invalid haddock comments in libraries/base
*	Cmm: track the type of global registers	sheaf	2023-01-31	1	-13/+14
\| \| \| \| \| \| \| \| \| \| \| \|	This patch tracks the type of Cmm global registers. This is needed in order to lint uses of polymorphic registers, such as SIMD vector registers that can be used both for floating-point and integer values. This changes allows us to refactor VanillaReg to not store VGcPtr, as that information is instead stored in the type of the usage of the register. Fixes #22297
*	nativeGen/X86: MFENCE is unnecessary for release semantics	Ben Gamari	2023-01-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	In #22764 a user noticed that a program implementing a simple atomic counter via an STRef regressed significantly due to the introduction of necessary atomic operations in the MutVar# primops (#22468). This regression was caused by a bug in the NCG, which emitted an unnecessary MFENCE instruction for a release-ordered atomic write. MFENCE is rather only needed to achieve sequentially consistent ordering. Fixes #22764.
*	Revert "NCG(x86): Compile add+shift as lea if possible."	Matthew Pickering	2023-01-09	1	-36/+0
\| \| \| \| \| \|	This reverts commit 20457d775885d6c3df020d204da9a7acfb3c2e5a. See #22666 and #21777
*	Codegen/x86: Eliminate barrier for relaxed accesses	Ben Gamari	2022-12-15	1	-7/+12
\|
*	cmm: Introduce MemoryOrderings	Ben Gamari	2022-12-15	1	-2/+2
\|
*	Use a more efficient printer for code generation (#21853)	Krzysztof Gogolewski	2022-11-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The changes in `GHC.Utils.Outputable` are the bulk of the patch and drive the rest. The types `HLine` and `HDoc` in Outputable can be used instead of `SDoc` and support printing directly to a handle with `bPutHDoc`. See Note [SDoc versus HDoc] and Note [HLine versus HDoc]. The classes `IsLine` and `IsDoc` are used to make the existing code polymorphic over `HLine`/`HDoc` and `SDoc`. This is done for X86, PPC, AArch64, DWARF and dependencies (printing module names, labels etc.). Co-authored-by: Alexis King <lexi.lambda@gmail.com> Metric Decrease: CoOpt_Read ManyAlternatives ManyConstructors T10421 T12425 T12707 T13035 T13056 T13253 T13379 T18140 T18282 T18698a T18698b T1969 T20049 T21839c T21839r T3064 T3294 T4801 T5321FD T5321Fun T5631 T6048 T783 T9198 T9233
*	Drop a kludge for binutils<2.17, which is now over 10 years old.	M Farkas-Dyck	2022-11-01	1	-16/+1
\|
*	Scrub various partiality involving lists (again).	M Farkas-Dyck	2022-10-19	1	-2/+2
\| \| \| \|	Lets us avoid some use of `head` and `tail`, and some panics.
*	Fix typos	Eric Lindblad	2022-09-14	1	-1/+1
\| \| \| \| \| \| \|	This fixes various typos and spelling mistakes in the compiler. Fixes #21891
*	Remove Outputable Char instance	Krzysztof Gogolewski	2022-09-07	1	-4/+4
\| \| \| \| \|	Use 'text' instead of 'ppr'. Using 'ppr' on the list "hello" rendered as "h,e,l,l,o".
*	NCG(x86): Compile add+shift as lea if possible.wip/andreask/add_mul_lea	Andreas Klebinger	2022-08-08	1	-0/+36
\|
*	codeGen/X86: Don't clobber switch variable in switch generation	Ben Gamari	2022-08-05	1	-2/+3
\| \| \| \| \| \| \| \| \|	Previously ce8745952f99174ad9d3bdc7697fd086b47cdfb5 assumed that it was safe to clobber the switch variable when generating code for a jump table since we were at the end of a block. However, this assumption is wrong; the register could be live in the jump target. Fixes #21968.
*	Don't assume that labels are 32-bit on Windows	Ben Gamari	2022-04-06	1	-10/+17
\|
*	Refactor is32BitLit to take Platform rather than Bool	Ben Gamari	2022-04-06	1	-37/+33
\|
*	Generate LEA for label expressions	Ben Gamari	2022-04-06	1	-0/+16
\|
*	nativeGen/x86: Use %rip-relative addressing	Ben Gamari	2022-04-06	1	-8/+49
\| \| \| \| \| \| \|	On Windows with high-entropy ASLR we must use %rip-relative addressing to avoid overflowing the signed 32-bit immediate size of x86-64. Since %rip-relative addressing comes essentially for free and can make linking significantly easier, we use it on all platforms.
*	codeGen: Fix signedness of jump table indexing	Ben Gamari	2022-03-18	1	-6/+40
\| \| \| \| \| \| \| \| \| \|	Previously while constructing the jump table index we would zero-extend the discriminant before subtracting the start of the jump-table. This goes subtly wrong in the case of a sub-word, signed discriminant, as described in the included Note. Fix this in both the PPC and X86 NCGs. Fixes #21186.
*	NCG: inline some 64-bit primops on x86/32-bit (#5444)	Sylvain Henry	2022-02-23	1	-21/+256
\| \| \| \| \| \| \| \|	Several 64-bit operation were implemented with FFI calls on 32-bit architectures but we can easily implement them with inline assembly code. Also remove unused hs_int64ToWord64 and hs_word64ToInt64 C functions.
*	NCG: refactor the way registers are handled	Sylvain Henry	2022-02-23	1	-137/+91
\| \| \| \| \| \| \| \| \| \| \| \|	* add getLocalRegReg to avoid allocating a CmmLocal just to call getRegisterReg * 64-bit registers: in the general case we must always use the virtual higher part of the register, so we might as well always return it with the lower part. The only exception is to implement 64-bit to 32-bit conversions. We now have to explicitly discard the higher part when matching on Reg64/RegCode64 datatypes instead of explicitly fetching the higher part from the lower part: much safer default.
*	NCG: refactor X86 codegen	Sylvain Henry	2022-02-23	1	-932/+1054
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Preliminary work done to make working on #5444 easier. Mostly make make control-flow easier to follow: * renamed genCCall into genForeignCall * split genForeignCall into the part dispatching on PrimTarget (genPrim) and the one really generating code for a C call (cf ForeignTarget and genCCall) * made genPrim/genSimplePrim only dispatch on MachOp: each MachOp now has its own code generation function. * out-of-line primops are not handled in a partial `outOfLineCmmOp` anymore but in the code generation functions directly. Helper functions have been introduced (e.g. genLibCCall) for code sharing. * the latter two bullets make code generated for primops that are only sometimes out-of-line (e.g. Pdep or Memcpy) and the logic to select between inline/out-of-line much more localized * avoided passing is32bit as an argument as we can easily get it from NatM state when we really need it * changed genCCall type to avoid it being partial (it can't handle PrimTarget) * globally removed 12 calls to `panic` thanks to better control flow and types ("parse, don't validate" ftw!).
*	StgToCmm: Get rid of GHC.Driver.Session imports	John Ericson	2022-02-08	1	-1/+0
\| \| \| \| \|	`DynFlags` is gone, but let's move a few trivial things around to get rid of its module too.
*	Fix some notes	Matthew Pickering	2022-02-08	1	-1/+1
\|
*	Introduce alignment to CmmStore	Ben Gamari	2022-02-04	1	-1/+1
\|
*	Introduce alignment in CmmLoad	Ben Gamari	2022-02-04	1	-24/+24
\|
*	Fix a few Note inconsistencies	Ben Gamari	2022-02-01	1	-8/+4
\|
*	nativeGen/x86: Don't encode large shift offsets	Ben Gamari	2021-12-02	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \|	Handle the case of a shift larger than the width of the shifted value. This is necessary since x86 applies a mask of 0x1f to the shift amount, meaning that, e.g., `shr 47, $eax` will actually shift by 47 & 0x1f == 15. See #20626. (cherry picked from commit 31370f1afe1e2f071b3569fb5ed4a115096127ca)
*	i386: fix codegen of 64-bit comparisons	Sylvain Henry	2021-11-06	1	-14/+21
\|
*	Rectifying COMMENT and `mkComment` across platforms to work with SDoc	Benjamin Maurer	2021-09-29	1	-1/+1
\| \| \| \|	and exhibit similar behaviors. Issue 20400
*	ncg: Kill incorrect unreachable code	Ben Gamari	2021-09-11	1	-3/+3
\| \| \| \| \| \|	As noted in #18183, these cases were previously incorrect and unused. Closes #18183.
*	PrimOps: Add CAS op for all int sizes	Peter Trommler	2021-08-02	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	PPC NCG: Implement CAS inline for 32 and 64 bit testsuite: Add tests for smaller atomic CAS X86 NCG: Catch calls to CAS C fallback Primops: Add atomicCasWord[8\|16\|32\|64]Addr# Add tests for atomicCasWord[8\|16\|32\|64]Addr# Add changelog entry for new primops X86 NCG: Fix MO-Cmpxchg W64 on 32-bit arch ghc-prim: 64-bit CAS C fallback on all archs
*	Fix #19931	John Ericson	2021-07-21	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The issue was the renderer for x86 addressing modes assumes native size registers, but we were passing in a possibly-smaller index in conjunction with a native-sized base pointer. The easist thing to do is just extend the register first. I also changed the other NGC backends implementing jump tables accordingly. On one hand, I think PowerPC and Sparc don't have the small sub-registers anyways so there is less to worry about. On the other hand, to the extent that's true the zero extension can become a no-op. I should give credit where it's due: @hsyl20 really did all the work for me in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4717#note_355874, but I was daft and missed the "Oops" and so ended up spending a silly amount of time putting it all back together myself. The unregisterised backend change is a bit different, because here we are translating the actual case not a jump table, and the fix is to handle right-sized literals not addressing modes. But it makes sense to include here too because it's the same change in the subsequent commit that exposes both bugs.
*	Add Word64#/Int64# primops	Sylvain Henry	2021-07-15	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Word64#/Int64# are only used on 32-bit architectures. Before this patch, operations on these types were directly using the FFI. Now we use real primops that are then lowered into ccalls. The advantage of doing this is that we can now perform constant folding on Word64#/Int64# (#19024). Most of this work was done by John Ericson in !3658. However this patch doesn't go as far as e.g. changing Word64 to always be using Word64#. Noticeable performance improvements T9203(normal) run/alloc 89870808.0 66662456.0 -25.8% GOOD haddock.Cabal(normal) run/alloc 14215777340.8 12780374172.0 -10.1% GOOD haddock.base(normal) run/alloc 15420020877.6 13643834480.0 -11.5% GOOD Metric Decrease: T9203 haddock.Cabal haddock.base
*	Fix #19889 - Invalid BMI2 instructions generated.wip/andreask/bim-fix	Andreas Klebinger	2021-07-06	1	-23/+21
\| \| \| \| \|	When arguments are 8 or 16 bits wide, then truncate before/after and use the 32bit operation.
*	Cmm: fix sinking after suspendThread	Sylvain Henry	2021-05-19	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Suppose a safe call: myCall(x,y,z) It is lowered into three unsafe calls in Cmm: r = suspendThread(...); myCall(x,y,z); resumeThread(r); Consider the following situation for myCall arguments: x = Sp[..] -- stack y = Hp[..] -- heap z = R1 -- global register r = suspendThread(...); myCall(x,y,z); resumeThread(r); The sink pass assumes that unsafe calls clobber memory (heap and stack), hence x and y assignments are not sunk after `suspendThread`. The sink pass also correctly handles global register clobbering for all unsafe calls, except `suspendThread`! `suspendThread` is special because it releases the capability the thread is running on. Hence the sink pass must also take into account global registers that are mapped into memory (in the capability). In the example above, we could get: r = suspendThread(...); z = R1 myCall(x,y,z); resumeThread(r); But this transformation isn't valid if R1 is (BaseReg->rR1) as BaseReg is invalid between suspendThread and resumeThread. This caused argument corruption at least with the C backend ("unregisterised") in #19237. Fix #19237
*	Remove useless {-# LANGUAGE CPP #-} pragmas	Sylvain Henry	2021-05-12	1	-1/+0
\|
*	Fully remove HsVersions.h	Sylvain Henry	2021-05-12	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	Replace uses of WARN macro with calls to: warnPprTrace :: Bool -> SDoc -> a -> a Remove the now unused HsVersions.h Bump haddock submodule
*	Replace CPP assertions with Haskell functions	Sylvain Henry	2021-05-12	1	-13/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no reason to use CPP. __LINE__ and __FILE__ macros are now better replaced with GHC's CallStack. As a bonus, assert error messages now contain more information (function name, column). Here is the mapping table (HasCallStack omitted): * ASSERT: assert :: Bool -> a -> a * MASSERT: massert :: Bool -> m () * ASSERTM: assertM :: m Bool -> m () * ASSERT2: assertPpr :: Bool -> SDoc -> a -> a * MASSERT2: massertPpr :: Bool -> SDoc -> m () * ASSERTM2: assertPprM :: m Bool -> SDoc -> m ()
*	Replace (ptext .. sLit) with `text`	Sylvain Henry	2021-04-29	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. `text` is as efficient as `ptext . sLit` thanks to the rewrite rules 2. `text` is visually nicer than `ptext . sLit` 3. `ptext . sLit` encourages using one `ptext` for several `sLit` as in: ptext $ case xy of ... -> sLit ... ... -> sLit ... which may allocate SDoc's TextBeside constructors at runtime instead of sharing them into CAFs.
*	Re-export GHC.Bits from GHC.Prelude with custom shift implementation.	Andreas Klebinger	2021-04-09	1	-1/+0
\| \| \| \| \| \| \|	This allows us to use the unsafe shifts in non-debug builds for performance. For older versions of base we instead export Data.Bits See also #19618
*	Transfer tickish things to GHC.Types.Tickish	Luite Stegeman	2021-03-20	1	-1/+1
\| \| \| \| \|	Metric Increase: MultiLayerModules
*	Save the type of breakpoints in the Breakpoint tick in STG	Luite Stegeman	2021-03-20	1	-1/+1
\| \| \| \| \| \| \| \|	GHCi needs to know the types of all breakpoints, but it's not possible to get the exprType of any expression in STG. This is preparation for the upcoming change to make GHCi bytecode from STG instead of Core.
*	Require GHC 8.10 as the minimum compiler for bootstrapping	Ryan Scott	2021-03-09	1	-7/+0
\| \| \| \| \| \| \|	Now that GHC 9.0.1 is released, it is time to drop support for bootstrapping with GHC 8.8, as we only support building with the previous two major GHC releases. As an added bonus, this allows us to remove several bits of CPP that are either always true or no longer reachable.
*	Fix typos	Brian Wignall	2021-02-06	1	-1/+1
\|
*	Add Addr# atomic primops (#17751)	Sylvain Henry	2020-11-16	1	-5/+5
\| \| \| \|	This reuses the codegen used for ByteArray#'s atomic primops.
*	Don't use LEA with 8-bit registers (#18614)	Sylvain Henry	2020-11-04	1	-2/+6
\|
*	NCG: Fix 64bit int comparisons on 32bit x86	Andreas Klebinger	2020-11-04	1	-14/+84
\| \| \| \| \| \| \| \| \| \| \|	We no compare these by doing 64bit subtraction and checking the resulting flags. We used to do this differently but the old approach was broken when the high bits compared equal and the comparison was one of >= or <=. The new approach should be both correct and faster.
*	Add the proper HLint rules and remove redundant keywords from compiler	Hécate	2020-11-01	1	-20/+21
\|