summaryrefslogtreecommitdiff
path: root/compiler/codeGen
Commit message (Collapse)AuthorAgeFilesLines
* minor: use unless instead of (when . not)Ömer Sinan Ağacan2015-11-081-3/+3
| | | | | | | | | | Reviewers: bgamari, austin Reviewed By: austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1438
* Make GHCi & TH work when the compiler is built with -profSimon Marlow2015-11-071-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Amazingly, there were zero changes to the byte code generator and very few changes to the interpreter - mainly because we've used good abstractions that hide the differences between profiling and non-profiling. So that bit was pleasantly straightforward, but there were a pile of other wibbles to get the whole test suite through. Note that a compiler built with -prof is now like one built with -dynamic, in that to use TH you have to build the code the same way. For dynamic, we automatically enable -dynamic-too when TH is required, but we don't have anything equivalent for profiling, so you have to explicitly use -prof when building code that uses TH with a profiled compiler. For this reason Cabal won't work with TH. We don't expect to ship a profiled compiler, so I think that's OK. Test Plan: validate with GhcProfiled=YES in validate.mk Reviewers: goldfire, bgamari, rwbarton, austin, hvr, erikd, ezyang Reviewed By: ezyang Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1407 GHC Trac Issues: #4837, #545
* cmm: Expose machine's stack and return address registerBen Gamari2015-11-011-0/+2
| | | | | | | | | | We will need to use these to setup proper unwinding information for the stg_stop_thread closure. This pokes a hole in the STG abstraction, exposing the machine's stack pointer register so that we can accomplish this. We also expose a dummy return address register, which corresponds to the register used to hold the DWARF return address. Differential Revision: https://phabricator.haskell.org/D1225
* Add subWordC# on x86ishNikita Karetnikov2015-10-311-0/+17
| | | | | | | | | | | | | | | This adds a subWordC# primop which implements subtraction with overflow reporting. Reviewers: tibbe, goldfire, rwbarton, bgamari, austin, hvr Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1334 GHC Trac Issues: #10962
* Make Monad/Applicative instances MRP-friendlyHerbert Valerio Riedel2015-10-172-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors pure/(*>) and return/(>>) in MRP-friendly way, i.e. such that the explicit definitions for `return` and `(>>)` match the MRP-style default-implementation, i.e. return = pure and (>>) = (*>) This way, e.g. all `return = pure` definitions can easily be grepped and removed in GHC 8.1; Test Plan: Harbormaster Reviewers: goldfire, alanz, bgamari, quchen, austin Reviewed By: quchen, austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1312
* Rename package key to unit ID, and installed package ID to component ID.Edward Z. Yang2015-10-148-20/+20
| | | | | | Comes with Haddock submodule update. Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
* Annotate CmmBranch with an optional likely targetSimon Marlow2015-09-234-7/+9
| | | | | | | | | | | | | | | | | Summary: This allows the code generator to give hints to later code generation steps about which branch is most likely to be taken. Right now it is only taken into account in one place: a special case in CmmContFlowOpt that swapped branches over to maximise the chance of fallthrough, which is now disabled when there is a likelihood setting. Test Plan: validate Reviewers: austin, simonpj, bgamari, ezyang, tibbe Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1273
* s/StgArrWords/StgArrBytes/Siddhanathan Shanmugam2015-09-111-4/+4
| | | | | | | | | | Rename StgArrWords to StgArrBytes (see Trac #8552) Reviewed By: austin Differential Revision: https://phabricator.haskell.org/D1233 GHC Trac Issues: #8552
* Fix trac #10413Ben Gamari2015-09-021-2/+5
| | | | | | | | | | | | | | Test Plan: Validate. Reviewers: austin, tibbe, bgamari Reviewed By: tibbe, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1194 GHC Trac Issues: #10413
* StgCmmHeap: Re-add check for large static allocationsBen Gamari2015-08-291-0/+9
| | | | | | | This should at least help alleviate the annoyance of #4505. This reintroduces a compile-time check originally added in a278f3f02d09bc32b0a75d4a04d710090cde250f but dropped with the new code generator.
* Delete FastBoolThomas Miedema2015-08-211-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverses some of the work done in Trac #1405, and assumes GHC is smart enough to do its own unboxing of booleans now. I would like to do some more performance measurements, but the code changes can be reviewed already. Test Plan: With a perf build: ./inplace/bin/ghc-stage2 nofib/spectral/simple/Main.hs -fforce-recomp +RTS -t --machine-readable before: ``` [("bytes allocated", "1300744864") ,("num_GCs", "302") ,("average_bytes_used", "8811118") ,("max_bytes_used", "24477464") ,("num_byte_usage_samples", "9") ,("peak_megabytes_allocated", "64") ,("init_cpu_seconds", "0.001") ,("init_wall_seconds", "0.001") ,("mutator_cpu_seconds", "2.833") ,("mutator_wall_seconds", "4.283") ,("GC_cpu_seconds", "0.960") ,("GC_wall_seconds", "0.961") ] ``` after: ``` [("bytes allocated", "1301088064") ,("num_GCs", "310") ,("average_bytes_used", "8820253") ,("max_bytes_used", "24539904") ,("num_byte_usage_samples", "9") ,("peak_megabytes_allocated", "64") ,("init_cpu_seconds", "0.001") ,("init_wall_seconds", "0.001") ,("mutator_cpu_seconds", "2.876") ,("mutator_wall_seconds", "4.474") ,("GC_cpu_seconds", "0.965") ,("GC_wall_seconds", "0.979") ] ``` CPU time seems to be up a bit, but I'm not sure. Unfortunately CPU time measurements are rather noisy. Reviewers: austin, bgamari, rwbarton Subscribers: nomeata Differential Revision: https://phabricator.haskell.org/D1143 GHC Trac Issues: #1405
* Implement getSizeofMutableByteArrayOp primopBen Gamari2015-08-211-0/+5
| | | | | | | | | | | | | | | Now since ByteArrays are mutable we need to be more explicit about when the size is queried. Test Plan: Add testcase and validate Reviewers: goldfire, hvr, austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1139 GHC Trac Issues: #9447
* Support MO_U_QuotRem2 in LLVM backendMichal Terepeta2015-08-031-1/+2
| | | | | | | | | | | | | | | | | | | This adds support for MO_U_QuotRem2 in LLVM backend. Similarly to MO_U_Mul2 we use the standard LLVM instructions (in this case 'udiv' and 'urem') but do the computation on double the word width (e.g., for 64-bit we will do them on 128 registers). Test Plan: validate Reviewers: rwbarton, austin, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1100 GHC Trac Issues: #9430
* Eliminate zero_static_objects_list()Simon Marlow2015-07-281-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: [Revised version of D1076 that was committed and then backed out] In a workload with a large amount of code, zero_static_objects_list() takes a significant amount of time, and furthermore it is in the single-threaded part of the GC. This patch uses a slightly fiddly scheme for marking objects on the static object lists, using a flag in the low 2 bits that flips between two states to indicate whether an object has been visited during this GC or not. We also have to take into account objects that have not been visited yet, which might appear at any time due to runtime linking. Test Plan: validate Reviewers: austin, ezyang, rwbarton, bgamari, thomie Reviewed By: bgamari, thomie Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1106
* Revert "Eliminate zero_static_objects_list()"Simon Marlow2015-07-271-3/+2
| | | | This reverts commit b949c96b4960168a3b399fe14485b24a2167b982.
* Eliminate zero_static_objects_list()Simon Marlow2015-07-221-2/+3
| | | | | | | | | | | | | | | | | | | | | Summary: In a workload with a large amount of code, zero_static_objects_list() takes a significant amount of time, and furthermore it is in the single-threaded part of the GC. This patch uses a slightly fiddly scheme for marking objects on the static object lists, using a flag in the low 2 bits that flips between two states to indicate whether an object has been visited during this GC or not. We also have to take into account objects that have not been visited yet, which might appear at any time due to runtime linking. Test Plan: validate Reviewers: austin, bgamari, ezyang, rwbarton Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1076
* LlvmCodeGen: add support for MO_U_Mul2 CallishMachOpMichal Terepeta2015-07-201-1/+2
| | | | | | | | | | | | | | | | | | | This adds support MO_U_Mul2 to the LLVM backend by simply using 'mul' instruction but operating at twice the bit width (e.g., for 64 bit words we will generate mul that operates on 128 bits and then extract the two 64 bit values for the result of the CallishMachOp). Test Plan: validate Reviewers: rwbarton, austin, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1068 GHC Trac Issues: #9430
* Delete the WayPar wayThomas Miedema2015-07-101-7/+0
| | | | | | Also remove 't' and 's' from ALL_WAYS; they don't exist. Differential Revision: https://phabricator.haskell.org/D1055
* Comments onlySimon Peyton Jones2015-07-081-26/+32
|
* Fix "CPP directive" in commentBen Gamari2015-07-071-6/+6
|
* Add more discussion of black-holing logic for #10414Ben Gamari2015-07-071-9/+59
| | | | Signed-off-by: Ben Gamari <ben@smart-cactus.org>
* Don't eagerly blackhole single-entry thunks (#10414)Reid Barton2015-07-071-1/+11
| | | | | | | | | | | | | In a parallel program they can actually be entered more than once, leading to deadlock. Reviewers: austin, simonmar Subscribers: michaelt, thomie, bgamari Differential Revision: https://phabricator.haskell.org/D1040 GHC Trac Issues: #10414
* Support MO_{Add,Sub}IntC and MO_Add2 in the LLVM backendMichal Terepeta2015-07-041-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This includes: - Adding new LlvmType called LMStructP that represents an unpacked struct (this is necessary since LLVM's instructions the llvm.sadd.with.overflow.* return an unpacked struct). - Modifications to LlvmCodeGen.CodeGen to generate the LLVM instructions for the primops. - Modifications to StgCmmPrim to actually use those three instructions if we use the LLVM backend (so far they were only used for NCG). Test Plan: validate Reviewers: austin, rwbarton, bgamari Reviewed By: bgamari Subscribers: thomie, bgamari Differential Revision: https://phabricator.haskell.org/D991 GHC Trac Issues: #9430
* Implement PowerPC 64-bit native code backend for LinuxPeter Trommler2015-07-031-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the PowerPC 32-bit native code generator for "64-bit PowerPC ELF Application Binary Interface Supplement 1.9" by Ian Lance Taylor and "Power Architecture 64-Bit ELF V2 ABI Specification -- OpenPOWER ABI for Linux Supplement" by IBM. The latter ABI is mainly used on POWER7/7+ and POWER8 Linux systems running in little-endian mode. The code generator supports both static and dynamic linking. PowerPC 64-bit code for ELF ABI 1.9 and 2 is mostly position independent anyway, and thus so is all the code emitted by the code generator. In other words, -fPIC does not make a difference. rts/stg/SMP.h support is implemented. Following the spirit of the introductory comment in PPC/CodeGen.hs, the rest of the code is a straightforward extension of the 32-bit implementation. Limitations: * Code is generated only in the medium code model, which is also gcc's default * Local symbols are not accessed directly, which seems to also be the case for 32-bit * LLVM does not work, but this does not work on 32-bit either * Must use the system runtime linker in GHCi, because the GHC linker for "static" object files (rts/Linker.c) for PPC 64-bit is not implemented. The system runtime (dynamic) linker works. * The handling of the system stack (register 1) is not ELF- compliant so stack traces break. Instead of allocating a new stack frame, spill code should use the "official" spill area in the current stack frame and deallocation code should restore the back chain * DWARF support is missing Fixes #9863 Test Plan: validate (on powerpc, too) Reviewers: simonmar, trofi, erikd, austin Reviewed By: trofi Subscribers: bgamari, arnons1, kgardas, thomie Differential Revision: https://phabricator.haskell.org/D629 GHC Trac Issues: #9863
* Be aware of overlapping global STG registers in CmmSink (#10521)Reid Barton2015-06-251-7/+9
| | | | | | | | | | | | | | | | | | | Summary: On x86_64, commit e2f6bbd3a27685bc667655fdb093734cb565b4cf assigned the STG registers F1 and D1 the same hardware register (xmm1), and the same for the registers F2 and D2, etc. When mixing calls to functions involving Float#s and Double#s, this can cause wrong Cmm optimizations that assume the F1 and D1 registers are independent. Reviewers: simonpj, austin Reviewed By: austin Subscribers: simonpj, thomie, bgamari Differential Revision: https://phabricator.haskell.org/D993 GHC Trac Issues: #10521
* Encode alignment in MO_Memcpy and friendsBen Gamari2015-06-161-25/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Alignment needs to be a compile-time constant. Previously the code generators had to jump through hoops to ensure this was the case as the alignment was passed as a CmmExpr in the arguments list. Now we take care of this up front. This fixes #8131. Authored-by: Reid Barton <rwbarton@gmail.com> Dusted-off-by: Ben Gamari <ben@smart-cactus.org> Tests for T8131 Test Plan: Validate Reviewers: rwbarton, austin Reviewed By: rwbarton, austin Subscribers: bgamari, carter, thomie Differential Revision: https://phabricator.haskell.org/D624 GHC Trac Issues: #8131
* ApiAnnotations : strings in warnings do not return SourceTextAlan Zimmerman2015-06-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The strings used in a WARNING pragma are captured via strings :: { Located ([AddAnn],[Located FastString]) } : STRING { sL1 $1 ([],[L (gl $1) (getSTRING $1)]) } .. The STRING token has a method getSTRINGs that returns the original source text for a string. A warning of the form {-# WARNING Logic , mkSolver , mkSimpleSolver , mkSolverForLogic , solverSetParams , solverPush , solverPop , solverReset , solverGetNumScopes , solverAssertCnstr , solverAssertAndTrack , solverCheck , solverCheckAndGetModel , solverGetReasonUnknown "New Z3 API support is still incomplete and fragile: \ \you may experience segmentation faults!" #-} returns the concatenated warning string rather than the original source. This patch now deals with all remaining instances of getSTRING to bring in a SourceText for each. This updates the haddock submodule as well, for the AST change. Test Plan: ./validate Reviewers: hvr, austin, goldfire Reviewed By: austin Subscribers: bgamari, thomie, mpickering Differential Revision: https://phabricator.haskell.org/D907 GHC Trac Issues: #10313
* Refactor the story around switches (#10137)Joachim Breitner2015-03-301-181/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-implements the code generation for case expressions at the Stg → Cmm level, both for data type cases as well as for integral literal cases. (Cases on float are still treated as before). The goal is to allow for fancier strategies in implementing them, for a cleaner separation of the strategy from the gritty details of Cmm, and to run this later than the Common Block Optimization, allowing for one way to attack #10124. The new module CmmSwitch contains a number of notes explaining this changes. For example, it creates larger consecutive jump tables than the previous code, if possible. nofib shows little significant overall improvement of runtime. The rather large wobbling comes from changes in the code block order (see #8082, not much we can do about it). But the decrease in code size alone makes this worthwhile. ``` Program Size Allocs Runtime Elapsed TotalMem Min -1.8% 0.0% -6.1% -6.1% -2.9% Max -0.7% +0.0% +5.6% +5.7% +7.8% Geometric Mean -1.4% -0.0% -0.3% -0.3% +0.0% ``` Compilation time increases slightly: ``` -1 s.d. ----- -2.0% +1 s.d. ----- +2.5% Average ----- +0.3% ``` The test case T783 regresses a lot, but it is the only one exhibiting any regression. The cause is the changed order of branches in an if-then-else tree, which makes the hoople data flow analysis traverse the blocks in a suboptimal order. Reverting that gets rid of this regression, but has a consistent, if only very small (+0.2%), negative effect on runtime. So I conclude that this test is an extreme outlier and no reason to change the code. Differential Revision: https://phabricator.haskell.org/D720
* Remove comments and flag for GranSimThomas Miedema2015-03-191-4/+1
| | | | | | | | | The GranSim code was removed in dd56e9ab and 297b05a9 in 2009, and perhaps other commits I couldn't find. Reviewed By: austin Differential Revision: https://phabricator.haskell.org/D737
* Small emitCmmSwitch/emitCmmLitSwitch refactoringJoachim Breitner2015-03-021-12/+11
| | | | | both use the same logic to divide, so put it in divideBranches :: Ord a => [(a,b)] -> ([(a,b)], a, [(a,b)])
* Improve if-then-else tree for cases on literal valuesJoachim Breitner2015-03-021-6/+23
| | | | | | | | | Previously, in the branch of the if-then-else tree, it would emit a final check if the scrut matches the alternative, even if earlier comparisons alread imply this equality. By keeping track of the bounds we can skip this check. Of course this is only sound for integer types. This closes #10129. Differential Revision: https://phabricator.haskell.org/D693
* Fix comments, and a little reformattingSimon Marlow2015-02-241-26/+26
|
* Add a bizarre corner-case to cgExpr (Trac #9964)Simon Peyton Jones2015-02-201-23/+55
| | | | | | | | | | | David Feuer managed to tickle a corner case in the code generator. See Note [Scrutinising VoidRep] in StgCmmExpr. I rejigged the comments in that area of the code generator Note [Dodgy unsafeCoerce 1] Note [Dodgy unsafeCoerce 2] but I can't say I fully understand them, alas.
* Replace .lhs with .hs in compiler commentsYuri de Wit2015-02-092-2/+2
| | | | | | | | | | | | | | Summary: It looks like during .lhs -> .hs switch the comments were not updated. So doing exactly that. Reviewers: austin, jstolarek, hvr, goldfire Reviewed By: austin, jstolarek Subscribers: thomie, goldfire Differential Revision: https://phabricator.haskell.org/D621 GHC Trac Issues: #9986
* Improve an ASSERTSimon Peyton Jones2014-12-171-1/+1
|
* Typos in commentsGabor Greif2014-12-171-1/+1
|
* Add unwind information to CmmPeter Wortmann2014-12-161-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Unwind information allows the debugger to discover more information about a program state, by allowing it to "reconstruct" other states of the program. In practice, this means that we explain to the debugger how to unravel stack frames, which comes down mostly to explaining how to find their Sp and Ip register values. * We declare yet another new constructor for CmmNode - and this time there's actually little choice, as unwind information can and will change mid-block. We don't actually make use of these capabilities, and back-end support would be tricky (generate new labels?), but it feels like the right way to do it. * Even though we only use it for Sp so far, we allow CmmUnwind to specify unwind information for any register. This is pretty cheap and could come in useful in future. * We allow full CmmExpr expressions for specifying unwind values. The advantage here is that we don't have to make up new syntax, and can e.g. use the WDS macro directly. On the other hand, the back-end will now have to simplify the expression until it can sensibly be converted into DWARF byte code - a process which might fail, yielding NCG panics. On the other hand, when you're writing Cmm by hand you really ought to know what you're doing. (From Phabricator D169)
* Tick scopesPeter Wortmann2014-12-168-69/+132
| | | | | | | | | | | | | | | | | | | | | | This patch solves the scoping problem of CmmTick nodes: If we just put CmmTicks into blocks we have no idea what exactly they are meant to cover. Here we introduce tick scopes, which allow us to create sub-scopes and merged scopes easily. Notes: * Given that the code often passes Cmm around "head-less", we have to make sure that its intended scope does not get lost. To keep the amount of passing-around to a minimum we define a CmmAGraphScoped type synonym here that just bundles the scope with a portion of Cmm to be assembled later. * We introduce new scopes at somewhat random places, aligning with getCode calls. This works surprisingly well, but we might have to add new scopes into the mix later on if we find things too be too coarse-grained. (From Phabricator D169)
* Source notes (Cmm support)Peter Wortmann2014-12-163-15/+30
| | | | | | | | | | | | | | | | | | | | This patch adds CmmTick nodes to Cmm code. This is relatively straight-forward, but also not very useful, as many blocks will simply end up with no annotations whatosever. Notes: * We use this design over, say, putting ticks into the entry node of all blocks, as it seems to work better alongside existing optimisations. Now granted, the reason for this is that currently GHC's main Cmm optimisations seem to mainly reorganize and merge code, so this might change in the future. * We have the Cmm parser generate a few source notes as well. This is relatively easy to do - worst part is that it complicates the CmmParse implementation a bit. (From Phabricator D169)
* Source notes (CorePrep and Stg support)Peter Wortmann2014-12-162-24/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is basically just about continuing maintaining source notes after the Core stage. Unfortunately, this is more involved as it might seem, as there are more restrictions on where ticks are allowed to show up. Notes: * We replace the StgTick / StgSCC constructors with a unified StgTick that can carry any tickish. * For handling constructor or lambda applications, we generally float ticks out. * Note that thanks to the NonLam placement, we know that source notes can never appear on lambdas. This means that as long as we are careful to always use mkTick, we will never violate CorePrep invariants. * This is however not automatically true for eta expansion, which needs to somewhat awkwardly strip, then re-tick the expression in question. * Where CorePrep floats out lets, we make sure to wrap them in the same spirit as FloatOut. * Detecting selector thunks becomes a bit more involved, as we can run into ticks at multiple points. (From Phabricator D169)
* Changing prefetch primops to have a `seq`-like interfaceCarter Tazio Schonwald2014-12-151-29/+51
| | | | | | | | | | | | | | | | | | | | | | Summary: The current primops for prefetching do not properly work in pure code; namely, the primops are not 'hoisted' into the correct call sites based on when arguments are evaluated. Instead, they should use a `seq`-like interface, which will cause it to be evaluated when the needed term is. See #9353 for the full discussion. Test Plan: updated tests for pure prefetch in T8256 to reflect the design changes in #9353 Reviewers: simonmar, hvr, ekmett, austin Reviewed By: ekmett, austin Subscribers: merijn, thomie, carter, simonmar Differential Revision: https://phabricator.haskell.org/D350 GHC Trac Issues: #9353
* arm64: 64bit iOS and SMP support (#7942)Luke Iannini2014-11-192-0/+14
| | | | Signed-off-by: Austin Seipp <austin@well-typed.com>
* Per-thread allocation counters and limitsSimon Marlow2014-11-121-72/+202
| | | | | | | | This reverts commit f0fcc41d755876a1b02d1c7c79f57515059f6417. New changes: now works on 32-bit platforms too. I added some basic support for 64-bit subtraction and comparison operations to the x86 NCG.
* Revert "Place static closures in their own section."Edward Z. Yang2014-10-203-9/+3
| | | | | | | | | | This reverts commit b23ba2a7d612c6b466521399b33fe9aacf5c4f75. Conflicts: compiler/cmm/PprCmmDecl.hs compiler/nativeGen/PPC/Ppr.hs compiler/nativeGen/SPARC/Ppr.hs compiler/nativeGen/X86/Ppr.hs
* Place static closures in their own section.Edward Z. Yang2014-10-013-3/+9
| | | | | | | | | | | | | | | | | | | | | | | Summary: The primary reason for doing this is assisting debuggability: if static closures are all in the same section, they are guaranteed to be adjacent to one another. This will help later when we add some code that takes section start/end and uses this to sanity-check the sections. Part of remove HEAP_ALLOCED patch set (#8199) Signed-off-by: Edward Z. Yang <ezyang@mit.edu> Test Plan: validate Reviewers: simonmar, austin Subscribers: simonmar, ezyang, carter, thomie Differential Revision: https://phabricator.haskell.org/D263 GHC Trac Issues: #8199
* Make Applicative a superclass of MonadAustin Seipp2014-09-099-12/+40
| | | | | | | | | | | | | | | | | | | | | Summary: This includes pretty much all the changes needed to make `Applicative` a superclass of `Monad` finally. There's mostly reshuffling in the interests of avoid orphans and boot files, but luckily we can resolve all of them, pretty much. The only catch was that Alternative/MonadPlus also had to go into Prelude to avoid this. As a result, we must update the hsc2hs and haddock submodules. Signed-off-by: Austin Seipp <austin@well-typed.com> Test Plan: Build things, they might not explode horribly. Reviewers: hvr, simonmar Subscribers: simonmar Differential Revision: https://phabricator.haskell.org/D13
* Add MO_AddIntC, MO_SubIntC MachOps and implement in X86 backendReid Barton2014-08-231-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | Summary: These MachOps are used by addIntC# and subIntC#, which in turn are used in integer-gmp when adding or subtracting small Integers. The following benchmark shows a ~6% speedup after this commit on x86_64 (building GHC with BuildFlavour=perf). {-# LANGUAGE MagicHash #-} import GHC.Exts import Criterion.Main count :: Int -> Integer count (I# n#) = go n# 0 where go :: Int# -> Integer -> Integer go 0# acc = acc go n# acc = go (n# -# 1#) $! acc + 1 main = defaultMain [bgroup "count" [bench "100" $ whnf count 100]] Differential Revision: https://phabricator.haskell.org/D140
* Implement new CLZ and CTZ primops (re #9340)Herbert Valerio Riedel2014-08-141-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | This implements the new primops clz#, clz32#, clz64#, ctz#, ctz32#, ctz64# which provide efficient implementations of the popular count-leading-zero and count-trailing-zero respectively (see testcase for a pure Haskell reference implementation). On x86, NCG as well as LLVM generates code based on the BSF/BSR instructions (which need extra logic to make the 0-case well-defined). Test Plan: validate and succesful tests on i686 and amd64 Reviewers: rwbarton, simonmar, ezyang, austin Subscribers: simonmar, relrod, ezyang, carter Differential Revision: https://phabricator.haskell.org/D144 GHC Trac Issues: #9340
* StgCmmPrim: add note to stop using fixed size signed types for sizesJohan Tibell2014-08-121-0/+5
| | | | | We use fixed size signed types to e.g. represent array sizes. This means that the size can overflow.
* shouldInlinePrimOp: Fix Int overflowJohan Tibell2014-08-121-22/+38
| | | | | | | | | | | | | | | | There were two overflow issues in shouldInlinePrimOp. The first one is due to a negative CmmInt literal being created if the array size was given as larger than 2^63-1 (on a 64-bit platform.) This meant that large array sizes could compare as being smaller than maxInlineAllocSize. The second issue is that we casted the Integer to an Int in the comparison, which again meant that large array sizes could compare as being smaller than maxInlineAllocSize. The attempt to allocate a large array inline then caused a segfault. Fixes #9416.