summaryrefslogtreecommitdiff
path: root/compiler/llvmGen/LlvmCodeGen.hs
Commit message (Collapse)AuthorAgeFilesLines
* Do CafInfo/SRT analysis in CmmÖmer Sinan Ağacan2020-01-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes all CafInfo predictions and various hacks to preserve predicted CafInfos from the compiler and assigns final CafInfos to interface Ids after code generation. SRT analysis is extended to support static data, and Cmm generator is modified to allow generating static_link fields after SRT analysis. This also fixes `-fcatch-bottoms`, which introduces error calls in case expressions in CorePrep, which runs *after* CoreTidy (which is where we decide on CafInfos) and turns previously non-CAFFY things into CAFFY. Fixes #17648 Fixes #9718 Evaluation ========== NoFib ----- Boot with: `make boot mode=fast` Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1` -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- CS -0.0% 0.0% -0.0% -0.0% -0.0% CSD -0.0% 0.0% -0.0% -0.0% -0.0% FS -0.0% 0.0% -0.0% -0.0% -0.0% S -0.0% 0.0% -0.0% -0.0% -0.0% VS -0.0% 0.0% -0.0% -0.0% -0.0% VSD -0.0% 0.0% -0.0% -0.0% -0.5% VSM -0.0% 0.0% -0.0% -0.0% -0.0% anna -0.1% 0.0% -0.0% -0.0% -0.0% ansi -0.0% 0.0% -0.0% -0.0% -0.0% atom -0.0% 0.0% -0.0% -0.0% -0.0% awards -0.0% 0.0% -0.0% -0.0% -0.0% banner -0.0% 0.0% -0.0% -0.0% -0.0% bernouilli -0.0% 0.0% -0.0% -0.0% -0.0% binary-trees -0.0% 0.0% -0.0% -0.0% -0.0% boyer -0.0% 0.0% -0.0% -0.0% -0.0% boyer2 -0.0% 0.0% -0.0% -0.0% -0.0% bspt -0.0% 0.0% -0.0% -0.0% -0.0% cacheprof -0.0% 0.0% -0.0% -0.0% -0.0% calendar -0.0% 0.0% -0.0% -0.0% -0.0% cichelli -0.0% 0.0% -0.0% -0.0% -0.0% circsim -0.0% 0.0% -0.0% -0.0% -0.0% clausify -0.0% 0.0% -0.0% -0.0% -0.0% comp_lab_zift -0.0% 0.0% -0.0% -0.0% -0.0% compress -0.0% 0.0% -0.0% -0.0% -0.0% compress2 -0.0% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm1 -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm2 -0.0% 0.0% -0.0% -0.0% -0.0% cse -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e1 -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e2 -0.0% 0.0% -0.0% -0.0% -0.0% dom-lt -0.0% 0.0% -0.0% -0.0% -0.0% eliza -0.0% 0.0% -0.0% -0.0% -0.0% event -0.0% 0.0% -0.0% -0.0% -0.0% exact-reals -0.0% 0.0% -0.0% -0.0% -0.0% exp3_8 -0.0% 0.0% -0.0% -0.0% -0.0% expert -0.0% 0.0% -0.0% -0.0% -0.0% fannkuch-redux -0.0% 0.0% -0.0% -0.0% -0.0% fasta -0.0% 0.0% -0.0% -0.0% -0.0% fem -0.0% 0.0% -0.0% -0.0% -0.0% fft -0.0% 0.0% -0.0% -0.0% -0.0% fft2 -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% fish -0.0% 0.0% -0.0% -0.0% -0.0% fluid -0.1% 0.0% -0.0% -0.0% -0.0% fulsom -0.0% 0.0% -0.0% -0.0% -0.0% gamteb -0.0% 0.0% -0.0% -0.0% -0.0% gcd -0.0% 0.0% -0.0% -0.0% -0.0% gen_regexps -0.0% 0.0% -0.0% -0.0% -0.0% genfft -0.0% 0.0% -0.0% -0.0% -0.0% gg -0.0% 0.0% -0.0% -0.0% -0.0% grep -0.0% 0.0% -0.0% -0.0% -0.0% hidden -0.0% 0.0% -0.0% -0.0% -0.0% hpg -0.1% 0.0% -0.0% -0.0% -0.0% ida -0.0% 0.0% -0.0% -0.0% -0.0% infer -0.0% 0.0% -0.0% -0.0% -0.0% integer -0.0% 0.0% -0.0% -0.0% -0.0% integrate -0.0% 0.0% -0.0% -0.0% -0.0% k-nucleotide -0.0% 0.0% -0.0% -0.0% -0.0% kahan -0.0% 0.0% -0.0% -0.0% -0.0% knights -0.0% 0.0% -0.0% -0.0% -0.0% lambda -0.0% 0.0% -0.0% -0.0% -0.0% last-piece -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% life -0.0% 0.0% -0.0% -0.0% -0.0% lift -0.0% 0.0% -0.0% -0.0% -0.0% linear -0.1% 0.0% -0.0% -0.0% -0.0% listcompr -0.0% 0.0% -0.0% -0.0% -0.0% listcopy -0.0% 0.0% -0.0% -0.0% -0.0% maillist -0.0% 0.0% -0.0% -0.0% -0.0% mandel -0.0% 0.0% -0.0% -0.0% -0.0% mandel2 -0.0% 0.0% -0.0% -0.0% -0.0% mate -0.0% 0.0% -0.0% -0.0% -0.0% minimax -0.0% 0.0% -0.0% -0.0% -0.0% mkhprog -0.0% 0.0% -0.0% -0.0% -0.0% multiplier -0.0% 0.0% -0.0% -0.0% -0.0% n-body -0.0% 0.0% -0.0% -0.0% -0.0% nucleic2 -0.0% 0.0% -0.0% -0.0% -0.0% para -0.0% 0.0% -0.0% -0.0% -0.0% paraffins -0.0% 0.0% -0.0% -0.0% -0.0% parser -0.1% 0.0% -0.0% -0.0% -0.0% parstof -0.1% 0.0% -0.0% -0.0% -0.0% pic -0.0% 0.0% -0.0% -0.0% -0.0% pidigits -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% pretty -0.0% 0.0% -0.3% -0.4% -0.4% primes -0.0% 0.0% -0.0% -0.0% -0.0% primetest -0.0% 0.0% -0.0% -0.0% -0.0% prolog -0.0% 0.0% -0.0% -0.0% -0.0% puzzle -0.0% 0.0% -0.0% -0.0% -0.0% queens -0.0% 0.0% -0.0% -0.0% -0.0% reptile -0.0% 0.0% -0.0% -0.0% -0.0% reverse-complem -0.0% 0.0% -0.0% -0.0% -0.0% rewrite -0.0% 0.0% -0.0% -0.0% -0.0% rfib -0.0% 0.0% -0.0% -0.0% -0.0% rsa -0.0% 0.0% -0.0% -0.0% -0.0% scc -0.0% 0.0% -0.3% -0.5% -0.4% sched -0.0% 0.0% -0.0% -0.0% -0.0% scs -0.0% 0.0% -0.0% -0.0% -0.0% simple -0.1% 0.0% -0.0% -0.0% -0.0% solid -0.0% 0.0% -0.0% -0.0% -0.0% sorting -0.0% 0.0% -0.0% -0.0% -0.0% spectral-norm -0.0% 0.0% -0.0% -0.0% -0.0% sphere -0.0% 0.0% -0.0% -0.0% -0.0% symalg -0.0% 0.0% -0.0% -0.0% -0.0% tak -0.0% 0.0% -0.0% -0.0% -0.0% transform -0.0% 0.0% -0.0% -0.0% -0.0% treejoin -0.0% 0.0% -0.0% -0.0% -0.0% typecheck -0.0% 0.0% -0.0% -0.0% -0.0% veritas -0.0% 0.0% -0.0% -0.0% -0.0% wang -0.0% 0.0% -0.0% -0.0% -0.0% wave4main -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve1 -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve2 -0.0% 0.0% -0.0% -0.0% -0.0% x2n1 -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.3% -0.5% -0.5% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% -0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- circsim -0.1% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% gc_bench -0.0% 0.0% -0.0% -0.0% -0.0% hash -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% spellcheck -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.0% -0.0% -0.0% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% +0.0% -0.0% -0.0% -0.0% Manual inspection of programs in testsuite/tests/programs --------------------------------------------------------- I built these programs with a bunch of dump flags and `-O` and compared STG, Cmm, and Asm dumps and file sizes. (Below the numbers in parenthesis show number of modules in the program) These programs have identical compiler (same .hi and .o sizes, STG, and Cmm and Asm dumps): - Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3), andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2), jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4), jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1), bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1), strict_anns (1), thurston-module-arith (2), okeefe_neural (1), joao-circular (6), 10queens (1) Programs with different compiler outputs: - jl_defaults (1): For some reason GHC HEAD marks a lot of top-level `[Int]` closures as CAFFY for no reason. With this patch we no longer make them CAFFY and generate less SRT entries. For some reason Main.o is slightly larger with this patch (1.3%) and the executable sizes are the same. (I'd expect both to be smaller) - launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the executable sizes are the same. - galois_raytrace (13): Differences are in the Parse module. There are a lot, but some of the changes are caused by the fact that for some reason (I think a bug) GHC HEAD marks the dictionary for `Functor Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the same. - north_array: We now generate less SRT entries because some of array primops used in this program like `NewArrayOp` get eliminated during Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24% larger (9224 bytes from 9000 bytes), executable sizes are the same. - seward-space-leak: Difference in this program is better shown by this smaller example: module Lib where data CDS = Case [CDS] [(Int, CDS)] | Call CDS CDS instance Eq CDS where Case sels1 rets1 == Case sels2 rets2 = sels1 == sels2 && rets1 == rets2 Call a1 b1 == Call a2 b2 = a1 == a2 && b1 == b2 _ == _ = False In this program GHC HEAD builds a new SRT for the recursive group of `(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==` in its SRT field, and `==` uses the SRT object as its SRT. With this patch we use the closure for `/=` as the SRT and add `==` there. Then `/=` gets an empty SRT field and `==` points to `/=` in its SRT field. This change looks fine to me. Main.o gets 0.07% larger, executable sizes are identical. head.hackage ------------ head.hackage's CI script builds 428 packages from Hackage using this patch with no failures. Compiler performance -------------------- The compiler perf tests report that the compiler allocates slightly more (worst case observed so far is 4%). However most programs in the test suite are small, single file programs. To benchmark compiler performance on something more realistic I build Cabal (the library, 236 modules) with different optimisation levels. For the "max residency" row I run GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows are generated with just `-s`. (This is because `-i0` causes running GC much more frequently and as a result "bytes copied" gets inflated by more than 25x in some cases) * -O0 | | GHC HEAD | This MR | Diff | | --------------- | -------------- | -------------- | ------ | | Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% | | Bytes copied | 4,926,037,184 | 4,990,638,760 | +1.31% | | Max residency | 421,225,624 | 424,324,264 | +0.73% | * -O1 | | GHC HEAD | This MR | Diff | | --------------- | --------------- | --------------- | ------ | | Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% | | Bytes copied | 26,943,452,560 | 27,089,972,296 | +0.54% | | Max residency | 982,643,440 | 991,663,432 | +0.91% | * -O2 | | GHC HEAD | This MR | Diff | | --------------- | --------------- | --------------- | ------ | | Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% | | Bytes copied | 37,044,237,616 | 36,121,690,472 | -2.49% | | Max residency | 1,071,600,328 | 1,086,396,256 | +1.38% | Extra compiler allocations -------------------------- Runtime allocations of programs are as reported above (NoFib section). The compiler now allocates more than before. Main source of allocation in this patch compared to base commit is the new SRT algorithm (GHC.Cmm.Info.Build). Below is some of the extra work we do with this patch, numbers generated by profiled stage 2 compiler when building a pathological case (the test 'ManyConstructors') with '-O2': - We now sort the final STG for a module, which means traversing the entire program, generating free variable set for each top-level binding, doing SCC analysis, and re-ordering the program. In ManyConstructors this step allocates 97,889,952 bytes. - We now do SRT analysis on static data, which in a program like ManyConstructors causes analysing 10,000 bindings that we would previously just skip. This step allocates 70,898,352 bytes. - We now maintain an SRT map for the entire module as we compile Cmm groups: data ModuleSRTInfo = ModuleSRTInfo { ... , moduleSRTMap :: SRTMap } (SRTMap is just a strict Map from the 'containers' library) This map gets an entry for most bindings in a module (exceptions are THUNKs and CAFFY static functions). For ManyConstructors this map gets 50015 entries. - Once we're done with code generation we generate a NameSet from SRTMap for the non-CAFFY names in the current module. This set gets the same number of entries as the SRTMap. - Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using the NameSet generated in the previous step. This usually does the least amount of allocation among the work listed here. Only place with this patch where we do less work in the CAF analysis in the tidying pass (CoreTidy). However that doesn't save us much, as the pass still needs to traverse the whole program and update IdInfos for other reasons. Only thing we don't here do is the `hasCafRefs` pass over the RHS of bindings, which is a stateless pass that returns a boolean value, so it doesn't allocate much. (Metric changes blow are all increased allocations) Metric changes -------------- Metric Increase: ManyAlternatives ManyConstructors T13035 T14683 T1969 T9961
* Module hierarchy: Cmm (cf #13009)Sylvain Henry2020-01-251-3/+3
|
* llvmGen: Drop old fix for #11649Ben Gamari2019-12-301-36/+1
| | | | | This was a hack which is no longer necessary now since we introduce a dedicated entry block for each procedure.
* Add GHC-API logging hooksSylvain Henry2019-12-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | * Add 'dumpAction' hook to DynFlags. It allows GHC API users to catch dumped intermediate codes and information. The format of the dump (Core, Stg, raw text, etc.) is now reported allowing easier automatic handling. * Add 'traceAction' hook to DynFlags. Some dumps go through the trace mechanism (for instance unfoldings that have been considered for inlining). This is problematic because: 1) dumps aren't written into files even with -ddump-to-file on 2) dumps are written on stdout even with GHC API 3) in this specific case, dumping depends on unsafe globally stored DynFlags which is bad for GHC API users We introduce 'traceAction' hook which allows GHC API to catch those traces and to avoid using globally stored DynFlags. * Avoid dumping empty logs via dumpAction/traceAction (but still write empty files to keep the existing behavior)
* Optimize MonadUnique instances based on IO (#16843)nineonine2019-11-191-3/+3
| | | | | Metric Decrease: T14683
* For s390x issue a warning if LLVM 9 or older is usedStefan Schulze Frielinghaus2019-11-071-0/+6
| | | | | For s390x the GHC calling convention is only supported since LLVM version 10. Issue a warning in case an older version of LLVM is used.
* Make dynflag argument for withTiming pure.Andreas Klebinger2019-10-231-1/+1
| | | | | | | | | | | | 19 times out of 20 we already have dynflags in scope. We could just always use `return dflags`. But this is in fact not free. When looking at some STG code I noticed that we always allocate a closure for this expression in the heap. Clearly a waste in these cases. For the other cases we can either just modify the callsite to get dynflags or use the _D variants of withTiming I added which will use getDynFlags under the hood.
* Refactor, document, and optimize LLVM configuration loadingBen Gamari2019-10-071-4/+10
| | | | | | | | | | | | As described in the new Note [LLVM Configuration] in SysTools, we now load llvm-targets and llvm-passes lazily to avoid the overhead of doing so when -fllvm isn't used (also known as "the common case"). Noticed in #17003. Metric Decrease: T12234 T12150
* Module hierarchy: StgToCmm (#13009)Sylvain Henry2019-09-101-1/+1
| | | | | | Add StgToCmm module hierarchy. Platform modules that are used in several other places (NCG, LLVM codegen, Cmm transformations) are put into GHC.Platform.
* Fix LLVM version check yet againÖmer Sinan Ağacan2019-08-291-14/+14
| | | | | | | | | | | | | | | There were two problems with LLVM version checking: - The parser would only parse x and x.y formatted versions. E.g. 1.2.3 would be rejected. - The version check was too strict and would reject x.y formatted versions. E.g. when we support version 7 it'd reject 7.0 ("LLVM version 7.0") and only accept 7 ("LLVM version 7"). We now parse versions with arbitrarily deep minor numbering (x.y.z.t...) and accept versions as long as the major version matches the supported version (e.g. 7.1, 7.1.2, 7.1.2.3 ...).
* Return results of Cmm streams in backendsÖmer Sinan Ağacan2019-08-281-5/+9
| | | | | | | | | | | | | | | | | | | This generalizes code generators (outputAsm, outputLlvm, outputC, and the call site codeOutput) so that they'll return the return values of the passed Cmm streams. This allows accumulating data during Cmm generation and returning it to the call site in HscMain. Previously the Cmm streams were assumed to return (), so the code generators returned () as well. This change is required by !1304 and !1530. Skipping CI as this was tested before and I only updated the commit message. [skip ci]
* Make non-streaming LLVM and C backends streamingÖmer Sinan Ağacan2019-08-231-2/+1
| | | | | | | | | This adds a Stream.consume function, uses it in LLVM and C code generators, and removes the use of Stream.collect function which was used to collect streaming Cmm generation results into a list. LLVM and C backends now properly use streamed Cmm generation, instead of collecting Cmm groups into a list before generating LLVM/C code.
* Remove LLVM_TARGET platform macrosJohn Ericson2019-07-141-1/+1
| | | | | Instead following @angerman's suggestion put them in the config file. Maybe we could re-key llvm-targets someday, but this is good for now.
* Fixes for LLVM 7Erik de Castro Lopo2019-06-241-1/+1
| | | | | | | LLVM version numberinf changed recently. Previously, releases were numbered 4.0, 5.0 and 6.0 but with version 7, they dropped the redundant ".0". Fix requires for Llvm detection and some code.
* Update Trac ticket URLs to point to GitLabRyan Scott2019-03-151-1/+1
| | | | | This moves all URL references to Trac tickets to their corresponding GitLab counterparts.
* Minor performance optimisationGabor Greif2018-11-221-5/+5
| | | | only concat once
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-191-0/+2
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* Clean up opt and llcMoritz Angermann2017-09-061-1/+10
| | | | | | | | | | | | | | | | | | | | | The LLVM backend shells out to LLVMs `opt` and `llc` tools. This clean up introduces a shared data structure to carry the arguments we pass to each tool so that corresponding flags are next to each other. It drops the hard coded data layouts in favor of using `-mtriple` and have LLVM infer them. Furthermore we add `clang` as a proper tool, so we don't rely on assuming that `clang` is called `clang` on the `PATH` when using `clang` as the assembler. Finally this diff also changes the type of `optLevel` from `Int` to `Word`, as we do not have negative optimization levels. Reviewers: erikd, hvr, austin, rwbarton, bgamari, kavon Reviewed By: kavon Subscribers: michalt, Ericson2314, ryantrinkle, dfeuer, carter, simonpj, kavon, simonmar, thomie, erikd, snowleopard Differential Revision: https://phabricator.haskell.org/D3352
* Hoopl: remove dependency on Hoopl packageMichal Terepeta2017-06-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This copies the subset of Hoopl's functionality needed by GHC to `cmm/Hoopl` and removes the dependency on the Hoopl package. The main motivation for this change is the confusing/noisy interface between GHC and Hoopl: - Hoopl has `Label` which is GHC's `BlockId` but different than GHC's `CLabel` - Hoopl has `Unique` which is different than GHC's `Unique` - Hoopl has `Unique{Map,Set}` which are different than GHC's `Uniq{FM,Set}` - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is needed just to filter the exposed functions (filter out some of the Hoopl's and add the GHC ones) With this change, we'll be able to simplify this significantly. It'll also be much easier to do invasive changes (Hoopl is a public package on Hackage with users that depend on the current behavior) This should introduce no changes in functionality - it merely copies the relevant code. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: simonpj, kavon, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3616
* LLVM: Tweak TBAA metadata codegenErik de Castro Lopo2017-01-161-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | This change is requred for llvm 4.0. GHC doesn't use that version yet, but this change is just as valid for versions eariler than 4.0. Two changes needed: * Previously, GHC defined a `topN` node in the TBAA heiarchy and some IR instructions referenced that node. With LLVM 4.0 the root node can no longer be referenced by IR instructions, so we introduce a new element `rootN` and make `topN` a child of that. * Previously the root TBAA node was rendered as "!0 = !{!"root", null}". With LLVM 4.0 that needs to be "!0 = !{!"root"}" which is also accepted by earlier versions. Test Plan: Build with quick-llvm BuildFlavor and run tests Reviewers: bgamari, drbo, austin, angerman, michalt, DemiMarie Reviewed By: DemiMarie Subscribers: mpickering, DemiMarie, thomie Differential Revision: https://phabricator.haskell.org/D2975
* llvmGen: Make metadata ids a newtypeBen Gamari2016-06-181-1/+1
| | | | | These were previously just represented as Ints which was needlessly vague.
* ErrUtils: Add timings to compiler phasesBen Gamari2016-03-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | This adds timings and allocation figures to the compiler's output when run with `-v2` in an effort to ease performance analysis. Todo: * Documentation * Where else should we add these? * Perhaps we should remove some of the now-arguably-redundant `showPass` occurrences where they are * Must we force more? * Perhaps we should place this behind a `-ftimings` instead of `-v2` Test Plan: `ghc -v2 Test.hs`, look at the output Reviewers: hvr, goldfire, simonmar, austin Reviewed By: simonmar Subscribers: angerman, michalt, niteria, ezyang, thomie Differential Revision: https://phabricator.haskell.org/D1959
* LlvmCodeGen: Fix generation of malformed LLVM blocksErik de Castro Lopo2016-03-121-1/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 673efccb3b uncovered a bug in LLVM code generation that produced LLVM code that the LLVM compiler refused to compile: { clpH: br label %clpH } This may well be a bug in LLVM itself. The solution is to keep the existing entry label and rewrite the function as: { clpH: br label %nPV nPV: br label %nPV } Thanks to Ben Gamari for pointing me in the right direction on this one. Test Plan: Build GHC with BuildFlavour=quick-llvm Reviewers: hvr, austin, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1996 GHC Trac Issues: #11649
* LLVM backend: Show expected LLVM version in warnings/errorsÖmer Sinan Ağacan2015-12-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before: [1 of 1] Compiling Main ( Main.hs, Main.o ) You are using a new version of LLVM that hasn't been tested yet! We will try though... After: [1 of 1] Compiling Main ( Main.hs, Main.o ) You are using an unsupported version of LLVM! Currently only 3.7 is supported. We will try though... Before: [1 of 1] Compiling Main ( Main.hs, Main.o ) <no location info>: Warning: Couldn't figure out LLVM version! Make sure you have installed LLVM ghc: could not execute: opt After: [1 of 1] Compiling Main ( Main.hs, Main.o ) <no location info>: error: Warning: Couldn't figure out LLVM version! Make sure you have installed LLVM 3.7 ghc-stage1: could not execute: opt Reviewers: austin, rwbarton, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1658
* Switch to LLVM version 3.7Erik de Castro Lopo2015-10-141-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | Before this commit, GHC only supported LLVM 3.6. Now it only supports LLVM 3.7 which was released in August 2015. LLVM version 3.6 and earlier do not work on AArch64/Arm64, but 3.7 does. Also: * Add CC_Ghc constructor to LlvmCallConvention. * Replace `maxSupportLlvmVersion`/`minSupportLlvmVersion` with a single `supportedLlvmVersion` variable. * Get `supportedLlvmVersion` from version specified in configure.ac. * Drop llvmVersion field from DynFlags (no longer needed because only one version is supported). Test Plan: Validate on x86_64 and arm Reviewers: bgamari, austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1320 GHC Trac Issues: #10953
* Revert "Switch to LLVM version 3.7"Erik de Castro Lopo2015-10-101-3/+8
| | | | | | Pushed by mistacke before it was ready. This reverts commit 5dc3db743ec477978b9727a313951be44dbd170f.
* Switch to LLVM version 3.7Erik de Castro Lopo2015-10-101-8/+3
|
* llvmGen: move to LLVM 3.6 exclusivelyBen Gamari2015-02-091-6/+1
| | | | | | | | | | | | | | | | | | | Summary: Rework llvmGen to use LLVM 3.6 exclusively. The plans for the 7.12 release are to ship LLVM alongside GHC in the interests of user (and developer) sanity. Along the way, refactor TNTC support to take advantage of the new `prefix` data support in LLVM 3.6. This allows us to drop the section-reordering component of the LLVM mangler. Test Plan: Validate, look at emitted code Reviewers: dterei, austin, scpmw Reviewed By: austin Subscribers: erikd, awson, spacekitteh, thomie, carter Differential Revision: https://phabricator.haskell.org/D530 GHC Trac Issues: #10074
* llvmGen: Compatibility with LLVM 3.5 (re #9142)Ben Gamari2014-11-211-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to changes in LLVM 3.5 aliases now may only refer to definitions. Previously to handle symbols defined outside of the current commpilation unit GHC would emit both an `external` declaration, as well as an alias pointing to it, e.g., @stg_BCO_info = external global i8 @stg_BCO_info$alias = alias private i8* @stg_BCO_info Where references to `stg_BCO_info` will use the alias `stg_BCO_info$alias`. This is not permitted under the new alias behavior, resulting in errors resembling, Alias must point to a definition i8* @"stg_BCO_info$alias" To fix this, we invert the naming relationship between aliases and definitions. That is, now the symbol definition takes the name `@stg_BCO_info$def` and references use the actual name, `@stg_BCO_info`. This means the external symbols can be handled by simply emitting an `external` declaration, @stg_BCO_info = external global i8 Whereas in the case of a forward declaration we emit, @stg_BCO_info = alias private i8* @stg_BCO_info$def Reviewed By: austin Differential Revision: https://phabricator.haskell.org/D155
* Add LANGUAGE pragmas to compiler/ source filesHerbert Valerio Riedel2014-05-151-2/+2
| | | | | | | | | | | | | | | | | | In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been reorganized, while following the convention, to - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before any `{-# OPTIONS_GHC #-}`-lines. - Moreover, if the list of language extensions fit into a single `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each individual language extension. In both cases, try to keep the enumeration alphabetically ordered. (The latter layout is preferable as it's more diff-friendly) While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
* Validate inferred theta. Fixes #8883Jan Stolarek2014-04-191-0/+1
| | | | | | | This checks that all the required extensions are enabled for the inferred type signature. Updates binary and vector submodules.
* LLVM refactor cleanupsPeter Wortmann2013-06-271-3/+1
| | | | | Slightly more documentation, removed unused label map (huh), removed MonadIO instance on LlvmM to improve encapsulation.
* Major Llvm refactoringPeter Wortmann2013-06-271-110/+143
| | | | | | | | | | | | | | | | | | | | | | This combined patch reworks the LLVM backend in a number of ways: 1. Most prominently, we introduce a LlvmM monad carrying the contents of the old LlvmEnv around. This patch completely removes LlvmEnv and refactors towards standard library monad combinators wherever possible. 2. Support for streaming - we can now generate chunks of Llvm for Cmm as it comes in. This might improve our speed. 3. To allow streaming, we need a more flexible way to handle forward references. The solution (getGlobalPtr) unifies LlvmCodeGen.Data and getHsFunc as well. 4. Skip alloca-allocation for registers that are actually never written. LLVM will automatically eliminate these, but output is smaller and friendlier to human eyes this way. 5. We use LlvmM to collect references for llvm.used. This allows places other than cmmProcLlvmGens to generate entries.
* Extend globals to aliasesPeter Wortmann2013-06-271-2/+3
| | | | | Also give them a proper constructor - getGlobalVar and getGlobalValue map directly to the accessors.
* Avoid generating empty llvm.used definitions.Geoffrey Mainland2013-06-121-12/+12
| | | | | LLVM 3.3rc3 complains when the llvm.used global is an empty array, so don't define llvm.used at all when it would be empty.
* Output LLVM version in use at -V2.David Terei2013-01-171-0/+2
|
* Add -f[no-]warn-unsupported-llvm-version. Closes Trac #7579.Austin Seipp2013-01-161-2/+3
| | | | | | | | | | | | This controls whether or not the compiler warns if we're using an LLVM version that's too old or too new. It's mostly useful when building the compiler knowingly with an unsupported version, so you don't get a lot of warnings in the build process. There's no documentation for this since it's a flag only a few developers would care about anyway. Signed-off-by: Austin Seipp <mad.one@gmail.com>
* Fix warningsSimon Marlow2012-11-121-1/+1
|
* Remove OldCmm, convert backends to consume new CmmSimon Marlow2012-11-121-9/+8
| | | | | | | | | | | | | | | | | | This removes the OldCmm data type and the CmmCvt pass that converts new Cmm to OldCmm. The backends (NCGs, LLVM and C) have all been converted to consume new Cmm. The main difference between the two data types is that conditional branches in new Cmm have both true/false successors, whereas in OldCmm the false case was a fallthrough. To generate slightly better code we occasionally need to invert a conditional to ensure that the branch-not-taken becomes a fallthrough; this was previously done in CmmCvt, and it is now done in CmmContFlowOpt. We could go further and use the Hoopl Block representation for native code, which would mean that we could use Hoopl's postorderDfs and analyses for native code, but for now I've left it as is, using the old ListGraph representation for native code.
* Generate correct LLVM for the new register allocation scheme.Geoffrey Mainland2012-10-301-2/+2
| | | | | | | | | | | | | We now have accurate global register liveness information attached to all Cmm procedures and jumps. With this patch, the LLVM back end uses this information to pass only the live floating point (F and D) registers on tail calls. This makes the LLVM back end compatible with the new register allocation strategy. Ideally the GHC LLVM calling convention would put all registers that are always live first in the parameter sequence. Unfortunately the specification is written so that on x86-64 SpLim (always live) is passed after the R registers. Therefore we must always pass *something* in the R registers, so we pass the LLVM value undef.
* Attach global register liveness info to Cmm procedures.Geoffrey Mainland2012-10-301-2/+2
| | | | | | | All Cmm procedures now include the set of global registers that are live on procedure entry, i.e., the global registers used to pass arguments to the procedure. Only global registers that are use to pass arguments are included in this list.
* Pass DynFlags down to bWordIan Lynagh2012-09-121-1/+1
| | | | | | I've switched to passing DynFlags rather than Platform, as (a) it's simpler to not have to extract targetPlatform in so many places, and (b) it may be useful to have DynFlags around in future.
* Move activeStgRegs into CodeGen.PlatformIan Lynagh2012-08-211-1/+1
|
* Add "Unregisterised" as a field in the settings fileIan Lynagh2012-08-071-1/+1
| | | | | | To explicitly choose whether you want an unregisterised build you now need to use the "--enable-unregisterised"/"--disable-unregisterised" configure flags.
* New codegen: do not split proc-points when using the NCGSimon Marlow2012-07-301-2/+2
| | | | | | | | | Proc-point splitting is only required by backends that do not support having proc-points within a code block (that is, everything except the native backend, i.e. LLVM and C). Not doing proc-point splitting saves some compilation time, and might produce slightly better code in some cases.
* tweak llvm version warning messageDavid Terei2012-06-251-2/+2
|
* Warn if using unsupported version of LLVM.David Terei2012-06-251-3/+18
|
* Remove some more redundant Platform argumentsIan Lynagh2012-06-201-1/+1
|
* Add DynFlags to the SDoc stateIan Lynagh2012-06-121-5/+5
|
* Use SDoc rather than Doc in LLVMIan Lynagh2012-06-121-7/+9
| | | | | In particular, this makes life simpler when we want to use a general GHC SDoc in the middle of some LLVM.