|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch removes all CafInfo predictions and various hacks to preserve
predicted CafInfos from the compiler and assigns final CafInfos to
interface Ids after code generation. SRT analysis is extended to support
static data, and Cmm generator is modified to allow generating
static_link fields after SRT analysis.
This also fixes `-fcatch-bottoms`, which introduces error calls in case
expressions in CorePrep, which runs *after* CoreTidy (which is where we
decide on CafInfos) and turns previously non-CAFFY things into CAFFY.
Fixes #17648
Fixes #9718
Evaluation
==========
NoFib
-----
Boot with: `make boot mode=fast`
Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1`
--------------------------------------------------------------------------------
        Program           Size    Allocs    Instrs     Reads    Writes
--------------------------------------------------------------------------------
             CS          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            CSD          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             FS          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              S          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             VS          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            VSD          -0.0%      0.0%     -0.0%     -0.0%     -0.5%
            VSM          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           anna          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
           ansi          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           atom          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         awards          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         banner          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
     bernouilli          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
   binary-trees          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          boyer          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         boyer2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           bspt          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      cacheprof          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       calendar          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       cichelli          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        circsim          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       clausify          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
  comp_lab_zift          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       compress          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      compress2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
    constraints          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
   cryptarithm1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
   cryptarithm2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            cse          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
   digits-of-e1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
   digits-of-e2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         dom-lt          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          eliza          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          event          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
    exact-reals          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         exp3_8          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         expert          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
 fannkuch-redux          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          fasta          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            fem          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            fft          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           fft2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       fibheaps          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           fish          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          fluid          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
         fulsom          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         gamteb          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            gcd          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
    gen_regexps          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         genfft          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             gg          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           grep          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         hidden          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            hpg          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            ida          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          infer          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        integer          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      integrate          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
   k-nucleotide          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          kahan          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        knights          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         lambda          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
     last-piece          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           lcss          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           life          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           lift          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         linear          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
      listcompr          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       listcopy          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       maillist          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         mandel          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        mandel2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           mate          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        minimax          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        mkhprog          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
     multiplier          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         n-body          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       nucleic2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           para          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      paraffins          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         parser          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
        parstof          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            pic          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       pidigits          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          power          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         pretty          -0.0%      0.0%     -0.3%     -0.4%     -0.4%
         primes          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      primetest          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         prolog          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         puzzle          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         queens          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        reptile          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
reverse-complem          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        rewrite          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           rfib          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            rsa          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            scc          -0.0%      0.0%     -0.3%     -0.5%     -0.4%
          sched          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            scs          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         simple          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
          solid          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        sorting          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
  spectral-norm          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         sphere          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         symalg          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            tak          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      transform          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       treejoin          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      typecheck          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        veritas          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           wang          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      wave4main          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
   wheel-sieve1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
   wheel-sieve2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           x2n1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
--------------------------------------------------------------------------------
            Min          -0.1%      0.0%     -0.3%     -0.5%     -0.5%
            Max          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
 Geometric Mean          -0.0%     -0.0%     -0.0%     -0.0%     -0.0%
--------------------------------------------------------------------------------
        Program           Size    Allocs    Instrs     Reads    Writes
--------------------------------------------------------------------------------
        circsim          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
    constraints          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       fibheaps          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       gc_bench          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           hash          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           lcss          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          power          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
     spellcheck          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
--------------------------------------------------------------------------------
            Min          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            Max          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
 Geometric Mean          -0.0%     +0.0%     -0.0%     -0.0%     -0.0%
Manual inspection of programs in testsuite/tests/programs
---------------------------------------------------------
I built these programs with a bunch of dump flags and `-O` and compared
STG, Cmm, and Asm dumps and file sizes.
(Below the numbers in parenthesis show number of modules in the program)
These programs have identical compiler (same .hi and .o sizes, STG, and
Cmm and Asm dumps):
- Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3),
  andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2),
  jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4),
  jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1),
  bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1),
  strict_anns (1), thurston-module-arith (2), okeefe_neural (1),
  joao-circular (6), 10queens (1)
Programs with different compiler outputs:
- jl_defaults (1): For some reason GHC HEAD marks a lot of top-level
  `[Int]` closures as CAFFY for no reason. With this patch we no longer
  make them CAFFY and generate less SRT entries. For some reason Main.o
  is slightly larger with this patch (1.3%) and the executable sizes are
  the same. (I'd expect both to be smaller)
- launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked
  as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the
  executable sizes are the same.
- galois_raytrace (13): Differences are in the Parse module. There are a
  lot, but some of the changes are caused by the fact that for some
  reason (I think a bug) GHC HEAD marks the dictionary for `Functor
  Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the
  same.
- north_array: We now generate less SRT entries because some of array
  primops used in this program like `NewArrayOp` get eliminated during
  Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24%
  larger (9224 bytes from 9000 bytes), executable sizes are the same.
- seward-space-leak: Difference in this program is better shown by this
  smaller example:
      module Lib where
      data CDS
        = Case [CDS] [(Int, CDS)]
        | Call CDS CDS
      instance Eq CDS where
        Case sels1 rets1 == Case sels2 rets2 =
            sels1 == sels2 && rets1 == rets2
        Call a1 b1 == Call a2 b2 =
            a1 == a2 && b1 == b2
        _ == _ =
            False
   In this program GHC HEAD builds a new SRT for the recursive group of
   `(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==`
   in its SRT field, and `==` uses the SRT object as its SRT. With this
   patch we use the closure for `/=` as the SRT and add `==` there. Then
   `/=` gets an empty SRT field and `==` points to `/=` in its SRT
   field.
   This change looks fine to me.
   Main.o gets 0.07% larger, executable sizes are identical.
head.hackage
------------
head.hackage's CI script builds 428 packages from Hackage using this
patch with no failures.
Compiler performance
--------------------
The compiler perf tests report that the compiler allocates slightly more
(worst case observed so far is 4%). However most programs in the test
suite are small, single file programs. To benchmark compiler performance
on something more realistic I build Cabal (the library, 236 modules)
with different optimisation levels. For the "max residency" row I run
GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows
are generated with just `-s`. (This is because `-i0` causes running GC
much more frequently and as a result "bytes copied" gets inflated by
more than 25x in some cases)
* -O0
|                 | GHC HEAD       | This MR        | Diff   |
| --------------- | -------------- | -------------- | ------ |
| Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% |
| Bytes copied    |  4,926,037,184 |  4,990,638,760 | +1.31% |
| Max residency   |    421,225,624 |    424,324,264 | +0.73% |
* -O1
|                 | GHC HEAD        | This MR         | Diff   |
| --------------- | --------------- | --------------- | ------ |
| Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% |
| Bytes copied    |  26,943,452,560 |  27,089,972,296 | +0.54% |
| Max residency   |     982,643,440 |     991,663,432 | +0.91% |
* -O2
|                 | GHC HEAD        | This MR         | Diff   |
| --------------- | --------------- | --------------- | ------ |
| Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% |
| Bytes copied    |  37,044,237,616 |  36,121,690,472 | -2.49% |
| Max residency   |   1,071,600,328 |   1,086,396,256 | +1.38% |
Extra compiler allocations
--------------------------
Runtime allocations of programs are as reported above (NoFib section).
The compiler now allocates more than before. Main source of allocation
in this patch compared to base commit is the new SRT algorithm
(GHC.Cmm.Info.Build). Below is some of the extra work we do with this
patch, numbers generated by profiled stage 2 compiler when building a
pathological case (the test 'ManyConstructors') with '-O2':
- We now sort the final STG for a module, which means traversing the
  entire program, generating free variable set for each top-level
  binding, doing SCC analysis, and re-ordering the program. In
  ManyConstructors this step allocates 97,889,952 bytes.
- We now do SRT analysis on static data, which in a program like
  ManyConstructors causes analysing 10,000 bindings that we would
  previously just skip. This step allocates 70,898,352 bytes.
- We now maintain an SRT map for the entire module as we compile Cmm
  groups:
      data ModuleSRTInfo = ModuleSRTInfo
        { ...
        , moduleSRTMap :: SRTMap
        }
   (SRTMap is just a strict Map from the 'containers' library)
   This map gets an entry for most bindings in a module (exceptions are
   THUNKs and CAFFY static functions). For ManyConstructors this map
   gets 50015 entries.
- Once we're done with code generation we generate a NameSet from SRTMap
  for the non-CAFFY names in the current module. This set gets the same
  number of entries as the SRTMap.
- Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using
  the NameSet generated in the previous step. This usually does the
  least amount of allocation among the work listed here.
Only place with this patch where we do less work in the CAF analysis in
the tidying pass (CoreTidy). However that doesn't save us much, as the
pass still needs to traverse the whole program and update IdInfos for
other reasons. Only thing we don't here do is the `hasCafRefs` pass over
the RHS of bindings, which is a stateless pass that returns a boolean
value, so it doesn't allocate much.
(Metric changes blow are all increased allocations)
Metric changes
--------------
Metric Increase:
    ManyAlternatives
    ManyConstructors
    T13035
    T14683
    T1969
    T9961 | 
| | 
| 
| 
| 
| 
| | incomplete-uni-patterns and incomplete-record-updates will be in -Wall at a
future date, so prepare for that by disabling those warnings on files that
trigger them. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Fixes the calling convention for functions passing raw SSE-register
values by adding padding as needed to get the values in the right
registers. This problem cropped up when some args were unused an dropped
from the live list.
This folds together 2e23e1c7de01c92b038e55ce53d11bf9db993dd4 and
73273be476a8cc6c13368660b042b3b0614fd928 previously from @kavon.
Metric Increase:
    T12707
    ManyConstructors | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | * Add 'dumpAction' hook to DynFlags.
It allows GHC API users to catch dumped intermediate codes and
information. The format of the dump (Core, Stg, raw text, etc.) is now
reported allowing easier automatic handling.
* Add 'traceAction' hook to DynFlags.
Some dumps go through the trace mechanism (for instance unfoldings that
have been considered for inlining). This is problematic because:
1) dumps aren't written into files even with -ddump-to-file on
2) dumps are written on stdout even with GHC API
3) in this specific case, dumping depends on unsafe globally stored
DynFlags which is bad for GHC API users
We introduce 'traceAction' hook which allows GHC API to catch those
traces and to avoid using globally stored DynFlags.
* Avoid dumping empty logs via dumpAction/traceAction (but still write
empty files to keep the existing behavior) | 
| | 
| 
| 
| 
| | Metric Decrease:
    T14683 | 
| | 
| 
| 
| 
| 
| | Add StgToCmm module hierarchy. Platform modules that are used in several
other places (NCG, LLVM codegen, Cmm transformations) are put into
GHC.Platform. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | There were two problems with LLVM version checking:
- The parser would only parse x and x.y formatted versions. E.g. 1.2.3
  would be rejected.
- The version check was too strict and would reject x.y formatted
  versions. E.g. when we support version 7 it'd reject 7.0 ("LLVM
  version 7.0") and only accept 7 ("LLVM version 7").
We now parse versions with arbitrarily deep minor numbering (x.y.z.t...)
and accept versions as long as the major version matches the supported
version (e.g. 7.1, 7.1.2, 7.1.2.3 ...). | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This generalizes code generators (outputAsm, outputLlvm, outputC, and
the call site codeOutput) so that they'll return the return values of
the passed Cmm streams.
This allows accumulating data during Cmm generation and returning it to
the call site in HscMain.
Previously the Cmm streams were assumed to return (), so the code
generators returned () as well.
This change is required by !1304 and !1530.
Skipping CI as this was tested before and I only updated the commit
message.
[skip ci] | 
| | 
| 
| 
| 
| 
| 
| | Unfortunately this will require more work; register allocation is
quite broken.
This reverts commit acd795583625401c5554f8e04ec7efca18814011. | 
| | 
| 
| 
| 
| 
| 
| | This adds support for constructing vector types from Float#, Double# etc
and performing arithmetic operations on them
Cleaned-Up-By: Ben Gamari <ben@well-typed.com> | 
| | 
| 
| 
| 
| 
| 
| | LLVM version numberinf changed recently. Previously, releases were numbered
4.0, 5.0 and 6.0 but with version 7, they dropped the redundant ".0".
Fix requires for Llvm detection and some code. | 
| | 
| 
| 
| 
| 
| 
| | ghc-pkg needs to be aware of platforms so it can figure out which
subdire within the user package db to use. This is admittedly
roundabout, but maybe Cabal could use the same notion of a platform as
GHC to good affect too. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | 1. If GHC is to be multi-target, these cannot be baked in at compile
   time.
2. Compile-time flags have a higher maintenance than run-time flags.
3. The old way makes build system implementation (various bootstrapping
   details) with the thing being built. E.g. GHC doesn't need to care
   about which integer library *will* be used---this is purely a crutch
   so the build system doesn't need to pass flags later when using that
   library.
4. Experience with cross compilation in Nixpkgs has shown things work
   nicer when compiler's can *optionally* delegate the bootstrapping the
   package manager. The package manager knows the entire end-goal build
   plan, and thus can make top-down decisions on bootstrapping. GHC can
   just worry about GHC, not even core library like base and ghc-prim! | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | When a new closure identifier is being established to a
local or exported closure already emitted into the same
module, refrain from adding an IND_STATIC closure, and
instead emit an assembly-language alias.
Inter-module IND_STATIC objects still remain, and need to be
addressed by other measures.
Binary-size savings on nofib are around 0.1%. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors
* makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding
behavior in 32bit haskell code
* removes the 80bit floating point representation from the supported float sizes
* theres still 1 tiny bit of x87 support needed,
for handling float and double return values in FFI calls  wrt the C ABI on x86_32,
but this one piece does not leak into the rest of NCG.
* Lots of code thats not been touched in a long time got deleted as a
consequence of all of this
all in all, this change paves the way towards a lot of future further
improvements in how GHC handles floating point computations, along with
making the native code gen more accessible to a larger pool of contributors. | 
| | 
| 
| 
| 
| 
| | The alias is of type i8, so its global variable name
should have type i8*. Anyway we should never deal
with pointers to (i8*)! | 
| | 
| 
| 
| | remove local | 
| | 
| 
| 
| | This reverts commit adcb5fb47c0942671d409b940d8884daa9359ca4. | 
| | 
| 
| 
| | This reverts commit d8495549ba9d194815c2d0eaee6797fc7c00756a. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | We now calculate the SSE register padding needed to fix the calling
convention in LLVM in a robust way: grouping them by whether
registers in that class overlap (with the same class overlapping
itself).
My prior patch assumed that no matter the platform, physical
register Fx aliases with Dx, etc, for our calling convention.
This is unfortunately not the case for any platform except x86-64.
Test Plan:
Only know how to test on x86-64, but it should be tested on ARM with:
`make test WAYS=llvm && make test WAYS=optllvm`
Reviewers: bgamari, angerman
Reviewed By: bgamari
Subscribers: rwbarton, carter
GHC Trac Issues: #15780, #14251, #15747
Differential Revision: https://phabricator.haskell.org/D5254 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - Fix for #13904 -- stop "trashing" callee-saved registers, since it is
  not actually doing anything useful.
- Fix for #14251 -- fixes the calling convention for functions passing
  raw SSE-register values by adding padding as needed to get the values
  in the right registers. This problem cropped up when some args were
  unused an dropped from the live list.
- Fixed a typo in 'readnone' attribute
- Added 'lower-expect' pass to level 0 LLVM optimization passes to
  improve block layout in LLVM for stack checks, etc.
Test Plan: `make test WAYS=optllvm` and `make test WAYS=llvm`
Reviewers: bgamari, simonmar, angerman
Reviewed By: angerman
Subscribers: rwbarton, carter
GHC Trac Issues: #13904, #14251
Differential Revision: https://phabricator.haskell.org/D5190 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This switches the compiler/ component to get compiled with
-XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
modules.
This is motivated by the upcoming "Prelude" re-export of
`Semigroup((<>))` which would cause lots of name clashes in every
modulewhich imports also `Outputable`
Reviewers: austin, goldfire, bgamari, alanz, simonmar
Reviewed By: bgamari
Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
Differential Revision: https://phabricator.haskell.org/D3989 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The fundamental problem with `type UniqSet = UniqFM` is that `UniqSet`
has a key invariant `UniqFM` does not. For example, `fmap` over
`UniqSet` will generally produce nonsense.
* Upgrade `UniqSet` from a type synonym to a newtype.
* Remove unused and shady `extendVarSet_C` and `addOneToUniqSet_C`.
* Use cached unique in `tyConsOfType` by replacing
  `unitNameEnv (tyConName tc) tc` with `unitUniqSet tc`.
Reviewers: austin, hvr, goldfire, simonmar, niteria, bgamari
Reviewed By: niteria
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D3146 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch converts the 4 lasting static flags (read from the command
line and unsafely stored in immutable global variables) into dynamic
flags. Most use cases have been converted into reading them from a DynFlags.
In cases for which we don't have easy access to a DynFlags, we read from
'unsafeGlobalDynFlags' that is set at the beginning of each 'runGhc'.
It's not perfect (not thread-safe) but it is still better as we can
set/unset these 4 flags before each run when using GHC API.
Updates haddock submodule.
Rebased and finished by: bgamari
Test Plan: validate
Reviewers: goldfire, erikd, hvr, austin, simonmar, bgamari
Reviewed By: simonmar
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2839
GHC Trac Issues: #8440 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - Fix #13076 by wrapping `printDoc_` so that the terminal color is
  reset even if an exception occurs.
- Add `printSDoc`, `printSDocLn`, and `bufLeftRenderSDoc` to keep `SDoc`
  values abstract (they are wrappers of `printDoc_`, `printDoc`, and
  `bufLeftRender` respectively).
- Remove unused function: `printForAsm`
Test Plan: manual
Reviewers: RyanGlScott, austin, dfeuer, bgamari
Reviewed By: dfeuer, bgamari
Subscribers: dfeuer, mpickering, thomie
Differential Revision: https://phabricator.haskell.org/D2932
GHC Trac Issues: #13076 | 
| | 
| 
| 
| 
| 
| 
| | This documents nondeterminism in code generation and removes
the nondeterministic ufmToList function. In the future someone
will have to use nonDetEltsUFM (with proper explanation)
or pprUFM. | 
| | 
| 
| 
| 
| | These were previously just represented as Ints which was needlessly
vague. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Reviewers: erikd, austin
Reviewed By: erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1994 | 
| | 
| 
| 
| 
| 
| | Starting with GHC 7.10 and base-4.8, `Monad` implies `Applicative`,
which allows to simplify some definitions to exploit the superclass
relationship. This a first refactoring to that end. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Since GHC 8.1/8.2 only needs to be bootstrap-able by GHC 7.10 and
GHC 8.0 (and GHC 8.2), we can now finally drop all that pre-AMP
compatibility CPP-mess for good!
Reviewers: austin, goldfire, bgamari
Subscribers: goldfire, thomie, erikd
Differential Revision: https://phabricator.haskell.org/D1724 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Before:
    [1 of 1] Compiling Main             ( Main.hs, Main.o )
    You are using a new version of LLVM that hasn't been tested yet!
    We will try though...
After:
    [1 of 1] Compiling Main             ( Main.hs, Main.o )
    You are using an unsupported version of LLVM!
    Currently only 3.7 is supported.
    We will try though...
Before:
    [1 of 1] Compiling Main             ( Main.hs, Main.o )
    <no location info>:
        Warning: Couldn't figure out LLVM version!
                 Make sure you have installed LLVM
    ghc: could not execute: opt
After:
    [1 of 1] Compiling Main             ( Main.hs, Main.o )
    <no location info>: error:
        Warning: Couldn't figure out LLVM version!
                 Make sure you have installed LLVM 3.7
    ghc-stage1: could not execute: opt
Reviewers: austin, rwbarton, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1658 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This adds a flag -split-sections that does similar things to
-split-objs, but using sections in single object files instead of
relying on the Satanic Splitter and other abominations. This is very
similar to the GCC flags -ffunction-sections and -fdata-sections.
The --gc-sections linker flag, which allows unused sections to actually
be removed, is added to all link commands (if the linker supports it) so
that space savings from having base compiled with sections can be
realized.
Supported both in LLVM and the native code-gen, in theory for all
architectures, but really tested on x86 only.
In the GHC build, a new SplitSections variable enables -split-sections
for relevant parts of the build.
Test Plan: validate with both settings of SplitSections
Reviewers: dterei, Phyx, austin, simonmar, thomie, bgamari
Reviewed By: simonmar, thomie, bgamari
Subscribers: hsyl20, erikd, kgardas, thomie
Differential Revision: https://phabricator.haskell.org/D1242
GHC Trac Issues: #8405 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch refactors pure/(*>) and return/(>>) in MRP-friendly way, i.e.
such that the explicit definitions for `return` and `(>>)` match the
MRP-style default-implementation, i.e.
  return = pure
and
  (>>) = (*>)
This way, e.g. all `return = pure` definitions can easily be grepped and
removed in GHC 8.1;
Test Plan: Harbormaster
Reviewers: goldfire, alanz, bgamari, quchen, austin
Reviewed By: quchen, austin
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1312 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Before this commit, GHC only supported LLVM 3.6. Now it only supports
LLVM 3.7 which was released in August 2015. LLVM version 3.6 and earlier
do not work on AArch64/Arm64, but 3.7 does.
Also:
* Add CC_Ghc constructor to LlvmCallConvention.
* Replace `maxSupportLlvmVersion`/`minSupportLlvmVersion` with
  a single `supportedLlvmVersion` variable.
* Get `supportedLlvmVersion` from version specified in configure.ac.
* Drop llvmVersion field from DynFlags (no longer needed because only
  one version is supported).
Test Plan: Validate on x86_64 and arm
Reviewers: bgamari, austin
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1320
GHC Trac Issues: #10953 | 
| | 
| 
| 
| 
| 
| | Pushed by mistacke before it was ready.
This reverts commit 5dc3db743ec477978b9727a313951be44dbd170f. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | It's pretty irritating having hasktags with multiple top-level
declarations with the same type; hasktags can't figure out which
declaration you actually wanted.
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
Reviewed By: dterei, austin
Differential Revision: https://phabricator.haskell.org/D819 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Rework llvmGen to use LLVM 3.6 exclusively. The plans for the 7.12 release are to ship LLVM alongside GHC in the interests of user (and developer) sanity.
Along the way, refactor TNTC support to take advantage of the new `prefix` data support in LLVM 3.6. This allows us to drop the section-reordering component of the LLVM mangler.
Test Plan: Validate, look at emitted code
Reviewers: dterei, austin, scpmw
Reviewed By: austin
Subscribers: erikd, awson, spacekitteh, thomie, carter
Differential Revision: https://phabricator.haskell.org/D530
GHC Trac Issues: #10074 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Due to changes in LLVM 3.5 aliases now may only refer to definitions.
Previously to handle symbols defined outside of the current commpilation
unit GHC would emit both an `external` declaration, as well as an alias
pointing to it, e.g.,
    @stg_BCO_info = external global i8
    @stg_BCO_info$alias = alias private i8* @stg_BCO_info
Where references to `stg_BCO_info` will use the alias
`stg_BCO_info$alias`. This is not permitted under the new alias
behavior, resulting in errors resembling,
    Alias must point to a definition
    i8* @"stg_BCO_info$alias"
To fix this, we invert the naming relationship between aliases and
definitions. That is, now the symbol definition takes the name
`@stg_BCO_info$def` and references use the actual name, `@stg_BCO_info`.
This means the external symbols can be handled by simply emitting an
`external` declaration,
    @stg_BCO_info = external global i8
Whereas in the case of a forward declaration we emit,
    @stg_BCO_info = alias private i8* @stg_BCO_info$def
Reviewed By: austin
Differential Revision: https://phabricator.haskell.org/D155 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This includes pretty much all the changes needed to make `Applicative`
a superclass of `Monad` finally. There's mostly reshuffling in the
interests of avoid orphans and boot files, but luckily we can resolve
all of them, pretty much. The only catch was that
Alternative/MonadPlus also had to go into Prelude to avoid this.
As a result, we must update the hsc2hs and haddock submodules.
Signed-off-by: Austin Seipp <austin@well-typed.com>
Test Plan: Build things, they might not explode horribly.
Reviewers: hvr, simonmar
Subscribers: simonmar
Differential Revision: https://phabricator.haskell.org/D13 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch set makes us no longer assume that a package key is a human
readable string, leaving Cabal free to "do whatever it wants" to allocate
keys; we'll look up the PackageId in the database to display to the user.
This also means we have a new level of qualifier decisions to make at the
package level, and rewriting some Safe Haskell error reporting code to DTRT.
Additionally, we adjust the build system to use a new ghc-cabal output
Make variable PACKAGE_KEY to determine library names and other things,
rather than concatenating PACKAGE/VERSION as before.
Adds a new `-this-package-key` flag to subsume the old, erroneously named
`-package-name` flag, and `-package-key` to select packages by package key.
RFC: The md5 hashes are pretty tough on the eye, as far as the file
system is concerned :(
ToDo: safePkg01 test had its output updated, but the fix is not really right:
the rest of the dependencies are truncated due to the fact the we're only
grepping a single line, but ghc-pkg is wrapping its output.
ToDo: In a later commit, update all submodules to stop using -package-name
and use -this-package-key.  For now, we don't do it to avoid submodule
explosion.
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
Test Plan: validate
Reviewers: simonpj, simonmar, hvr, austin
Subscribers: simonmar, relrod, carter
Differential Revision: https://phabricator.haskell.org/D80 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
reorganized, while following the convention, to
- place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
  any `{-# OPTIONS_GHC #-}`-lines.
- Moreover, if the list of language extensions fit into a single
  `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
  line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
  individual language extension. In both cases, try to keep the
  enumeration alphabetically ordered.
  (The latter layout is preferable as it's more diff-friendly)
While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
occurences by `{-# OPTIONS_GHC ... #-}` pragmas. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The basic idea here is simple, and described in Note [The interactive package]
in HscTypes, which starts thus:
    Note [The interactive package]
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Type and class declarations at the command prompt are treated as if
    they were defined in modules
       interactive:Ghci1
       interactive:Ghci2
       ...etc...
    with each bunch of declarations using a new module, all sharing a
    common package 'interactive' (see Module.interactivePackageId, and
    PrelNames.mkInteractiveModule).
    This scheme deals well with shadowing.  For example:
       ghci> data T = A
       ghci> data T = B
       ghci> :i A
       data Ghci1.T = A  -- Defined at <interactive>:2:10
    Here we must display info about constructor A, but its type T has been
    shadowed by the second declaration.  But it has a respectable
    qualified name (Ghci1.T), and its source location says where it was
    defined.
    So the main invariant continues to hold, that in any session an original
    name M.T only refers to oe unique thing.  (In a previous iteration both
    the T's above were called :Interactive.T, albeit with different uniques,
    which gave rise to all sorts of trouble.)
This scheme deals nicely with the original problem.  It allows us to
eliminate a couple of grotseque hacks
  - Note [Outputable Orig RdrName] in HscTypes
  - Note [interactive name cache] in IfaceEnv
(both these comments have gone, because the hacks they describe are no
longer necessary). I was also able to simplify Outputable.QueryQualifyName,
so that it takes a Module/OccName as args rather than a Name.
However, matters are never simple, and this change took me an
unreasonably long time to get right.  There are some details in
Note [The interactive package] in HscTypes. | 
| | |  | 
| | |  | 
| | 
| 
| 
| 
| | Authored-by: David Luposchainsky <dluposchainsky@gmail.com>
Signed-off-by: Austin Seipp <austin@well-typed.com> |