| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
| |
|
| |
|
|
|
|
| |
Documentation of Integer constant folding
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch modifies all comparison primops for Char#, Int#, Word#, Double#,
Float# and Addr# to return Int# instead of Bool. A value of 1# represents True
and 0# represents False. For a more detailed description of motivation for this
change, discussion of implementation details and benchmarking results please
visit the wiki page: http://hackage.haskell.org/trac/ghc/wiki/PrimBool
There's also some cleanup: whitespace fixes in files that were extensively edited
in this patch and constant folding rules for Integer div and mod operators (which
for some reason have been left out up till now).
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This change helps the simplifier eliminate unreachable code, since
retry# technically doesn't return.
This closes ticket #8091.
Authored-by: Patrick Palka <patrick@parcs.ath.cx>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Roles are a solution to the GeneralizedNewtypeDeriving type-safety
problem.
Roles were first described in the "Generative type abstraction" paper,
by Stephanie Weirich, Dimitrios Vytiniotis, Simon PJ, and Steve Zdancewic.
The implementation is a little different than that paper. For a quick
primer, check out Note [Roles] in Coercion. Also see
http://ghc.haskell.org/trac/ghc/wiki/Roles
and
http://ghc.haskell.org/trac/ghc/wiki/RolesImplementation
For a more formal treatment, check out docs/core-spec/core-spec.pdf.
This fixes Trac #1496, #4846, #7148.
|
|
|
|
|
| |
This patch adds rules for converting floating point multiplication
of the form 2.0 * x and x * 2.0 into addition x + x.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Exposes bSwap{,16,32,64}# primops
* Add a new machop: MO_BSwap
* Use a Stg implementation (hs_bswap{16,32,64}) for other implementation
in NCG.
* Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr
instead of using xchg.
* Generate llvm.bswap intrinsics in llvm codegen.
Authored-by: Vincent Hanquez <tab@snarc.org>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
| |
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
|
|
|
|
|
| |
We add the invariant to the MVar blocked threads queue that
threads blocked on an atomic read are always at the front of
the queue. This invariant is easy to maintain, since takers
are only ever added to the end of the queue.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit changes the syntax and story around overlapping type
family instances. Before, we had "unbranched" instances and
"branched" instances. Now, we have closed type families and
open ones.
The behavior of open families is completely unchanged. In particular,
coincident overlap of open type family instances still works, despite
emails to the contrary.
A closed type family is declared like this:
> type family F a where
> F Int = Bool
> F a = Char
The equations are tried in order, from top to bottom, subject to
certain constraints, as described in the user manual. It is not
allowed to declare an instance of a closed family.
|
|
|
|
|
|
| |
Clang doesn't like whitespace between macro and arguments.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The commit replaces mkWeakForeignEnv# with addCFinalizerToWeak#.
This new primop mutates an existing Weak# object and adds a new
C finalizer to it.
This change removes an invariant in MarkWeak.c, namely that the relative
order of Weak# objects in the list needs to be preserved across GC. This
makes it easier to split the list into per-generation structures.
The patch also removes a race condition between two threads calling
finalizeWeak# on the same WEAK object at that same time.
|
|
|
|
| |
This reverts commit 1c5b0511a89488f5280523569d45ee61c0d09ffa.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Exposes bSwap{,16,32,64}# primops
* Add a new machops MO_BSwap
* Use a Stg implementation (hs_bswap{16,32,64}) for other implementation
in NCG.
* Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr
instead of using xchg.
* Generate llvm.bswap intrinsics in llvm codegen.
Patch from Vincent Hanquez.
|
| |
|
|
|
|
| |
The details of this are described in Note [magicSingIId magic] in basicTypes/MkId.lhs
|
| |
|
|
|
|
|
|
| |
Finally (I hope) fixes Trac #7903.
See Note [Zonking inside the knot] in TcHsSyn
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a long-standing regression (Trac #7797), which meant that in
particular the Eq [Char] instance does not get specialised.
(The *methods* do, but the dictionary itself doesn't.) So when you
call a function
f :: Eq a => blah
on a string type (ie a=[Char]), 7.6 passes a dictionary of un-specialised
methods.
This only matters when calling an overloaded function from a
specialised context, but that does matter in some programs. I
remember (though I cannot find the details) that Nick Frisby discovered
this to be the source of some pretty solid performanc regresisons.
Anyway it works now. The key change is that a DFunUnfolding now takes
a form that is both simpler than before (the DFunArg type is eliminated)
and more general:
data Unfolding
= ...
| DFunUnfolding { -- The Unfolding of a DFunId
-- See Note [DFun unfoldings]
-- df = /\a1..am. \d1..dn. MkD t1 .. tk
-- (op1 a1..am d1..dn)
-- (op2 a1..am d1..dn)
df_bndrs :: [Var], -- The bound variables [a1..m],[d1..dn]
df_con :: DataCon, -- The dictionary data constructor (never a newtype datacon)
df_args :: [CoreExpr] -- Args of the data con: types, superclasses and methods,
} -- in positional order
That in turn allowed me to re-enable the DFunUnfolding specialisation in
DsBinds. Lots of details here in TcInstDcls:
Note [SPECIALISE instance pragmas]
I also did some refactoring, in particular to pass the InScopeSet to
exprIsConApp_maybe (which in turn means it has to go to a RuleFun).
NB: Interface file format has changed!
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes Trac #7888, where the user wanted to use 'undefined' in a
context that needed ((forall a. a->a) -> Int). We allow OpenKind
unification variables to be instantiate with polytypes (or unboxed
types), hence the change.
'error' has always been like this; this change simply extends
the special treatment to 'undefined'. It's still magical;
you can't define your own wrapper for 'error' and get the
same behaviour. Really just a convenience hack.
|
| |
|
| |
|
|\ |
|
| |\ |
|
| | | |
|
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This work was all done by
Achim Krause <achim.t.krause@gmail.com>
George Giorgidze <giorgidze@gmail.com>
Weijers Jeroen <jeroen.weijers@uni-tuebingen.de>
It allows list syntax, such as [a,b], [a..b] and so on, to be
overloaded so that it works for a variety of types.
The design is described here:
http://hackage.haskell.org/trac/ghc/wiki/OverloadedLists
Eg. you can use it for maps, so that
[(1,"foo"), (4,"bar")] :: Map Int String
The main changes
* The ExplicitList constructor of HsExpr gets witness field
* Ditto ArithSeq constructor
* Ditto the ListPat constructor of HsPat
Everything else flows from this.
|
|/ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the Data.Typeable.Typeable class work with arguments of any
kind. In particular, this removes the Typeable1..7 class hierarchy, greatly
simplyfing the whole Typeable story. Also added is the AutoDeriveTypeable
language extension, which will automatically derive Typeable for all types and
classes declared in that module. Since there is now no good reason to give
handwritten instances of the Typeable class, those are ignored (for backwards
compatibility), and a warning is emitted.
The old, kind-* Typeable class is now called OldTypeable, and lives in the
Data.OldTypeable module. It is deprecated, and should be removed in some future
version of GHC.
|
| |
|
|
|
|
|
| |
The llvm_only attribute indicates that a primop is only available when using the
LLVM back-end.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch lays the groundwork needed for primop support for SIMD vectors. In
addition to the groundwork, we add support for the FloatX4# primitive type and
associated primops.
* Add the FloatX4# primitive type and associated primops.
* Add CodeGen support for Float vectors.
* Compile vector operations to LLVM vector operations in the LLVM code
generator.
* Make the x86 native backend fail gracefully when encountering vector primops.
* Only generate primop wrappers for vector primops when using LLVM.
|
| |
|
|\
| |
| |
| |
| | |
Conflicts:
compiler/basicTypes/DataCon.lhs
|
| |\ |
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
It should be the case that either an entire mutually recursive
group of data type declarations can be promoted, or none of them.
It's really odd to promote some data constructors of a type but
not others. Eg
data T a = T1 a | T2 Int
Here T1 is sort-of-promotable but T2 isn't (becuase Int isn't
promotable).
This patch makes it all-or-nothing. At the same time I've made
the TyCon point to its promoted cousin (via the tcPromoted field
of an AlgTyCon), as well as vice versa (via the ty_con field of
PromotedTyCon).
The inference for the group is done in TcTyDecls, the same place
that infers which data types are recursive, another global question.
|
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The main payload of this patch is to extend CPR so that it
detects when a function always returns a result constructed
with the *same* constructor, even if the constructor comes from
a sum type. This doesn't matter very often, but it does improve
some things (results below).
Binary sizes increase a little bit, I think because there are more
wrappers. This with -split-objs. Without split-ojbs binary sizes
increased by 6% even for HelloWorld.hs. It's hard to see exactly why,
but I think it was because System.Posix.Types.o got included in the
linked binary, whereas it didn't before.
Program Size Allocs Runtime Elapsed TotalMem
fluid +1.8% -0.3% 0.01 0.01 +0.0%
tak +2.2% -0.2% 0.02 0.02 +0.0%
ansi +1.7% -0.3% 0.00 0.00 +0.0%
cacheprof +1.6% -0.3% +0.6% +0.5% +1.4%
parstof +1.4% -4.4% 0.00 0.00 +0.0%
reptile +2.0% +0.3% 0.02 0.02 +0.0%
----------------------------------------------------------------------
Min +1.1% -4.4% -4.7% -4.7% -15.0%
Max +2.3% +0.3% +8.3% +9.4% +50.0%
Geometric Mean +1.9% -0.1% +0.6% +0.7% +0.3%
Other things in this commit
~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Got rid of the Lattice class in Demand
* Refactored the way that products and newtypes are
decomposed (no change in functionality)
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is the result of Ilya Sergey's internship at MSR. It
constitutes a thorough overhaul and simplification of the demand
analyser. It makes a solid foundation on which we can now build.
Main changes are
* Instead of having one combined type for Demand, a Demand is
now a pair (JointDmd) of
- a StrDmd and
- an AbsDmd.
This allows strictness and absence to be though about quite
orthogonally, and greatly reduces brain melt-down.
* Similarly in the DmdResult type, it's a pair of
- a PureResult (indicating only divergence/non-divergence)
- a CPRResult (which deals only with the CPR property
* In IdInfo, the
strictnessInfo field contains a StrictSig, not a Maybe StrictSig
demandInfo field contains a Demand, not a Maybe Demand
We don't need Nothing (to indicate no strictness/demand info)
any more; topSig/topDmd will do.
* Remove "boxity" analysis entirely. This was an attempt to
avoid "reboxing", but it added complexity, is extremely
ad-hoc, and makes very little difference in practice.
* Remove the "unboxing strategy" computation. This was an an
attempt to ensure that a worker didn't get zillions of
arguments by unboxing big tuples. But in fact removing it
DRAMATICALLY reduces allocation in an inner loop of the
I/O library (where the threshold argument-count had been
set just too low). It's exceptional to have a zillion arguments
and I don't think it's worth the complexity, especially since
it turned out to have a serious performance hit.
* Remove quite a bit of ad-hoc cruft
* Move worthSplittingFun, worthSplittingThunk from WorkWrap to
Demand. This allows JointDmd to be fully abstract, examined
only inside Demand.
Everything else really follows from these changes.
All of this is really just refactoring, so we don't expect
big performance changes, but acutally the numbers look quite
good. Here is a full nofib run with some highlights identified:
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
expert -2.6% -15.5% 0.00 0.00 +0.0%
fluid -2.4% -7.1% 0.01 0.01 +0.0%
gg -2.5% -28.9% 0.02 0.02 -33.3%
integrate -2.6% +3.2% +2.6% +2.6% +0.0%
mandel2 -2.6% +4.2% 0.01 0.01 +0.0%
nucleic2 -2.0% -16.3% 0.11 0.11 +0.0%
para -2.6% -20.0% -11.8% -11.7% +0.0%
parser -2.5% -17.9% 0.05 0.05 +0.0%
prolog -2.6% -13.0% 0.00 0.00 +0.0%
puzzle -2.6% +2.2% +0.8% +0.8% +0.0%
sorting -2.6% -35.9% 0.00 0.00 +0.0%
treejoin -2.6% -52.2% -9.8% -9.9% +0.0%
--------------------------------------------------------------------------------
Min -2.7% -52.2% -11.8% -11.7% -33.3%
Max -1.8% +4.2% +10.5% +10.5% +7.7%
Geometric Mean -2.5% -2.8% -0.4% -0.5% -0.4%
Things to note
* Binary sizes are smaller. I don't know why, but it's good.
* Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
(I checked treejoin, gg, sorting) arise from one place, namely a function
GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
which have several arugments. Not w/w'ing both arguments (which is what
we did before) has a big effect. So the big win in actually somewhat
accidental, gained by removing the "unboxing strategy" code.
* A couple of benchmarks allocate slightly more. This turns out
to be due to reboxing (integrate). But the biggest increase is
mandel2, and *that* turned out also to be a somewhat accidental
loss of CSE, and pointed the way to doing better CSE: see Trac
#7596.
* Runtimes are never very reliable, but seem to improve very slightly.
All in all, a good piece of work. Thank you Ilya!
|
|
|
|
| |
e.g. narrow8Int# subsumes narrow16Int#, not the other way around.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
foldMap to derived Foldable instance.
The derived instances will no longer eta-expand the function. I.e. instead of
fmap f (Foo a) = Foo (fmap (\x -> f x) a)
we now derive
fmap f (Foo a) = Foo (fmap f a)
Some superflous lambdas are generated as a result. For example
data X a = X (a,a)
fmap f (X x) = (\y -> case y of (a,b) -> (f a, f b)) x
The optimizer should be able to simplify this code, as it is just beta reduction.
The derived Foldable instance now includes foldMap in addition to foldr.
|
|
|
|
| |
See Note [SingI and EvLit] in TcEvidence.
|