delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	compiler: Use compact representation for SourceText	Zubin Duggal	2023-05-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	SourceText is serialized along with INLINE pragmas into interface files. Many of these SourceTexts are identical, for example "{-# INLINE#". When deserialized, each such SourceText was previously expanded out into a [Char], which is highly wasteful of memory, and each such instance of the text would allocate an independent list with its contents as deserializing breaks any sharing that might have existed. Instead, we use a `FastString` to represent these, so that each instance unique text will be interned and stored in a memory efficient manner.
*	Major refactor in the handling of equality constraints	Simon Peyton Jones	2023-04-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This MR substantially refactors the way in which the constraint solver deals with equality constraints. The big thing is: * Intead of a pipeline in which we /first/ canonicalise and /then/ interact (the latter including performing unification) the two steps are more closely integreated into one. That avoids the current rather indirect communication between the two steps. The proximate cause for this refactoring is fixing #22194, which involve solving [W] alpha[2] ~ Maybe (F beta[4]) by doing this: alpha[2] := Maybe delta[2] [W] delta[2] ~ F beta[4] That is, we don't promote beta[4]! This is very like introducing a cycle breaker, and was very awkward to do before, but now it is all nice. See GHC.Tc.Utils.Unify Note [Promotion and level-checking] and Note [Family applications in canonical constraints]. The big change is this: * Several canonicalisation checks (occurs-check, cycle-breaking, checking for concreteness) are combined into one new function: GHC.Tc.Utils.Unify.checkTyEqRhs This function is controlled by `TyEqFlags`, which says what to do for foralls, type families etc. * `canEqCanLHSFinish` now sees if unification is possible, and if so, actually does it: see `canEqCanLHSFinish_try_unification`. There are loads of smaller changes: * The on-the-fly unifier `GHC.Tc.Utils.Unify.unifyType` has a cheap-and-cheerful version of `checkTyEqRhs`, called `simpleUnifyCheck`. If `simpleUnifyCheck` succeeds, it can unify, otherwise it defers by emitting a constraint. This is simpler than before. * I simplified the swapping code in `GHC.Tc.Solver.Equality.canEqCanLHS`. Especially the nasty stuff involving `swap_for_occurs` and `canEqTyVarFunEq`. Much nicer now. See Note [Orienting TyVarLHS/TyFamLHS] Note [Orienting TyFamLHS/TyFamLHS] * Added `cteSkolemOccurs`, `cteConcrete`, and `cteCoercionHole` to the problems that can be discovered by `checkTyEqRhs`. * I fixed #23199 `pickQuantifiablePreds`, which actually allows GHC to to accept both cases in #22194 rather than rejecting both. Yet smaller: * Added a `synIsConcrete` flag to `SynonymTyCon` (alongside `synIsFamFree`) to reduce the need for synonym expansion when checking concreteness. Use it in `isConcreteType`. * Renamed `isConcrete` to `isConcreteType` * Defined `GHC.Core.TyCo.FVs.isInjectiveInType` as a more efficient way to find if a particular type variable is used injectively than finding all the injective variables. It is called in `GHC.Tc.Utils.Unify.definitely_poly`, which in turn is used quite a lot. * Moved `rewriterView` to `GHC.Core.Type`, so we can use it from the constraint solver. Fixes #22194, #23199 Compile times decrease by an average of 0.1%; but there is a 7.4% drop in compiler allocation on T15703. Metric Decrease: T15703
*	Add quotRem rules (#22152)	Sylvain Henry	2023-04-13	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	case quotRemInt# x y of (# q, _ #) -> body ====> case quotInt# x y of q -> body case quotRemInt# x y of (# _, r #) -> body ====> case remInt# x y of r -> body
*	Simplifier: `countValArgs` should not count Type args (#23102)	Sebastian Graf	2023-03-13	1	-1/+1
\| \| \| \| \| \| \| \| \|	I observed miscompilations while working on !10088 caused by this. Fixes #23102. Metric Decrease: T10421
*	Simplify: Move `wantEtaExpansion` before expensive `do_eta_expand` check	Sebastian Graf	2023-03-10	1	-5/+6
\| \| \| \| \| \| \| \|	There is no need to run arity analysis and what not if we are not in a Simplifier phase that eta-expands or if we don't want to eta-expand the expression in the first place. Purely a refactoring with the goal of improving compiler perf.
*	Fix shadowing bug in prepareAlts	Simon Peyton Jones	2023-02-24	2	-13/+26
\| \| \| \| \| \| \| \|	As #23012 showed, GHC.Core.Opt.Simplify.Utils.prepareAlts was using an OutType to construct an InAlt. When shadowing is in play, this is outright wrong. See Note [Shadowing in prepareAlts].
*	Refactor the simplifier a bit to fix #22761	Simon Peyton Jones	2023-02-10	3	-148/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The core change in this commit, which fixes #22761, is that * In a Core rule, ru_rhs is always occ-analysed. This means adding a couple of calls to occurAnalyseExpr when building a Rule, in * GHC.Core.Rules.mkRule * GHC.Core.Opt.Simplify.Iteration.simplRules But diagosing the bug made me stare carefully at the code of the Simplifier, and I ended up doing some only-loosely-related refactoring. * I think that RULES could be lost because not every code path did addBndrRules * The code around lambdas was very convoluted It's mainly moving deck chairs around, but I like it more now.
*	Revert "Don't keep exit join points so much"	Matthew Pickering	2023-02-08	2	-32/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit caced75765472a1a94453f2e5a439dba0d04a265. It seems the patch "Don't keep exit join points so much" is causing wide-spread regressions in the bytestring library benchmarks. If I revert it then the 9.6 numbers are better on average than 9.4. See https://gitlab.haskell.org/ghc/ghc/-/issues/22893#note_479525 ------------------------- Metric Decrease: MultiComponentModules MultiComponentModulesRecomp MultiLayerModules MultiLayerModulesRecomp MultiLayerModulesTH_Make T12150 T13386 T13719 T21839c T3294 parsing001 -------------------------
*	Improve exprOkForSpeculation for classops	Simon Peyton Jones	2023-01-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes #22745 and #15205, which are about GHC's failure to discard unnecessary superclass selections that yield coercions. See GHC.Core.Utils Note [exprOkForSpeculation and type classes] The main changes are: * Write new Note [NON-BOTTOM_DICTS invariant] in GHC.Core, and refer to it * Define new function isTerminatingType, to identify those guaranteed-terminating dictionary types. * exprOkForSpeculation has a new (very simple) case for ClassOpId * ClassOpId has a new field that says if the return type is an unlifted type, or a terminating type. This was surprisingly tricky to get right. In particular note that unlifted types are not terminating types; you can write an expression of unlifted type, that diverges. Not so for dictionaries (or, more precisely, for the dictionaries that GHC constructs). Metric Decrease: LargeRecord
*	Take account of loop breakers in specLookupRule	Simon Peyton Jones	2023-01-30	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The key change is that in GHC.Core.Opt.Specialise.specLookupRule we were using realIdUnfolding, which ignores the loop-breaker flag. When given a loop breaker, rule matching therefore looped infinitely -- #22802. In fixing this I refactored a bit. * Define GHC.Core.InScopeEnv as a data type, and use it. (Previously it was a pair: hard to grep for.) * Put several functions returning an IdUnfoldingFun into GHC.Types.Id, namely idUnfolding alwaysActiveUnfoldingFun, whenActiveUnfoldingFun, noUnfoldingFun and use them. (The are all loop-breaker aware.)
*	Assorted changes to avoid Data.List.{head,tail}	Bodigrim	2023-01-28	1	-1/+1
\|
*	Separate core inlining logic from `Unfolding` type.	Andreas Klebinger	2023-01-16	3	-0/+557
\| \| \| \| \|	This seems like a good idea either way, but is mostly motivated by a patch where this avoids a module loop.
*	Refactor mkRuntimeError	Simon Peyton Jones	2022-12-22	1	-1/+1
\| \| \| \| \| \| \| \| \|	This patch fixes #22634. Because we don't have TYPE/CONSTRAINT polymorphism, we need two error functions rather than one. I took the opportunity to rname runtimeError to impossibleError, to line up with mkImpossibleExpr, and avoid confusion with the genuine runtime-error-constructing functions.
*	Abstract over the right free vars	Simon Peyton Jones	2022-12-21	1	-51/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix #22459, in two ways: (1) Make the Specialiser not create a bogus specialisation if it is presented by strangely polymorphic dictionary. See Note [Weird special case in SpecDict] in GHC.Core.Opt.Specialise (2) Be more careful in abstractFloats See Note [Which type variables to abstract over] in GHC.Core.Opt.Simplify.Utils. So (2) stops creating the excessively polymorphic dictionary in abstractFloats, while (1) stops crashing if some other pass should nevertheless create a weirdly polymorphic dictionary.
*	Fix loop in the interface representation of some `Unfolding` fields	Gergő Érdi	2022-12-13	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discovered in #22272, dehydration of the unfolding info of a recursive definition used to involve a traversal of the definition itself, which in turn involves traversing the unfolding info. Hence, a loop. Instead, we now store enough data in the interface that we can produce the unfolding info without this traversal. See Note [Tying the 'CoreUnfolding' knot] for details. Fixes #22272 Co-authored-by: Simon Peyton Jones <simon.peytonjones@gmail.com>
*	Add a missing varToCoreExpr in etaBodyForJoinPoint	Simon Peyton Jones	2022-12-01	1	-33/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This subtle bug showed up when compiling a library with 9.4. See #22491. The bug is present in master, but it is hard to trigger; the new regression test T22491 fails in 9.4. The fix was easy: just add a missing varToCoreExpr in etaBodyForJoinPoint. The fix is definitely right though! I also did some other minor refatoring: * Moved the preInlineUnconditionally test in simplExprF1 to before the call to joinPointBinding_maybe, to avoid fruitless eta-expansion. * Added a boolean from_lam flag to simplNonRecE, to avoid two fruitless tests, and commented it a bit better. These refactorings seem to save 0.1% on compile-time allocation in perf/compiler; with a max saving of 1.4% in T9961 Metric Decrease: T9961
*	Scrub some no-warning pragmas.	M Farkas-Dyck	2022-11-23	1	-1/+1
\|
*	Simplifier: Consider `seq` as a `BoringCtxt` (#22317)	Sebastian Graf	2022-11-19	1	-6/+56
\| \| \| \| \| \|	See `Note [Seq is boring]` for the rationale. Fixes #22317.
*	Type vs Constraint: finally nailed	Simon Peyton Jones	2022-11-11	4	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This big patch addresses the rats-nest of issues that have plagued us for years, about the relationship between Type and Constraint. See #11715/#21623. The main payload of the patch is: * To introduce CONSTRAINT :: RuntimeRep -> Type * To make TYPE and CONSTRAINT distinct throughout the compiler Two overview Notes in GHC.Builtin.Types.Prim * Note [TYPE and CONSTRAINT] * Note [Type and Constraint are not apart] This is the main complication. The specifics * New primitive types (GHC.Builtin.Types.Prim) - CONSTRAINT - ctArrowTyCon (=>) - tcArrowTyCon (-=>) - ccArrowTyCon (==>) - funTyCon FUN -- Not new See Note [Function type constructors and FunTy] and Note [TYPE and CONSTRAINT] * GHC.Builtin.Types: - New type Constraint = CONSTRAINT LiftedRep - I also stopped nonEmptyTyCon being built-in; it only needs to be wired-in * Exploit the fact that Type and Constraint are distinct throughout GHC - Get rid of tcView in favour of coreView. - Many tcXX functions become XX functions. e.g. tcGetCastedTyVar --> getCastedTyVar * Kill off Note [ForAllTy and typechecker equality], in (old) GHC.Tc.Solver.Canonical. It said that typechecker-equality should ignore the specified/inferred distinction when comparein two ForAllTys. But that wsa only weakly supported and (worse) implies that we need a separate typechecker equality, different from core equality. No no no. * GHC.Core.TyCon: kill off FunTyCon in data TyCon. There was no need for it, and anyway now we have four of them! * GHC.Core.TyCo.Rep: add two FunTyFlags to FunCo See Note [FunCo] in that module. * GHC.Core.Type. Lots and lots of changes driven by adding CONSTRAINT. The key new function is sORTKind_maybe; most other changes are built on top of that. See also `funTyConAppTy_maybe` and `tyConAppFun_maybe`. * Fix a longstanding bug in GHC.Core.Type.typeKind, and Core Lint, in kinding ForAllTys. See new tules (FORALL1) and (FORALL2) in GHC.Core.Type. (The bug was that before (forall (cv::t1 ~# t2). blah), where blah::TYPE IntRep, would get kind (TYPE IntRep), but it should be (TYPE LiftedRep). See Note [Kinding rules for types] in GHC.Core.Type. * GHC.Core.TyCo.Compare is a new module in which we do eqType and cmpType. Of course, no tcEqType any more. * GHC.Core.TyCo.FVs. I moved some free-var-like function into this module: tyConsOfType, visVarsOfType, and occCheckExpand. Refactoring only. * GHC.Builtin.Types. Compiletely re-engineer boxingDataCon_maybe to have one for each /RuntimeRep/, rather than one for each /Type/. This dramatically widens the range of types we can auto-box. See Note [Boxing constructors] in GHC.Builtin.Types The boxing types themselves are declared in library ghc-prim:GHC.Types. GHC.Core.Make. Re-engineer the treatment of "big" tuples (mkBigCoreVarTup etc) GHC.Core.Make, so that it auto-boxes unboxed values and (crucially) types of kind Constraint. That allows the desugaring for arrows to work; it gathers up free variables (including dictionaries) into tuples. See Note [Big tuples] in GHC.Core.Make. There is still work to do here: #22336. But things are better than before. * GHC.Core.Make. We need two absent-error Ids, aBSENT_ERROR_ID for types of kind Type, and aBSENT_CONSTRAINT_ERROR_ID for vaues of kind Constraint. Ditto noInlineId vs noInlieConstraintId in GHC.Types.Id.Make; see Note [inlineId magic]. * GHC.Core.TyCo.Rep. Completely refactor the NthCo coercion. It is now called SelCo, and its fields are much more descriptive than the single Int we used to have. A great improvement. See Note [SelCo] in GHC.Core.TyCo.Rep. * GHC.Core.RoughMap.roughMatchTyConName. Collapse TYPE and CONSTRAINT to a single TyCon, so that the rough-map does not distinguish them. * GHC.Core.DataCon - Mainly just improve documentation * Some significant renamings: GHC.Core.Multiplicity: Many --> ManyTy (easier to grep for) One --> OneTy GHC.Core.TyCo.Rep TyCoBinder --> GHC.Core.Var.PiTyBinder GHC.Core.Var TyCoVarBinder --> ForAllTyBinder AnonArgFlag --> FunTyFlag ArgFlag --> ForAllTyFlag GHC.Core.TyCon TyConTyCoBinder --> TyConPiTyBinder Many functions are renamed in consequence e.g. isinvisibleArgFlag becomes isInvisibleForAllTyFlag, etc * I refactored FunTyFlag (was AnonArgFlag) into a simple, flat data type data FunTyFlag = FTF_T_T -- (->) Type -> Type \| FTF_T_C -- (-=>) Type -> Constraint \| FTF_C_T -- (=>) Constraint -> Type \| FTF_C_C -- (==>) Constraint -> Constraint * GHC.Tc.Errors.Ppr. Some significant refactoring in the TypeEqMisMatch case of pprMismatchMsg. * I made the tyConUnique field of TyCon strict, because I saw code with lots of silly eval's. That revealed that GHC.Settings.Constants.mAX_SUM_SIZE can only be 63, because we pack the sum tag into a 6-bit field. (Lurking bug squashed.) Fixes * #21530 Updates haddock submodule slightly. Performance changes ~~~~~~~~~~~~~~~~~~~ I was worried that compile times would get worse, but after some careful profiling we are down to a geometric mean 0.1% increase in allocation (in perf/compiler). That seems fine. There is a big runtime improvement in T10359 Metric Decrease: LargeRecord MultiLayerModulesTH_OneShot T13386 T13719 Metric Increase: T8095
*	Weaken wrinkle 1 of Note [Scrutinee Constant Folding]	Matthew Craven	2022-11-11	1	-16/+101
\| \| \| \| \| \|	Fixes #22375. Co-authored-by: Simon Peyton Jones <simon.peytonjones@gmail.com>
*	Use a more efficient printer for code generation (#21853)	Krzysztof Gogolewski	2022-11-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The changes in `GHC.Utils.Outputable` are the bulk of the patch and drive the rest. The types `HLine` and `HDoc` in Outputable can be used instead of `SDoc` and support printing directly to a handle with `bPutHDoc`. See Note [SDoc versus HDoc] and Note [HLine versus HDoc]. The classes `IsLine` and `IsDoc` are used to make the existing code polymorphic over `HLine`/`HDoc` and `SDoc`. This is done for X86, PPC, AArch64, DWARF and dependencies (printing module names, labels etc.). Co-authored-by: Alexis King <lexi.lambda@gmail.com> Metric Decrease: CoOpt_Read ManyAlternatives ManyConstructors T10421 T12425 T12707 T13035 T13056 T13253 T13379 T18140 T18282 T18698a T18698b T1969 T20049 T21839c T21839r T3064 T3294 T4801 T5321FD T5321Fun T5631 T6048 T783 T9198 T9233
*	Add a fast path for data constructor workers	Simon Peyton Jones	2022-11-11	2	-18/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	See Note [Fast path for data constructors] in GHC.Core.Opt.Simplify.Iteration This bypasses lots of expensive logic, in the special case of applications of data constructors. It is a surprisingly worthwhile improvement, as you can see in the figures below. Metrics: compile_time/bytes allocated ------------------------------------------------ CoOpt_Read(normal) -2.0% CoOpt_Singletons(normal) -2.0% ManyConstructors(normal) -1.3% T10421(normal) -1.9% GOOD T10421a(normal) -1.5% T10858(normal) -1.6% T11545(normal) -1.7% T12234(optasm) -1.3% T12425(optasm) -1.9% GOOD T13035(normal) -1.0% GOOD T13056(optasm) -1.8% T13253(normal) -3.3% GOOD T15164(normal) -1.7% T15304(normal) -3.4% T15630(normal) -2.8% T16577(normal) -4.3% GOOD T17096(normal) -1.1% T17516(normal) -3.1% T18282(normal) -1.9% T18304(normal) -1.2% T18698a(normal) -1.2% GOOD T18698b(normal) -1.5% GOOD T18923(normal) -1.3% T1969(normal) -1.3% GOOD T19695(normal) -4.4% GOOD T21839c(normal) -2.7% GOOD T21839r(normal) -2.7% GOOD T4801(normal) -3.8% GOOD T5642(normal) -3.1% GOOD T6048(optasm) -2.5% GOOD T9020(optasm) -2.7% GOOD T9630(normal) -2.1% GOOD T9961(normal) -11.7% GOOD WWRec(normal) -1.0% geo. mean -1.1% minimum -11.7% maximum +0.1% Metric Decrease: T10421 T12425 T13035 T13253 T16577 T18698a T18698b T1969 T19695 T21839c T21839r T4801 T5642 T6048 T9020 T9630 T9961
*	Fire RULES in the Specialiser	Simon Peyton Jones	2022-11-10	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Specialiser has, for some time, fires class-op RULES in the specialiser itself: see Note [Specialisation modulo dictionary selectors] This MR beefs it up a bit, so that it fires /all/ RULES in the specialiser, not just class-op rules. See Note [Fire rules in the specialiser] The result is a bit more specialisation; see test simplCore/should_compile/T21851_2 This pushed me into a bit of refactoring. I made a new data types GHC.Core.Rules.RuleEnv, which combines - the several source of rules (local, home-package, external) - the orphan-module dependencies in a single record for `getRules` to consult. That drove a bunch of follow-on refactoring, including allowing me to remove cr_visible_orphan_mods from the CoreReader data type. I moved some of the RuleBase/RuleEnv stuff into GHC.Core.Rule. The reorganisation in the Simplifier improve compile times a bit (geom mean -0.1%), but T9961 is an outlier Metric Decrease: T9961
*	Export pprTrace and friends from GHC.Prelude.	Andreas Klebinger	2022-11-03	2	-2/+0
\| \| \| \| \|	Introduces GHC.Prelude.Basic which can be used in modules which are a dependency of the ppr code.
*	Make the specialiser handle polymorphic specialisation	Simon Peyton Jones	2022-10-25	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ticket #13873 unexpectedly showed that a SPECIALISE pragma made a program run (a lot) slower, because less specialisation took place overall. It turned out that the specialiser was missing opportunities because of quantified type variables. It was quite easy to fix. The story is given in Note [Specialising polymorphic dictionaries] Two other minor fixes in the specialiser * There is no benefit in specialising data constructor /wrappers/. (They can appear overloaded because they are given a dictionary to store in the constructor.) Small guard in canSpecImport. * There was a buglet in the UnspecArg case of specHeader, in the case where there is a dead binder. We need a LitRubbish filler for the specUnfolding stuff. I expanded Note [Drop dead args from specialisations] to explain. There is a 4% increase in compile time for T15164, because we generate more specialised code. This seems OK. Metric Increase: T15164
*	Fix binder-swap bug	Simon Peyton Jones	2022-10-25	1	-13/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes #21229 properly, by avoiding doing a binder-swap on dictionary Ids. This is pretty subtle, and explained in Note [Care with binder-swap on dictionaries]. Test is already in simplCore/should_run/T21229 This allows us to restore a feature to the specialiser that we had to revert: see Note [Specialising polymorphic dictionaries]. (This is done in a separate patch.) I also modularised things, using a new function scrutBinderSwap_maybe in all the places where we are (effectively) doing a binder-swap, notably * Simplify.Iteration.addAltUnfoldings * SpecConstr.extendCaseBndrs In Simplify.Iteration.addAltUnfoldings I also eliminated a guard Many <- idMult case_bndr because we concluded, in #22123, that it was doing no good.
*	Scrub some partiality in `GHC.Core.Opt.Simplify.Utils`.	M Farkas-Dyck	2022-10-21	1	-8/+9
\|
*	Avoid allocating intermediate lists for non recursive bindings.	Andreas Klebinger	2022-10-17	1	-4/+0
\| \| \| \| \| \| \|	We do so by having an explicit folding function that doesn't need to allocate intermediate lists first. Fixes #22196
*	Don't keep exit join points so much	Simon Peyton Jones	2022-10-11	2	-12/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were religiously keeping exit join points throughout, which had some bad effects (#21148, #22084). This MR does two things: * Arranges that exit join points are inhibited from inlining only in /one/ Simplifier pass (right after Exitification). See Note [Be selective about not-inlining exit join points] in GHC.Core.Opt.Exitify It's not a big deal, but it shaves 0.1% off compile times. * Inline used-once non-recursive join points very aggressively Given join j x = rhs in joinrec k y = ....j x.... where this is the only occurrence of `j`, we want to inline `j`. (Unless sm_keep_exits is on.) See Note [Inline used-once non-recursive join points] in GHC.Core.Opt.Simplify.Utils This is just a tidy-up really. It doesn't change allocation, but getting rid of a binding is always good. Very effect on nofib -- some up and down.
*	Make rewrite rules "win" over inliningwip/T21851-rule-win	Simon Peyton Jones	2022-10-10	2	-157/+216
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a rewrite rule and a rewrite rule compete in the simplifier, this patch makes sure that the rewrite rule "win". That is, in general a bit fragile, but it's a huge help when making specialisation work reliably, as #21851 and #22097 showed. The change is fairly straightforwad, and documented in Note [Rewrite rules and inlining] in GHC.Core.Opt.Simplify.Iteration. Compile-times change, up and down a bit -- in some cases because we get better specialisation. But the payoff (more reliable specialisation) is large. Metrics: compile_time/bytes allocated ----------------------------------------------- T10421(normal) +3.7% BAD T10421a(normal) +5.5% T13253(normal) +1.3% T14052(ghci) +1.8% T15304(normal) -1.4% T16577(normal) +3.1% BAD T17516(normal) +2.3% T17836(normal) -1.9% T18223(normal) -1.8% T8095(normal) -1.3% T9961(normal) +2.5% BAD geo. mean +0.0% minimum -1.9% maximum +5.5% Nofib results are (bytes allocated) +-------------------------------++----------+ \| \|\|tsv (rel) \| +===============================++==========+ \| imaginary/paraffins \|\| +0.27% \| \| imaginary/rfib \|\| -0.04% \| \| real/anna \|\| +0.02% \| \| real/fem \|\| -0.04% \| \| real/fluid \|\| +1.68% \| \| real/gamteb \|\| -0.34% \| \| real/gg \|\| +1.54% \| \| real/hidden \|\| -0.01% \| \| real/hpg \|\| -0.03% \| \| real/infer \|\| -0.03% \| \| real/prolog \|\| +0.02% \| \| real/veritas \|\| -0.47% \| \| shootout/fannkuch-redux \|\| -0.03% \| \| shootout/k-nucleotide \|\| -0.02% \| \| shootout/n-body \|\| -0.06% \| \| shootout/spectral-norm \|\| -0.01% \| \| spectral/cryptarithm2 \|\| +1.25% \| \| spectral/fibheaps \|\| +18.33% \| \| spectral/last-piece \|\| -0.34% \| +===============================++==========+ \| geom mean \|\| +0.17% \| There are extensive notes in !8897 about the regressions. Briefly * fibheaps: there was a very delicately balanced inlining that tipped over the wrong way after this change. * cryptarithm2 and paraffins are caused by #22274, which is a separate issue really. (I.e. the right fix is not to make inlining "win" over rules.) So I'm accepting these changes Metric Increase: T10421 T16577 T9961
*	Demand: Format Call SubDemands `Cn(sd)` as `C(n,sd)` (#22231)wip/T22231	Sebastian Graf	2022-09-29	1	-3/+3
\| \| \| \| \| \| \| \|	Justification in #22231. Short form: In a demand like `1C1(C1(L))` it was too easy to confuse which `1` belongs to which `C`. Now that should be more obvious. Fixes #22231
*	Refactor UnfoldingSource and IfaceUnfolding	Simon Peyton Jones	2022-09-28	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I finally got tired of the way that IfaceUnfolding reflected a previous structure of unfoldings, not the current one. This MR refactors UnfoldingSource and IfaceUnfolding to be simpler and more consistent. It's largely just a refactor, but in UnfoldingSource (which moves to GHC.Types.Basic, since it is now used in IfaceSyn too), I distinguish between /user-specified/ and /system-generated/ stable unfoldings. data UnfoldingSource = VanillaSrc \| StableUserSrc -- From a user-specified pragma \| StableSystemSrc -- From a system-generated unfolding \| CompulsorySrc This has a minor effect in CSE (see the use of isisStableUserUnfolding in GHC.Core.Opt.CSE), which I tripped over when working on specialisation, but it seems like a Good Thing to know anyway.
*	Improve aggressive specialisation	Simon Peyton Jones	2022-09-28	1	-20/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes #21286, by not unboxing dictionaries in worker/wrapper (ever). The main payload is tiny: * In `GHC.Core.Opt.DmdAnal.finaliseArgBoxities`, do not unbox dictionaries in `get_dmd`. See Note [Do not unbox class dictionaries] in that module * I also found that imported wrappers were being fruitlessly specialised, so I fixed that too, in canSpecImport. See Note [Specialising imported functions] point (2). In doing due diligence in the testsuite I fixed a number of other things: * Improve Note [Specialising unfoldings] in GHC.Core.Unfold.Make, and Note [Inline specialisations] in GHC.Core.Opt.Specialise, and remove duplication between the two. The new Note describes how we specialise functions with an INLINABLE pragma. And simplify the defn of `spec_unf` in `GHC.Core.Opt.Specialise.specCalls`. * Improve Note [Worker/wrapper for INLINABLE functions] in GHC.Core.Opt.WorkWrap. And (critially) make an actual change which is to propagate the user-written pragma from the original function to the wrapper; see `mkStrWrapperInlinePrag`. * Write new Note [Specialising imported functions] in GHC.Core.Opt.Specialise All this has a big effect on some compile times. This is compiler/perf, showing only changes over 1%: Metrics: compile_time/bytes allocated ------------------------------------- LargeRecord(normal) -50.2% GOOD ManyConstructors(normal) +1.0% MultiLayerModulesTH_OneShot(normal) +2.6% PmSeriesG(normal) -1.1% T10547(normal) -1.2% T11195(normal) -1.2% T11276(normal) -1.0% T11303b(normal) -1.6% T11545(normal) -1.4% T11822(normal) -1.3% T12150(optasm) -1.0% T12234(optasm) -1.2% T13056(optasm) -9.3% GOOD T13253(normal) -3.8% GOOD T15164(normal) -3.6% GOOD T16190(normal) -2.1% T16577(normal) -2.8% GOOD T16875(normal) -1.6% T17836(normal) +2.2% T17977b(normal) -1.0% T18223(normal) -33.3% GOOD T18282(normal) -3.4% GOOD T18304(normal) -1.4% T18698a(normal) -1.4% GOOD T18698b(normal) -1.3% GOOD T19695(normal) -2.5% GOOD T5837(normal) -2.3% T9630(normal) -33.0% GOOD WWRec(normal) -9.7% GOOD hard_hole_fits(normal) -2.1% GOOD hie002(normal) +1.6% geo. mean -2.2% minimum -50.2% maximum +2.6% I diligently investigated some of the big drops. * Caused by not doing w/w for dictionaries: T13056, T15164, WWRec, T18223 * Caused by not fruitlessly specialising wrappers LargeRecord, T9630 For runtimes, here is perf/should+_run: Metrics: runtime/bytes allocated -------------------------------- T12990(normal) -3.8% T5205(normal) -1.3% T9203(normal) -10.7% GOOD haddock.Cabal(normal) +0.1% haddock.base(normal) -1.1% haddock.compiler(normal) -0.3% lazy-bs-alloc(normal) -0.2% ------------------------------------------ geo. mean -0.3% minimum -10.7% maximum +0.1% I did not investigate exactly what happens in T9203. Nofib is a wash: +-------------------------------++--+-----------+-----------+ \| \|\| \| tsv (rel) \| std. err. \| +===============================++==+===========+===========+ \| real/anna \|\| \| -0.13% \| 0.0% \| \| real/fem \|\| \| +0.13% \| 0.0% \| \| real/fulsom \|\| \| -0.16% \| 0.0% \| \| real/lift \|\| \| -1.55% \| 0.0% \| \| real/reptile \|\| \| -0.11% \| 0.0% \| \| real/smallpt \|\| \| +0.51% \| 0.0% \| \| spectral/constraints \|\| \| +0.20% \| 0.0% \| \| spectral/dom-lt \|\| \| +1.80% \| 0.0% \| \| spectral/expert \|\| \| +0.33% \| 0.0% \| +===============================++==+===========+===========+ \| geom mean \|\| \| \| \| +-------------------------------++--+-----------+-----------+ I spent quite some time investigating dom-lt, but it's pretty complicated. See my note on !7847. Conclusion: it's just a delicate inlining interaction, and we have plenty of those. Metric Decrease: LargeRecord T13056 T13253 T15164 T16577 T18223 T18282 T18698a T18698b T19695 T9630 WWRec hard_hole_fits T9203
*	Demand: Clear distinction between Call SubDmd and eval Dmd (#21717)wip/T21717	Sebastian Graf	2022-09-27	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In #21717 we saw a reportedly unsound strictness signature due to an unsound definition of plusSubDmd on Calls. This patch contains a description and the fix to the unsoundness as outlined in `Note [Call SubDemand vs. evaluation Demand]`. This fix means we also get rid of the special handling of `-fpedantic-bottoms` in eta-reduction. Thanks to less strict and actually sound strictness results, we will no longer eta-reduce the problematic cases in the first place, even without `-fpedantic-bottoms`. So fixing the unsoundness also makes our eta-reduction code simpler with less hacks to explain. But there is another, more unfortunate side-effect: We unfix #21085, but fortunately we have a new fix ready: See `Note [mkCall and plusSubDmd]`. There's another change: I decided to make `Note [SubDemand denotes at least one evaluation]` a lot simpler by using `plusSubDmd` (instead of `lubPlusSubDmd`) even if both argument demands are lazy. That leads to less precise results, but in turn rids ourselves from the need for 4 different `OpMode`s and the complication of `Note [Manual specialisation of lubDmd/plusDmd]`. The result is simpler code that is in line with the paper draft on Demand Analysis. I left the abandoned idea in `Note [Unrealised opportunity in plusDmd]` for posterity. The fallout in terms of regressions is negligible, as the testsuite and NoFib shows. ``` Program Allocs Instrs -------------------------------------------------------------------------------- hidden +0.2% -0.2% linear -0.0% -0.7% -------------------------------------------------------------------------------- Min -0.0% -0.7% Max +0.2% +0.0% Geometric Mean +0.0% -0.0% ``` Fixes #21717.
*	Fix typos	Krzysztof Gogolewski	2022-09-14	1	-1/+1
\|
*	Fix typos	Eric Lindblad	2022-09-14	2	-6/+6
\| \| \| \| \| \| \|	This fixes various typos and spelling mistakes in the compiler. Fixes #21891
*	Add a missing trimArityType	Simon Peyton Jones	2022-08-31	1	-136/+20
\| \| \| \| \|	This buglet was exposed by #22114, a consequence of my earlier refactoring of arity for join points.
*	Improve SpecConstr for evals	Simon Peyton Jones	2022-08-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As #21763 showed, we were over-specialising in some cases, when the function involved was doing a simple 'eval', but not taking the value apart, or branching on it. This MR fixes the problem. See Note [Do not specialise evals]. Nofib barely budges, except that spectral/cichelli allocates about 3% less. Compiler bytes-allocated improves a bit geo. mean -0.1% minimum -0.5% maximum +0.0% The -0.5% is on T11303b, for what it's worth.
*	Fix arityType: -fpedantic-bottoms, join points, etcwip/T21694a	Simon Peyton Jones	2022-08-25	2	-48/+147
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This MR fixes #21694, #21755. It also makes sure that #21948 and fix to #21694. * For #21694 the underlying problem was that we were calling arityType on an expression that had free join points. This is a Bad Bad Idea. See Note [No free join points in arityType]. * To make "no free join points in arityType" work out I had to avoid trying to use eta-expansion for runRW#. This entailed a few changes in the Simplifier's treatment of runRW#. See GHC.Core.Opt.Simplify.Iteration Note [No eta-expansion in runRW#] * I also made andArityType work correctly with -fpedantic-bottoms; see Note [Combining case branches: andWithTail]. * Rewrote Note [Combining case branches: optimistic one-shot-ness] * arityType previously treated join points differently to other let-bindings. This patch makes them unform; arityType analyses the RHS of all bindings to get its ArityType, and extends am_sigs. I realised that, now we have am_sigs giving the ArityType for let-bound Ids, we don't need the (pre-dating) special code in arityType for join points. But instead we need to extend the env for Rec bindings, which weren't doing before. More uniform now. See Note [arityType for let-bindings]. This meant we could get rid of ae_joins, and in fact get rid of EtaExpandArity altogether. Simpler. * And finally, it was the strange treatment of join-point Ids in arityType (involving a fake ABot type) that led to a serious bug: #21755. Fixed by this refactoring, which treats them uniformly; but without breaking #18328. In fact, the arity for recursive join bindings is pretty tricky; see the long Note [Arity for recursive join bindings] in GHC.Core.Opt.Simplify.Utils. That led to more refactoring, including deciding that an Id could have an Arity that is bigger than its JoinArity; see Note [Invariants on join points], item 2(b) in GHC.Core * Make sure that the "demand threshold" for join points in DmdAnal is no bigger than the join-arity. In GHC.Core.Opt.DmdAnal see Note [Demand signatures are computed for a threshold arity based on idArity] * I moved GHC.Core.Utils.exprIsDeadEnd into GHC.Core.Opt.Arity, where it more properly belongs. * Remove an old, redundant hack in FloatOut. The old Note was Note [Bottoming floats: eta expansion] in GHC.Core.Opt.SetLevels. Compile time improves very slightly on average: Metrics: compile_time/bytes allocated --------------------------------------------------------------------------------------- T18223(normal) ghc/alloc 725,808,720 747,839,216 +3.0% BAD T6048(optasm) ghc/alloc 105,006,104 101,599,472 -3.2% GOOD geo. mean -0.2% minimum -3.2% maximum +3.0% For some reason Windows was better T10421(normal) ghc/alloc 125,888,360 124,129,168 -1.4% GOOD T18140(normal) ghc/alloc 85,974,520 83,884,224 -2.4% GOOD T18698b(normal) ghc/alloc 236,764,568 234,077,288 -1.1% GOOD T18923(normal) ghc/alloc 75,660,528 73,994,512 -2.2% GOOD T6048(optasm) ghc/alloc 112,232,512 108,182,520 -3.6% GOOD geo. mean -0.6% I had a quick look at T18223 but it is knee deep in coercions and the size of everything looks similar before and after. I decided to accept that 3% increase in exchange for goodness elsewhere. Metric Decrease: T10421 T18140 T18698b T18923 T6048 Metric Increase: T18223
*	Force `getOccFS bndr` to avoid retaining reference to Bndr.	Matthew Pickering	2022-08-19	1	-2/+5
\| \| \| \|	This is another symptom of #19619
*	Make ru_fn field strict to avoid retaining Ids	Matthew Pickering	2022-08-19	1	-1/+2
\| \| \| \| \| \|	It's better to perform this projection from Id to Name strictly so we don't retain an old Id (hence IdInfo, hence Unfolding, hence everything etc)
*	Remove TCvSubst and use Subst for both term and type-level subst	Yiyun Liu	2022-08-04	2	-15/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes the TCvSubst data type and instead uses Subst as the environment for both term and type level substitution. This change is partially motivated by the existential type proposal, which will introduce types that contain expressions and therefore forces us to carry around an "IdSubstEnv" even when substituting for types. It also reduces the amount of code because "Subst" and "TCvSubst" share a lot of common operations. There isn't any noticeable impact on performance (geo. mean for ghc/alloc is around 0.0% but we have -94 loc and one less data type to worry abount). Currently, the "TCvSubst" data type for substitution on types is identical to the "Subst" data type except the former doesn't store "IdSubstEnv". Using "Subst" for type-level substitution means there will be a redundant field stored in the data type. However, in cases where the substitution starts from the expression, using "Subst" for type-level substitution saves us from having to project "Subst" into a "TCvSubst". This probably explains why the allocation is mostly even despite the redundant field. The patch deletes "TCvSubst" and moves "Subst" and its relevant functions from "GHC.Core.Subst" into "GHC.Core.TyCo.Subst". Substitution on expressions is still defined in "GHC.Core.Subst" so we don't have to expose the definition of "Expr" in the hs-boot file that "GHC.Core.TyCo.Subst" must import to refer to "IdSubstEnv" (whose codomain is "CoreExpr"). Most functions named fooTCvSubst are renamed into fooSubst with a few exceptions (e.g. "isEmptyTCvSubst" is a distinct function from "isEmptySubst"; the former ignores the emptiness of "IdSubstEnv"). These exceptions mainly exist for performance reasons and will go away when "Expr" and "Type" are mutually recursively defined (we won't be able to take those shortcuts if we can't make the assumption that expressions don't appear in types).
*	Switch off eta-expansion in rules and unfoldings	Simon Peyton Jones	2022-07-25	1	-8/+14
\| \| \| \| \| \|	I think this change will make little difference except to reduce clutter. But that's it -- if it causes problems we can switch it on again.
*	Refactored Simplify pass	Dominik Peteler	2022-07-22	4	-175/+4628
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Removed references to driver from GHC.Core.LateCC, GHC.Core.Simplify namespace and GHC.Core.Opt.Stats. Also removed services from configuration records. * Renamed GHC.Core.Opt.Simplify to GHC.Core.Opt.Simplify.Iteration. * Inlined `simplifyPgm` and renamed `simplifyPgmIO` to `simplifyPgm` and moved the Simplify driver to GHC.Core.Opt.Simplify. * Moved `SimplMode` and `FloatEnable` to GHC.Core.Opt.Simplify.Env. * Added a configuration record `TopEnvConfig` for the `SimplTopEnv` environment in GHC.Core.Opt.Simplify.Monad. * Added `SimplifyOpts` and `SimplifyExprOpts`. Provide initialization functions for those in a new module GHC.Driver.Config.Core.Opt.Simplify. Also added initialization functions for `SimplMode` to that module. * Moved `CoreToDo` and friends to a new module GHC.Core.Pipeline.Types and the counting types and functions (`SimplCount` and `Tick`) to new module GHC.Core.Opt.Stats. * Added getter functions for the fields of `SimplMode`. The pedantic bottoms option and the platform are retrieved from the ArityOpts and RuleOpts and the getter functions allow us to retrieve values from `SpecEnv` without the knowledge where the data is stored exactly. * Moved the coercion optimization options from the top environment to `SimplMode`. This way the values left in the top environment are those dealing with monadic functionality, namely logging, IO related stuff and counting. Added a note "The environments of the Simplify pass". * Removed `CoreToDo` from GHC.Core.Lint and GHC.CoreToStg.Prep and got rid of `CoreDoSimplify`. Pass `SimplifyOpts` in the `CoreToDo` type instead. * Prep work before removing `InteractiveContext` from `HscEnv`.
*	Flags to disable local let-floating; -flocal-float-out, ↵	Vanessa McHale	2022-06-22	2	-7/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-flocal-float-out-top-level CLI flags These flags affect the behaviour of local let floating. If `-flocal-float-out` is disabled (the default) then we disable all local floating. ``` …(let x = let y = e in (a,b) in body)... ===> …(let y = e; x = (a,b) in body)... ``` Further to this, top-level local floating can be disabled on it's own by passing -fno-local-float-out-top-level. ``` x = let y = e in (a,b) ===> y = e; x = (a,b) ``` Note that this is only about local floating, ie, floating two adjacent lets past each other and doesn't say anything about the global floating pass which is controlled by `-fno-float`. Fixes #13663
*	Simplify: Take care with eta reduction in recursive RHSs (#21652)	Sebastian Graf	2022-06-20	2	-27/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to the fix to #20836 in CorePrep, we now track the set of enclosing recursive binders in the SimplEnv and SimpleOptEnv. See Note [Eta reduction in recursive RHSs] for details. I also updated Note [Arity robustness] with the insights Simon and I had in a call discussing the issue. Fixes #21652. Unfortunately, we get a 5% ghc/alloc regression in T16577. That is due to additional eta reduction in GHC.Read.choose1 and the resulting ANF-isation of a large list literal at the top-level that didn't happen before (presumably because it was too interesting to float to the top-level). There's not much we can do about that. Metric Increase: T16577
*	A bunch of changes related to eta reduction	Simon Peyton Jones	2022-05-30	2	-143/+177
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a large collection of changes all relating to eta reduction, originally triggered by #18993, but there followed a long saga. Specifics: * Move state-hack stuff from GHC.Types.Id (where it never belonged) to GHC.Core.Opt.Arity (which seems much more appropriate). * Add a crucial mkCast in the Cast case of GHC.Core.Opt.Arity.eta_expand; helps with T18223 * Add clarifying notes about eta-reducing to PAPs. See Note [Do not eta reduce PAPs] * I moved tryEtaReduce from GHC.Core.Utils to GHC.Core.Opt.Arity, where it properly belongs. See Note [Eta reduce PAPs] * In GHC.Core.Opt.Simplify.Utils.tryEtaExpandRhs, pull out the code for when eta-expansion is wanted, to make wantEtaExpansion, and all that same function in GHC.Core.Opt.Simplify.simplStableUnfolding. It was previously inconsistent, but it's doing the same thing. * I did a substantial refactor of ArityType; see Note [ArityType]. This allowed me to do away with the somewhat mysterious takeOneShots; more generally it allows arityType to describe the function, leaving its clients to decide how to use that information. I made ArityType abstract, so that clients have to use functions to access it. * Make GHC.Core.Opt.Simplify.Utils.rebuildLam (was stupidly called mkLam before) aware of the floats that the simplifier builds up, so that it can still do eta-reduction even if there are some floats. (Previously that would not happen.) That means passing the floats to rebuildLam, and an extra check when eta-reducting (etaFloatOk). * In GHC.Core.Opt.Simplify.Utils.tryEtaExpandRhs, make use of call-info in the idDemandInfo of the binder, as well as the CallArity info. The occurrence analyser did this but we were failing to take advantage here. In the end I moved the heavy lifting to GHC.Core.Opt.Arity.findRhsArity; see Note [Combining arityType with demand info], and functions idDemandOneShots and combineWithDemandOneShots. (These changes partly drove my refactoring of ArityType.) * In GHC.Core.Opt.Arity.findRhsArity * I'm now taking account of the demand on the binder to give extra one-shot info. E.g. if the fn is always called with two args, we can give better one-shot info on the binders than if we just look at the RHS. * Don't do any fixpointing in the non-recursive case -- simple short cut. * Trim arity inside the loop. See Note [Trim arity inside the loop] * Make SimpleOpt respect the eta-reduction flag (Some associated refactoring here.) * I made the CallCtxt which the Simplifier uses distinguish between recursive and non-recursive right-hand sides. data CallCtxt = ... \| RhsCtxt RecFlag \| ... It affects only one thing: - We call an RHS context interesting only if it is non-recursive see Note [RHS of lets] in GHC.Core.Unfold * Remove eta-reduction in GHC.CoreToStg.Prep, a welcome simplification. See Note [No eta reduction needed in rhsToBody] in GHC.CoreToStg.Prep. Other incidental changes * Fix a fairly long-standing outright bug in the ApplyToVal case of GHC.Core.Opt.Simplify.mkDupableContWithDmds. I was failing to take the tail of 'dmds' in the recursive call, which meant the demands were All Wrong. I have no idea why this has not caused problems before now. * Delete dead function GHC.Core.Opt.Simplify.Utils.contIsRhsOrArg Metrics: compile_time/bytes allocated Test Metric Baseline New value Change --------------------------------------------------------------------------------------- MultiLayerModulesTH_OneShot(normal) ghc/alloc 2,743,297,692 2,619,762,992 -4.5% GOOD T18223(normal) ghc/alloc 1,103,161,360 972,415,992 -11.9% GOOD T3064(normal) ghc/alloc 201,222,500 184,085,360 -8.5% GOOD T8095(normal) ghc/alloc 3,216,292,528 3,254,416,960 +1.2% T9630(normal) ghc/alloc 1,514,131,032 1,557,719,312 +2.9% BAD parsing001(normal) ghc/alloc 530,409,812 525,077,696 -1.0% geo. mean -0.1% Nofib: Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- banner +0.0% +0.4% -8.9% -8.7% 0.0% exact-reals +0.0% -7.4% -36.3% -37.4% 0.0% fannkuch-redux +0.0% -0.1% -1.0% -1.0% 0.0% fft2 -0.1% -0.2% -17.8% -19.2% 0.0% fluid +0.0% -1.3% -2.1% -2.1% 0.0% gg -0.0% +2.2% -0.2% -0.1% 0.0% spectral-norm +0.1% -0.2% 0.0% 0.0% 0.0% tak +0.0% -0.3% -9.8% -9.8% 0.0% x2n1 +0.0% -0.2% -3.2% -3.2% 0.0% -------------------------------------------------------------------------------- Min -3.5% -7.4% -58.7% -59.9% 0.0% Max +0.1% +2.2% +32.9% +32.9% 0.0% Geometric Mean -0.0% -0.1% -14.2% -14.8% -0.0% Metric Decrease: MultiLayerModulesTH_OneShot T18223 T3064 T15185 T14766 Metric Increase: T9630
*	Make findRhsArity take RecFlag	Simon Peyton Jones	2022-05-30	1	-3/+3
\| \| \| \| \|	This avoids a fixpoint iteration for the common case of non-recursive bindings.
*	Do arity trimming at bindings, rather than in exprArity	Simon Peyton Jones	2022-05-30	1	-0/+4
\| \| \| \| \|	Sometimes there are very large casts, and coercionRKind can be slow.
*	Factor out `initArityOps` to `GHC.Driver.Config.*` module	John Ericson	2022-05-26	1	-6/+2
\| \| \| \|	We want `DynFlags` only mentioned in `GHC.Driver`.