diff options
author | Simon Peyton Jones <simon.peytonjones@gmail.com> | 2023-02-21 10:51:34 +0000 |
---|---|---|
committer | Ben Gamari <ben@smart-cactus.org> | 2023-03-02 10:31:49 -0500 |
commit | b73b70bfa976eb57f1c4ec0c0db97a5fab2bf9ff (patch) | |
tree | b92f61bbab8d9dbcd63f545d99dc5db3b6118e61 | |
parent | 5451b48c5affeb443e2ba1eef60a9c10e0a5842b (diff) | |
download | haskell-b73b70bfa976eb57f1c4ec0c0db97a5fab2bf9ff.tar.gz |
Take more care with unlifted bindings in the specialiser
As #22998 showed, we were floating an unlifted binding to top
level, which breaks a Core invariant.
The fix is easy, albeit a little bit conservative. See
Note [Care with unlifted bindings] in GHC.Core.Opt.Specialise
(cherry picked from commit 7192ef91c855e1fae6997f75cfde76aafd0b4bcf)
-rw-r--r-- | compiler/GHC/Core.hs | 94 | ||||
-rw-r--r-- | compiler/GHC/Core/Opt/Specialise.hs | 29 | ||||
-rw-r--r-- | testsuite/tests/simplCore/should_run/T22998.hs | 10 | ||||
-rw-r--r-- | testsuite/tests/simplCore/should_run/T22998.stdout | 1 | ||||
-rw-r--r-- | testsuite/tests/simplCore/should_run/all.T | 1 |
5 files changed, 101 insertions, 34 deletions
diff --git a/compiler/GHC/Core.hs b/compiler/GHC/Core.hs index f504a7cbd5..6ecd144988 100644 --- a/compiler/GHC/Core.hs +++ b/compiler/GHC/Core.hs @@ -366,39 +366,20 @@ a Coercion, (sym c). Note [Core letrec invariant] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The right hand sides of all top-level and recursive @let@s -/must/ be of lifted type (see "Type#type_classification" for -the meaning of /lifted/ vs. /unlifted/). +The Core letrec invariant: -There is one exception to this rule, top-level @let@s are -allowed to bind primitive string literals: see -Note [Core top-level string literals]. + The right hand sides of all + /top-level/ or /recursive/ + bindings must be of lifted type -Note [Core top-level string literals] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -As an exception to the usual rule that top-level binders must be lifted, -we allow binding primitive string literals (of type Addr#) of type Addr# at the -top level. This allows us to share string literals earlier in the pipeline and -crucially allows other optimizations in the Core2Core pipeline to fire. -Consider, - - f n = let a::Addr# = "foo"# - in \x -> blah - -In order to be able to inline `f`, we would like to float `a` to the top. -Another option would be to inline `a`, but that would lead to duplicating string -literals, which we want to avoid. See #8472. - -The solution is simply to allow top-level unlifted binders. We can't allow -arbitrary unlifted expression at the top-level though, unlifted binders cannot -be thunks, so we just allow string literals. + There is one exception to this rule, top-level @let@s are + allowed to bind primitive string literals: see + Note [Core top-level string literals]. -We allow the top-level primitive string literals to be wrapped in Ticks -in the same way they can be wrapped when nested in an expression. -CoreToSTG currently discards Ticks around top-level primitive string literals. -See #14779. +See "Type#type_classification" in GHC.Core.Type +for the meaning of "lifted" vs. "unlifted"). -Also see Note [Compilation plan for top-level string literals]. +For the non-top-level, non-recursive case see Note [Core let-can-float invariant]. Note [Compilation plan for top-level string literals] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -420,14 +401,16 @@ parts of the compilation pipeline. the _bytes suffix. Note [Core let-can-float invariant] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The let-can-float invariant: - The right hand side of a non-recursive 'Let' - /may/ be of unlifted type, but only if + The right hand side of a /non-top-level/, /non-recursive/ binding + may be of unlifted type, but only if the expression is ok-for-speculation or the 'Let' is for a join point. + (For top-level or recursive lets see Note [Core letrec invariant].) + This means that the let can be floated around without difficulty. For example, this is OK: @@ -466,6 +449,53 @@ we need to allow lots of things in the arguments of a call. TL;DR: we relaxed the let/app invariant to become the let-can-float invariant. +Note [Core top-level string literals] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +As an exception to the usual rule that top-level binders must be lifted, +we allow binding primitive string literals (of type Addr#) of type Addr# at the +top level. This allows us to share string literals earlier in the pipeline and +crucially allows other optimizations in the Core2Core pipeline to fire. +Consider, + + f n = let a::Addr# = "foo"# + in \x -> blah + +In order to be able to inline `f`, we would like to float `a` to the top. +Another option would be to inline `a`, but that would lead to duplicating string +literals, which we want to avoid. See #8472. + +The solution is simply to allow top-level unlifted binders. We can't allow +arbitrary unlifted expression at the top-level though, unlifted binders cannot +be thunks, so we just allow string literals. + +We allow the top-level primitive string literals to be wrapped in Ticks +in the same way they can be wrapped when nested in an expression. +CoreToSTG currently discards Ticks around top-level primitive string literals. +See #14779. + +Also see Note [Compilation plan for top-level string literals]. + +Note [Compilation plan for top-level string literals] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Here is a summary on how top-level string literals are handled by various +parts of the compilation pipeline. + +* In the source language, there is no way to bind a primitive string literal + at the top level. + +* In Core, we have a special rule that permits top-level Addr# bindings. See + Note [Core top-level string literals]. Core-to-core passes may introduce + new top-level string literals. + + See GHC.Core.Utils.exprIsTopLevelBindable, and exprIsTickedString + +* In STG, top-level string literals are explicitly represented in the syntax + tree. + +* A top-level string literal may end up exported from a module. In this case, + in the object file, the content of the exported literal is given a label with + the _bytes suffix. + Note [Case expression invariants] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Case expressions are one of the more complicated elements of the Core diff --git a/compiler/GHC/Core/Opt/Specialise.hs b/compiler/GHC/Core/Opt/Specialise.hs index f028bd428e..f48aeb50d7 100644 --- a/compiler/GHC/Core/Opt/Specialise.hs +++ b/compiler/GHC/Core/Opt/Specialise.hs @@ -27,7 +27,7 @@ import GHC.Core import GHC.Core.Make ( mkLitRubbish ) import GHC.Core.Unify ( tcMatchTy ) import GHC.Core.Rules -import GHC.Core.Utils ( exprIsTrivial +import GHC.Core.Utils ( exprIsTrivial, exprIsTopLevelBindable , mkCast, exprType , stripTicksTop, mkInScopeSetBndrs ) import GHC.Core.FVs @@ -1515,7 +1515,10 @@ specBind top_lvl env (NonRec fn rhs) do_body = [mkDB $ NonRec b r | (b,r) <- pairs] ++ fromOL dump_dbs - ; if float_all then + can_float_this_one = exprIsTopLevelBindable rhs (idType fn) + -- exprIsTopLevelBindable: see Note [Care with unlifted bindings] + + ; if float_all && can_float_this_one then -- Rather than discard the calls mentioning the bound variables -- we float this (dictionary) binding along with the others return ([], body', all_free_uds `snocDictBinds` final_binds) @@ -1876,6 +1879,28 @@ even in the case that @x = False@! Instead, we add a dummy 'Void#' argument to the specialisation '$sfInt' ($sfInt :: Void# -> Array# Int) in order to preserve laziness. +Note [Care with unlifted bindings] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider (#22998) + f x = let x::ByteArray# = <some literal> + n::Natural = NB x + in wombat @192827 (n |> co) +where + co :: Natural ~ KnownNat 192827 + wombat :: forall (n:Nat). KnownNat n => blah + +Left to itself, the specialiser would float the bindings for `x` and `n` to top +level, so we can specialise `wombat`. But we can't have a top-level ByteArray# +(see Note [Core letrec invariant] in GHC.Core). Boo. + +This is pretty exotic, so we take a simple way out: in specBind (the NonRec +case) do not float the binding itself unless it satisfies exprIsTopLevelBindable. +This is conservative: maybe the RHS of `x` has a free var that would stop it +floating to top level anyway; but that is hard to spot (since we don't know what +the non-top-level in-scope binders are) and rare (since the binding must satisfy +Note [Core let-can-float invariant] in GHC.Core). + + Note [Specialising Calls] ~~~~~~~~~~~~~~~~~~~~~~~~~ Suppose we have a function with a complicated type: diff --git a/testsuite/tests/simplCore/should_run/T22998.hs b/testsuite/tests/simplCore/should_run/T22998.hs new file mode 100644 index 0000000000..459f576d82 --- /dev/null +++ b/testsuite/tests/simplCore/should_run/T22998.hs @@ -0,0 +1,10 @@ +{-# LANGUAGE DataKinds #-} +module Main where + +import Data.Proxy (Proxy(Proxy)) +import GHC.TypeLits (natVal) + +main :: IO () +main = print x + where + x = natVal @18446744073709551616 Proxy + natVal @18446744073709551616 Proxy diff --git a/testsuite/tests/simplCore/should_run/T22998.stdout b/testsuite/tests/simplCore/should_run/T22998.stdout new file mode 100644 index 0000000000..1ce484120a --- /dev/null +++ b/testsuite/tests/simplCore/should_run/T22998.stdout @@ -0,0 +1 @@ +36893488147419103232 diff --git a/testsuite/tests/simplCore/should_run/all.T b/testsuite/tests/simplCore/should_run/all.T index a887d5cedb..2de90b6541 100644 --- a/testsuite/tests/simplCore/should_run/all.T +++ b/testsuite/tests/simplCore/should_run/all.T @@ -107,3 +107,4 @@ test('T21229', normal, compile_and_run, ['-O']) test('T21575', normal, compile_and_run, ['-O']) test('T21575b', [], multimod_compile_and_run, ['T21575b', '-O']) test('T20836', normal, compile_and_run, ['-O0']) # Should not time out; See #20836 +test('T22998', normal, compile_and_run, ['-O0 -fspecialise -dcore-lint']) |