summaryrefslogtreecommitdiff
path: root/compiler/GHC/Unit/Home/ModInfo.hs
Commit message (Collapse)AuthorAgeFilesLines
* Interface Files with Core DefinitionsMatthew Pickering2022-10-111-7/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds three new flags * -fwrite-if-simplified-core: Writes the whole core program into an interface file * -fbyte-code-and-object-code: Generate both byte code and object code when compiling a file * -fprefer-byte-code: Prefer to use byte-code if it's available when running TH splices. The goal for including the core bindings in an interface file is to be able to restart the compiler pipeline at the point just after simplification and before code generation. Once compilation is restarted then code can be created for the byte code backend. This can significantly speed up start-times for projects in GHCi. HLS already implements its own version of these extended interface files for this reason. Preferring to use byte-code means that we can avoid some potentially expensive code generation steps (see #21700) * Producing object code is much slower than producing bytecode, and normally you need to compile with `-dynamic-too` to produce code in the static and dynamic way, the dynamic way just for Template Haskell execution when using a dynamically linked compiler. * Linking many large object files, which happens once per splice, can be quite expensive compared to linking bytecode. And you can get GHC to compile the necessary byte code so `-fprefer-byte-code` has access to it by using `-fbyte-code-and-object-code`. Fixes #21067
* Multiple Home UnitsMatthew Pickering2021-12-281-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multiple home units allows you to load different packages which may depend on each other into one GHC session. This will allow both GHCi and HLS to support multi component projects more naturally. Public Interface ~~~~~~~~~~~~~~~~ In order to specify multiple units, the -unit @⟨filename⟩ flag is given multiple times with a response file containing the arguments for each unit. The response file contains a newline separated list of arguments. ``` ghc -unit @unitLibCore -unit @unitLib ``` where the `unitLibCore` response file contains the normal arguments that cabal would pass to `--make` mode. ``` -this-unit-id lib-core-0.1.0.0 -i -isrc LibCore.Utils LibCore.Types ``` The response file for lib, can specify a dependency on lib-core, so then modules in lib can use modules from lib-core. ``` -this-unit-id lib-0.1.0.0 -package-id lib-core-0.1.0.0 -i -isrc Lib.Parse Lib.Render ``` Then when the compiler starts in --make mode it will compile both units lib and lib-core. There is also very basic support for multiple home units in GHCi, at the moment you can start a GHCi session with multiple units but only the :reload is supported. Most commands in GHCi assume a single home unit, and so it is additional work to work out how to modify the interface to support multiple loaded home units. Options used when working with Multiple Home Units There are a few extra flags which have been introduced specifically for working with multiple home units. The flags allow a home unit to pretend it’s more like an installed package, for example, specifying the package name, module visibility and reexported modules. -working-dir ⟨dir⟩ It is common to assume that a package is compiled in the directory where its cabal file resides. Thus, all paths used in the compiler are assumed to be relative to this directory. When there are multiple home units the compiler is often not operating in the standard directory and instead where the cabal.project file is located. In this case the -working-dir option can be passed which specifies the path from the current directory to the directory the unit assumes to be it’s root, normally the directory which contains the cabal file. When the flag is passed, any relative paths used by the compiler are offset by the working directory. Notably this includes -i and -I⟨dir⟩ flags. -this-package-name ⟨name⟩ This flag papers over the awkward interaction of the PackageImports and multiple home units. When using PackageImports you can specify the name of the package in an import to disambiguate between modules which appear in multiple packages with the same name. This flag allows a home unit to be given a package name so that you can also disambiguate between multiple home units which provide modules with the same name. -hidden-module ⟨module name⟩ This flag can be supplied multiple times in order to specify which modules in a home unit should not be visible outside of the unit it belongs to. The main use of this flag is to be able to recreate the difference between an exposed and hidden module for installed packages. -reexported-module ⟨module name⟩ This flag can be supplied multiple times in order to specify which modules are not defined in a unit but should be reexported. The effect is that other units will see this module as if it was defined in this unit. The use of this flag is to be able to replicate the reexported modules feature of packages with multiple home units. Offsetting Paths in Template Haskell splices ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When using Template Haskell to embed files into your program, traditionally the paths have been interpreted relative to the directory where the .cabal file resides. This causes problems for multiple home units as we are compiling many different libraries at once which have .cabal files in different directories. For this purpose we have introduced a way to query the value of the -working-dir flag to the Template Haskell API. By using this function we can implement a makeRelativeToProject function which offsets a path which is relative to the original project root by the value of -working-dir. ``` import Language.Haskell.TH.Syntax ( makeRelativeToProject ) foo = $(makeRelativeToProject "./relative/path" >>= embedFile) ``` > If you write a relative path in a Template Haskell splice you should use the makeRelativeToProject function so that your library works correctly with multiple home units. A similar function already exists in the file-embed library. The function in template-haskell implements this function in a more robust manner by honouring the -working-dir flag rather than searching the file system. Closure Property for Home Units ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For tools or libraries using the API there is one very important closure property which must be adhered to: > Any dependency which is not a home unit must not (transitively) depend on a home unit. For example, if you have three packages p, q and r, then if p depends on q which depends on r then it is illegal to load both p and r as home units but not q, because q is a dependency of the home unit p which depends on another home unit r. If you are using GHC by the command line then this property is checked, but if you are using the API then you need to check this property yourself. If you get it wrong you will probably get some very confusing errors about overlapping instances. Limitations of Multiple Home Units ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There are a few limitations of the initial implementation which will be smoothed out on user demand. * Package thinning/renaming syntax is not supported * More complicated reexports/renaming are not yet supported. * It’s more common to run into existing linker bugs when loading a large number of packages in a session (for example #20674, #20689) * Backpack is not yet supported when using multiple home units. * Dependency chasing can be quite slow with a large number of modules and packages. * Loading wired-in packages as home units is currently not supported (this only really affects GHC developers attempting to load template-haskell). * Barely any normal GHCi features are supported, it would be good to support enough for ghcid to work correctly. Despite these limitations, the implementation works already for nearly all packages. It has been testing on large dependency closures, including the whole of head.hackage which is a total of 4784 modules from 452 packages. Internal Changes ~~~~~~~~~~~~~~~~ * The biggest change is that the HomePackageTable is replaced with the HomeUnitGraph. The HomeUnitGraph is a map from UnitId to HomeUnitEnv, which contains information specific to each home unit. * The HomeUnitEnv contains: - A unit state, each home unit can have different package db flags - A set of dynflags, each home unit can have different flags - A HomePackageTable * LinkNode: A new node type is added to the ModuleGraph, this is used to place the linking step into the build plan so linking can proceed in parralel with other packages being built. * New invariant: Dependencies of a ModuleGraphNode can be completely determined by looking at the value of the node. In order to achieve this, downsweep now performs a more complete job of downsweeping and then the dependenices are recorded forever in the node rather than being computed again from the ModSummary. * Some transitive module calculations are rewritten to use the ModuleGraph which is more efficient. * There is always an active home unit, which simplifies modifying a lot of the existing API code which is unit agnostic (for example, in the driver). The road may be bumpy for a little while after this change but the basics are well-tested. One small metric increase, which we accept and also submodule update to haddock which removes ExtendedModSummary. Closes #10827 ------------------------- Metric Increase: MultiLayerModules ------------------------- Co-authored-by: Fendor <power.walross@gmail.com>
* Temporary fix for leak with -fno-code (#20509)Matthew Pickering2021-10-191-0/+4
| | | | | | | | | | | | | | | | | | This hack inserted for backpack caused a very bad leak when using -fno-code where EPS entries would end up retaining stale HomePackageTables. For any interactive user, such as HLS, this is really bad as once the entry makes it's way into the EPS then it's there for the rest of the session. This is a temporary fix which "solves" the issue by filtering the HPT to only the part which is needed for the hack to work, but in future we want to separate out hole modules from the HPT entirely to avoid needing to do this kind of special casing. ------------------------- Metric Decrease: MultiLayerModulesDefsGhci -------------------------
* More strictness around HomePackageTableMatthew Pickering2021-10-121-1/+3
| | | | | | | | | This patch makes some operations to do with HomePackageTable stricter * Adding a new entry into the HPT would not allow the old HomeModInfo to be collected because the function used by insertWith wouldn't be forced. * We're careful to force the new MVar value before it's inserted into the global MVar as otherwise we retain references to old entries.
* Fix annoying warning about Data.List unqualified importSylvain Henry2021-09-171-1/+1
|
* Driver rework pt3: the upsweepMatthew Pickering2021-08-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch specifies and simplifies the module cycle compilation in upsweep. How things work are described in the Note [Upsweep] Note [Upsweep] ~~~~~~~~~~~~~~ Upsweep takes a 'ModuleGraph' as input, computes a build plan and then executes the plan in order to compile the project. The first step is computing the build plan from a 'ModuleGraph'. The output of this step is a `[BuildPlan]`, which is a topologically sorted plan for how to build all the modules. ``` data BuildPlan = SingleModule ModuleGraphNode -- A simple, single module all alone but *might* have an hs-boot file which isn't part of a cycle | ResolvedCycle [ModuleGraphNode] -- A resolved cycle, linearised by hs-boot files | UnresolvedCycle [ModuleGraphNode] -- An actual cycle, which wasn't resolved by hs-boot files ``` The plan is computed in two steps: Step 1: Topologically sort the module graph without hs-boot files. This returns a [SCC ModuleGraphNode] which contains cycles. Step 2: For each cycle, topologically sort the modules in the cycle *with* the relevant hs-boot files. This should result in an acyclic build plan if the hs-boot files are sufficient to resolve the cycle. The `[BuildPlan]` is then interpreted by the `interpretBuildPlan` function. * `SingleModule nodes` are compiled normally by either the upsweep_inst or upsweep_mod functions. * `ResolvedCycles` need to compiled "together" so that the information which ends up in the interface files at the end is accurate (and doesn't contain temporary information from the hs-boot files.) - During the initial compilation, a `KnotVars` is created which stores an IORef TypeEnv for each module of the loop. These IORefs are gradually updated as the loop completes and provide the required laziness to typecheck the module loop. - At the end of typechecking, all the interface files are typechecked again in the retypecheck loop. This time, the knot-tying is done by the normal laziness based tying, so the environment is run without the KnotVars. * UnresolvedCycles are indicative of a proper cycle, unresolved by hs-boot files and are reported as an error to the user. The main trickiness of `interpretBuildPlan` is deciding which version of a dependency is visible from each module. For modules which are not in a cycle, there is just one version of a module, so that is always used. For modules in a cycle, there are two versions of 'HomeModInfo'. 1. Internal to loop: The version created whilst compiling the loop by upsweep_mod. 2. External to loop: The knot-tied version created by typecheckLoop. Whilst compiling a module inside the loop, we need to use the (1). For a module which is outside of the loop which depends on something from in the loop, the (2) version is used. As the plan is interpreted, which version of a HomeModInfo is visible is updated by updating a map held in a state monad. So after a loop has finished being compiled, the visible module is the one created by typecheckLoop and the internal version is not used again. This plan also ensures the most important invariant to do with module loops: > If you depend on anything within a module loop, before you can use the dependency, the whole loop has to finish compiling. The end result of `interpretBuildPlan` is a `[MakeAction]`, which are pairs of `IO a` actions and a `MVar (Maybe a)`, somewhere to put the result of running the action. This list is topologically sorted, so can be run in order to compute the whole graph. As well as this `interpretBuildPlan` also outputs an `IO [Maybe (Maybe HomeModInfo)]` which can be queried at the end to get the result of all modules at the end, with their proper visibility. For example, if any module in a loop fails then all modules in that loop will report as failed because the visible node at the end will be the result of retypechecking those modules together. Along the way we also fix a number of other bugs in the driver: * Unify upsweep and parUpsweep. * Fix #19937 (static points, ghci and -j) * Adds lots of module loop tests due to Divam. Also related to #20030 Co-authored-by: Divam Narula <dfordivam@gmail.com> ------------------------- Metric Decrease: T10370 -------------------------
* Linker: reorganize linker related codeSylvain Henry2020-11-031-1/+1
| | | | | | | Move linker related code into GHC.Linker. Previously it was scattered into GHC.Unit.State, GHC.Driver.Pipeline, GHC.Runtime.Linker, etc. Add documentation in GHC.Linker
* Split GHC.Driver.TypesSylvain Henry2020-10-291-0/+117
I was working on making DynFlags stateless (#17957), especially by storing loaded plugins into HscEnv instead of DynFlags. It turned out to be complicated because HscEnv is in GHC.Driver.Types but LoadedPlugin isn't: it is in GHC.Driver.Plugins which depends on GHC.Driver.Types. I didn't feel like introducing yet another hs-boot file to break the loop. Additionally I remember that while we introduced the module hierarchy (#13009) we talked about splitting GHC.Driver.Types because it contained various unrelated types and functions, but we never executed. I didn't feel like making GHC.Driver.Types bigger with more unrelated Plugins related types, so finally I bit the bullet and split GHC.Driver.Types. As a consequence this patch moves a lot of things. I've tried to put them into appropriate modules but nothing is set in stone. Several other things moved to avoid loops. * Removed Binary instances from GHC.Utils.Binary for random compiler things * Moved Typeable Binary instances into GHC.Utils.Binary.Typeable: they import a lot of things that users of GHC.Utils.Binary don't want to depend on. * put everything related to Units/Modules under GHC.Unit: GHC.Unit.Finder, GHC.Unit.Module.{ModGuts,ModIface,Deps,etc.} * Created several modules under GHC.Types: GHC.Types.Fixity, SourceText, etc. * Split GHC.Utils.Error (into GHC.Types.Error) * Finally removed GHC.Driver.Types Note that this patch doesn't put loaded plugins into HscEnv. It's left for another patch. Bump haddock submodule