diff options
author | Matthew Pickering <matthewtpickering@gmail.com> | 2022-11-09 11:48:23 +0000 |
---|---|---|
committer | Matthew Pickering <matthewtpickering@gmail.com> | 2022-11-29 11:10:32 +0000 |
commit | f212f609f137c7f10455ee34cbd82f15843cb6de (patch) | |
tree | 46a56a7fe9423078eb5f0881427428c2b3710b15 /docs/users_guide/debugging.rst | |
parent | def47dd32491311289bff26230b664c895f178cc (diff) | |
download | haskell-wip/par-stats.tar.gz |
driver: Add timing information to upsweep and some simple analysis scriptswip/par-stats
This comit adds a new flag `-ddump-make-stats` which
shows some statistics about the project build graph after compilation has finished.
These can be useful identifying bottlenecks in your projects module structure.
The statistics which are currently outputted are:
* The modules which took longest to compile.
* The modules which have the largest "flow". The initial flow is 1, and split
evenly between all roots of the dependency graph. The flow is propagated
through the graph, accumulated on each node and split evenly on children.
The result is that any synchronisation points will have a flow equal to 1,
and likewise other important modules will have a high flow value.
* The length of the longest (critical) path through the project. This provides
a lower bound on the projects compilation time.
* The "parallelism score" which is the sum of compiling all nodes divided by
the length of the critical path. This should be a more stable metric then
critical path length because it doesn't depend on how fast your computer is.
For example, here is an example of the output from compiling the Cabal
library.
```
===== Maximum Duration (s) =====
000 M: main:Distribution.Simple.Setup (59): time: 1.40 allocs: 1489.93
001 M: main:Distribution.PackageDescription.Check (82): time: 0.68 allocs: 732.85
...
104 M: main:Distribution.Compat.GetShortPathName (9): time: 0.00 allocs: 3.62
105 M: main:Distribution.Compat.FilePath (8): time: 0.00 allocs: 3.46
===== Maximum Flows =====
000 M: main:Distribution.Simple (105): 1.000
001 M: main:Distribution.Simple.Configure (97): 0.346
...
104 M: main:Distribution.Simple.Program.Types (50): 0.002
105 M: main:Distribution.Simple.GHC.ImplInfo (46): 0.000
===== Flows x Time =====
000 M: main:Distribution.Simple (105): 0.175
001 M: main:Distribution.Simple.Configure (97): 0.127
...
104 M: main:Distribution.Backpack.PreExistingComponent (4): 0.000
105 M: main:Distribution.Simple.GHC.ImplInfo (46): 0.000
===== Statistics =====
longest path: 4.291s
parallelism score: 2.247
sequential time: 9.642s
```
In addition to this, the build graph is also emitted to the eventlog.
For each node in the build graph, an event is emitted to the eventlog of the form
```
node: { "node_id": 0, "node_deps": [0, 1,2,3], "node_desc": "GHC.Driver.Make" }
```
this allows external tooling to easily reconstruct the actual build
graph used by GHC and analyse it using external tools.
Diffstat (limited to 'docs/users_guide/debugging.rst')
-rw-r--r-- | docs/users_guide/debugging.rst | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/docs/users_guide/debugging.rst b/docs/users_guide/debugging.rst index 83d093cd06..1a1791bbbd 100644 --- a/docs/users_guide/debugging.rst +++ b/docs/users_guide/debugging.rst @@ -111,6 +111,28 @@ Dumping out compiler intermediate structures Show allocation and runtime statistics for various stages of compilation. Allocations are measured in bytes. Timings are measured in milliseconds. +.. ghc-flag:: -ddump-make-stats + :shortdesc: Dump information about the project build time and build graph. + :type: dynamic + + Show some statistics about the project build graph after compilation has finished. + These can be useful identifying bottlenecks in your projects module structure. + + The statistics which are currently outputted are: + + * The modules which took longest to compile. + * The modules which have the largest "flow". The initial flow is 1, and split + evenly between all roots of the dependency graph. The flow is propagated + through the graph, accumulated on each node and split evenly on children. + The result is that any synchronisation points will have a flow equal to 1, + and likewise other important modules will have a high flow value. + * The length of the longest (critical) path through the project. This provides + a lower bound on the projects compilation time. + * The "parallelism score" which is the sum of compiling all nodes divided by + the length of the critical path. This should be a more stable metric then + critical path length because it doesn't depend on how fast your computer is. + + GHC is a large program consisting of a number of stages. You can tell GHC to dump information from various stages of compilation using the ``-ddump-⟨pass⟩`` flags listed below. Note that some of these tend to produce a lot of output. |