summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMatthew Pickering <matthewtpickering@gmail.com>2021-02-19 14:24:40 +0000
committerMatthew Pickering <matthewtpickering@gmail.com>2021-12-06 16:27:35 +0000
commitd72720f9b75fbed51a24edfb691d0c77c6e96dbe (patch)
treea8e460da6d257e0d5c739d1a7056a296fd68997f
parenta9e035a430c7fdc228d56d21b27b3b8e815fd06b (diff)
downloadhaskell-d72720f9b75fbed51a24edfb691d0c77c6e96dbe.tar.gz
Add section to the user guide about OS memory usage
-rw-r--r--docs/users_guide/hints.rst101
-rw-r--r--docs/users_guide/profiling.rst4
-rw-r--r--docs/users_guide/runtime_control.rst7
3 files changed, 110 insertions, 2 deletions
diff --git a/docs/users_guide/hints.rst b/docs/users_guide/hints.rst
index ea7ff0e9fb..0d011a2563 100644
--- a/docs/users_guide/hints.rst
+++ b/docs/users_guide/hints.rst
@@ -427,5 +427,102 @@ Inlining generics
There are also flags specific to the inlining of generics:
-:ghc-flag:`-finline-generics`
-:ghc-flag:`-finline-generics-aggressively`
+* :ghc-flag:`-finline-generics`
+* :ghc-flag:`-finline-generics-aggressively`
+
+
+.. _hints-os-memory:
+
+Understanding how OS memory usage corresponds to live data
+----------------------------------------------------------
+
+A confusing aspect about the RTS is the sometimes big difference between
+OS reported memory usage and
+the amount of live data reported by heap profiling or ``GHC.Stats``.
+
+There are two main factors which determine OS memory usage.
+
+Firstly the collection strategy used by the oldest generation. By default a copying
+strategy is used which requires at least 2 times the amount of currently live
+data in order to perform a major collection. For example, if your program's live data
+is 1G then you would expect the OS to report at minimum 2G.
+
+If instead you are using the compacting (:rts-flag:`-c`) or nonmoving (:rts-flag:`-xn`) strategies
+for the
+oldest generation then less overhead is required as the strategy immediately
+reuses already allocated memory by overwriting. For a program with heap size
+1G then you might expect the OS to report at minimum a small percentage above 1G.
+
+Secondly, after doing some allocation GHC is quite reluctant to return
+the memory to the OS. This is because after performing a major collection the program might
+still be allocating a lot and it costs to have to request
+more memory. Therefore the RTS keeps an extra amount to reuse which
+depends on the :rts-flag:`-F ⟨factor⟩` option. By default
+the RTS will keep up to ``(2 + F) * live_bytes`` after performing a major collection due to
+exhausting the available heap. The default value is ``F = 2`` so you
+can see OS memory usage reported to be as high as 4 times the amount used by your
+program.
+
+Without further intervention, once your program has topped out at this high
+threshold, no more memory would be returned to the OS so memory usage would always remain
+at 4 times the live data. If you had a server with 1.5G live data, then if there was a memory
+spike up to 6G for a short period, then OS reported memory would never dip below 6G. This
+is what happened before GHC 9.2. In GHC 9.2 memory is gradually returned to the OS so OS memory
+usage returns closer to the theoretical minimums.
+
+The :rts-flag:`-Fd ⟨factor⟩` option controls the rate at which memory is returned to
+the OS. On consecutive major collections which are not triggered by heap overflows, a
+counter (``t``) is increased and the ``F`` factor is inversly scaled according to the
+value of ``t`` and ``Fd``. The factor is scaled by the equation:
+
+.. math::
+
+ \texttt{F}' = \texttt{F} \times {2 ^ \frac{- \texttt{t}}{\texttt{Fd}}}
+
+By default ``Fd = 4``, increasing ``Fd`` decreases the rate memory is returned.
+
+Major collections which are not triggered by heap overflows arise mainly in two ways.
+
+ 1. Idle collections (controlled by :rts-flag:`-I ⟨seconds⟩`)
+ 2. Explicit trigger using ``performMajorGC``.
+
+For example, idle collections happen by default after 0.3 seconds of inactivity.
+If you are running your application and have also set ``-Iw30``, so that the minimum
+period between idle GCs is 30 seconds, then say you do a small amount of work every 5 seconds,
+there will be about 10 idle collections about 5 minutes. This number of consecutive
+idle collections will scale the ``F`` factor as follows:
+
+.. math::
+
+ \texttt{F}' = 2 \times {2^{\frac{-10}{4}}} \approx 0.35
+
+and hence we will only retain ``(0.35 + 2) * live_bytes``
+rather than the original 4 times. If you want less frequent idle collections then
+you should also decrease ``Fd`` so that more memory is returned each time
+a collection takes place.
+
+If you set ``-Fd0`` then GHC will not attempt to return memory, which corresponds
+with the behaviour from releases prior to 9.2. You probably don't want to do this as
+unless you have idle periods in your program the behaviour will be similar anyway.
+If you want to retain a specific amount of memory then it's better to set ``-H1G``
+in order to communicate that you are happy with a heap size of ``1G``. If you do this
+then OS reported memory will never decrease below this amount if it ever reaches this
+threshold.
+
+The collecting strategy also affects the fragmentation of the heap and hence how easy
+it is to return memory to a theoretical baseline. Memory is allocated firstly
+in the unit of megablocks which is then further divided into blocks. Block-level
+fragmentation is how much unused space within the allocated megablocks there is.
+In a fragmented heap there will be many megablocks which are only partially full.
+
+In theory the compacting
+strategy has a lower memory baseline but practically it can be hard to reach the
+baseline due to how compacting never defragments. On the other hand, the copying
+collecting has a higher theoretical baseline but we can often get very close to
+it because the act of copying leads to lower fragmentation.
+
+There are some other flags which affect the amount of retained memory as well.
+Setting the maximum heap size using :rts-flag:`-M ⟨size⟩` will make sure we don't try
+and retain more memory than the maximum size and explicitly setting :rts-flag:`-H [⟨size⟩]`
+will mean that we will always try and retain at least ``H`` bytes irrespective of
+the amount of live data.
diff --git a/docs/users_guide/profiling.rst b/docs/users_guide/profiling.rst
index f5a99c82a4..0aa437a4dc 100644
--- a/docs/users_guide/profiling.rst
+++ b/docs/users_guide/profiling.rst
@@ -746,6 +746,10 @@ You might also want to take a look at
`hp2any <https://www.haskell.org/haskellwiki/Hp2any>`__, a more advanced
suite of tools (not distributed with GHC) for displaying heap profiles.
+Note that there might be a big difference between the OS reported memory usage
+of your program and the amount of live data as reported by heap profiling.
+The reasons for the difference are explained in :ref:`hints-os-memory`.
+
.. _rts-options-heap-prof:
RTS options for heap profiling
diff --git a/docs/users_guide/runtime_control.rst b/docs/users_guide/runtime_control.rst
index 8f8b9b3fcb..9ebc5db7f3 100644
--- a/docs/users_guide/runtime_control.rst
+++ b/docs/users_guide/runtime_control.rst
@@ -730,6 +730,11 @@ performance.
and too small an interval could adversely affect interactive
responsiveness.
+ The idle period timer only resets after some activity
+ by a Haskell thread. If your program is doing literally nothing then
+ after the first idle collection is triggered then no more future collections
+ will be scheduled until more work is performed.
+
This is an experimental feature, please let us know if it causes
problems and/or could benefit from further tuning.
@@ -961,6 +966,8 @@ performance.
calling the ``getRTSStats()`` function from C, or
``GHC.Stats.getRTSStats`` from Haskell.
+
+
.. _rts-options-statistics:
RTS options to produce runtime statistics