summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>2022-04-26 22:04:31 +0200
committerZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>2022-04-28 15:46:44 +0200
commit6f83ea60e90b18e44cc979834aae2947afa66834 (patch)
tree3737967c9eda6e1961ae4d0dbd66727d1227a17d
parentc0a96b1b1d19a06a3828885b10a275c423a5e6f2 (diff)
downloadsystemd-6f83ea60e90b18e44cc979834aae2947afa66834.tar.gz
man: beef up the description of systemd-oomd.service
The gist of the description is moved from systemd.resource-control to systemd-oomd man page. Cross-references to OOMPolicy, memory.oom.group, oomctl, ManagedOOMSwap and ManagedOOMMemoryPressure are added in all places. The descriptions are also more down-to-earth: instead of talking about "taking action" let's just say "kill". We *might* add configuration for different actions in the future, but we're not there yet, so let's just describe what we do now.
-rw-r--r--man/systemd-oomd.service.xml79
-rw-r--r--man/systemd.resource-control.xml30
-rw-r--r--man/systemd.service.xml14
3 files changed, 71 insertions, 52 deletions
diff --git a/man/systemd-oomd.service.xml b/man/systemd-oomd.service.xml
index e87a753987..11c9237645 100644
--- a/man/systemd-oomd.service.xml
+++ b/man/systemd-oomd.service.xml
@@ -29,23 +29,36 @@
<refsect1>
<title>Description</title>
- <para><command>systemd-oomd</command> is a system service that uses cgroups-v2 and pressure stall information (PSI)
- to monitor and take action on processes before an OOM occurs in kernel space.</para>
-
- <para>You can enable monitoring and actions on units by setting <varname>ManagedOOMSwap=</varname> and/or
- <varname>ManagedOOMMemoryPressure=</varname> to the appropriate value. <command>systemd-oomd</command> will
- periodically poll enabled units' cgroup data to detect when corrective action needs to occur. When an action needs
- to happen, it will only be performed on the descendant cgroups of the enabled units. More precisely, only cgroups with
- <filename>memory.oom.group</filename> set to <constant>1</constant> and leaf cgroup nodes are eligible candidates.
- Action will be taken recursively on all of the processes under the chosen candidate.</para>
-
- <para>See
- <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>
+ <para><command>systemd-oomd</command> is a system service that uses cgroups-v2 and pressure stall
+ information (PSI) to monitor and take corrective action before an OOM occurs in the kernel space.</para>
+
+ <para>You can enable monitoring and actions on units by setting <varname>ManagedOOMSwap=</varname> and
+ <varname>ManagedOOMMemoryPressure=</varname> in the unit configuration, see
+ <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>.
+ <command>systemd-oomd</command> retrieves information about such units from <command>systemd</command>
+ when it starts and watches for subsequent changes.</para>
+
+ <para>Cgroups of units with <varname>ManagedOOMSwap=</varname> or
+ <varname>ManagedOOMMemoryPressure=</varname> set to <option>kill</option> will be monitored.
+ <command>systemd-oomd</command> periodically polls PSI statistics for the system and those cgroups to
+ decide when to take action. If the configured limits are exceeded, <command>systemd-oomd</command> will
+ select a cgroup to terminate, and send <constant>SIGKILL</constant> to all processes in it. Note that
+ only descendant cgroups are eligible candidates for killing; the unit with its property set to
+ <option>kill</option> is not a candidate (unless one of its ancestors set their property to
+ <option>kill</option>). Also only leaf cgroups and cgroups with <filename>memory.oom.group</filename> set
+ to <constant>1</constant> are eligible candidates; see <varname>OOMPolicy=</varname> in
+ <citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>.
+ </para>
+
+ <para><citerefentry><refentrytitle>oomctl</refentrytitle><manvolnum>1</manvolnum></citerefentry> can
+ be used to list monitored cgroups and pressure information.</para>
+
+ <para>See <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>
for more information about the configuration of this service.</para>
</refsect1>
<refsect1>
- <title>Setup Information</title>
+ <title>System requirements and configuration</title>
<para>The system must be running systemd with a full unified cgroup hierarchy for the expected cgroups-v2 features.
Furthermore, memory accounting must be turned on for all units monitored by <command>systemd-oomd</command>.
@@ -53,23 +66,25 @@
is set to <constant>true</constant> in
<citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para>
- <para>You will need a kernel compiled with PSI support. This is available in Linux 4.20 and above.</para>
+ <para>The kernel must be compiled with PSI support. This is available in Linux 4.20 and above.</para>
- <para>It is highly recommended for the system to have swap enabled for <command>systemd-oomd</command> to function
- optimally. With swap enabled, the system spends enough time swapping pages to let <command>systemd-oomd</command> react.
- Without swap, the system enters a livelocked state much more quickly and may prevent <command>systemd-oomd</command>
- from responding in a reasonable amount of time. See
- <ulink url="https://chrisdown.name/2018/01/02/in-defence-of-swap.html">"In defence of swap: common misconceptions"</ulink>
- for more details on swap. Any swap-based actions on systems without swap will be ignored. While
- <command>systemd-oomd</command> can perform pressure-based actions on a system without swap, the pressure increases
- will be more abrupt and may require more tuning to get the desired thresholds and behavior.</para>
+ <para>It is highly recommended for the system to have swap enabled for <command>systemd-oomd</command> to
+ function optimally. With swap enabled, the system spends enough time swapping pages to let
+ <command>systemd-oomd</command> react. Without swap, the system enters a livelocked state much more
+ quickly and may prevent <command>systemd-oomd</command> from responding in a reasonable amount of
+ time. See <ulink url="https://chrisdown.name/2018/01/02/in-defence-of-swap.html">"In defence of swap:
+ common misconceptions"</ulink> for more details on swap. Any swap-based actions on systems without swap
+ will be ignored. While <command>systemd-oomd</command> can perform pressure-based actions on such a
+ system, the pressure increases will be more abrupt and may require more tuning to get the desired
+ thresholds and behavior.</para>
<para>Be aware that if you intend to enable monitoring and actions on <filename>user.slice</filename>,
- <filename>user-$UID.slice</filename>, or their ancestor cgroups, it is highly recommended that your programs be
- managed by the systemd user manager to prevent running too many processes under the same session scope (and thus
- avoid a situation where memory intensive tasks trigger <command>systemd-oomd</command> to kill everything under the
- cgroup). If you're using a desktop environment like GNOME, it already spawns many session components with the
- systemd user manager.</para>
+ <filename>user-$UID.slice</filename>, or their ancestor cgroups, it is highly recommended that your
+ programs be managed by the systemd user manager to prevent running too many processes under the same
+ session scope (and thus avoid a situation where memory intensive tasks trigger
+ <command>systemd-oomd</command> to kill everything under the cgroup). If you're using a desktop
+ environment like GNOME or KDE, it already spawns many session components with the systemd user manager.
+ </para>
</refsect1>
<refsect1>
@@ -79,11 +94,11 @@
<filename>-.slice</filename>, and allowing all descendant cgroups to be eligible candidates may make the most
sense.</para>
- <para><varname>ManagedOOMMemoryPressure=</varname> tends to work better on the cgroups below the root slice
- <filename>-.slice</filename>. For units which tend to have processes that are less latency sensitive (e.g.
- <filename>system.slice</filename>), a higher limit like the default of 60% may be acceptable, as those processes
- can usually ride out slowdowns caused by lack of memory without serious consequences. However, something like
- <filename>user@$UID.service</filename> may prefer a much lower value like 40%.</para>
+ <para><varname>ManagedOOMMemoryPressure=</varname> tends to work better on the cgroups below the root
+ slice. For units which tend to have processes that are less latency sensitive (e.g.
+ <filename>system.slice</filename>), a higher limit like the default of 60% may be acceptable, as those
+ processes can usually ride out slowdowns caused by lack of memory without serious consequences. However,
+ something like <filename>user@$UID.service</filename> may prefer a much lower value like 40%.</para>
</refsect1>
<refsect1>
diff --git a/man/systemd.resource-control.xml b/man/systemd.resource-control.xml
index d9edb6ab74..ce03a2f1a6 100644
--- a/man/systemd.resource-control.xml
+++ b/man/systemd.resource-control.xml
@@ -1108,24 +1108,24 @@ DeviceAllow=/dev/loop-control
<citerefentry><refentrytitle>systemd-oomd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>
will act on this unit's cgroups. Defaults to <option>auto</option>.</para>
- <para>When set to <option>kill</option>, <command>systemd-oomd</command> will actively monitor this unit's
- cgroup metrics to decide whether it needs to act. If the cgroup passes the limits set by
- <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> or its
- overrides, <command>systemd-oomd</command> will send a <constant>SIGKILL</constant> to all of the processes
- under the chosen candidate cgroup. Note that only descendant cgroups can be eligible candidates for killing;
- the unit that set its property to <option>kill</option> is not a candidate (unless one of its ancestors set
- their property to <option>kill</option>). You can find more details on candidates and kill behavior at
+ <para>When set to <option>kill</option>, the unit becomes a candidate for monitoring by
+ <command>systemd-oomd</command>. If the cgroup passes the limits set by
+ <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> or
+ the unit configuration, <command>systemd-oomd</command> will select a descendant cgroup and send
+ <constant>SIGKILL</constant> to all of the processes under it. You can find more details on
+ candidates and kill behavior at
<citerefentry><refentrytitle>systemd-oomd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>
- and <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>. Setting
- either of these properties to <option>kill</option> will also automatically acquire
+ and
+ <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para>
+
+ <para>Setting either of these properties to <option>kill</option> will also result in
<varname>After=</varname> and <varname>Wants=</varname> dependencies on
- <filename>systemd-oomd.service</filename> unless <varname>DefaultDependencies=no</varname>.
- </para>
+ <filename>systemd-oomd.service</filename> unless <varname>DefaultDependencies=no</varname>.</para>
- <para>When set to <option>auto</option>, <command>systemd-oomd</command> will not actively use this cgroup's
- data for monitoring and detection. However, if an ancestor cgroup has one of these properties set to
- <option>kill</option>, a unit with <option>auto</option> can still be an eligible candidate for
- <command>systemd-oomd</command> to act on.</para>
+ <para>When set to <option>auto</option>, <command>systemd-oomd</command> will not actively use this
+ cgroup's data for monitoring and detection. However, if an ancestor cgroup has one of these
+ properties set to <option>kill</option>, a unit with <option>auto</option> can still be a candidate
+ for <command>systemd-oomd</command> to terminate.</para>
</listitem>
</varlistentry>
diff --git a/man/systemd.service.xml b/man/systemd.service.xml
index 4e4a9732e4..ad303d440b 100644
--- a/man/systemd.service.xml
+++ b/man/systemd.service.xml
@@ -1130,8 +1130,12 @@
killed by the kernel's OOM killer this is logged but the service continues running. If set to
<constant>stop</constant> the event is logged but the service is terminated cleanly by the service
manager. If set to <constant>kill</constant> and one of the service's processes is killed by the OOM
- killer the kernel is instructed to kill all remaining processes of the service, too. Defaults to the
- setting <varname>DefaultOOMPolicy=</varname> in
+ killer the kernel is instructed to kill all remaining processes of the service too, by setting the
+ <filename>memory.oom.group</filename> attribute to <constant>1</constant>; also see <ulink
+ url="https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html">kernel documentation</ulink>.
+ </para>
+
+ <para>Defaults to the setting <varname>DefaultOOMPolicy=</varname> in
<citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>
is set to, except for services where <varname>Delegate=</varname> is turned on, where it defaults to
<constant>continue</constant>.</para>
@@ -1142,9 +1146,9 @@
<citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry> for
details.</para>
- <para>This setting also applies to <command>systemd-oomd</command>, similar to kernel OOM kills
- this setting determines the state of the service after <command>systemd-oomd</command> kills a cgroup associated
- with the service.</para></listitem>
+ <para>This setting also applies to <command>systemd-oomd</command>, similar to the kernel OOM kills
+ this setting determines the state of the service after <command>systemd-oomd</command> kills a cgroup
+ associated with the service.</para></listitem>
</varlistentry>
</variablelist>