summaryrefslogtreecommitdiff
path: root/man/systemd.exec.xml
diff options
context:
space:
mode:
Diffstat (limited to 'man/systemd.exec.xml')
-rw-r--r--man/systemd.exec.xml152
1 files changed, 117 insertions, 35 deletions
diff --git a/man/systemd.exec.xml b/man/systemd.exec.xml
index 3bd790b485..6419bee499 100644
--- a/man/systemd.exec.xml
+++ b/man/systemd.exec.xml
@@ -1,4 +1,4 @@
-<?xml version='1.0'?> <!--*- Mode: nxml; nxml-child-indent: 2; indent-tabs-mode: nil -*-->
+<?xml version='1.0'?>
<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
@@ -81,6 +81,9 @@
<refsect1>
<title>Paths</title>
+ <para>The following settings may be used to change a service's view of the filesystem. Please note that the paths
+ must be absolute and must not contain a <literal>..</literal> path component.</para>
+
<variablelist class='unit-directives'>
<varlistentry>
@@ -121,7 +124,16 @@
partition table, or a file system within an MBR/MS-DOS or GPT partition table with only a single
Linux-compatible partition, or a set of file systems within a GPT partition table that follows the <ulink
url="https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/">Discoverable Partitions
- Specification</ulink>.</para></listitem>
+ Specification</ulink>.</para>
+
+ <para>When <varname>DevicePolicy=</varname> is set to <literal>closed</literal> or <literal>strict</literal>,
+ or set to <literal>auto</literal> and <varname>DeviceAllow=</varname> is set, then this setting adds
+ <filename>/dev/loop-control</filename> with <constant>rw</constant> mode, <literal>block-loop</literal> and
+ <literal>block-blkext</literal> with <constant>rwm</constant> mode to <varname>DeviceAllow=</varname>. See
+ <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>
+ for the details about <varname>DevicePolicy=</varname> or <varname>DeviceAllow=</varname>. Also, see
+ <varname>PrivateDevices=</varname> below, as it may change the setting of <varname>DevicePolicy=</varname>.
+ </para></listitem>
</varlistentry>
<varlistentry>
@@ -738,6 +750,20 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting>
<refsect1>
<title>Sandboxing</title>
+ <para>The following sandboxing options are an effective way to limit the exposure of the system towards the unit's
+ processes. It is recommended to turn on as many of these options for each unit as is possible without negatively
+ affecting the process' ability to operate. Note that many of these sandboxing features are gracefully turned off on
+ systems where the underlying security mechanism is not available. For example, <varname>ProtectSystem=</varname>
+ has no effect if the kernel is built without file system namespacing or if the service manager runs in a container
+ manager that makes file system namespacing unavailable to its payload. Similar,
+ <varname>RestrictRealtime=</varname> has no effect on systems that lack support for SECCOMP system call filtering,
+ or in containers where support for this is turned off.</para>
+
+ <para>Also note that some sandboxing functionality is generally not available in user services (i.e. services run
+ by the per-user service manager). Specifically, the various settings requiring file system namespacing support
+ (such as <varname>ProtectSystem=</varname>) are not available, as the underlying kernel functionality is only
+ accessible to privileged processes.</para>
+
<variablelist>
<varlistentry>
@@ -755,9 +781,9 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting>
recommended to enable this setting for all long-running services, unless they are involved with system updates
or need to modify the operating system in other ways. If this option is used,
<varname>ReadWritePaths=</varname> may be used to exclude specific directories from being made read-only. This
- setting is implied if <varname>DynamicUser=</varname> is set. For this setting the same restrictions regarding
- mount propagation and privileges apply as for <varname>ReadOnlyPaths=</varname> and related calls, see
- below. Defaults to off.</para></listitem>
+ setting is implied if <varname>DynamicUser=</varname> is set. This setting cannot ensure protection in all
+ cases. In general it has the same limitations as <varname>ReadOnlyPaths=</varname>, see below. Defaults to
+ off.</para></listitem>
</varlistentry>
<varlistentry>
@@ -776,11 +802,11 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting>
<varname>ReadOnlyPaths=</varname>, and <literal>tmpfs</literal> is mostly equivalent to
<varname>TemporaryFileSystem=</varname>.</para>
- <para> It is recommended to enable this setting for all long-running services (in particular network-facing ones),
- to ensure they cannot get access to private user data, unless the services actually require access to the user's
- private data. This setting is implied if <varname>DynamicUser=</varname> is set. For this setting the same
- restrictions regarding mount propagation and privileges apply as for <varname>ReadOnlyPaths=</varname> and related
- calls, see below.</para></listitem>
+ <para> It is recommended to enable this setting for all long-running services (in particular network-facing
+ ones), to ensure they cannot get access to private user data, unless the services actually require access to
+ the user's private data. This setting is implied if <varname>DynamicUser=</varname> is set. This setting cannot
+ ensure protection in all cases. In general it has the same limitations as <varname>ReadOnlyPaths=</varname>,
+ see below.</para></listitem>
</varlistentry>
<varlistentry>
@@ -793,15 +819,18 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting>
<listitem><para>These options take a whitespace-separated list of directory names. The specified directory
names must be relative, and may not include <literal>..</literal>. If set, one or more
directories by the specified names will be created (including their parents) below the locations
- defined in the following table, when the unit is started.</para>
+ defined in the following table, when the unit is started. Also, the corresponding environment variable
+ is defined with the full path of directories. If multiple directories are set, then int the environment variable
+ the paths are concatenated with colon (<literal>:</literal>).</para>
<table>
- <title>Automatic directory creation</title>
- <tgroup cols='3'>
+ <title>Automatic directory creation and environment variables</title>
+ <tgroup cols='4'>
<thead>
<row>
<entry>Locations</entry>
<entry>for system</entry>
<entry>for users</entry>
+ <entry>Environment variable</entry>
</row>
</thead>
<tbody>
@@ -809,26 +838,31 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting>
<entry><varname>RuntimeDirectory=</varname></entry>
<entry><filename>/run</filename></entry>
<entry><varname>$XDG_RUNTIME_DIR</varname></entry>
+ <entry><varname>$RUNTIME_DIRECTORY</varname></entry>
</row>
<row>
<entry><varname>StateDirectory=</varname></entry>
<entry><filename>/var/lib</filename></entry>
<entry><varname>$XDG_CONFIG_HOME</varname></entry>
+ <entry><varname>$STATE_DIRECTORY</varname></entry>
</row>
<row>
<entry><varname>CacheDirectory=</varname></entry>
<entry><filename>/var/cache</filename></entry>
<entry><varname>$XDG_CACHE_HOME</varname></entry>
+ <entry><varname>$CACHE_DIRECTORY</varname></entry>
</row>
<row>
<entry><varname>LogsDirectory=</varname></entry>
<entry><filename>/var/log</filename></entry>
<entry><varname>$XDG_CONFIG_HOME</varname><filename>/log</filename></entry>
+ <entry><varname>$LOGS_DIRECTORY</varname></entry>
</row>
<row>
<entry><varname>ConfigurationDirectory=</varname></entry>
<entry><filename>/etc</filename></entry>
<entry><varname>$XDG_CONFIG_HOME</varname></entry>
+ <entry><varname>$CONFIGURATION_DIRECTORY</varname></entry>
</row>
</tbody>
</tgroup>
@@ -878,7 +912,13 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting>
<filename>/run/foo/bar</filename>, and <filename>/run/baz</filename>. The directories
<filename>/run/foo/bar</filename> and <filename>/run/baz</filename> except <filename>/run/foo</filename> are
owned by the user and group specified in <varname>User=</varname> and <varname>Group=</varname>, and removed
- when the service is stopped.</para></listitem>
+ when the service is stopped.</para>
+
+ <para>Example: if a system service unit has the following,
+ <programlisting>RuntimeDirectory=foo/bar
+StateDirectory=aaa/bbb ccc</programlisting>
+ then the environment variable <literal>RUNTIME_DIRECTORY</literal> is set with <literal>/run/foo/bar</literal>, and
+ <literal>STATE_DIRECTORY</literal> is set with <literal>/var/lib/aaa/bbb:/var/lib/ccc</literal>.</para></listitem>
</varlistentry>
<varlistentry>
@@ -934,8 +974,7 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting>
<varname>BindPaths=</varname>, or <varname>BindReadOnlyPaths=</varname> inside it. For a more flexible option,
see <varname>TemporaryFileSystem=</varname>.</para>
- <para>Note that restricting access with these options does not extend to submounts of a directory that are
- created later on. Non-directory paths may be specified as well. These options may be specified more than once,
+ <para>Non-directory paths may be specified as well. These options may be specified more than once,
in which case all paths listed will have limited access from within the namespace. If the empty string is
assigned to this option, the specific list is reset, and all prior assignments have no effect.</para>
@@ -947,11 +986,19 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting>
<literal>+</literal> on the same path make sure to specify <literal>-</literal> first, and <literal>+</literal>
second.</para>
- <para>Note that using this setting will disconnect propagation of mounts from the service to the host
- (propagation in the opposite direction continues to work). This means that this setting may not be used for
- services which shall be able to install mount points in the main mount namespace. Note that the effect of these
- settings may be undone by privileged processes. In order to set up an effective sandboxed environment for a
- unit it is thus recommended to combine these settings with either
+ <para>Note that these settings will disconnect propagation of mounts from the unit's processes to the
+ host. This means that this setting may not be used for services which shall be able to install mount points in
+ the main mount namespace. For <varname>ReadWritePaths=</varname> and <varname>ReadOnlyPaths=</varname>
+ propagation in the other direction is not affected, i.e. mounts created on the host generally appear in the
+ unit processes' namespace, and mounts removed on the host also disappear there too. In particular, note that
+ mount propagation from host to unit will result in unmodified mounts to be created in the unit's namespace,
+ i.e. writable mounts appearing on the host will be writable in the unit's namespace too, even when propagated
+ below a path marked with <varname>ReadOnlyPaths=</varname>! Restricting access with these options hence does
+ not extend to submounts of a directory that are created later on. This means the lock-down offered by that
+ setting is not complete, and does not offer full protection. </para>
+
+ <para>Note that the effect of these settings may be undone by privileged processes. In order to set up an
+ effective sandboxed environment for a unit it is thus recommended to combine these settings with either
<varname>CapabilityBoundingSet=~CAP_SYS_ADMIN</varname> or
<varname>SystemCallFilter=~@mount</varname>.</para></listitem>
</varlistentry>
@@ -1043,9 +1090,13 @@ BindReadOnlyPaths=/var/lib/systemd</programlisting>
Defaults to false. It is possible to run two or more units within the same private network namespace by using
the <varname>JoinsNamespaceOf=</varname> directive, see
<citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry> for
- details. Note that this option will disconnect all socket families from the host, this includes AF_NETLINK and
- AF_UNIX. The latter has the effect that AF_UNIX sockets in the abstract socket namespace will become
- unavailable to the processes (however, those located in the file system will continue to be accessible).</para>
+ details. Note that this option will disconnect all socket families from the host, including
+ <constant>AF_NETLINK</constant> and <constant>AF_UNIX</constant>. Effectively, for
+ <constant>AF_NETLINK</constant> this means that device configuration events received from
+ <citerefentry><refentrytitle>systemd-udevd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> are
+ not delivered to the unit's processes. And for <constant>AF_UNIX</constant> this has the effect that
+ <constant>AF_UNIX</constant> sockets in the abstract socket namespace of the host will become unavailable to
+ the unit's processes (however, those located in the file system will continue to be accessible).</para>
<para>Note that the implementation of this setting might be impossible (for example if network namespaces are
not available), and the unit should be written in a way that does not solely rely on this setting for
@@ -1311,10 +1362,7 @@ RestrictNamespaces=~cgroup net</programlisting>
settings (see the discussion in <varname>PrivateMounts=</varname> above) will implicitly disable mount and
unmount propagation from the unit's processes towards the host by changing the propagation setting of all mount
points in the unit's file system namepace to <option>slave</option> first. Setting this option to
- <option>shared</option> does not reestablish propagation in that case. Conversely, if this option is set, but
- no other file system namespace setting is used, then new file system namespaces will be created for the unit's
- processes and this propagation flag will be applied right away to all mounts within it, without the
- intermediary application of <option>slave</option>.</para>
+ <option>shared</option> does not reestablish propagation in that case.</para>
<para>If not set – but file system namespaces are enabled through another file system namespace unit setting –
<option>shared</option> mount propagation is used, but — as mentioned — as <option>slave</option> is applied
@@ -1597,7 +1645,13 @@ SystemCallErrorNumber=EPERM</programlisting>
<para>
See <citerefentry
project='man-pages'><refentrytitle>environ</refentrytitle><manvolnum>7</manvolnum></citerefentry> for details
- about environment variables.</para></listitem>
+ about environment variables.</para>
+
+ <para>Note that environment variables are not suitable for passing secrets (such as passwords, key material, …)
+ to service processes. Environment variables set for a unit are exposed to unprivileged clients via D-Bus IPC,
+ and generally not understood as being data that requires protection. Moreover, environment variables are
+ propagated down the process tree, including across security boundaries (such as setuid/setgid executables), and
+ hence might leak to processes that should not have access to the secret data.</para></listitem>
</varlistentry>
<varlistentry>
@@ -1739,7 +1793,13 @@ SystemCallErrorNumber=EPERM</programlisting>
<citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry> for more
details about named file descriptors and their ordering.</para>
- <para>This setting defaults to <option>null</option>.</para></listitem>
+ <para>This setting defaults to <option>null</option>.</para>
+
+ <para>Note that services which specify <option>DefaultDependencies=no</option> and use
+ <varname>StandardInput=</varname> or <varname>StandardOutput=</varname> with
+ <option>tty</option>/<option>tty-force</option>/<option>tty-fail</option>, should specify
+ <option>After=systemd-vconsole-setup.service</option>, to make sure that the tty intialization is
+ finished before they start.</para></listitem>
</varlistentry>
<varlistentry>
@@ -1749,8 +1809,8 @@ SystemCallErrorNumber=EPERM</programlisting>
of <option>inherit</option>, <option>null</option>, <option>tty</option>, <option>journal</option>,
<option>syslog</option>, <option>kmsg</option>, <option>journal+console</option>,
<option>syslog+console</option>, <option>kmsg+console</option>,
- <option>file:<replaceable>path</replaceable></option>, <option>socket</option> or
- <option>fd:<replaceable>name</replaceable></option>.</para>
+ <option>file:<replaceable>path</replaceable></option>, <option>append:<replaceable>path</replaceable></option>,
+ <option>socket</option> or<option>fd:<replaceable>name</replaceable></option>.</para>
<para><option>inherit</option> duplicates the file descriptor of standard input for standard output.</para>
@@ -1781,11 +1841,17 @@ SystemCallErrorNumber=EPERM</programlisting>
<para>The <option>file:<replaceable>path</replaceable></option> option may be used to connect a specific file
system object to standard output. The semantics are similar to the same option of
- <varname>StandardInput=</varname>, see above. If standard input and output are directed to the same file path,
- it is opened only once, for reading as well as writing and duplicated. This is particular useful when the
- specified path refers to an <constant>AF_UNIX</constant> socket in the file system, as in that case only a
+ <varname>StandardInput=</varname>, see above. If <replaceable>path</replaceable> refers to a regular file
+ on the filesystem, it is opened (created if it doesn't exist yet) for writing at the beginning of the file,
+ but without truncating it.
+ If standard input and output are directed to the same file path, it is opened only once, for reading as well
+ as writing and duplicated. This is particularly useful when the specified path refers to an
+ <constant>AF_UNIX</constant> socket in the file system, as in that case only a
single stream connection is created for both input and output.</para>
+ <para><option>append:<replaceable>path</replaceable></option> is similar to <option>file:<replaceable>path
+ </replaceable></option> above, but it opens the file in append mode.</para>
+
<para><option>socket</option> connects standard output to a socket acquired via socket activation. The
semantics are similar to the same option of <varname>StandardInput=</varname>, see above.</para>
@@ -1906,6 +1972,22 @@ StandardInputData=SWNrIHNpdHplIGRhIHVuJyBlc3NlIEtsb3BzLAp1ZmYgZWVtYWwga2xvcHAncy
</varlistentry>
<varlistentry>
+ <term><varname>LogRateLimitIntervalSec=</varname></term>
+ <term><varname>LogRateLimitBurst=</varname></term>
+
+ <listitem><para>Configures the rate limiting that is applied to messages generated by this unit. If, in the
+ time interval defined by <varname>LogRateLimitIntervalSec=</varname>, more messages than specified in
+ <varname>LogRateLimitBurst=</varname> are logged by a service, all further messages within the interval are
+ dropped until the interval is over. A message about the number of dropped messages is generated. The time
+ specification for <varname>LogRateLimitIntervalSec=</varname> may be specified in the following units: "s",
+ "min", "h", "ms", "us" (see
+ <citerefentry><refentrytitle>systemd.time</refentrytitle><manvolnum>7</manvolnum></citerefentry> for details).
+ The default settings are set by <varname>RateLimitIntervalSec=</varname> and <varname>RateLimitBurst=</varname>
+ configured in <citerefentry><refentrytitle>journald.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><varname>SyslogIdentifier=</varname></term>
<listitem><para>Sets the process name ("<command>syslog</command> tag") to prefix log lines sent to the logging