| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In production we've had several incidents over the years where a process
has a signal handler registered for SIGHUP or one of the SIGUSR signals
which can be used to signal a request to reload configs, rotate log
files, and the like. While this may seem harmless enough, what we've
seen happen repeatedly is something like the following:
1. A process is using SIGHUP/SIGUSR[12] to request some
application-handled state change -- reloading configs, rotating a log
file, etc;
2. This kind of request is deprecated and removed, so the signal handler
is removed. However, a site where the signal might be sent from is
missed (often logrotate or a service manager);
3. Because the default disposition of these signals is terminal, sooner
or later these applications are going to be sent SIGHUP or similar
and end up unexpectedly killed.
I know for a fact that we're not the only organisation experiencing
this: in general, signal use is pretty tricky to reason about and safely
remove because of the fairly aggressive SIG_DFL behaviour for some
common signals, especially for SIGHUP which has a particularly ambiguous
meaning. Especially in a large, highly interconnected codebase,
reasoning about signal interactions between system configuration and
applications can be highly complex, and it's inevitable that on occasion
a callsite will be missed.
In some cases the right call to avoid this will be to migrate services
towards other forms of IPC for this purpose, but inevitably there will
be some services which must continue using signals, so we need a safe
way to support them.
This patch adds support for the -H/--require-handler flag, which matches
on processes with a userspace handler present for the signal being sent.
With this flag we can enforce that all SIGHUP reload cases and SIGUSR
equivalents use --require-handler. This effectively mitigates the case
we've seen time and time again where SIGHUP is used to rotate log files
or reload configs, but the sending site is mistakenly left present after
the removal of signal handler, resulting in unintended termination of
the process.
Signed-off-by: Chris Down <chris@chrisdown.name>
|
|
|
|
|
|
|
|
| |
Commit c8384e682c1c ("pgrep: add pwait") changed from the old i_am_pkill
logic, but mistakenly missed a break in the pkill case. This results in
showing -e/--echo twice when running `pkill -h'.
Signed-off-by: Chris Down <chris@chrisdown.name>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace AC_CHECK_FUNC by AC_CHECK_FUNCS otherwise HAVE_PIDFD_OPEN will
never be defined resulting in the following build failure if pidfd_open
is available but __NR_pidfd_open is not available:
pgrep.c: In function 'pidfd_open':
pgrep.c:748:17: error: '__NR_pidfd_open' undeclared (first use in this function); did you mean 'pidfd_open'?
748 | return syscall(__NR_pidfd_open, pid, flags);
| ^~~~~~~~~~~~~~~
| pidfd_open
This build failure is raised since the addition of pwait in version
3.3.17 and
https://gitlab.com/procps-ng/procps/-/commit/c8384e682c1cfb3b2dc797e0f8a3cbaaccf7a3da
Fixes:
- http://autobuild.buildroot.org/results/f23a5156e641b2ebdd673973dec0f9c87760c688
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
|
|
|
|
|
|
|
| |
Previously we mistakenly only checked one previous level of the
hierarchy.
Signed-off-by: Chris Down <chris@chrisdown.name>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pgrep and friends naturally filter their own processes from their
matches. The same issue can occur when elevating with tools like sudo or
doas, where the elevating shim layers linger as a parent and are
returned in the results. For example:
% sudo pkill -9 -cf someelevatedcmdline
1
zsh: killed sudo pkill -9 -cf someelevatedcmdline
This is a situation we've actually seen in production, where some poor
soul changes how permission management works (for example with Linux's
hidepid option), needs to elevate a pgrep or pkill call, and now ends up
with more than they bargained for. Even after the issue is noticed,
resolving it requires reinventing some of the pgrep logic, which is
unfortunate.
This commit adds the -A/--ignore-ancestors option which excludes pgrep's
ancestors from the results:
% sudo ./pkill -9 -Acf someelevatedcmdline
0
We looks at multiple layers of the process hierarchy because, while
things like sudo only have one layer of shimming, some mechanisms (like
those found in a typical container manager like those found in Docker or
Kubernetes) may have many more.
Signed-off-by: Chris Down <chris@chrisdown.name>
|
|
|
|
|
|
| |
All the dependent programs needed to have their includes moved too
Signed-off-by: Craig Small <csmall@dropbear.xyz>
|
|
*.c -> src/
ps/* src/ps/
top/* src/top/
Signed-off-by: Craig Small <csmall@dropbear.xyz>
|