Skip to content

Harden signalfd read semantics#7

Merged
jserv merged 1 commit intomainfrom
signalfd
May 5, 2026
Merged

Harden signalfd read semantics#7
jserv merged 1 commit intomainfrom
signalfd

Conversation

@jserv
Copy link
Copy Markdown
Contributor

@jserv jserv commented May 5, 2026

  1. Off-by-one in signal_collect_signalfd / signal_take_signalfd_exact excluded signum == LINUX_NSIG (SIGRTMAX, 64 on aarch64) from the iteration. Bare-musl applications targeting SIGRTMAX directly were silently dropped from signalfd reads even when the signal was present in sig_state.pending. Both loops now use inclusive bounds.
  2. signalfd_read previously took the rt-queue head before writing to the guest, which forced a re-queue on guest_write_small EFAULT. The re-queue path had three latent hazards Codex flagged in review: it could exceed RT_SIGQUEUE_MAX under concurrent same-signal pressure, it issued duplicate signalfd_notify writes that desynced the pipe-byte count from the actual pending-signal count (causing spurious EAGAIN on later blocking reads), and an earlier draft also risked duplicate delivery of records that already reached the guest. Restructured to peek -> write -> take only the prefix that wrote successfully. An EFAULT before any record lands returns -EFAULT with the rt-queue intact (preserving the elfuse promise locked in by tests/test-tier-b's test_signalfd_efault_preserves_pending). A partial fault after N>0 records returns N*sizeof(signalfd_siginfo) bytes and leaves the unwritten entries pending; if a concurrent consumer advanced the rt-queue head between peek and take, the read loop restarts via a retry label so the caller never sees stale records.
  3. Standard signals (1-31) used to fabricate SI_USER/proc_pid/proc_uid defaults at signalfd-read time, dropping the sender pid/uid and any sigval payload that sigqueue() supplied. Linux coalesces standard signals on the pending bitmask but preserves one siginfo for the pending instance. Mirrored that in sig_state via std_info[] + std_info_valid[]; new signal_queue_info() routes both standard and RT queued signals through one path. signal_deliver, signal_queue, signal_queue_rt, and sc_rt_tgsigqueueinfo all read or write through the new path. signal_default_info() consolidates the SI_USER fallback construction.
  4. Wire SYS_rt_sigqueueinfo (138) so glibc and musl sigqueue() works. sc_rt_sigqueueinfo is a thin forwarder to sc_rt_tgsigqueueinfo with tgid == tid == pid. Single-VM divergence: targeting a pid that is not the current guest returns -ESRCH because guest threads of another guest_t are unreachable. sc_rt_tgsigqueueinfo also now surfaces -EFAULT when guest_read_small fails on the siginfo pointer instead of silently queueing a zero-payload signal.

Per-thread blocked mask non-interference is documented inline at signalfd_read in src/syscall/fd.c: signalfd is the standard mechanism for reading signals blocked from synchronous delivery via sigprocmask, so consulting the pthread mask would defeat the purpose.

The hardening test covers RT multiplicity FIFO with distinct si_int payloads, standard-signal coalescing, SIGRTMAX reachability (the regression for the off-by-one), ssi_int / ssi_ptr sigval round-trip, sender ssi_pid / ssi_uid carry-through, signalfd-mask-only filtering, libc sigqueue() smoke, standard-signal sigqueue metadata round-trip, partial-fault preservation via mmap guard page, and rt_sigqueueinfo EFAULT on unreadable siginfo. The partial-fault assertion accepts both Linux's looser "lose the failed record" semantics and elfuse's stricter "preserve every unwritten record" semantics so the test is green under both runners.


Summary by cubic

Hardens signalfd reads to match Linux: includes SIGRTMAX, preserves sender metadata and sigval for standard signals, and switches to write-then-take to avoid lost or duplicated records. Implements SYS_rt_sigqueueinfo so libc sigqueue() works, and adds a focused test suite.

  • Bug Fixes

    • Iterate signals 1..64 inclusive; SIGRTMAX is now delivered via signalfd.
    • signalfd_read now peek → write → take; -EFAULT before any write leaves the queue intact; partial faults return partial bytes; avoids RT queue overflow, duplicate notifications, stale records; retries on races.
    • Surface -EFAULT in rt_tgsigqueueinfo when the siginfo pointer is unreadable; return -ESRCH for pids outside the current guest.
  • New Features

    • Preserve one siginfo for standard signals (1–31): sender pid/uid and sigval survive coalescing; ssi_int/ssi_ptr are populated.
    • Add signal_queue_info() and signal_default_info(); route standard and RT queued signals through one path.
    • Implement sc_rt_sigqueueinfo forwarding; libc sigqueue() works; accepts thread tids and routes to the process, matching Linux.
    • Add test-signalfd-hardening to the matrix: RT FIFO, standard coalescing, SIGRTMAX, sigval round‑trip, sender ids, mask-only filtering, libc sigqueue(), standard-signal metadata, partial‑fault behavior, unreadable‑siginfo EFAULT, foreign‑pid ESRCH, and thread‑tid routing.

Written for commit 3458687. Summary will update on new commits.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 9 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/syscall/syscall.c">

<violation number="1" location="src/syscall/syscall.c:760">
P2: `rt_sigqueueinfo` is incorrectly accepting thread IDs by forwarding `pid` as `tid`; it should be process/thread-group scoped and return `ESRCH` for non-process IDs.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread src/syscall/syscall.c
(void) x3;
(void) x4;
(void) x5;
return sc_rt_tgsigqueueinfo(g, x0, x0, x1, x2, 0, 0, verbose);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: rt_sigqueueinfo is incorrectly accepting thread IDs by forwarding pid as tid; it should be process/thread-group scoped and return ESRCH for non-process IDs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/syscall/syscall.c, line 760:

<comment>`rt_sigqueueinfo` is incorrectly accepting thread IDs by forwarding `pid` as `tid`; it should be process/thread-group scoped and return `ESRCH` for non-process IDs.</comment>

<file context>
@@ -717,25 +724,42 @@ static int64_t sc_rt_tgsigqueueinfo(guest_t *g,
+    (void) x3;
+    (void) x4;
+    (void) x5;
+    return sc_rt_tgsigqueueinfo(g, x0, x0, x1, x2, 0, 0, verbose);
+}
+
</file context>
Suggested change
return sc_rt_tgsigqueueinfo(g, x0, x0, x1, x2, 0, 0, verbose);
if ((int) x0 != (int) proc_get_pid())
return -LINUX_ESRCH;
return sc_rt_tgsigqueueinfo(g, x0, (uint64_t) proc_get_pid(), x1, x2, 0, 0,
verbose);

1. Off-by-one in signal_collect_signalfd / signal_take_signalfd_exact
   excluded signum == LINUX_NSIG (SIGRTMAX, 64 on aarch64) from the
   iteration. Bare-musl applications targeting SIGRTMAX directly were
   silently dropped from signalfd reads even when the signal was
   present in sig_state.pending. Both loops now use inclusive bounds.
2. signalfd_read previously took the rt-queue head before writing to
   the guest, which forced a re-queue on guest_write_small EFAULT. The
   re-queue path had three latent hazards Codex flagged in review: it
   could exceed RT_SIGQUEUE_MAX under concurrent same-signal pressure,
   it issued duplicate signalfd_notify writes that desynced the
   pipe-byte count from the actual pending-signal count (causing
   spurious EAGAIN on later blocking reads), and an earlier draft also
   risked duplicate delivery of records that already reached the guest.
   Restructured to peek -> write -> take only the prefix that wrote
   successfully. An EFAULT before any record lands returns -EFAULT
   with the rt-queue intact (preserving the elfuse promise locked in
   by tests/test-tier-b's test_signalfd_efault_preserves_pending). A
   partial fault after N>0 records returns N*sizeof(signalfd_siginfo)
   bytes and leaves the unwritten entries pending; if a concurrent
   consumer advanced the rt-queue head between peek and take, the
   read loop restarts via a retry label so the caller never sees
   stale records.
3. Standard signals (1-31) used to fabricate SI_USER/proc_pid/proc_uid
   defaults at signalfd-read time, dropping the sender pid/uid and any
   sigval payload that sigqueue() supplied. Linux coalesces standard
   signals on the pending bitmask but preserves one siginfo for the
   pending instance. Mirrored that in sig_state via std_info[] +
   std_info_valid[]; new signal_queue_info() routes both standard and
   RT queued signals through one path. signal_deliver, signal_queue,
   signal_queue_rt, and sc_rt_tgsigqueueinfo all read or write through
   the new path. signal_default_info() consolidates the SI_USER
   fallback construction.
4. Wire SYS_rt_sigqueueinfo (138) so glibc and musl sigqueue() works.
   sc_rt_sigqueueinfo is a thin forwarder to sc_rt_tgsigqueueinfo with
   tgid == tid == pid. Single-VM divergence: targeting a pid that is
   not the current guest returns -ESRCH because guest threads of
   another guest_t are unreachable. sc_rt_tgsigqueueinfo also now
   surfaces -EFAULT when guest_read_small fails on the siginfo
   pointer instead of silently queueing a zero-payload signal.

Per-thread blocked mask non-interference is documented inline at
signalfd_read in src/syscall/fd.c: signalfd is the standard mechanism
for reading signals blocked from synchronous delivery via sigprocmask,
so consulting the pthread mask would defeat the purpose.

The hardening test covers RT multiplicity FIFO with distinct si_int
payloads, standard-signal coalescing, SIGRTMAX reachability (the
regression for the off-by-one), ssi_int / ssi_ptr sigval round-trip,
sender ssi_pid / ssi_uid carry-through, signalfd-mask-only filtering,
libc sigqueue() smoke, standard-signal sigqueue metadata round-trip,
partial-fault preservation via mmap guard page, and rt_sigqueueinfo
EFAULT on unreadable siginfo. The partial-fault assertion accepts
both Linux's looser "lose the failed record" semantics and elfuse's
stricter "preserve every unwritten record" semantics so the test is
green under both runners.
@jserv jserv merged commit 39b1076 into main May 5, 2026
4 checks passed
@jserv jserv deleted the signalfd branch May 5, 2026 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant