Upstream issue draft — maddy: nil-pointer dereference in moduleReload teardown (v0.9.4)
Target repo: https://github.com/foxcpp/maddy
Issue type: Bug report
Status (local): filed in our TODO.md as R-4 — defensive systemd override
landed on the affected host 2026-05-19 (drop-in clears ExecReload=);
upstream report below pending submission to GitHub.
Title
SIGSEGV in moduleReload teardown — daemon panics on config reload (v0.9.4)
Summary
systemctl reload maddy (which sends SIGUSR2 to the maddy process via the
unit file's ExecReload=/bin/kill -USR2 $MAINPID) consistently crashes the
running daemon with a nil-pointer dereference in moduleReload.func3 at
maddy.go:520. The new server starts successfully and binds its listeners
before the panic — the crash is in the teardown of the old server, not in
configuration parsing. systemd then transitions the service to
exit-code/INVALIDARGUMENT (status=2), leaving the host with no mail
service running until systemctl restart maddy is issued.
Reproduces 100% of the time on a Debian-based VPS running maddy 0.9.4
with a typical mail-server configuration (port 25 + Tailscale-bound 587 + LMTP
target + rspamd check).
Environment
- maddy version: 0.9.4 (per
journalctl: new server started {"version":"0.9.4"})
- OS: Debian 12 (bookworm), systemd-based
- Configuration shape:
smtp tcp://0.0.0.0:25 { ... check { dkim spf rspamd } ... }
smtp tcp://<tailscale-ip>:587 { ... } (submission on a private interface)
target.lmtp inbound_bridge { targets tcp://127.0.0.1:8025 }
target.remote outbound_delivery { ... }
target.queue outbound_queue { ... }
- TLS: Let's Encrypt fullchain + key files
Steps to reproduce
- Boot a maddy service with any configuration that uses both an SMTP
listener stanza and a target.lmtp plus target.remote (we have not
isolated which module triggers the panic).
- Make a trivial edit to
/etc/maddy/maddy.conf (we changed only the
top-of-file comment block — no listener or check-block edits).
- Run
systemctl reload maddy.
- Observe the panic in
journalctl -u maddy.
Observed log
maddy[160492]: signal received (user defined signal 2), reloading configuration
maddy[160492]: reloading server...
systemd[1]: Reloaded maddy.service - maddy mail server.
maddy[160492]: loading new configuration...
maddy[160492]: configuration loaded
maddy[160492]: starting new server
maddy[160492]: smtp: listening on tcp://0.0.0.0:25
maddy[160492]: smtp: listening on tcp://<tailscale-ip>:587
maddy[160492]: new server started {"version":"0.9.4"}
maddy[160492]: stopping old server
maddy[160492]: old server stopped
maddy[160492]: panic: runtime error: invalid memory address or nil pointer dereference
maddy[160492]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x7f6d18708689]
maddy[160492]: goroutine 5230 [running]:
maddy[160492]: github.com/foxcpp/maddy.moduleReload.func3()
maddy[160492]: github.com/foxcpp/maddy/maddy.go:520 +0x109
maddy[160492]: created by github.com/foxcpp/maddy.moduleReload in goroutine 1
maddy[160492]: github.com/foxcpp/maddy/maddy.go:507 +0x296
systemd[1]: maddy.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
systemd[1]: maddy.service: Failed with result 'exit-code'.
Expected
systemctl reload maddy should atomically swap to the new configuration
without dropping the daemon, exactly the case the "new server started" /
"old server stopped" log lines describe.
Actual
The atomic-swap teardown panics. The new server has already bound its
listeners (and accepted no traffic in the brief window), but the panicking
goroutine takes the whole process down with it.
Workaround we use
Set ExecReload= (empty) in a systemd drop-in to make
systemctl reload maddy fail cleanly with "Job type reload is not
applicable" instead of crashing. Operationally we use systemctl restart maddy for all config changes (including the certbot renewal deploy hook).
Code pointer + likely root cause
maddy.go:520 in moduleReload.func3 (the goroutine spawned at :507)
is the panic site. Reading the v0.9.4 source: line 520 is
oldContainer.DefaultLogger.Out.Close() inside the async goroutine
that runs immediately after the "old server stopped" log message.
The crash matches Out being nil on the old container's logger — most
likely moduleStop (or whatever teardown ran before this goroutine
fires) cleared / closed the logger's underlying writer, leaving Out
nil before .Close() is called on it.
Suggested guard: nil-check oldContainer.DefaultLogger.Out before
the Close() call, or move the close into the synchronous teardown
path so it can't race with whatever zeroed Out.
The SIGUSR2 reload mechanism itself was introduced in #750
(2026-03-25); the bug appears to be in the reload teardown half of
that feature, not in the new-server-start half.
Happy to test a patch or provide a minimal reproducer if helpful.
Why this might be hard to spot in CI
The crash is in teardown of the old server — the new server reports a
healthy start. Anyone testing systemctl reload by observing port
binding would see "yes, listeners are up" before the panic. We caught it
only because we read journalctl -u maddy after the reload and noticed
the daemon had exited.
Upstream issue draft — maddy: nil-pointer dereference in moduleReload teardown (v0.9.4)
Target repo: https://github.com/foxcpp/maddy
Issue type: Bug report
Status (local): filed in our TODO.md as R-4 — defensive systemd override
landed on the affected host 2026-05-19 (drop-in clears
ExecReload=);upstream report below pending submission to GitHub.
Title
SIGSEGV in moduleReload teardown — daemon panics on config reload(v0.9.4)Summary
systemctl reload maddy(which sendsSIGUSR2to the maddy process via theunit file's
ExecReload=/bin/kill -USR2 $MAINPID) consistently crashes therunning daemon with a nil-pointer dereference in
moduleReload.func3atmaddy.go:520. The new server starts successfully and binds its listenersbefore the panic — the crash is in the teardown of the old server, not in
configuration parsing. systemd then transitions the service to
exit-code/INVALIDARGUMENT (status=2), leaving the host with no mailservice running until
systemctl restart maddyis issued.Reproduces 100% of the time on a Debian-based VPS running
maddy 0.9.4with a typical mail-server configuration (port 25 + Tailscale-bound 587 + LMTP
target + rspamd check).
Environment
journalctl:new server started {"version":"0.9.4"})smtp tcp://0.0.0.0:25 { ... check { dkim spf rspamd } ... }smtp tcp://<tailscale-ip>:587 { ... }(submission on a private interface)target.lmtp inbound_bridge { targets tcp://127.0.0.1:8025 }target.remote outbound_delivery { ... }target.queue outbound_queue { ... }Steps to reproduce
listener stanza and a
target.lmtpplustarget.remote(we have notisolated which module triggers the panic).
/etc/maddy/maddy.conf(we changed only thetop-of-file comment block — no listener or check-block edits).
systemctl reload maddy.journalctl -u maddy.Observed log
Expected
systemctl reload maddyshould atomically swap to the new configurationwithout dropping the daemon, exactly the case the "new server started" /
"old server stopped" log lines describe.
Actual
The atomic-swap teardown panics. The new server has already bound its
listeners (and accepted no traffic in the brief window), but the panicking
goroutine takes the whole process down with it.
Workaround we use
Set
ExecReload=(empty) in a systemd drop-in to makesystemctl reload maddyfail cleanly with "Job type reload is notapplicable" instead of crashing. Operationally we use
systemctl restart maddyfor all config changes (including the certbot renewal deploy hook).Code pointer + likely root cause
maddy.go:520inmoduleReload.func3(the goroutine spawned at:507)is the panic site. Reading the v0.9.4 source: line 520 is
oldContainer.DefaultLogger.Out.Close()inside the async goroutinethat runs immediately after the "old server stopped" log message.
The crash matches
Outbeing nil on the old container's logger — mostlikely
moduleStop(or whatever teardown ran before this goroutinefires) cleared / closed the logger's underlying writer, leaving
Outnil before
.Close()is called on it.Suggested guard: nil-check
oldContainer.DefaultLogger.Outbeforethe
Close()call, or move the close into the synchronous teardownpath so it can't race with whatever zeroed
Out.The SIGUSR2 reload mechanism itself was introduced in #750
(2026-03-25); the bug appears to be in the reload teardown half of
that feature, not in the new-server-start half.
Happy to test a patch or provide a minimal reproducer if helpful.
Why this might be hard to spot in CI
The crash is in teardown of the old server — the new server reports a
healthy start. Anyone testing
systemctl reloadby observing portbinding would see "yes, listeners are up" before the panic. We caught it
only because we read
journalctl -u maddyafter the reload and noticedthe daemon had exited.