Skip to content

feat(rpm): replace init-pki.sh with openshell-gateway generate-certs#1258

Draft
TaylorMutch wants to merge 3 commits intomainfrom
tmutch/rpm-certgen-cutover
Draft

feat(rpm): replace init-pki.sh with openshell-gateway generate-certs#1258
TaylorMutch wants to merge 3 commits intomainfrom
tmutch/rpm-certgen-cutover

Conversation

@TaylorMutch
Copy link
Copy Markdown
Collaborator

Summary

RPM cutover: the gateway systemd user unit's ExecStartPre now invokes openshell-gateway generate-certs --output-dir %S/openshell/tls instead of the 197-line deploy/rpm/init-pki.sh openssl wrapper. One PKI implementation, one file layout, real test coverage.

Depends on #1257 (which adds the --output-dir mode to generate-certs). Held as draft until that lands.

Changes

  • Spec rewire (openshell.spec):
    • ExecStartPre=/usr/bin/openshell-gateway generate-certs --output-dir %S/openshell/tls (was init-pki.sh %S/openshell/tls).
    • Removed the install -pm 0755 deploy/rpm/init-pki.sh ... line and the matching %files gateway entry.
  • deploy/rpm/init-pki.sh deleted (-197 lines).
  • pki.rs::DEFAULT_SERVER_SANS gains host.containers.internal so podman parity is built-in. Docker (host.docker.internal) and Kubernetes (cluster.local DNS) were already covered. The RPM systemd unit needs no extra --server-san flag; k8s Helm chart also picks it up automatically.
  • Docs: man page (deploy/man/openshell-gateway.8.md), RPM CONFIGURATION.md, and the comment in init-gateway-env.sh all point at the new entrypoint.

The output paths, file modes, and CLI auto-discovery copy are byte-for-byte identical to what init-pki.sh produced — every OPENSHELL_TLS_* / OPENSHELL_PODMAN_TLS_* path in the unit stays valid without edits.

Testing

Local binary smoke

$ openshell-gateway generate-certs --output-dir /tmp/test/state/openshell/tls
INFO openshell_server::certgen: PKI files created. dir=/tmp/test/state/openshell/tls

$ ls -la /tmp/test/state/openshell/tls/{ca.crt,ca.key,server,client}/...
-rw-r--r--   ca.crt
-rw-------   ca.key
-rw-r--r--   server/tls.crt
-rw-------   server/tls.key
-rw-r--r--   client/tls.crt
-rw-------   client/tls.key

$ openssl x509 -in tls/server/tls.crt -noout -ext subjectAltName
DNS:openshell, DNS:openshell.openshell.svc, DNS:openshell.openshell.svc.cluster.local,
DNS:localhost, DNS:host.docker.internal, DNS:host.containers.internal, IP Address:127.0.0.1

$ openshell-gateway generate-certs --output-dir /tmp/test/state/openshell/tls
INFO openshell_server::certgen: PKI files already exist, skipping.

Helm cluster regression check

Deleted both Secrets, redeployed via Skaffold, confirmed:

  • Both kubernetes.io/tls Secrets created with 3 keys each, chain verifies via openssl verify.
  • Server cert SANs include the new host.containers.internal alongside the existing 6 — no duplicates.
  • StatefulSet stabilized, no regression.

Pre-commit

  • mise run pre-commit passes (clippy -D warnings, fmt, markdownlint, tests).
  • pki.rs::tests::build_server_sans_includes_defaults_and_extras continues to pass — uses DEFAULT_SERVER_SANS.len(), auto-adapts.

What this PR does not test locally

  • COPR/Packit RPM build. Triggered automatically on PR; will surface any spec-level issues.
  • systemd ExecStartPre execution on a real Fedora host. Plan: install the COPR-built RPM in a Fedora VM (or podman run --systemd=always fedora) and run systemctl --user enable --now openshell-gateway.service, then verify the 6 PEMs land under ~/.local/state/openshell/tls/.

Checklist

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 7, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Introduce `openshell-gateway generate-certs` modeled on envoyproxy/gateway's
certgen pattern. The Helm pre-install/pre-upgrade hook now runs the gateway
image instead of an alpine + openssl shell job — one image to mirror in
air-gapped environments, one PKI implementation, real test coverage.

Reuses `openshell_bootstrap::pki::generate_pki` for CA/server/client cert
generation. Idempotency contract preserved: both Secrets exist → skip; one
exists → fail with `kubectl delete` recovery hint; neither exists → POST
both `kubernetes.io/tls` Secrets.

The server CLI gains optional subcommand support: bare `openshell-gateway`
still runs the gateway, `generate-certs` runs the new path. `--db-url`
moved from clap-required to call-site validated to avoid the clap flatten +
required-field landmine.
Presence of `--output-dir <DIR>` switches the subcommand from Kubernetes
Secret writes to filesystem writes, making the kube flags optional.

Local layout matches `deploy/rpm/init-pki.sh` exactly:
  <dir>/{ca.crt, ca.key, server/tls.{crt,key}, client/tls.{crt,key}}

Stages writes to a sibling `<dir>.certgen.tmp` and renames into place for
atomic per-file installation. Sets 0o700 on directories and 0o600 on key
files. Reuses `openshell_bootstrap::mtls::store_pki_bundle` to populate
the CLI auto-discovery directory at $XDG_CONFIG_HOME/openshell/gateways/
openshell/mtls/, mirroring init-pki.sh's local-CLI UX.

Same idempotency contract as the Kubernetes path: all six files present →
skip (and self-heal the CLI mTLS copy if missing); partial → fail with a
recovery hint; nothing → generate and write.

Sets up the seam for a follow-up PR that swaps init-pki.sh for the Rust
command in the systemd unit. The shell script and unit are untouched here.
Cuts the RPM gateway over to the unified Rust certgen path. The systemd
user unit's first ExecStartPre now invokes:

  /usr/bin/openshell-gateway generate-certs --output-dir %S/openshell/tls

producing the same six-PEM layout init-pki.sh built (ca.{crt,key},
server/tls.{crt,key}, client/tls.{crt,key}) and the same CLI mTLS copy
under $XDG_CONFIG_HOME/openshell/gateways/openshell/mtls/. None of the
OPENSHELL_TLS_* / OPENSHELL_PODMAN_TLS_* paths in the unit change.

Adds host.containers.internal to the gateway's built-in SAN list so
podman containers reaching their host validate cleanly with no
per-deployment --server-san flag. Docker (host.docker.internal) and
Kubernetes (cluster.local DNS) were already covered.

Drops 197 lines of openssl shell, the install/file lines for the script
itself, and updates the docs (man page, RPM CONFIGURATION.md, env-file
generator comment) to point at the new entrypoint. The %S state dir,
unit security hardening, and consumer paths are untouched.
@TaylorMutch TaylorMutch force-pushed the tmutch/rpm-certgen-cutover branch from bea306f to 6c7d354 Compare May 8, 2026 04:36
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 8, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant