Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ tokio = { version = "1.43", features = ["full"] }
# gRPC/Protobuf
tonic = "0.12"
tonic-build = "0.12"
tonic-health = "0.12"
tonic-reflection = "0.12"
prost = "0.13"
prost-types = "0.13"

Expand Down
13 changes: 12 additions & 1 deletion architecture/gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,17 @@ These complement the unit tests inside `supervisor_session.rs` (registry-only be

## gRPC Services

### Standard health and reflection (infrastructure)

The gateway multiplexes these alongside application services on the same TLS port (`crates/openshell-server/src/multiplex.rs`):

| Service | Path prefix | Notes |
|---------|-------------|--------|
| `grpc.health.v1.Health` | `/grpc.health.v1.Health/` | `Check` reports **process liveness** only (same semantics as legacy `OpenShell/Health` today—not database, driver, or inference readiness). Registered names include `openshell.v1.OpenShell`, `openshell.inference.v1.Inference`, and aggregate probe `service: ""`. `Watch` returns `UNIMPLEMENTED`. |
| `grpc.reflection.v1.ServerReflection` | `/grpc.reflection.v1.ServerReflection/` | Reflection aggregates multiple encoded `FileDescriptorSet`s: OpenShell protos from `openshell-core` (`crates/openshell-core/build.rs`), `grpc.health.v1` from `tonic_health::pb::FILE_DESCRIPTOR_SET`, and `grpc.reflection.v1` from the set `tonic_reflection::server::Builder::build_v1()` registers by default ([reflection proto](https://github.com/grpc/grpc/blob/master/src/proto/grpc/reflection/v1/reflection.proto)). When OIDC is enabled, `/grpc.reflection.*` remains **unauthenticated** by design (see `UNAUTHENTICATED_PREFIXES` in `crates/openshell-server/src/auth/oidc.rs`). |

Prefer **`grpc.health.v1.Health/Check`** for probes and tooling; `openshell.v1.OpenShell/Health` is **deprecated** in `proto/openshell.proto` but retained for backward compatibility.

### OpenShell Service

Defined in `proto/openshell.proto`, implemented in `crates/openshell-server/src/grpc/mod.rs` as `OpenShellService`. Per-concern handlers live in `crates/openshell-server/src/grpc/` submodules.
Expand All @@ -335,7 +346,7 @@ Defined in `proto/openshell.proto`, implemented in `crates/openshell-server/src/

| RPC | Description | Key behavior |
|-----|-------------|--------------|
| `Health` | Returns service status and version | Always returns `HEALTHY` with `CARGO_PKG_VERSION` |
| `Health` | Returns service status and version | **Deprecated** — use `grpc.health.v1.Health/Check`. When called, reflects process liveness (`HEALTHY` / `UNHEALTHY`) and `CARGO_PKG_VERSION` (or git-derived version). |
| `CreateSandbox` | Create a new sandbox | Validates spec and policy, validates provider names exist (fail-fast), persists to store, creates the compute-driver sandbox. On driver failure, rolls back the store record and index entry. |
| `GetSandbox` | Fetch sandbox by name | Looks up by name via `store.get_message_by_name()` |
| `ListSandboxes` | List sandboxes | Paginated (default limit 100), decodes protobuf payloads from store records |
Expand Down
7 changes: 4 additions & 3 deletions architecture/oidc-auth.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,10 +158,11 @@ These methods require no authentication at all — health probes and infrastruct

| Method / Prefix | Reason |
|---|---|
| `OpenShell/Health` | Kubernetes liveness/readiness probes |
| `Inference/Health` | Inference service health probes |
| `OpenShell/Health` | Deprecated custom health RPC on `OpenShell` for legacy clients/tooling. Kubernetes **gRPC probes** call `grpc.health.v1.Health/Check` (see `/grpc.health.*` below), not this method |
| `/grpc.reflection.*` | gRPC server reflection (debugging tools) |
| `/grpc.health.*` | gRPC health check protocol |
| `/grpc.health.*` | Standard gRPC health check protocol (covers logical services by name, e.g. `openshell.inference.v1.Inference`; there is no separate `Inference/Health` RPC on `Inference`) |

Note: The `Inference` service in `proto/inference.proto` never defined a `Health` method — only bundle/cluster inference RPCs. Any historical OIDC allowlist entry for `Inference/Health` pointed at a non-existent method.

### Sandbox-Secret Authenticated

Expand Down
2 changes: 2 additions & 0 deletions crates/openshell-cli/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@ bundled-z3 = ["openshell-prover/bundled-z3"]
dev-settings = ["openshell-core/dev-settings"]

[dev-dependencies]
openshell-server = { path = "../openshell-server" }
tonic-health = { workspace = true }
futures = { workspace = true }
rcgen = { version = "0.13", features = ["crypto", "pem"] }
reqwest = { workspace = true }
Expand Down
5 changes: 4 additions & 1 deletion crates/openshell-cli/src/run.rs
Original file line number Diff line number Diff line change
Expand Up @@ -696,7 +696,10 @@ fn is_progress_status(status: &str) -> bool {
}

/// Show gateway status.
#[allow(clippy::branches_sharing_code)]
///
/// Still calls legacy `OpenShell/Health` for version text; migrate to
/// `grpc.health.v1.Health/Check` when a generated client is wired here.
#[allow(clippy::branches_sharing_code, deprecated)]
pub async fn gateway_status(gateway_name: &str, server: &str, tls: &TlsOptions) -> Result<()> {
println!("{}", "Server Status".cyan().bold());
println!();
Expand Down
26 changes: 20 additions & 6 deletions crates/openshell-cli/tests/mtls_integration.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0

use openshell_cli::tls::{TlsOptions, grpc_client};
use openshell_cli::tls::{TlsOptions, build_channel};
use openshell_core::proto::{
CreateProviderRequest, CreateSshSessionRequest, CreateSshSessionResponse,
DeleteProviderRequest, DeleteProviderResponse, ExecSandboxEvent, ExecSandboxRequest,
Expand All @@ -10,6 +10,7 @@ use openshell_core::proto::{
UpdateProviderRequest,
open_shell_server::{OpenShell, OpenShellServer},
};
use openshell_server::{GatewayStandardHealth, MAX_GRPC_DECODE_SIZE, OPENSHELL_SERVICE_NAME};
use rcgen::{
BasicConstraints, Certificate, CertificateParams, ExtendedKeyUsagePurpose, IsCa, KeyPair,
};
Expand All @@ -21,6 +22,9 @@ use tonic::{
Response, Status,
transport::{Certificate as TlsCertificate, Identity, Server, ServerTlsConfig},
};
use tonic_health::pb::{
HealthCheckRequest, health_check_response::ServingStatus, health_client::HealthClient,
};

struct EnvVarGuard {
key: &'static str,
Expand Down Expand Up @@ -407,7 +411,10 @@ async fn run_server(
Server::builder()
.tls_config(tls)
.unwrap()
.add_service(OpenShellServer::new(TestOpenShell))
.add_service(GatewayStandardHealth::server(MAX_GRPC_DECODE_SIZE))
.add_service(
OpenShellServer::new(TestOpenShell).max_decoding_message_size(MAX_GRPC_DECODE_SIZE),
)
.serve_with_incoming(incoming)
.await
.unwrap();
Expand Down Expand Up @@ -437,9 +444,16 @@ async fn cli_connects_with_client_cert() {

let tls = TlsOptions::new(Some(ca_path), Some(cert_path), Some(key_path));
let endpoint = format!("https://localhost:{}", addr.port());
let mut client = grpc_client(&endpoint, &tls).await.unwrap();
let response = client.health(HealthRequest {}).await.unwrap();
assert_eq!(response.get_ref().status, ServiceStatus::Healthy as i32);
let channel = build_channel(&endpoint, &tls).await.unwrap();
let mut health = HealthClient::new(channel);
let response = health
.check(HealthCheckRequest {
service: OPENSHELL_SERVICE_NAME.to_string(),
})
.await
.unwrap()
.into_inner();
assert_eq!(response.status, ServingStatus::Serving as i32);
}

#[tokio::test]
Expand All @@ -461,6 +475,6 @@ async fn cli_requires_client_cert_for_https() {

let tls = TlsOptions::new(Some(ca_path), None, None);
let endpoint = format!("https://localhost:{}", addr.port());
let result = grpc_client(&endpoint, &tls).await;
let result = build_channel(&endpoint, &tls).await;
assert!(result.is_err());
}
5 changes: 5 additions & 0 deletions crates/openshell-core/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,15 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
collect_proto_files(&proto_root, &mut proto_files)?;
proto_files.sort();

// Serialized FileDescriptorSet for gRPC server reflection on the gateway.
let out_dir = PathBuf::from(env::var("OUT_DIR")?);
let descriptor_path = out_dir.join("openshell_file_descriptor_set.bin");

// Configure tonic-build
tonic_build::configure()
.build_server(true)
.build_client(true)
.file_descriptor_set_path(&descriptor_path)
.compile_protos(&proto_files, &[proto_root.as_path()])?;

Ok(())
Expand Down
15 changes: 15 additions & 0 deletions crates/openshell-core/src/proto/descriptor_set.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0

//! Embedded protobuf `FileDescriptorSet` for gRPC server reflection.
//!
//! This blob covers **`OpenShell`** protos only. The gateway reflection service also registers
//! `grpc.health.v1` and `grpc.reflection.v1` using the embedded sets exported by the
//! `tonic-health` and `tonic-reflection` crates (`tonic_health::pb::FILE_DESCRIPTOR_SET` and the
//! set `tonic_reflection::server::Builder::build_v1()` adds by default).

/// Serialized `FileDescriptorSet` covering `OpenShell` gateway protos (see `build.rs`).
pub const FILE_DESCRIPTOR_SET: &[u8] = include_bytes!(concat!(
env!("OUT_DIR"),
"/openshell_file_descriptor_set.bin"
));
4 changes: 4 additions & 0 deletions crates/openshell-core/src/proto/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@
//!
//! This module re-exports the generated protobuf types and service definitions.

mod descriptor_set;

pub use descriptor_set::FILE_DESCRIPTOR_SET;

#[allow(
clippy::all,
clippy::pedantic,
Expand Down
2 changes: 2 additions & 0 deletions crates/openshell-server/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ tokio = { workspace = true }

# gRPC
tonic = { workspace = true, features = ["channel", "tls"] }
tonic-health = { workspace = true }
tonic-reflection = { workspace = true }
prost = { workspace = true }
prost-types = { workspace = true }

Expand Down
5 changes: 1 addition & 4 deletions crates/openshell-server/src/auth/oidc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,7 @@ pub const INTERNAL_AUTH_SOURCE_HEADER: &str = "x-openshell-auth-source";
pub const AUTH_SOURCE_SANDBOX_SECRET: &str = "sandbox-secret";

/// Truly unauthenticated methods — health probes and infrastructure.
const UNAUTHENTICATED_METHODS: &[&str] = &[
"/openshell.v1.OpenShell/Health",
"/openshell.inference.v1.Inference/Health",
];
const UNAUTHENTICATED_METHODS: &[&str] = &["/openshell.v1.OpenShell/Health"];

/// Path prefixes that bypass OIDC validation (gRPC reflection, health probes).
const UNAUTHENTICATED_PREFIXES: &[&str] = &["/grpc.reflection.", "/grpc.health."];
Expand Down
Loading
Loading