Skip to content

test: cover push scheduling in client integration tests#1685

Open
abhinavgautam01 wants to merge 4 commits into
apache:mainfrom
abhinavgautam01:issue-1463-push-scheduling-integration-tests
Open

test: cover push scheduling in client integration tests#1685
abhinavgautam01 wants to merge 4 commits into
apache:mainfrom
abhinavgautam01:issue-1463-push-scheduling-integration-tests

Conversation

@abhinavgautam01
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #1463.

Rationale for this change

Issue #1463 asked for integration tests to cover push-staged scheduling as well as pull-staged. The in-process test cluster was effectively pull-only: the standalone scheduler used a fixed identity while listening on a random port, and the standalone executor always used the pull poll_loop. That gap meant we did not exercise the same scheduler/executor path as the default push policy in real deployments.
In addition, exposing push mode in tests surfaced a real correctness issue: push executors report task status via gRPC clients keyed by the scheduler’s advertised host:port; a hard-coded placeholder did not match the actual listener, which could cause hangs or failed status updates.

What changes are included in this PR?

  • Scheduler (standalone): bind 127.0.0.1:0 first, then build the in-memory cluster / SchedulerServer with scheduler_endpoint = addr.to_string() so task metadata and curator IDs match the real gRPC listener.
  • Executor (standalone): add push-staged support using the existing executor_server::startup path; keep pull-staged behavior on poll_loop as before.
  • executor_server::startup: wait until the executor gRPC address accepts a TCP connection before register_executor, reducing a race with asynchronously started listeners.
  • Client integration tests: helpers and rstest cases for push (remote_push, remote_state_push); context_setup remote tests parameterized over TaskSchedulingPolicy; align test URLs with 127.0.0.1 for consistency with the binder.
  • Examples test common: same 127.0.0.1 host alignment for cluster URLs.
  • Minor: cast ephemeral grpc_port to u32 for protobuf registration fields.
    Tests: cargo test -p ballista --features standalone (with protoc available).

Are there any user-facing changes?

Yes, but small and additive. New public APIs were added on the scheduler and executor standalone modules (policy-aware constructors / _with_scheduling_policy helpers). Existing entry points retain prior behavior where applicable (e.g. default standalone scheduler helpers still target pull-staged integration defaults as before).

No changes to CLI flags or documented user workflows beyond optional use of these APIs by embedders/tests.

- Add standalone scheduler/executor APIs for TaskSchedulingPolicy.
- Run push-staged executor path in tests; fix scheduler endpoint and gRPC readiness.
- Extend rstest fixtures and context_setup with push cases.
@abhinavgautam01 abhinavgautam01 force-pushed the issue-1463-push-scheduling-integration-tests branch from 1f1ec15 to 3a70d60 Compare May 11, 2026 15:50
Comment thread ballista/executor/src/standalone.rs Outdated
Comment thread ballista/executor/src/standalone.rs Outdated
Comment thread ballista/executor/src/standalone.rs Outdated
Comment thread ballista/executor/src/standalone.rs Outdated
Comment thread ballista/executor/src/standalone.rs Outdated
Comment thread ballista/executor/src/standalone.rs
Comment thread ballista/executor/src/executor_server.rs Outdated
Comment thread ballista/executor/src/standalone.rs Outdated
Comment thread ballista/executor/src/executor_server.rs Outdated
Comment thread ballista/client/tests/context_checks.rs Outdated
@killzoner
Copy link
Copy Markdown
Contributor

You added standalone push scheduler, but this is not covered by tests ATM ?

@abhinavgautam01
Copy link
Copy Markdown
Contributor Author

You added standalone push scheduler, but this is not covered by tests ATM ?

added coverage in context_setup.rs: standalone_push_scheduling_executes_sql starts the standalone scheduler/executor with push scheduling and verifies a real SQL query against it...

@abhinavgautam01 abhinavgautam01 requested a review from killzoner May 18, 2026 13:14
@killzoner
Copy link
Copy Markdown
Contributor

You added standalone push scheduler, but this is not covered by tests ATM ?

added coverage in context_setup.rs: standalone_push_scheduling_executes_sql starts the standalone scheduler/executor with push scheduling and verifies a real SQL query against it...

Sorry, not quite what I meant. I meant having 0 change on the test cases themselves, and just splitting what you did to

#[fixture]
   async fn remote(
       #[values(TaskSchedulingPolicy::PullStaged, TaskSchedulingPolicy::PushStaged)]
       policy: TaskSchedulingPolicy,
   ) -> SessionContext {
       common::remote_context_with_scheduling(policy).await
   }

and the same for standalone.
This way you keep the 2 remote vs standalone, do 0 change on the existing code, and cover pull / push for both paths.

Also now you have diverging code style with ballista/client/tests/context_unsupported.rs

@abhinavgautam01
Copy link
Copy Markdown
Contributor Author

updated... i removed the standalone one-off test and changed the context tests to keep the two context cases (standalone and remote) while expanding TaskSchedulingPolicy::{PullStaged, PushStaged} through #[values(...)]...

i also aligned context_unsupported.rs to use the same pattern... the exact #[fixture] async fn remote(#[values(...)] ...) syntax does not compile with the repo’s rstest 0.26, so this uses the supported equivalent with context builders plus policy values...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve integration tests to test push scheduler mode as well

3 participants