feat(cli): add --service-tier flag and --no-wait default for execute#189
feat(cli): add --service-tier flag and --no-wait default for execute#189spillai wants to merge 2 commits into
Conversation
…mands Expose the service_tier option (standard/flex/priority) across all three CLI entry points so users can control delivery tier and pricing from the command line. Also adds service_tier to AgentExecutionOrCreationConfig so the execute path can carry the value through to the API.
Remove 'auto' and 'standard' from the accepted service_tier values across GenerationConfig, AgentExecutionOrCreationConfig, CLI flags, and tests to match the currently supported tiers.
There was a problem hiding this comment.
Code Review
This pull request introduces the service_tier option across the chat, execute, and generate CLI commands, allowing users to specify 'default', 'flex', or 'priority' delivery tiers. Corresponding updates were made to the client types and validation logic, and the default behavior for the execute command was changed to not wait for completion. Feedback indicates that the help text for the --service-tier option in several files incorrectly references a 'standard' tier, which should be updated to 'default' to align with the implementation.
| service_tier: Optional[str] = typer.Option( | ||
| None, | ||
| "--service-tier", | ||
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", |
There was a problem hiding this comment.
The help text refers to standard as a service tier, but this value is not included in AVAILABLE_SERVICE_TIERS and has been removed from the supported tiers in the client types. It should be updated to default to match the implementation and avoid user confusion.
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", | |
| help="Delivery tier: default, flex (50%% discount), or priority (1.8x premium).", |
| service_tier: Optional[str] = typer.Option( | ||
| None, | ||
| "--service-tier", | ||
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", |
There was a problem hiding this comment.
The help text refers to standard as a service tier, but this value is not included in AVAILABLE_SERVICE_TIERS and has been removed from the supported tiers in the client types. It should be updated to default to match the implementation and avoid user confusion.
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", | |
| help="Delivery tier: default, flex (50%% discount), or priority (1.8x premium).", |
| service_tier: Optional[str] = typer.Option( | ||
| None, | ||
| "--service-tier", | ||
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", |
There was a problem hiding this comment.
The help text refers to standard as a service tier, but this value is not included in AVAILABLE_SERVICE_TIERS and has been removed from the supported tiers in the client types. It should be updated to default to match the implementation and avoid user confusion.
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", | |
| help="Delivery tier: default, flex (50%% discount), or priority (1.8x premium).", |
| service_tier: Optional[str] = typer.Option( | ||
| None, | ||
| "--service-tier", | ||
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", |
There was a problem hiding this comment.
🟡 CLI help text lists "standard" as a valid service tier but the actual valid tier is "default"
The --service-tier help string in chat.py:590 says "Delivery tier: standard, flex ..." but AVAILABLE_SERVICE_TIERS at chat.py:150 and the Literal type in types.py:607 define the valid tiers as ["default", "flex", "priority"]. A user following the help text who passes --service-tier standard will get an error: "Invalid service tier 'standard'". The same bug exists in execute.py:312 and generate.py:178.
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", | |
| help="Delivery tier: default, flex (50%% discount), or priority (1.8x premium).", |
Was this helpful? React with 👍 or 👎 to provide feedback.
| service_tier: Optional[str] = typer.Option( | ||
| None, | ||
| "--service-tier", | ||
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", |
There was a problem hiding this comment.
🟡 CLI help text lists "standard" as a valid service tier but the actual valid tier is "default"
Same issue as in chat.py: the --service-tier help string in execute.py:312 says "standard" but the valid tier is "default" (per AVAILABLE_SERVICE_TIERS at execute.py:38).
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", | |
| help="Delivery tier: default, flex (50%% discount), or priority (1.8x premium).", |
Was this helpful? React with 👍 or 👎 to provide feedback.
| service_tier: Optional[str] = typer.Option( | ||
| None, | ||
| "--service-tier", | ||
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", |
There was a problem hiding this comment.
🟡 CLI help text lists "standard" as a valid service tier but the actual valid tier is "default"
Same issue as in chat.py: the --service-tier help string in generate.py:178 says "standard" but the valid tier is "default" (per AVAILABLE_SERVICE_TIERS at generate.py:33).
| help="Delivery tier: standard, flex (50%% discount), or priority (1.8x premium).", | |
| help="Delivery tier: default, flex (50%% discount), or priority (1.8x premium).", |
Was this helpful? React with 👍 or 👎 to provide feedback.
| "Delivery tier: 'default' (baseline), 'flex' (50%% discount, higher latency), " | ||
| "or 'priority' (1.8x premium)." |
There was a problem hiding this comment.
🟡 Double percent %% in Pydantic Field descriptions renders as literal %% instead of %
The description strings in both AgentExecutionOrCreationConfig.service_tier (types.py:245) and GenerationConfig.service_tier (types.py:610) use 50%% which is a Click/typer escaping convention. However, Pydantic Field(description=...) strings are plain Python strings — %% is not processed as an escape and will render literally as "50%% discount" instead of "50% discount" in generated JSON schemas and documentation.
| "Delivery tier: 'default' (baseline), 'flex' (50%% discount, higher latency), " | |
| "or 'priority' (1.8x premium)." | |
| "Delivery tier: 'default' (baseline), 'flex' (50% discount, higher latency), " | |
| "or 'priority' (1.8x premium)." |
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
--service-tierCLI flag toexecute,generate, andchatcommands, supporting the 3 tiers:default,flex(50% discount, higher latency), andpriority(1.8x premium)service_tierfield toAgentExecutionOrCreationConfigso the execute path carries the value through to the APIGenerationConfig.service_tierfrom 5 values to the 3 currently supported:default,flex,priorityexecutedefault from--waitto--no-wait(fire-and-forget by default)Test plan
vlmrun execute --helpshows--service-tierand--wait/--no-waitwith correct defaultsvlmrun generate --helpshows--service-tiervlmrun chat --helpshows--service-tierpytest -sv tests/test_predictions.pyto confirm service_tier tests pass