Python: Support OpenAI and Gemini `allowed_tools` tool choice by giles17 · Pull Request #5322 · microsoft/agent-framework

giles17 · 2026-04-17T03:49:40Z

Motivation and Context

OpenAI and Azure OpenAI support an allowed_tools tool choice type that lets calers restrict which tools the model may invoke without removing tools from the prompt, preserving prompt caching benefits. The Agent Framework had no way to express this constraint.

Fixes #5309

Description

The ToolMode TypedDict gains an optional allowed_tools: list[str] field, validated to only be used with mode="auto". The OpenAI chat client's _prepare_options translates this into the wire format ({"type": "allowed_tools", "mode": "auto", "tools": [...]}) expected by the OpenAI API. Additionally, finish_reason is now propagated through AgentResponse and AgentResponseUpdate so calers can inspect why the model stopped generating, and Pydantic-based tool models (used by providers like Gemini) are properly serialized in _tools_to_dict.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Note: PR autogenerated by giles17's agent

Add allowed_tools field to ToolMode TypedDict, enabling users to restrict which tools the model may call via the OpenAI allowed_tools tool_choice type. This preserves prompt caching by keeping all tools in the tools list while limiting which ones the model can invoke. - Add allowed_tools: list[str] to ToolMode TypedDict - Add validation in validate_tool_mode() (only valid when mode == "auto") - Convert to OpenAI API format in _prepare_options() - Add tests for validation and API payload generation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fixes microsoft#5309

moonbox3 · 2026-04-17T03:52:00Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/anthropic/agent_framework_anthropic
_chat_client.py	439	35	92%	446, 449, 530, 623, 625, 768, 795–796, 874, 876, 906–907, 952, 968–969, 976–978, 982–984, 988–991, 1105, 1115, 1167, 1315–1316, 1333, 1346, 1359, 1384–1385
packages/bedrock/agent_framework_bedrock
_chat_client.py	382	101	73%	298–299, 315–324, 329, 345–351, 354–355, 363, 380, 389, 400, 402, 404, 409, 424–425, 446, 459, 471, 474, 482–483, 486–487, 489–490, 495–497, 499, 509–510, 532, 539, 548–549, 551–552, 554–556, 558, 560–561, 567–569, 572–573, 579–582, 588–598, 601, 620, 625, 667–668, 681, 707, 719, 724, 733, 737, 745–746, 750, 752–759
packages/core/agent_framework
_types.py	1108	87	92%	58, 67–68, 122, 127, 146, 148, 152, 156, 158, 160, 162, 180, 184, 210, 232, 237, 242, 246, 276, 689–690, 849–850, 1285, 1357, 1392, 1412, 1422, 1474, 1606–1608, 1790, 1893–1898, 1923, 2017, 2025–2027, 2032, 2123, 2135, 2158, 2413, 2437, 2536, 2790, 2999, 3072, 3083, 3085–3089, 3091, 3094–3102, 3112, 3182, 3319, 3324, 3329, 3334, 3338, 3422–3424, 3453, 3541–3545
packages/gemini/agent_framework_gemini
_chat_client.py	338	1	99%	375
packages/openai/agent_framework_openai
_chat_client.py	918	121	86%	522–525, 529–530, 536–537, 572–578, 599, 607, 630, 748, 847, 906, 908, 910, 912, 978, 992, 1072, 1082, 1087, 1130, 1252, 1433, 1438, 1442–1444, 1448–1449, 1515, 1544, 1550, 1560, 1566, 1571, 1577, 1582–1583, 1602, 1692, 1714–1715, 1730–1731, 1749–1750, 1793, 1959, 1997–1998, 2014, 2016, 2096–2104, 2134, 2244, 2279, 2294, 2314–2324, 2337, 2348–2352, 2366, 2380–2391, 2400, 2432–2435, 2445–2446, 2457–2459, 2473–2475, 2485–2486, 2492, 2507
_chat_completion_client.py	358	27	92%	428, 524–525, 529, 755–762, 764–767, 777, 855, 857, 874, 895, 923, 936, 960, 980, 1020, 1295
TOTAL	29097	3468	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
5811	30 💤	0 ❌	0 🔥	1m 35s ⏱️

giles17

Automated Code Review

Reviewers: 4 | Confidence: 93%

✓ Correctness

This PR contains only cosmetic/formatting changes: a single blank line added after the ToolMode class in _types.py, and several multi-line expressions collapsed into single lines in test_hyperlight_codeact.py. There are no logic changes and no correctness issues. The allowed_tools feature referenced in the issue context is already fully implemented in the codebase (ToolMode TypedDict, validate_tool_mode, and OpenAI client conversion).

✓ Security Reliability

This PR contains only cosmetic changes: an extra blank line added in _types.py (line 3157) and reformatting of multi-line expressions into single lines in test_hyperlight_codeact.py. There are no functional, security, or reliability changes. The allowed_tools field referenced in context lines already existed prior to this diff.

✓ Test Coverage

The PR adds allowed_tools support to ToolMode with good test coverage for the core validation and OpenAI Responses API client conversion. Tests cover valid single/multiple tools, invalid mode combinations, and regression for plain auto mode. Two test coverage gaps are notable: (1) no test for an empty allowed_tools list ([]), which passes validation and produces a likely-invalid API payload {"type": "allowed_tools", "tools": []}, and (2) the Chat Completions client (_chat_completion_client.py line 665-666) silently drops allowed_tools by falling through to run_options["tool_choice"] = mode (i.e., just "auto"), but there is no test documenting this behavior or warning the user.

✗ Design Approach

The diff itself is trivial — a blank line added to _types.py and cosmetic test reformatting. No logic is changed. However, the allowed_tools field that this PR exposes in ToolMode is not fully wired up: _chat_completion_client.py (lines 655–666) never checks for allowed_tools and silently falls through to emitting plain tool_choice: "auto", making the feature a no-op for users of that client. The _chat_client.py (lines 1218–1224) handles it correctly, creating an inconsistency between the two clients.

Flagged Issues

_chat_completion_client.py _prepare_options (lines 655–666): the allowed_tools branch is missing. When mode == "auto" and allowed_tools is set, the code falls through to run_options["tool_choice"] = mode, silently discarding the list and emitting plain "auto". _chat_client.py lines 1218–1224 show the correct pattern to mirror. Without this fix the feature is non-functional for the Chat Completions client.

Suggestions

Add a test for validate_tool_mode({"mode": "auto", "allowed_tools": []}) — an empty list passes validation today but would produce {"type": "allowed_tools", "tools": []} at the API level. Consider whether validation should reject it, and add a test either way to document the expected behavior.
Add a test in test_openai_chat_completion_client.py covering tool_choice={"mode": "auto", "allowed_tools": ["fn"]} to lock in the expected API payload (or to document that allowed_tools is silently dropped), analogous to test_prepare_options_allowed_tools in test_openai_chat_client.py. If allowed_tools is intentionally unsupported in the Chat Completions client, consider raising a warning so users don't silently lose the restriction.

Automated review by giles17's agents

Copilot

Pull request overview

Adds Python SDK support for OpenAI/Azure OpenAI tool_choice.type="allowed_tools" so callers can restrict tool invocation without removing tools from the prompt/tool list.

Changes:

Extend core ToolMode to include optional allowed_tools (only valid with mode="auto") and update validation.
Update OpenAI chat client option preparation to translate allowed_tools into the OpenAI wire format, with accompanying unit tests.
Adjust samples to suppress pyright optional-dependency import errors for orjson.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
python/packages/core/agent_framework/_types.py	Adds `allowed_tools` to `ToolMode` and extends `validate_tool_mode` constraints.
python/packages/core/tests/core/test_types.py	Adds unit tests for `ToolMode.allowed_tools` and validation behavior.
python/packages/openai/agent_framework_openai/_chat_client.py	Maps `ToolMode.allowed_tools` into OpenAI `tool_choice` “allowed_tools” payload.
python/packages/openai/tests/openai/test_openai_chat_client.py	Adds tests ensuring `_prepare_options` emits correct OpenAI `tool_choice` format.
python/samples/02-agents/conversations/file_history_provider.py	Adds pyright ignore for optional `orjson` import.
python/samples/02-agents/conversations/file_history_provider_conversation_persistence.py	Adds pyright ignore for optional `orjson` import.
python/packages/hyperlight/tests/hyperlight/test_hyperlight_codeact.py	Minor test formatting adjustments.

…ions client support - validate_tool_mode now checks allowed_tools is a non-string sequence of strings and normalizes to list[str], raising ContentError for invalid types - Add missing allowed_tools branch in _chat_completion_client._prepare_options so allowed_tools is emitted as the OpenAI allowed_tools wire format instead of being silently dropped - Add tests for invalid allowed_tools types (string, int, mixed), empty list, tuple normalization, and Chat Completions client payload generation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

giles17

Automated Code Review

Reviewers: 4 | Confidence: 91%

✓ Correctness

The diff adds support for allowed_tools in ToolMode, following the same pattern as the existing required_function_name field. The validation logic in _types.py correctly checks type constraints (non-string sequence of strings), normalizes tuples to lists, and gates the field to mode == 'auto'. Both the Chat Completion client and the Responses API client correctly convert the validated allowed_tools into the OpenAI API format. The walrus operator chain in the client's if/elif branches is correct — mode is assigned even when the first condition short-circuits. Tests cover the key cases including invalid types, empty lists, tuple normalization, and single/multiple tool names. No correctness issues found.

✓ Security Reliability

The implementation is clean and follows the established patterns for ToolMode validation and client conversion. Input validation is thorough (type-checking allowed_tools as a non-string sequence of strings), and the conversion to OpenAI API format is correct. The validation function properly prevents conflicting fields (e.g., both required_function_name and allowed_tools). No security or reliability issues found.

✓ Test Coverage

The new allowed_tools feature has solid test coverage for validation logic (type checks, normalization, invalid mode combinations) and basic client payload generation (single and multiple tools). Two minor gaps: (1) no client-level test for an empty allowed_tools list, which the validation explicitly permits and would produce "tools": [] in the payload; (2) no regression test verifying that {"mode": "auto"} without allowed_tools still falls through to produce tool_choice = "auto" (though this is indirectly covered by the existing parametrized test at line 1627). Overall the coverage is good and assertions are meaningful—each test verifies specific structural properties of the output rather than just asserting no exception.

✓ Design Approach

The PR adds allowed_tools support to ToolMode following the same pattern as required_function_name: extend the TypedDict, validate centrally in validate_tool_mode, and convert to provider-specific API format in the client. The implementation is consistent with the existing framework design at every layer. No fundamental design problems found. One minor observation: when allowed_tools is already a list (the common case), validate_tool_mode returns the original dict object unchanged (final return tool_choice), while a tuple input returns a newly-constructed dict — this asymetry is harmless and matches the existing behavior for required_function_name, but worth being aware of. There are no missing cases in validation logic and the is not None guard in the client correctly passes empty lists through to the API.

Suggestions

Add a client-level test for empty allowed_tools list (e.g., {"mode": "auto", "allowed_tools": []}) to verify _prepare_options produces {"type": "allowed_tools", "mode": "auto", "tools": []} rather than falling through to run_options["tool_choice"] = mode. Validation tests confirm the empty list is accepted, but no client test exercises the resulting payload shape.
Consider adding a regression test verifying that {"mode": "auto"} (without allowed_tools) still produces tool_choice = "auto" through _prepare_options, since the new elif branch could theoretically interfere if the walrus operator condition were wrong. The existing parametrized test at line 1627 covers "auto" as a string but not {"mode": "auto"} as a dict without allowed_tools.

Automated review by giles17's agents

cecheta · 2026-04-18T12:01:27Z

Thanks for looking into this, just to add that allowed_tools also supports mode: required in addition to auto.

https://github.com/openai/openai-python/blob/main/src/openai/types/responses/tool_choice_allowed.py

OpenAI's allowed_tools tool_choice type supports both mode 'auto' and 'required'. Update validation, client conversion, and tests to allow both modes instead of restricting to 'auto' only. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

eavanvalkenburg

some additional work tbd, but removing the request changes

discussed approach

giles17

Automated Code Review

Reviewers: 4 | Confidence: 89%

✗ Correctness

The PR adds allowed_tools support to the core ToolMode TypedDict and implements provider-specific handling across OpenAI, Gemini, Anthropic, Bedrock, and Ollama. The core validation logic is thorough (mutual exclusion with required_function_name, mode gating, type checks, normalization). The OpenAI clients correctly map to the allowed_tools API payload. Provider warnings are properly placed. There is one correctness bug in the Gemini client: the new allowed_tools override block (lines 838-840) can set allowed_names to an empty list [], which the pre-existing truthiness check on line 843 (if allowed_names:) treats as falsy, causing the empty list to be silently dropped while mode is still set to ANY. This means allowed_tools: [] on Gemini produces 'model must call at least one function from the full set' — the semantic opposite of the user's intent ('no tools callable'). The core validation and tests explicitly support empty lists as valid.

✗ Security Reliability

The allowed_tools feature is well-implemented overall with solid input validation in the core layer. One reliability issue: the Gemini client's interaction between new allowed_tools code and the existing truthiness check on allowed_names causes incorrect behavior when allowed_tools is an empty list — the mode is silently changed to ANY (requiring a function call) without restricting to any names, producing the opposite of the caller's intent. The empty list is explicitly allowed by the validation layer (tested in test_types.py line 1178) and handled correctly by both OpenAI clients, making this a Gemini-specific regression. All other provider changes (warnings in Anthropic/Bedrock/Ollama, OpenAI conversion logic, core validation) look correct and well-tested.

✓ Test Coverage

Test coverage for the allowed_tools feature is generally solid: core validation has thorough tests (happy paths, edge cases, type coercion, mutual exclusion), both OpenAI clients have good coverage, and all unsupported providers have warning tests. There is one gap worth noting: the Gemini client silently degrades allowed_tools: [] (empty list) into ANY mode with no function name filtering (equivalent to 'required' — call any tool), because the if allowed_names: guard on line 843 is falsy for []. This is a behavioral inconsistency with the OpenAI clients which faithfully pass through empty lists. A test for this edge case would help document the intended behavior. The Responses API client tests (test_openai_chat_client.py) also lack an empty allowed_tools test, unlike the Chat Completions client tests which do have one.

✗ Design Approach

The implementation is largely well-structured and consistent with the framework's provider-extension pattern. One genuine design issue: the Gemini client silently promotes mode='auto' to ANY (i.e., required/forced tool use) when allowed_tools is present, because Gemini's API only supports allowedFunctionNames with ANY mode. This is a silent semantic change — the user requested optional tool use but gets mandatory tool use — with no warning to the caller. All other providers either honour the mode value as-is (OpenAI, Azure) or log a warning that the feature is unsupported (Anthropic, Bedrock, Ollama). The Gemini case is worse than unsupported: it partially honours the feature but changes the contract without telling the user.

Automated review by giles17's agents

giles17 · 2026-04-21T19:50:49Z

+        # allowed_tools overrides: Gemini requires ANY mode for allowedFunctionNames
+        if "allowed_tools" in tool_mode:
+            allowed_names = list(tool_mode["allowed_tools"])
+            function_calling_mode = types.FunctionCallingConfigMode.ANY


mode='auto' is silently promoted to ANY (forced tool call) here because Gemini's API only accepts allowedFunctionNames with ANY mode. This contradicts the user's intent: auto means the model may optionally call tools, while ANY means it must. Every other provider either honours the requested mode or emits a warning. This should log a warning so callers are not surprised by the changed behaviour.

Suggested change

function_calling_mode = types.FunctionCallingConfigMode.ANY

allowed_names = list(tool_mode["allowed_tools"])

if tool_mode.get("mode") == "auto":

logger.warning(

"Gemini does not support allowedFunctionNames with AUTO mode; "

"promoting to ANY (required) mode to honour allowed_tools"

)

function_calling_mode = types.FunctionCallingConfigMode.ANY

Fixed in f7ca2fdd.

Changing to Any is still a issue, because Any basically maps to required, see https://ai.google.dev/gemini-api/docs/function-calling?example=meeting#function_calling_modes and I think really for Gemini, allowed_tools is not the way, it is required_function_name that should be combined with Any and there is not real equivalence, except maybe Validated, we could look at that

I did notice this and I agree. However, VALIDATED was added to google-genai in v1.32.0, and we currently pin >=1.0.0. So this would require bumping the minimum to >=1.32.0 in pyproject.toml. Is that okay?

yeah, let's go for that, thanks

… providers - Use FunctionCallingConfigMode.VALIDATED instead of ANY when allowed_tools is set with auto mode in Gemini, preserving optional tool-call semantics. - Handle allowed_tools in required mode with required_function_name precedence. - Fix allowed_names guard to use identity check (is not None) so empty lists are preserved. - Bump google-genai minimum to >=1.32.0 (VALIDATED added in that version). - Add warnings in Anthropic and Bedrock when allowed_tools is set but not supported. - Add Gemini unit tests for allowed_tools with auto, required, empty list, and required_function_name precedence scenarios. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot and others added 2 commits April 17, 2026 03:14

Python: Support OpenAI allowed_tools tool choice in Python SDK

c7e57c0

Fixes microsoft#5309

Copilot AI review requested due to automatic review settings April 17, 2026 03:49

giles17 self-assigned this Apr 17, 2026

moonbox3 added the python label Apr 17, 2026

Copilot started reviewing on behalf of giles17 April 17, 2026 03:50 View session

giles17 commented Apr 17, 2026

View reviewed changes

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_types.py Outdated

giles17 commented Apr 17, 2026

View reviewed changes

Comment thread python/packages/openai/tests/openai/test_openai_chat_completion_client.py

giles17 changed the title ~~Python: Support OpenAI allowed_tools tool choice in Python SDK~~ Python: Support OpenAI allowed_tools tool choice Apr 20, 2026

moonbox3 approved these changes Apr 20, 2026

View reviewed changes

eavanvalkenburg previously requested changes Apr 21, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_types.py

eavanvalkenburg reviewed Apr 21, 2026

View reviewed changes

giles17 commented Apr 21, 2026

View reviewed changes

giles17 changed the title ~~Python: Support OpenAI allowed_tools tool choice~~ Python: Support OpenAI and Gemini allowed_tools tool choice Apr 21, 2026

Merge branch 'main' into agent/fix-5309-1

0a28079

giles17 force-pushed the agent/fix-5309-1 branch from f7ca2fd to 0a28079 Compare April 23, 2026 19:54

-            function_calling_mode = types.FunctionCallingConfigMode.ANY
+            allowed_names = list(tool_mode["allowed_tools"])
+            if tool_mode.get("mode") == "auto":
+                logger.warning(
+                    "Gemini does not support allowedFunctionNames with AUTO mode; "
+                "promoting to ANY (required) mode to honour allowed_tools"
+                )
+            function_calling_mode = types.FunctionCallingConfigMode.ANY

Conversation

giles17 commented Apr 17, 2026

Motivation and Context

Description

Contribution Checklist

Uh oh!

moonbox3 commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

giles17 left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✗ Design Approach

Flagged Issues

Suggestions

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

giles17 left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✓ Design Approach

Suggestions

Uh oh!

Uh oh!

cecheta commented Apr 18, 2026

Uh oh!

Uh oh!

eavanvalkenburg left a comment

Choose a reason for hiding this comment

Uh oh!

giles17 left a comment

Choose a reason for hiding this comment

Automated Code Review

✗ Correctness

✗ Security Reliability

✓ Test Coverage

✗ Design Approach

Uh oh!

Uh oh!

giles17 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

giles17 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

eavanvalkenburg Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

giles17 Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

eavanvalkenburg Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

moonbox3 commented Apr 17, 2026 •

edited

Loading