Expose popular agent CLIs as a small OpenAI-compatible HTTP API (/v1/*).
Works great as a local gateway (localhost) or behind a reverse proxy.
Think of it as LiteLLM for agent CLIs: you point existing OpenAI SDKs/tools at base_url, and choose a backend by model.
Supported backends:
- OpenAI Codex - defaults to backend
/responsesfor vision and image generation (DALL-E /gpt-image-class output); falls back tocodex exec - Cursor Agent - via
cursor-agentCLI - Claude Code - via CLI or direct API (auto-detects
~/.claude/settings.jsonconfig) - Gemini - via CLI or CloudCode direct (set
GEMINI_USE_CLOUDCODE_API=1)
Why this exists:
- Many tools/SDKs only speak the OpenAI API (
/v1/chat/completions) - this lets you plug agent CLIs into that ecosystem. - One gateway, multiple CLIs: pick a backend by
model(with optional prefixes likecursor:/claude:/gemini:). - Expose your ChatGPT Plus / Pro subscription's image generation as an HTTP API. No
OPENAI_API_KEYrequired — the gateway reuses the OAuth token fromcodex login, lets you callimage_generationvia plain chat completions, and returns the PNG inline (data URI). See Image generation (ChatGPT subscription).
- Requirements
- Install
- Run (No
.envNeeded) - Core Configuration
- API
- Image generation (ChatGPT subscription)
- OpenAI SDK examples
- Security notes
- Logging & Performance Diagnosis
- Performance notes (important)
- Advanced setup (optional)
- Keywords (SEO)
- Python 3.10+ (tested on 3.13)
- Install and authenticate the CLI(s) you want to use (
codex,cursor-agent,claude,gemini)
uv syncpython -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtPick a provider and start the gateway:
uv run agent-cli-to-api codex
uv run agent-cli-to-api gemini
uv run agent-cli-to-api claude
uv run agent-cli-to-api cursor-agent
uv run agent-cli-to-api doctorBy default agent-cli-to-api does NOT load .env implicitly.
Optional auth:
CODEX_GATEWAY_TOKEN=devtoken uv run agent-cli-to-api codexCustom bind host/port:
uv run agent-cli-to-api codex --host 127.0.0.1 --port 8000Log request curl commands (optional):
uv run agent-cli-to-api codex curl
# or
uv run agent-cli-to-api codex --log-curlNotes:
- If
CODEX_WORKSPACEis unset, the gateway creates an empty temp workspace under/tmp(so you don't need to configure a repo path). - When you start with a fixed provider (e.g.
... gemini), the client-sentmodelstring is accepted but ignored by default (gateway uses the provider's default model). - Each provider still requires its own local CLI login state (no API key is required for Codex / Gemini CloudCode / Claude OAuth).
- Claude auto-detects
~/.claude/settings.jsonand uses direct API mode ifANTHROPIC_AUTH_TOKENandANTHROPIC_BASE_URLare configured. uv run agent-cli-to-api cursor-agentdefaults to Cursor Auto routing (CURSOR_AGENT_MODEL=auto). If you want faster responses, run with--preset cursor-fast.- When running in an interactive terminal (TTY), the gateway enables colored logs and Markdown rendering by default. To disable:
CODEX_RICH_LOGS=0orCODEX_LOG_RENDER_MARKDOWN=0.
Quick smoke test (optional):
# In another terminal, run:
# uv run agent-cli-to-api codex
# Then:
BASE_URL=http://127.0.0.1:8000/v1 ./scripts/smoke.sh
# If you enabled auth:
TOKEN=devtoken BASE_URL=http://127.0.0.1:8000/v1 ./scripts/smoke.shexport CODEX_PRESET=codex-fast
uv run agent-cli-to-api codexSupported presets:
codex-fastautoglm-phonecursor-autocursor-fast(Cursor model pinned for speed)gemini-cloudcode(defaults togemini-3-flash-preview)claude-oauth
Use CODEX_PROVIDER=auto and select providers per-request by prefixing model:
- Codex:
"gpt-5.5" - Cursor:
"cursor:<model>" - Claude:
"claude:<model>" - Gemini:
"gemini:<model>"
- Web search is enabled by default for the Codex backend API (
CODEX_ENABLE_SEARCH=1). The gateway adds the native Responsesweb_searchtool to Codex/responsesrequests. CODEX_CODEX_ALLOW_TOOLS=0to disable Codex backend tool calls (default: enabled).- OpenAI
tools/tool_choiceare mapped for Codex backend, Claude OAuth, and Gemini CloudCode (best-effort).
The gateway auto-detects your Claude CLI configuration from ~/.claude/settings.json:
# If you have Claude CLI configured with a custom API endpoint (e.g. 小米 MiMo, 腾讯混元, etc.)
# Just run - no extra config needed:
uv run agent-cli-to-api claudeThe gateway will automatically:
- Read
ANTHROPIC_AUTH_TOKENandANTHROPIC_BASE_URLfrom~/.claude/settings.json - Use direct HTTP API calls (fast, ~0ms gateway overhead)
- Log timing breakdown:
auth_ms,prepare_ms,api_latency_ms
Alternative: Claude OAuth (Anthropic official):
uv run python -m codex_gateway.claude_oauth_login
CLAUDE_USE_OAUTH_API=1 uv run agent-cli-to-api claudeuvx --from git+https://github.com/leeguooooo/agent-cli-to-api agent-cli-to-api codexCODEX_GATEWAY_TOKEN=devtoken uv run agent-cli-to-api codex
cloudflared tunnel --url http://127.0.0.1:8000For advanced env vars, see .env.example and codex_gateway/config.py.
GET /healthzGET /debug/config(effective runtime config; requires auth ifCODEX_GATEWAY_TOKENis set)GET /v1/modelsPOST /v1/embeddings(proxies to OpenAI embeddings; requiresOPENAI_API_KEYor~/.codex/auth.jsonwithOPENAI_API_KEY)POST /v1/chat/completions(supportsstream)POST /v1/messages(Anthropic Messages-compatible; supportsstream)POST /v1/messages/count_tokens(Anthropic-compatible; currently heuristic token counting)
Tip: any OpenAI SDK that supports base_url should work by pointing it at this server.
Tip: Claude Code can point ANTHROPIC_BASE_URL at this server and use ANTHROPIC_AUTH_TOKEN for gateway auth.
Auth note: include Authorization: Bearer <token> only when you set CODEX_GATEWAY_TOKEN on the gateway.
curl -s http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer devtoken" \
-d '{
"model":"gpt-5.5",
"messages":[{"role":"user","content":"总结一下这个仓库结构"}],
"reasoning": {"effort":"low"},
"stream": false
}'curl -s http://127.0.0.1:8000/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer devtoken" \
-d '{
"model":"text-embedding-3-small",
"input":"hello world"
}'curl -N http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer devtoken" \
-H "X-Codex-Session-Id: 0f3d5b6f-2a3b-4d78-9f50-123456789abc" \
-d '{
"model":"gpt-5-codex",
"messages":[{"role":"user","content":"用一句话解释这个项目的目的"}],
"stream": true
}'curl -s http://127.0.0.1:8000/v1/messages \
-H "Content-Type: application/json" \
-H "Authorization: Bearer devtoken" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model":"claude-sonnet-4-6",
"max_tokens": 256,
"messages":[
{"role":"user","content":"用一句话解释这个项目的作用"}
]
}'curl -s http://127.0.0.1:8000/v1/messages/count_tokens \
-H "Content-Type: application/json" \
-H "Authorization: Bearer devtoken" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model":"claude-sonnet-4-6",
"messages":[
{"role":"user","content":"hello"}
]
}'When CODEX_LOG_MODE=full (or CODEX_LOG_EVENTS=1), the gateway logs image[0] ext=... bytes=... and decoded_images=N so you can confirm images are being received/decoded.
python - <<'PY' > /tmp/payload.json
import base64, json
img_b64 = base64.b64encode(open("screenshot.png","rb").read()).decode()
print(json.dumps({
"model": "gpt-5-codex",
"stream": False,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "读取图片里的文字,只输出文字本身"},
{"type": "image_url", "image_url": {"url": "data:image/png;base64," + img_b64}},
],
}],
}))
PY
curl -s http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer devtoken" \
-d @/tmp/payload.jsonPDF input uses OpenAI-style type: "file" parts:
python - <<'PY' > /tmp/pdf-payload.json
import base64, json
pdf_b64 = base64.b64encode(open("label.pdf","rb").read()).decode()
print(json.dumps({
"model": "gpt-5.5",
"stream": False,
"messages": [{
"role": "user",
"content": [
{"type": "file", "file": {"filename": "label.pdf", "file_data": pdf_b64}},
{"type": "text", "text": "Check these rules and summarize the key constraints."},
],
}],
}))
PY
curl -s http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer devtoken" \
-d @/tmp/pdf-payload.jsonTL;DR — turn your ChatGPT Plus / Pro / Team subscription into an OpenAI-compatible image-generation HTTP API. No
OPENAI_API_KEY, no per-image billing on top of your subscription, no separate/v1/images/generationsupstream. Just call/v1/chat/completionsand the gateway hands you back a PNG.
The Codex CLI's built-in image_gen capability is implemented as a native Responses API tool ({"type": "image_generation"}) hosted on ChatGPT's internal backend-api/codex endpoint — and your ~/.codex/auth.json OAuth token is what authorises it. This gateway:
- Reuses that OAuth token (no API key needed).
- Injects
{"type": "image_generation"}into thetoolsarray on every chat completion request whenCODEX_ENABLE_IMAGE_GEN=1. Default is OFF so plain-text completions don't get the tool silently attached. - Streams the upstream Responses events, intercepts the
image_generation_calloutput items, and embeds the resulting base64 PNG into the assistant message content as a markdown data URI:. - Returns a standard OpenAI Chat Completion response — any client that understands the OpenAI SDK gets the image for free.
- Logged-in Codex CLI (
codex loginonce — creates~/.codex/auth.json). CODEX_USE_CODEX_RESPONSES_API=1(this is the default).CODEX_ENABLE_IMAGE_GEN=1(must be set explicitly — default is OFF). Without this the gateway does not inject theimage_generationtool and/v1/chat/completionsreturns text only.
curl -sS http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer devtoken" \
-d '{
"model": "gpt-5.5",
"stream": false,
"messages": [
{"role": "user",
"content": "Use the image_generation tool to draw a minimal flat-design icon of a green leaf on white, 1024x1024."}
]
}' | jq -r '.choices[0].message.content' \
| python3 -c "import sys,re,base64; m=re.search(r'data:image/(\w+);base64,([A-Za-z0-9+/=]+)', sys.stdin.read()); open(f'leaf.{m.group(1)}','wb').write(base64.b64decode(m.group(2)))"The script above pipes the data URI out and writes leaf.png.
import base64, re
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8000/v1", api_key="devtoken")
resp = client.chat.completions.create(
model="gpt-5.5",
messages=[{"role": "user", "content": "Use the image_generation tool to render a watercolour cat."}],
)
m = re.search(r"data:image/(\w+);base64,([A-Za-z0-9+/=]+)", resp.choices[0].message.content)
open(f"cat.{m.group(1)}", "wb").write(base64.b64decode(m.group(2)))A turnkey CLI helper for any agent (Claude Code, Codex, Cursor, your own scripts) ships in this repo:
python3 skills/imagegen/scripts/generate.py \
"Studio photo of a red ceramic teacup on a wooden table, soft morning light" \
-o assets/hero.png \
--size 1536x1024 \
--quiet
# stdout = assets/hero.png (the agent can capture and use it)Drop the skills/imagegen/ directory into any agent's skill directory (or symlink it). The accompanying SKILL.md gives agents everything they need: when to use it, sizing recipes, save-path policy, error handling, and known limits.
| Param | Status | Notes |
|---|---|---|
size |
✅ honoured | auto, 1024x1024, 1536x1024, 1024x1536, 2048x2048, 3840x2160, … |
output_format |
✅ honoured | png (default), jpeg, webp |
quality: low/medium/auto |
✅ honoured | model picks medium by default |
quality: high |
medium |
ChatGPT subscription tier cap — use OPENAI_API_KEY and direct /v1/images/generations for true high |
background: transparent |
❌ not supported on subscription path | requires gpt-image-1.5 via OPENAI_API_KEY; or use chroma-key + local alpha extraction |
model (e.g. gpt-image-2) |
passthrough | hosted model is whatever the subscription provides; modern subscription serves gpt-image-2-class output |
Edits (/v1/images/edits) |
❌ not yet exposed | open issue if you need it |
- Calls consume your ChatGPT subscription image quota — shared with the ChatGPT web app and Codex CLI.
- One image typically takes 15–40 seconds at default quality.
- This is a thin gateway, not a "free image API for everyone" — it's meant for personal automation, agent workflows, and dogfooding from your own developer machine. Putting it behind a public proxy violates OpenAI's ToS for your subscription. Use a token (
CODEX_GATEWAY_TOKEN) and bind to127.0.0.1.
The ChatGPT subscription backend handles concurrent image_generation requests fine — measured on a Plus account, 4 simultaneous requests all returned 200 with total_wall ≈ slowest_single (~27s), i.e. fully parallel, no serialization, no 429. You don't need a semaphore in the gateway for this on personal use.
When you might want to add one (CODEX_IMAGE_GEN_CONCURRENCY is not currently a knob — open an issue if you need it):
- Multi-user / team-shared gateway: a burst of slow image requests can fill the worker pool (
CODEX_MAX_CONCURRENCY=100by default) and make text completions queue behind them. - High-frequency batch generation (>10 images/min sustained): you'll eventually hit subscription rate limits.
Either way, streaming chat completions and image generation are mutually exclusive — stream=true requests get HTTP 400 if CODEX_ENABLE_IMAGE_GEN=1, since image bytes can't be chunked back through SSE in a way that any OpenAI SDK understands. Set stream=false for image gen requests.
If you don't need the HTTP gateway and just want to generate images from your terminal or from an AI agent (Claude Code / Cursor / Codex Agent / OpenClaw…), use the sister project:
➡️ leeguooooo/chatgpt-imagegen — single-file Python CLI + agent skill, zero deps, same ChatGPT-subscription backend. Install via npx skills add leeguooooo/chatgpt-imagegen -g.
| You want | Use |
|---|---|
| OpenAI-compatible HTTP API, multi-app, team-shared | this repo (agent-cli-to-api) |
| Local CLI only, agent-driven, no server | chatgpt-imagegen |
Python:
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8000/v1", api_key="devtoken")
resp = client.chat.completions.create(
model="gpt-5.5",
messages=[{"role": "user", "content": "Hi"}],
)
print(resp.choices[0].message.content)TypeScript:
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://127.0.0.1:8000/v1",
apiKey: process.env.CODEX_GATEWAY_TOKEN ?? "devtoken",
});
const resp = await client.chat.completions.create({
model: "gpt-5.5",
messages: [{ role: "user", content: "Hi" }],
});
console.log(resp.choices[0].message.content);You are exposing an agent that can read files and run commands depending on CODEX_SANDBOX.
Keep it private by default, use a token, and run in an isolated environment when deploying.
The gateway provides detailed timing logs to help diagnose latency:
INFO claude-oauth request: url=https://api.example.com/v1/messages model=xxx auth_ms=0 prepare_ms=0
INFO claude-oauth response: status=200 api_latency_ms=2886 parse_ms=0 total_ms=2887
| Metric | Description |
|---|---|
auth_ms |
Time to load/refresh credentials |
prepare_ms |
Time to build request payload |
api_latency_ms |
Upstream API response time (main bottleneck) |
parse_ms |
Time to parse response |
total_ms |
Total gateway processing time |
If api_latency_ms ≈ total_ms, the latency is entirely from the upstream API (not the gateway).
CODEX_LOG_MODE=summary # one line per request (default)
CODEX_LOG_MODE=qa # show Q (question) and A (answer)
CODEX_LOG_MODE=full # full prompt + responseIf your normal ~/.codex/config.toml has many mcp_servers.* entries, Codex will start them for every codex exec call
and include their tool schemas in the prompt. This can add seconds of startup time and 10k+ prompt tokens per request.
For an HTTP gateway, it's usually best to run Codex with a minimal config (no MCP servers).
By default the gateway uses your system ~/.codex (so auth stays in sync).
If you want a minimal, isolated config (no MCP servers), set CODEX_CLI_HOME to a gateway-local directory.
On first run it will try to copy ~/.codex/auth.json into that directory (so you don't have to).
If you want to set it up manually or customize it:
export CODEX_CLI_HOME=$PWD/.codex-gateway-home
mkdir -p "$CODEX_CLI_HOME/.codex"
cp ~/.codex/auth.json "$CODEX_CLI_HOME/.codex/auth.json" # or set CODEX_API_KEY instead
cat > "$CODEX_CLI_HOME/.codex/config.toml" <<'EOF'
model = "gpt-5.5"
model_reasoning_effort = "low"
[projects."/path/to/your/workspace"]
trust_level = "trusted"
EOFcp .env.example .env
uv run agent-cli-to-api codex --env-file .envTip: you can also opt-in to loading .env from the current directory with --auto-env.
This installs a user LaunchAgent and keeps the gateway running after reboot.
chmod +x scripts/install_launchd.sh
scripts/install_launchd.sh --provider codex --host 127.0.0.1 --port 8000Optional env/token:
scripts/install_launchd.sh --env-file "$PWD/.env" --token devtokenUninstall:
scripts/install_launchd.sh --uninstallLogs:
~/Library/Logs/com.codex-api.gateway.out.log~/Library/Logs/com.codex-api.gateway.err.log
Note: uv must be on your PATH (e.g. /opt/homebrew/bin/uv).
Enable colored logs (Rich handler):
export CODEX_RICH_LOGS=1
uv run agent-cli-to-api codexRender assistant output as Markdown in the terminal (best-effort; prints a separate block to stderr):
export CODEX_LOG_RENDER_MARKDOWN=1
uv run agent-cli-to-api codexLog request curl commands (useful for replay/debug):
export CODEX_LOG_REQUEST_CURL=1
uv run agent-cli-to-api codexOpenAI-compatible API, chat completions, SSE streaming, agent gateway, CLI to API proxy, Codex CLI, Cursor Agent, Claude Code, Gemini CLI.
Image generation specifically: ChatGPT subscription image generation API, ChatGPT Plus image API, ChatGPT Pro image API, use ChatGPT image generation without OPENAI_API_KEY, expose ChatGPT image generation as HTTP API, gpt-image-1 / gpt-image-2 via ChatGPT subscription, Codex CLI image_gen as API, DALL-E via ChatGPT Plus subscription, no-API-key image generation proxy, OAuth-backed OpenAI image generation, /v1/chat/completions image_generation tool, Responses API image_generation tool, image_generation_call SSE events, ChatGPT subscription as image API gateway, free-tier-friendly image generation gateway, agent skill for image generation, save generated image to project directory.
中文搜索词: 用 ChatGPT 订阅生成图片接口、ChatGPT Plus 生图 API、不用 API key 生成图片、把 ChatGPT 订阅做成 OpenAI 兼容生图接口、ChatGPT 订阅生图代理、Codex CLI 生图能力接口化、gpt-image-2 用订阅调用、ChatGPT Plus 生图转 API、image_generation 工具网关、给 agent 用的生图 skill、生图保存到项目目录。