Drop-in LiteLLM provider for OpenCode with zero configuration.
Auto-detect a running LiteLLM proxy, pull every model from /v1/models, and register them in OpenCode.
No model lists to hand-maintain. No restart loops. No surprises.
Quickstart Β· Configuration Β· How it works Β· FAQ Β· Contributing
npm package:
opencode-plugin-litellmΒ Β·Β GitHub repo:yuseferi/opencode-litellmThe unscopedopencode-litellmnpm name was already taken by another author.
Maintaining a models block in opencode.json for every model your LiteLLM proxy exposes is a chore β every new entry in your model_list means a config edit, a restart, and a context-switch.
opencode-litellm removes that loop entirely. It hooks into OpenCode's config lifecycle, queries your LiteLLM proxy at startup, and merges the discovered models into your config in memory. The result: every model in litellm config.yaml shows up in OpenCode's picker the moment you start it β automatically.
# 1. Install
npm install opencode-plugin-litellm
# or: bun add opencode-plugin-litellm# 3. Start LiteLLM (if it isn't already)
litellm --config config.yaml --port 4000
# 4. Run OpenCode β every model in your LiteLLM model_list is now available.
opencode| π Auto-detection | Probes localhost:4000, :8000, :8080 and adopts the first responsive proxy. |
| π‘ Dynamic discovery | Queries /v1/models so your OpenCode model picker always reflects your live model_list. |
| π·οΈ Smart formatting | Turns anthropic/claude-3-5-sonnet into Claude 3 5 Sonnet in the picker β handles versions, sizes, quantizations, and brand-cased names like gpt-4o. |
| π§ Modality-aware | Infers chat / embedding / image / audio from the model mode field or id, and writes proper modalities metadata. |
| π§ͺ Reasoning-aware routing | Auto-routes gpt-5* / o1/o3/o4* models through a sibling litellm-responses provider that uses /v1/responses, so tools + reasoning_effort actually work. Override per model via responsesApiModels / chatApiModels. |
| π’ Provider extraction | Pulls litellm_provider (or the provider/model prefix) into organizationOwner so models group correctly in the UI. |
| π Auth-aware | Honours LITELLM_API_KEY / LITELLM_MASTER_KEY env vars or provider.litellm.options.apiKey. |
| π Gateway-friendly | Supports customHeaders for proxies behind Cloudflare Access or other API gateways requiring extra HTTP headers. |
| β±οΈ Non-blocking startup | Discovery is capped at 5 s β a slow or offline proxy never delays OpenCode boot. |
| π€ Non-destructive merge | Only adds models you don't already have configured. Hand-curated entries are preserved verbatim. |
| πͺΆ Zero runtime deps | Only depends on @opencode-ai/plugin. No build step, no bundler. |
| π TypeScript strict | Strict-mode compiled, fully typed public API. |
Point at your LiteLLM proxy β the plugin discovers all models automatically:
{
"$schema": "https://opencode.ai/config.json",
"plugin": ["opencode-plugin-litellm@latest"],
"provider": {
"litellm": {
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "http://localhost:4000/v1"
}
}
}
}You do not need to list any models β the plugin still discovers them from /v1/models automatically. Use this form only when you need to point at a non-default URL or pass an API key:
{
"$schema": "https://opencode.ai/config.json",
"plugin": ["opencode-plugin-litellm@latest"],
"provider": {
"litellm": {
"npm": "@ai-sdk/openai-compatible",
"name": "LiteLLM (proxy)",
"options": {
"baseURL": "http://litellm.internal.example.com/v1",
"apiKey": "{env:LITELLM_API_KEY}"
}
}
}
}That's the whole config β every model in your LiteLLM model_list will appear in the picker.
If you want to rename a model in the picker, pin its organizationOwner, or otherwise hand-curate metadata, add it under models. The plugin preserves your entries verbatim and only injects discovered models whose key isn't already defined:
{
"provider": {
"litellm": {
"options": {
"baseURL": "http://litellm.internal.example.com/v1",
"apiKey": "{env:LITELLM_API_KEY}"
},
"models": {
"openai/gpt-4o": {
"name": "GPT-4o (curated)",
"organizationOwner": "openai"
}
}
}
}
}Here, openai/gpt-4o keeps your custom name; every other model from the proxy is still discovered and added automatically.
OpenAI's reasoning-tier models reject requests that combine reasoning_effort
with function tools when sent to /v1/chat/completions. The OpenAI Responses
API (/v1/responses) has no such restriction, so the plugin routes those
models through a second provider entry named litellm-responses that
uses an SDK speaking the Responses API.
You don't need to do anything for the default behaviour β the plugin
detects reasoning-tier models from their id (gpt-5*, o1*, o3*,
o4*) and from LiteLLM's mode === 'responses' field, and creates the
sibling provider lazily.
To override the routing per model:
{
"provider": {
"litellm": {
"options": {
"baseURL": "http://localhost:4000/v1",
// "auto" (default) | "chat" | "responses"
"transport": "auto",
// Force these into /v1/responses (highest precedence)
"responsesApiModels": ["gpt-5-4-high", "my-custom-reasoning-model"],
// Force these into /v1/chat/completions
"chatApiModels": ["o1-mini-cheap"]
}
}
}
}The two providers share baseURL and apiKey. Models curated by hand
under either provider's models block are preserved verbatim, and a
discovered model is skipped if its key already exists under either
provider.
Note: this requires LiteLLM β₯ 1.40 (which proxies
/v1/responses) and an@ai-sdk/openaiversion that supports the Responses API. Older AI SDKs may silently fall back to chat-completions, in which case setresponsesApiModelsto an empty list and fix the upstream LiteLLM config instead (e.g.use_responses_api: trueper model).
If your LiteLLM proxy requires a master key, expose it via either approach:
| Method | Example |
|---|---|
| Env var | export LITELLM_API_KEY=sk-... |
| Env var (alias) | export LITELLM_MASTER_KEY=sk-... |
| Config | "options": { "apiKey": "{env:LITELLM_API_KEY}" } |
The env var path lets you commit opencode.json without leaking secrets.
If your LiteLLM proxy is behind Cloudflare Access or another gateway that requires extra HTTP headers, use the customHeaders option:
{
"provider": {
"litellm": {
"options": {
"baseURL": "https://litellm.internal.example.com/v1",
"apiKey": "{env:LITELLM_API_KEY}",
"customHeaders": {
"CF-Access-Client-Id": "{env:CF_ACCESS_CLIENT_ID}",
"CF-Access-Client-Secret": "{env:CF_ACCESS_CLIENT_SECRET}"
}
}
}
}
}These headers are included in every request the plugin makes during model discovery (health check and /v1/models). To obtain a Cloudflare Access Service Token, follow the Cloudflare docs.
sequenceDiagram
participant OC as OpenCode
participant Plugin as opencode-litellm
participant LL as LiteLLM proxy
OC->>Plugin: config(initial)
alt provider.litellm configured
Plugin->>LL: GET /v1/models @ baseURL
else not configured
Plugin->>LL: probe :4000, :8000, :8080
LL-->>Plugin: 200 OK on one
Plugin->>Plugin: auto-create provider entry
end
Plugin->>LL: GET /v1/models (with auth if set)
LL-->>Plugin: { data: [...models] }
Plugin->>Plugin: format names, infer modalities, extract owner
Plugin->>Plugin: bucket each model by transport (chat vs responses)
Plugin->>OC: merge chat-completions models into provider.litellm
Plugin->>OC: merge responses models into provider.litellm-responses (lazy)
OC->>OC: render model picker with all discovered models
- On OpenCode startup the
configlifecycle hook fires. - If
provider.litellmexists, itsbaseURLis used. Otherwise common ports are probed. - A health check (
GET /v1/models) verifies the proxy is reachable and authorized. - Models from the response are converted into OpenCode model entries with
id, formattedname,organizationOwner, and inferredmodalities. - Each model is bucketed by transport β reasoning-tier models (
gpt-5*,o1/o3/o4*, or anything withmode === 'responses') go into thelitellm-responsesprovider; everything else goes intolitellm. Per-model overrides viaresponsesApiModels/chatApiModelswin. - Discovered models are merged on top of any user-defined ones β never overwriting them. A model is skipped if its key already exists under either provider.
- The whole flow is wrapped in a
Promise.raceagainst a 5 s timeout so a slow proxy never blocks boot.
- OpenCode β₯ 0.1.x with plugin support (
@opencode-ai/plugin ^1.0.166) - A running LiteLLM proxy:
pip install 'litellm[proxy]' litellm --config config.yaml --port 4000 - Node.js β₯ 20 (or Bun β₯ 1.0)
| LiteLLM version | OpenCode version | Status |
|---|---|---|
| β₯ 1.40 | β₯ 0.1.x | β Tested |
| 1.30 β 1.39 | β₯ 0.1.x | /v1/models schema) |
| < 1.30 | any | β Unsupported |
Why doesn't a model appear in OpenCode after I add it to LiteLLM?
OpenCode reads the plugin output once at startup. After updating litellm config.yaml, restart both LiteLLM and OpenCode to refresh the model list.
Can I use this with a remote LiteLLM proxy?
Yes. Set provider.litellm.options.baseURL to your remote URL and (optionally) apiKey. Auto-detection only probes localhost, but explicit configuration works against any URL.
What happens if LiteLLM is offline at startup?
The plugin logs a warning and is a no-op. OpenCode starts normally; you just won't see LiteLLM-discovered models until you restart with the proxy up.
Will my hand-curated model entries be overwritten?
No. The merge is additive: anything you've already defined under provider.litellm.models is preserved exactly as-is. Discovered models are only added if their key isn't already present.
Why is the npm name opencode-plugin-litellm and not opencode-litellm?
The unscoped opencode-litellm was already published by another author when this project was started. The GitHub repo and exported plugin symbol still use the cleaner opencode-litellm name.
Does this work with Ollama through LiteLLM?
Yes β anything in your LiteLLM model_list shows up, including Ollama, Bedrock, Azure, OpenAI, Anthropic, Google, etc. That's the whole point of LiteLLM.
My LiteLLM proxy is behind Cloudflare Access β how do I authenticate?
Cloudflare Access intercepts requests before they reach LiteLLM, so a plain Authorization: Bearer header isn't enough. Create a Cloudflare Access Service Token and pass the credentials via customHeaders:
{
"provider": {
"litellm": {
"options": {
"baseURL": "https://litellm.your-company.com/v1",
"customHeaders": {
"CF-Access-Client-Id": "{env:CF_ACCESS_CLIENT_ID}",
"CF-Access-Client-Secret": "{env:CF_ACCESS_CLIENT_SECRET}"
}
}
}
}
}The customHeaders map works for any gateway that requires extra HTTP headers β not just Cloudflare.
I get Function tools with reasoning_effort are not supported β¦ in /v1/chat/completions β what do I do?
This error comes from OpenAI: their reasoning-tier models (gpt-5, o1, o3, o4) refuse function-tool calls on /v1/chat/completions when reasoning_effort is set. They require /v1/responses instead.
As of 0.2.0, opencode-litellm automatically routes those models through a sibling litellm-responses provider that uses the Responses API. If your model id doesn't match the heuristic (e.g. you renamed it in LiteLLM), add it explicitly:
"provider": {
"litellm": {
"options": {
"responsesApiModels": ["my-renamed-gpt-5-high"]
}
}
}The model will appear under the LiteLLM (responses) provider in the picker; pick it from there and tool-calling will work.
Why are there suddenly two providers (litellm and litellm-responses) in the picker?
Same LiteLLM proxy, different transport. litellm talks to /v1/chat/completions; litellm-responses talks to /v1/responses. The split is required for OpenAI reasoning models β see the FAQ entry above.
The responses provider is created lazily and only appears if at least one discovered model needs it. To collapse everything back into a single provider, set "transport": "chat" in provider.litellm.options (you'll lose tool-calling on reasoning models in exchange).
git clone https://github.com/yuseferi/opencode-litellm.git
cd opencode-litellm
npm install
npm run typecheckThe project is intentionally tiny:
src/
βββ index.ts # Public exports
βββ types/index.ts # LiteLLM API types
βββ utils/
β βββ litellm-api.ts # health check, discovery, auto-detect
β βββ format-model-name.ts # owner extraction, name formatting, categorization
βββ plugin/
βββ index.ts # LiteLLMPlugin entry
βββ config-hook.ts # OpenCode config-lifecycle hook (5 s timeout)
βββ enhance-config.ts # core merge logic
See CONTRIBUTING.md for the full contributor workflow.
- Optional cost/latency overlay using LiteLLM's
/spendand/healthendpoints - In-memory cache with TTL to avoid re-querying on rapid restarts
- Model categorization based on
litellm.proxy.config.model_list[].model_info - Tests with vitest
-
chat.paramshook for injecting LiteLLM routing tags / fallbacks
Have an idea? Open an issue.
Inspired by opencode-lmstudio by @agustif β the architectural blueprint for OpenCode model-discovery plugins.
Built on top of LiteLLM by the BerriAI team and OpenCode by the OpenCode contributors.
If this project saved you time, consider giving it a β on GitHub.