⚡ Agent Lightning v0.3

Drop-in AI agent optimization toolkit. Better than Microsoft's.

100+ models via OpenRouter — not hardcoded to OpenAI
APO beam search with textual gradients (same algo, model-agnostic)
Multi-model consensus — novel feature, not in Microsoft's version
Autonomous orchestrator — fires, trains, A/B tests, deploys automatically
OTel spans — plug into Jaeger, Datadog, Grafana Tempo
Zero deps core — SQLite only, works offline with Ollama/Qwen3

pip install agent-lightning
pip install "agent-lightning[openrouter]"   # + aiohttp for OpenRouter
pip install "agent-lightning[all]"           # everything

60-second quickstart

import agent_lightning as al

store  = al.LightningStore()
tracer = al.AsyncTracer(agent_id="my_agent", store=store)

# Record runs exactly as before
with tracer.run() as run:
    tracer.log_prompt("What is Python?")
    answer = my_llm("What is Python?")
    tracer.log_response(answer)
    run.add_reward(score(answer))

# Get optimized prompt back (set by orchestrator automatically)
prompt = tracer.get_optimized_prompt(fallback="You are a helpful assistant.")

OpenAI — 1-line wrap (v0.1 compat)

client = al.wrap_openai(OpenAI(), agent_id="my_agent")
# use exactly as before — all calls traced automatically

The Autonomous Loop

from agent_lightning import *

store  = LightningStore()
tracer = AsyncTracer(agent_id="my_agent", store=store)

# 1. Setup APO with OpenRouter (cost arbitrage)
apo = APO(
    store=store, agent_id="my_agent",
    gradient_backend=OpenRouterBackend.fast(api_key="sk-or-..."),   # cheap: Qwen3-8B free
    eval_backend=OpenRouterBackend.strong(api_key="sk-or-..."),     # strong: Claude/GPT-4o
    initial_prompt="You are a helpful assistant.",
    beam_width=4, beam_rounds=3,
)

# 2. Orchestrator fires automatically every 50 runs
orchestrator = Orchestrator(
    store=store, algorithm=apo,
    config=OrchestratorConfig(
        agent_id="my_agent",
        trigger=EveryNRuns(50),
        auto_deploy=True,
    ),
    on_improvement=lambda p, score: print(f"Better prompt deployed! score={score:.3f}"),
)
orchestrator.start()   # background thread, non-blocking

# 3. Your agent runs unchanged
while True:
    prompt = tracer.get_optimized_prompt("You are a helpful assistant.")
    with tracer.run() as run:
        tracer.log_prompt(user_input, role="user")
        response = call_llm(user_input, system=prompt)
        tracer.log_response(response)
        run.add_reward(evaluate(response))

OpenRouter — 100+ Models

# Free tier for bulk gradient computation
gradient = OpenRouterBackend(api_key="sk-or-...", model="qwen/qwen3-8b:free")

# Strong model for final eval
eval_b   = OpenRouterBackend(api_key="sk-or-...", model="anthropic/claude-3-haiku")

# Or use tier factories
fast     = OpenRouterBackend.fast(api_key="sk-or-...")       # cheapest
balanced = OpenRouterBackend.balanced(api_key="sk-or-...")   # mid
strong   = OpenRouterBackend.strong(api_key="sk-or-...")     # best quality

Available tiers:

Tier	Models
fast	`qwen/qwen3-8b:free`, `llama-3.1-8b:free`, `gemma-3-4b:free`, `mistral-7b:free`
balanced	`claude-3-haiku`, `gemini-flash-1.5`, `qwen-2.5-72b`, `mistral-nemo`
strong	`claude-sonnet-4`, `gemini-2.5-pro`, `gpt-4o`, `deepseek-r1`

Multi-Model Consensus (novel vs Microsoft)

consensus = MultiModelConsensus(
    backends=[
        OpenRouterBackend(key, model="qwen/qwen3-8b:free"),
        OpenRouterBackend(key, model="meta-llama/llama-3.1-8b-instruct:free"),
        OpenRouterBackend(key, model="mistralai/mistral-7b-instruct:free"),
    ],
    judge=OpenRouterBackend(key, model="anthropic/claude-3-haiku"),
)
best_prompt = await consensus.optimize_prompt(current, good_examples, bad_examples)

Offline / Air-Gapped (Ollama)

backend = OllamaBackend(model="qwen3:8b")  # local, no internet
apo = APO(store=store, agent_id="x", gradient_backend=backend)

APO — Automatic Prompt Optimization

Textual gradients + beam search. Beats single-shot reflection.

apo = APO(
    store=store, agent_id="my_agent",
    gradient_backend=OpenRouterBackend.fast(api_key="..."),
    eval_backend=OpenRouterBackend.strong(api_key="..."),
    initial_prompt="You are a helpful assistant.",
    beam_width=4,       # keep top 4 candidates each round
    branch_factor=3,    # 3 gradient edits per parent
    beam_rounds=3,      # 3 optimization rounds
    reward_fn=my_scorer,  # optional: your own eval function
)
result = await apo.run(train_dataset=train, val_dataset=val)
print(result["best_prompt"])
print(result["best_score"])
print(result["history"])   # full optimization trace

Async Rollouts + Hooks

class MyHook(al.Hook):
    async def on_rollout_start(self, rollout, **kw):
        print(f"Starting rollout {rollout.rollout_id}")
    async def on_rollout_end(self, rollout, attempt, status, **kw):
        print(f"Done: reward={attempt.reward}")

tracer = AsyncTracer(agent_id="x", store=store, hooks=[MyHook()])

async with tracer.rollout(task=my_task) as (rollout, attempt):
    tracer.log_prompt("input")
    tracer.log_response("output")
    attempt.reward = 0.9

OTel Compatibility

from agent_lightning.otel import OTelExporter, OTelSpanAdapter

exporter = OTelExporter(endpoint="http://jaeger:4318/v1/traces")
await exporter.export(spans)          # send to Jaeger/Datadog/Tempo

# Or export to file
exporter.export_to_file(spans, "traces.jsonl")

Triggers

EveryNRuns(n=50)                       # after every N completed runs
Scheduled(interval_seconds=3600)       # every hour
Manual()                               # only when .trigger() called
OnImprovement(min_avg_reward=0.6)      # when quality drops below threshold

CLI

agl dashboard                          # web UI at localhost:7860
agl stats my_agent                     # run statistics
agl rollouts my_agent                  # rollout queue status
agl prompt my_agent                    # show current optimized prompt
agl export my_agent -o data.jsonl      # export runs for fine-tuning
agl models                             # list OpenRouter model tiers
agl train --agent-id my_agent \
          --api-key sk-or-... \
          --model qwen/qwen3-8b:free \
          --rounds 3

How We Beat Microsoft

Feature	Microsoft Agent Lightning	Ours v0.2
LLM backend	Hardcoded `AsyncOpenAI`	OpenRouter = 100+ models
Cost arbitrage	❌	✅ cheap gradient, strong eval
Multi-model consensus	❌	✅ novel feature
Offline / air-gapped	❌	✅ via OllamaBackend
APO algorithm	✅ OpenAI only	✅ any backend
Beam search	✅	✅
Textual gradients	✅	✅
Autonomous orchestrator	✅	✅
OTel spans	✅	✅
Rollout queue	✅ complex	✅ simpler, SQLite
Async tracer	✅	✅
Hooks	✅	✅
A/B eval + t-test	❌	✅
GPU RL (VERL)	✅	❌ (roadmap)
Install complexity	Docker + vLLM + FSDP	`pip install`

Examples

File	What it shows
`examples/quickstart.py`	v0.1 minimal tracing
`examples/v2_quickstart.py`	v0.2 full autonomous loop
`examples/v2_openrouter_apo.py`	APO with cost arbitrage
`examples/v2_orchestrator.py`	Background auto-optimization
`examples/v2_multi_model_consensus.py`	3-model consensus
`examples/v2_ollama_offline.py`	Offline with Ollama/Qwen3
`examples/openai_agent.py`	1-line OpenAI wrap
`examples/tool_agent.py`	Tool call tracing
`examples/multi_agent.py`	Multi-agent pipeline

Tests

cd agent_lightning
python -m pytest tests/ -v     # requires pytest
# or run directly:
python -c "import sys; sys.path.insert(0,'.'); exec(open('tests/test_models.py').read())"

License

MIT — use it, fork it, ship it.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
agent_lightning		agent_lightning
contrib		contrib
docs		docs
examples		examples
scripts		scripts
tests		tests
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
RAI_README.md		RAI_README.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ Agent Lightning v0.3

60-second quickstart

OpenAI — 1-line wrap (v0.1 compat)

The Autonomous Loop

OpenRouter — 100+ Models

Multi-Model Consensus (novel vs Microsoft)

Offline / Air-Gapped (Ollama)

APO — Automatic Prompt Optimization

Async Rollouts + Hooks

OTel Compatibility

Triggers

CLI

How We Beat Microsoft

Examples

Tests

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ Agent Lightning v0.3

60-second quickstart

OpenAI — 1-line wrap (v0.1 compat)

The Autonomous Loop

OpenRouter — 100+ Models

Multi-Model Consensus (novel vs Microsoft)

Offline / Air-Gapped (Ollama)

APO — Automatic Prompt Optimization

Async Rollouts + Hooks

OTel Compatibility

Triggers

CLI

How We Beat Microsoft

Examples

Tests

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages