Run agent workflows faster and cheaper across inference providers.
Each LLM call has a different latency budget, cost profile, and reliability requirement. Blackbox routes the workflow as a graph instead of treating calls as opaque traffic.
Your agent is a graph. Most routers see a queue.
Each request has a different cost profile, a different latency budget, and different reliability requirements. Blackbox treats the workflow as a first-class object, not as opaque traffic.
- One click annotation & deployment
- Tradeoffs you can control. Find the optimal latency/cost
- Reroute before throttling hurts. Shift traffic before provider limits slow down your workflow.
- No provider lock-in: credentials live in your dashboard, not in your code.
- Intelligent quota controls. Your Provider limits turned into live signals.
One API in front of every provider.
Set api-key to the Blackbox API key minted in the dashboard. Blackbox handles provider selection, retries, and quota arbitration.
# Annotate LLM nodes. Blackbox extracts the graph from these declarations.from langgraph.graph import StateGraphfrom agentir_langgraph.decorators import llm_call @llm_call(model="claude-opus-4.7", static_vars=["SUMMARIZER_SYSTEM_PROMPT"])def summarizer(state: DocState, config=None): ... workflow = StateGraph(DocState)workflow.add_node("summarizer", summarizer)Automatically generate a pull request with SDK annotations configured for Blackbox. At no charge!
Workflow-aware provider selection.
Configure provider limits, cost/latency targets, health, and quotas. Blackbox uses them to pick the best provider for each LLM call.
Blackbox sees the graph, not isolated calls.
Pick the provider tradeoff you want per workflow.
Avoid slow or failing providers before they hurt latency.
Turn RPM, TPM, and spend limits into live routing signals.
Catch critical-path changes before users notice.
Register workflows, manage providers, tune tradeoffs, and inspect runs.