← Back to Blog

Weave Router vs OpenRouter, LiteLLM, and Portkey: When Does Local Model Routing Pay Off?

June 28, 2026 · 9 min read

Switchboard with cables routing connections

Another Router Enters the Ring

Weave hit Hacker News on June 28, 2026 as a local-first intelligent model router. Install is one line — npx @workweave/router — and it runs as a localhost proxy on port 8080. The pitch: a cluster-scoring algorithm based on Avengers-Pro 1 picks the best model per request, you keep your provider keys locally, and tracing flows out through OTLP for self-hosted observability.

Weave joins a crowded category. OpenRouter, LiteLLM, and Portkey have been routing LLM calls for over a year. The question for any team adding a router today is not "which is best in the abstract?" — it's "what does my workload actually need, and when does a local router beat a cloud one?"

Side-by-Side Capability Map

Capability Weave (Local) OpenRouter (Cloud) LiteLLM (Self-Host) Portkey (Cloud)
Key custody Local, encrypted Cloud Self-host Cloud
Auto routing Avengers-Pro 1 Pareto curves Manual rules Cluster scoring
Markup on traffic 0% 5% 0% 0-3% tier-based
Observability OTLP self-host Built-in dashboard Self-host Built-in dashboard
Client compatibility Claude Code, Codex, Cursor OpenAI-compat OpenAI-compat OpenAI-compat
Data residency control Full (you hold) Partial (region routing) Full Partial

When the Local Router Wins

Three workloads make Weave (or LiteLLM self-hosted) the right pick over OpenRouter or Portkey:

Sensitive data flows. If your prompts contain customer PII, source code under NDA, or regulated content, sending them through a cloud proxy adds a vendor to your trust boundary. Local routing keeps the data inside your network.

High-volume workloads at frontier scale. OpenRouter's 5% markup on a $50K/month spend is $2,500/month — call it $30K/year. That funds an engineer-day per month of LiteLLM ops easily. Above $100K/month, the markup math gets harder to justify.

Single-developer or single-team setup. Weave specifically targets the local Claude Code / Codex / Cursor user who wants smart routing without standing up a cloud gateway. The installation is essentially zero-friction.

When the Cloud Router Wins

OpenRouter and Portkey hold three structural advantages that local routers cannot replicate:

Provider failover. OpenRouter routes around 30-90 second provider outages automatically using its multi-region presence. A localhost router on your machine cannot do this without you also running multi-region infrastructure.

Built-in spend governance. Cloud routers offer team-level spend caps, per-user quotas, and email alerts out of the box. Replicating these with OTLP + Grafana on a local router is doable but takes 1-2 days of setup.

Negotiated provider rates. OpenRouter has volume contracts with several providers that price-match or beat direct API rates after their markup. For low-volume teams the cloud router is sometimes cheaper than going direct.

A Decision Rule

Three thresholds to make the choice numerical, not philosophical:

  • Under $5K/month spend: use OpenRouter or Portkey. The setup and operations cost of a local router is not worth it at this volume.
  • $5K-$50K/month: Weave for a single team, LiteLLM self-hosted for multiple teams. Cloud router markup starts to bite.
  • $50K+/month: LiteLLM self-hosted with proper observability, OR direct provider relationships with custom rate cards. Cloud router markup is real money.

Weave's Specific Niche

Weave is positioned at the smaller end of the market — individual developers and small teams using Claude Code, Codex, or Cursor. The Avengers-Pro 1 cluster scorer is the differentiator: it claims to pick the right model per request based on prompt features (length, code language, task type) rather than blanket rules.

Whether the auto-routing actually saves money over a simpler "Sonnet 4.6 for code, Haiku 4.5 for tools" hand-coded rule depends on workload shape. Run a 1-week shadow test before committing.

The Real Cost Lever

The model routing market is now saturated. The router itself contributes 5-15% of total savings. The other 85% comes from which models you route to. A router that diligently sends every request to Claude Opus 4.8 will spend more than a hand-coded script that sends most requests to DeepSeek V4-Flash. Pick the router that fits your trust model and team size; spend the engineering time on the routing policy, not the router selection.

Want to calculate exact costs for your project?

Frequently Asked Questions

Is Weave free?

The router itself is free and open source. You pay only for the underlying provider API calls, which Weave forwards using your own keys.

Can I use Weave behind a corporate proxy?

Yes, but you need to expose the OTLP endpoint outbound for tracing. The provider API calls themselves go through standard HTTPS.

Does Weave support self-hosted models like DeepSeek or Llama on your own GPU?

Currently yes via OpenAI-compatible endpoints — point Weave at your inference server URL and it routes there like any other provider.

How does Avengers-Pro 1 actually pick models?

It clusters prompts by feature embeddings and historical accuracy/cost data, then selects the model with the best score in that cluster. The training data is publicly documented in the Avengers-Pro 1 paper.