Anthropic Launches Claude Apps Gateway for Bedrock and Google Cloud: Enterprise Cost Control, Decoded
By Eric Bush · June 30, 2026 · 8 min read
What Shipped Today
On June 30, 2026, Anthropic released the Claude apps gateway — a self-hosted control plane that lets enterprises run Claude Code against Amazon Bedrock or Google Cloud Vertex AI without sending any inference traffic back to Anthropic. The whole thing ships as a single stateless container that you point at a PostgreSQL database. It is generally available today and free to deploy on top of your existing Bedrock or Vertex commitments.
The headline features are not new ideas individually — SSO via OIDC, role-based permissions, daily/weekly/monthly spend caps, and OpenTelemetry export to whatever observability stack you already pay for. What is new is that you can compose all of these onto Claude Code itself, instead of writing a custom proxy. Anthropic explicitly states the gateway does not send inference or usage data back to them unless you also configure the Claude API as a backend.
Where Enterprise Money Actually Leaks
Before this release, an enterprise running Claude Code on Bedrock had three options, none cheap:
| Approach | Setup Cost | Failure Mode |
|---|---|---|
| Direct Bedrock access per developer | $0 | No spend caps; one bad agent loop can burn $5K overnight |
| Custom proxy (LiteLLM, Portkey, in-house) | 2–8 engineer-weeks | Maintenance burden; auth + observability rewritten poorly |
| Enterprise gateway SaaS | $15–50/seat/month | Inference traffic routes through third party; data residency complications |
| Claude apps gateway (today) | ~1 engineer-day deploy | Limited to Anthropic-supported backends |
The most expensive failure mode in production AI coding deployments is not the inference bill — it is an unaudited agent that retries 200 times against a 200K-token context. Per-user daily caps are the single highest-ROI control most enterprises do not have.
Cost Math: 50-Person Engineering Team
Take a team of 50 engineers each using Claude Code through Bedrock. Average usage sits around $180/seat/month based on 2026 baselines. That is $9,000/month base spend, ignoring the long tail of accidents.
In our observations of teams without per-user caps, the long tail adds 15–35% on top — runaway agent loops, accidental large-codebase ingestion, repeated retries against a broken tool. At the high end, that is $3,150/month of pure waste. Putting per-user $25/day caps in place collapses that long tail to under 5%, saving roughly $2,700/month for a team this size.
The gateway runs as a single Linux container plus a small managed PostgreSQL. Even a generous Bedrock-region deployment with t3.medium-equivalent compute and a db.t3.small RDS sits at ~$60/month. Net savings: ~$2,640/month for a 50-person team. At 200 engineers the math scales linearly toward $10K+/month in clawed-back waste.
What the Gateway Does Not Do
It is not a multi-provider router. If your strategy depends on falling back from Claude Opus 4.8 to DeepSeek V4-Flash when a request looks cheap-enough, you still need LiteLLM, Portkey, or OpenRouter in the path. The Claude apps gateway is purpose-built for Anthropic's model family on Anthropic-blessed backends. That is a feature for governance teams who want one vendor's audit trail, and a limitation for cost-routing teams who want every dollar second-guessed.
It also does not change underlying token prices. Bedrock and Vertex pricing remain whatever your cloud commit negotiated. The gateway claws back wasted spend; it does not negotiate cheaper tokens.
When to Deploy It
Deploy if: your engineering org is over ~30 seats, you have Bedrock or Vertex commitments, and you currently have no per-user spend cap on Claude Code usage. Payback is well under a month.
Skip if: you route across multiple providers as a cost strategy (you need a gateway with multi-provider support), or your team is small enough that one overspending developer is easily caught manually.
Want to calculate exact costs for your project?
Frequently Asked Questions
Does the Claude apps gateway send any data back to Anthropic?
No — Anthropic explicitly states the gateway does not transmit inference traffic or usage telemetry back to them, unless you also configure the Claude API as a backend. Inference stays inside your Bedrock or Vertex environment.
How much does the gateway itself cost to run?
Hosting cost is negligible — a single stateless container plus PostgreSQL. Expect roughly $50–80/month for a production-grade deployment on AWS (small EC2 + RDS) or GCP equivalents.
Can the gateway route to non-Anthropic models like DeepSeek or Gemini?
No. It is purpose-built for Anthropic's model family on Bedrock and Vertex. For multi-provider routing you still need a separate layer like LiteLLM, Portkey, or OpenRouter.
What is the single biggest cost saving from deploying this?
Per-user daily spend caps. Teams running without them typically see 15–35% of total spend lost to runaway agent loops and accidents. Caps collapse that to under 5%.
Related Articles
Claude Desktop on AWS, GCP & Microsoft Foundry: Per-Cloud Cost Differences for Enterprise Coding
Anthropic's June 2026 launch puts full Claude Desktop on AWS, Google Cloud, and Microsoft Foundry. Per-cloud token pricing, identity integration, and data residency cost differences for enterprise coding teams.
Anthropic's Claude Partner Network: Enterprise Volume Pricing Through Certified Partners
Anthropic launched the Claude Partner Network with Services Track and Partner Hub. We analyze the cost implications — enterprise Claude Code deployments can now access 20-30% volume discounts through certified partners.
Anthropic Launches Claude Tag in Slack: The Hidden Multi-Seat Token Cost of @Claude Team Collaboration
Anthropic's new Claude Tag turns Claude into a Slack teammate you @mention. Each tag triggers an async agent — and the token bill quietly fans out across the team. Here's the math.