← Back to Blog

Memory Prices Surging 40–50% in Q3 2026: Samsung + SK Hynix's $590B Bet and Your AI Coding API Bill

By Eric Bush · June 30, 2026 · 8 min read

Close-up of memory chips on a server motherboard with copper heatsinks

The Forecast That Reframes 2026 Budgets

Jefferies analysts projected on June 30, 2026 that memory prices will rise 40–50% in Q3 2026, another 30–40% in Q4, and 40–45% through 2027. Capacity relief from new factories doesn't arrive until 15-20% of new lines come online in 2028. Two companies — Samsung and SK Hynix — control roughly 80% of the global high-bandwidth memory (HBM) chip market, the kind that sits next to every Nvidia and AMD AI accelerator.

In response, Samsung and SK Hynix announced a combined $590 billion (590 trillion KRW) capacity build-out: 800T KRW for four new fabs, 81T KRW for packaging centers, and 30T KRW over 15 years for next-gen research. SK Group separately committed to 15GW of AI data center capacity by 2035, totaling 1,000T KRW. Apple has already raised Mac and MacBook prices in response. Your AI API bill is the next thing to move.

The Path From DRAM to Token Prices

AI inference accelerators are HBM-bound. A single H100 GPU carries 80GB of HBM3; B200 carries 192GB of HBM3e; GB300 (announced for Microsoft Foundry rollouts) carries even more. Memory is roughly 40-50% of the bill of materials on a modern AI accelerator.

Walk the math: if HBM goes up 50% in Q3, the BOM of a B200 rises by roughly 20-25%. Cloud providers buying B200s under fixed contracts absorb part of the hit; new orders renegotiate at the higher rate. Inference services priced per-token then face a 10-15% upward pressure on the underlying cost basis within 6-9 months.

Provider Exposure by Hardware Mix

Provider Primary Hardware HBM Exposure Likely 2026 Price Move
Anthropic (AWS Trainium + Bedrock) Trainium + GB300 High Flat-to-up; cushioned by Apollo silicon deal
OpenAI (Microsoft Azure) Mixed H100/B200/GB300 Very High Up 10-15% on flagship; mid-tier may hold
Google (TPU + GPU mix) TPU v6+ primary Moderate Likely stable; TPU memory architecture differs
DeepSeek (China-domestic) Domestic AI accelerators Insulated Continues aggressive pricing
AMD MI355X-class inference AMD CDNA + HBM High Squeezed; aggressive HBM stockpiling reported

What This Means for Your 2026-2027 Coding Budget

The cost-collapse story of AI coding over 2024-2026 was driven by software efficiency outrunning hardware. Now hardware is pushing back hard. Three concrete moves for budgeting teams:

1. Lock long-context-heavy contracts now. Long-context inference is the most memory-intensive workload type. If your team relies on Claude Sonnet 4.6 or Gemini 3.1 Pro for full-codebase reasoning, this is the cheapest those tokens will be for the next 18 months.

2. Build DeepSeek into your routing now. A China-domestic provider sitting outside the HBM supply chain is structurally insulated from this price wave. Lindy's 100% switch to DeepSeek looks more rational every month.

3. Watch prompt caching like a hawk. Cached prompts skip the memory-heavy KV-cache rebuild. If your cache hit rate is currently 40%, getting it to 70% becomes structurally more valuable as raw token prices rise.

The Long View

The $590B Korean capex eventually relieves the shortage — but eventually means 2028 at the earliest, and that's assuming demand growth stalls. For the next 18 months, AI coding cost optimization shifts from "find the cheapest token" to "structurally insulate yourself from the memory squeeze." That looks like a mix of cached prompts, mid-tier models, multi-provider routing, and selective use of frontier capacity only where it pays.

Want to calculate exact costs for your project?

Frequently Asked Questions

How big is the projected memory price surge?

Jefferies forecasts DRAM and HBM rising 40-50% in Q3 2026, an additional 30-40% in Q4 2026, and 40-45% through 2027. Capacity relief is not expected until 15-20% of new Korean fab capacity comes online in 2028.

Which AI providers are most exposed to the HBM squeeze?

OpenAI on Azure has highest exposure due to heavy B200/GB300 inventory needs. AMD MI355X-based deployments are also squeezed. Google's TPU architecture is more insulated, and DeepSeek's China-domestic stack sits largely outside the supply chain affected.

Will Claude or GPT prices definitely go up in 2026?

Flagship tier prices face strong upward pressure of 10-15% on the underlying cost basis. Whether that's passed to customers depends on each provider's strategy. Anthropic's Apollo silicon deal cushions Claude somewhat; OpenAI is more exposed.

What's the single best hedge for an AI coding team?

Multi-provider routing with DeepSeek or another structurally-insulated provider in the mix, combined with aggressive prompt caching. Cache hits skip the memory-heavy KV-cache rebuild and become structurally more valuable as raw token prices rise.