OpenAI Moderation API vs Building Your Own: Cost of Content Safety in AI Apps
June 5, 2026 · 8 min read
OpenAI Embeds Moderation Directly in API Responses
OpenAI released moderation scores directly in the Responses API and Completions API. Instead of making a separate call to the moderation endpoint, apps now receive moderation signals in the same request flow. The scores cover categories like hate speech, self-harm, sexual content, and violence — and can be used for logging, routing, auditing, or blocking without additional API calls.
This changes the cost equation for content safety. Previously, developers had to choose between OpenAI's free-but-separate moderation endpoint, building their own classifiers, or paying for third-party moderation APIs. Now, moderation comes bundled with inference at zero additional token cost.
The Four Approaches to Content Safety
Every AI application that accepts user input needs content moderation. The question is how much it costs and how well it works. Here are the four main approaches:
1. OpenAI's built-in moderation — now included in API responses at no extra cost. Previously available as a free separate endpoint. Covers standard harm categories with confidence scores.
2. Third-party moderation APIs — services like Perspective API (Google), Azure Content Safety, or specialized providers like Hive Moderation. Pay per request with varying coverage.
3. Fine-tuned custom classifiers — train your own model on domain-specific content. Higher upfront cost, lower marginal cost at scale, and catches domain-specific abuse patterns.
4. Human review teams — manual moderation for edge cases or as a final layer. Most expensive per request but highest accuracy for nuanced content.
Cost Per 1M Requests: Head-to-Head Comparison
| Approach | Cost per 1M requests | Latency added | Setup cost |
|---|---|---|---|
| OpenAI inline moderation | $0 (bundled) | 0ms (same response) | $0 |
| OpenAI separate endpoint | $0 (free tier) | 50-100ms | $0 |
| Azure Content Safety | $1,000-$1,500 | 30-80ms | $0 |
| Hive Moderation | $1,200-$2,000 | 100-200ms | $0 |
| Custom fine-tuned classifier | $50-$200 (inference) | 10-30ms | $5,000-$20,000 |
| Human review (outsourced) | $50,000-$200,000 | minutes-hours | $2,000-$5,000 |
The numbers tell a clear story: if you're already using OpenAI for inference, their bundled moderation is effectively free. The real cost question only arises when you need moderation beyond what OpenAI covers or when you're not using OpenAI as your primary provider.
When Free Moderation Isn't Enough
OpenAI's moderation covers broad categories but has limitations that may force you to invest in additional layers:
Domain-specific abuse. A children's education platform needs stricter thresholds than a developer tool. A financial app needs to detect scam patterns. OpenAI's generic categories don't cover these.
Multi-modal content. If your app handles images or audio alongside text, you need separate moderation for each modality. OpenAI's text moderation won't catch harmful images uploaded by users.
Regulatory compliance. GDPR, COPPA, DSA, and industry-specific regulations may require audit trails, appeal mechanisms, and specific categorization that generic moderation doesn't provide.
Provider independence. If you use Claude, Gemini, or open-source models for inference, you don't get OpenAI's bundled moderation and need your own solution.
Building a Custom Classifier: Real Cost Breakdown
For teams that need custom moderation, here's what building your own actually costs:
Training data collection: $2,000-$8,000. You need 10K-50K labeled examples of both safe and unsafe content in your domain. This often requires specialized annotators who understand context.
Model training: $500-$3,000. Fine-tuning a small BERT-class model on your labeled data. GPU compute for training is relatively cheap; the iteration cycles add up.
Infrastructure: $100-$500/month. Hosting a small classifier model for inference. A distilled model can run on a single GPU instance or even CPU for moderate traffic.
Maintenance: $500-$2,000/month ongoing. Adversarial content evolves. You need regular retraining, threshold tuning, and false positive review.
Total first-year cost: $15,000-$40,000. At scale (10M+ requests/month), the per-request cost drops below $0.0001 — far cheaper than third-party APIs. The breakeven point versus Azure Content Safety is around 15-20M requests.
The Hybrid Approach: Optimizing for Cost and Coverage
Most production systems use a layered approach:
Layer 1: OpenAI's bundled moderation (free) catches 90% of clearly harmful content with zero additional latency. Use this as your first gate.
Layer 2: Lightweight custom rules or a small classifier ($50-$200/1M requests) handles domain-specific patterns that OpenAI misses. Only triggered for content that passes Layer 1 but matches suspicious patterns.
Layer 3: Human review ($0.05-$0.20 per item) for edge cases flagged by Layers 1-2 but not clearly actionable. Used for less than 1% of total content.
This hybrid costs approximately $200-$800 per 1M requests with near-human accuracy, compared to $50,000+ for full human review at the same volume.
Cost Impact on AI Application Architecture
Content safety costs should be factored into your AI app's unit economics from the start. If you're building a chatbot that handles 100K messages/day:
Using only OpenAI's bundled moderation: $0/month additional cost. Using Azure Content Safety as a second layer: $3,000-$4,500/month. Using a custom classifier: $200-$500/month after initial investment. Sending 1% to human review: $1,500-$6,000/month.
For most AI coding tools and developer-facing applications, OpenAI's built-in moderation is sufficient. The investment in custom moderation only makes sense for consumer-facing applications with high-volume user-generated content where the risk of harmful content slipping through has legal or reputational consequences.
Recommendation by Scale
| Monthly requests | Recommended approach | Monthly cost |
|---|---|---|
| <100K | OpenAI bundled only | $0 |
| 100K-1M | OpenAI + keyword rules | $0-$50 |
| 1M-10M | OpenAI + custom classifier | $200-$500 |
| 10M+ | Custom classifier + human review layer | $1,000-$5,000 |
Frequently Asked Questions
How much does OpenAI's moderation API cost?
OpenAI's moderation is free — both the standalone endpoint and the new inline moderation scores bundled with Responses API and Completions API responses. There is no additional token charge for moderation signals.
What does it cost to build a custom content moderation system?
First-year costs range from $15,000-$40,000 including training data ($2K-$8K), model training ($500-$3K), infrastructure ($100-$500/month), and ongoing maintenance ($500-$2K/month). Per-request costs drop below $0.0001 at scale.
When should I build custom moderation vs using OpenAI's?
Build custom moderation when you need domain-specific abuse detection, multi-modal coverage, regulatory compliance beyond standard categories, or when you don't use OpenAI as your inference provider. For most developer tools, OpenAI's built-in moderation is sufficient.
How much does content moderation cost per million API requests?
OpenAI bundled: $0. Azure Content Safety: $1,000-$1,500. Hive Moderation: $1,200-$2,000. Custom classifier: $50-$200. Human review: $50,000-$200,000. A hybrid approach typically costs $200-$800 per million requests.
Does the new OpenAI moderation add latency to API responses?
No. The inline moderation scores are returned as part of the same API response, adding zero additional latency. The separate moderation endpoint adds 50-100ms per call.
Want to calculate exact costs for your project?
Related Articles
OpenAI Codex Now Builds iOS Apps: Mobile Development Cost with AI Agents
OpenAI Codex adds iOS app building with SwiftUI previews and hot-reload. We analyze the token costs of mobile development with AI agents vs web dev workflows.
Claude Code vs OpenAI Codex for Mobile Development: Which Costs Less Per Feature?
Head-to-head cost comparison of Claude Code and OpenAI Codex for building mobile features. Token consumption, pricing tiers, and workflow efficiency compared.
Bot Traffic Hits 57.5%: How AI Coding Agents Are Driving Up Infrastructure Costs
Cloudflare Radar reports bots now generate 57.5% of internet traffic. AI coding agents making API calls, fetching docs, and using MCP tools are a growing contributor. Here's what this means for your costs.