AI Cost Estimator

Estimate your AI coding costs

← Back to Blog

How to Calculate Cost per AI Agent Task: A Practical Formula for Developers

May 21, 2026 · 6 min read

Cost per Prompt Is the Wrong Metric

Developers often ask how much an AI coding prompt costs. That number is useful, but it is not the metric that matters. The better metric is cost per completed agent task: the total model spend required to finish a bug fix, generate tests, migrate a component, review a pull request, or ship a feature.

A cheap prompt can become expensive if it leads to five retries. An expensive premium-model prompt can be cheap if it solves a difficult task in one pass. Task-level accounting is the only fair way to compare models, tools, and workflows.

The Basic Formula

The direct token cost of an AI agent task is straightforward:

Task cost = input tokens × input price + output tokens × output price

Because most providers quote prices per million tokens, divide token counts by 1,000,000 before multiplying. For example, 200,000 input tokens on a $3/M input model costs $0.60. 40,000 output tokens on a $15/M output model costs $0.60. The total direct token cost is $1.20.

Cost component What to count
Input tokensPrompts, files, logs, diffs, tool output, history
Output tokensCode, explanations, plans, tests, summaries
RetriesFailed attempts, test-fix loops, review changes
Parallel agentsResearch, implementation, QA, security review workers

Add the Retry Multiplier

Most real coding tasks do not finish on the first turn. The agent writes a patch, runs tests, sees failures, edits again, and responds to reviewer feedback. That retry loop is why cost per task can be several times higher than cost per initial prompt.

A simple bug fix may have a retry multiplier of 1.2. A multi-file refactor may be 2.0. A migration with hidden tests may be 3.0 or higher. The multiplier should reflect your actual workflow, not an optimistic demo.

Do Not Ignore Human Review

Token cost is only part of the economics. If a cheap model produces code that takes 45 minutes to review, it may be more expensive than a premium model that produces a clean patch in 10 minutes. For teams, the best metric is often fully loaded task cost: token spend plus human review time.

You do not need a perfect accounting system. Even rough estimates help. Track agent task type, model used, number of turns, whether tests passed, and review time. Patterns appear quickly.

Bottom Line

Cost per AI agent task is the right way to compare coding models. Count input tokens, output tokens, retries, parallel agents, and review time. Then compare the final cost per completed engineering outcome.

Use the AI Cost Estimator to model different task sizes and see how model choice changes total spend.

Want to calculate exact costs for your project?