Tool
AI API Cost Calculator
See exactly what GPT-4, Claude, and Gemini will cost you per call, per day, and per month — at the volume and prompt size you actually use. Helps decide which model is the right value for your workload.
| Model | Per call | Per day | Per month |
|---|---|---|---|
GPT-4o OpenAI · $2.5/M in · $10/M out | $0.00750 | $0.7500 | $22.50 |
GPT-4 Turbo OpenAI · $10/M in · $30/M out | $0.0250 | $2.50 | $75.00 |
GPT-3.5 Turbo OpenAI · $0.5/M in · $1.5/M out | $0.00125 | $0.1250 | $3.75 |
Claude 3.5 Sonnet Anthropic · $3/M in · $15/M out | $0.0105 | $1.05 | $31.50 |
Claude 3 Opus Anthropic · $15/M in · $75/M out | $0.0525 | $5.25 | $157.50 |
Claude 3 Haiku Anthropic · $0.25/M in · $1.25/M out | $0.00088 | $0.0875 | $2.63 |
Gemini 1.5 Pro Google · $1.25/M in · $5/M out | $0.00375 | $0.3750 | $11.25 |
Gemini 1.5 FlashCheapest Google · $0.075/M in · $0.3/M out | $0.00022 | $0.0225 | $0.6750 |
Frequently asked questions
How do AI APIs charge for usage?
All major AI APIs charge per token, billed separately for input tokens (what you send) and output tokens (what the model returns). Output tokens almost always cost more — typically 3-5x the input rate. Pricing is quoted per million tokens. For example, GPT-4o is $2.50 per million input tokens and $10 per million output tokens as of 2026.
Which AI API is cheapest?
For most general use cases in 2026, Gemini 1.5 Flash and Claude 3 Haiku are the cheapest, followed by GPT-3.5 Turbo. For higher-quality outputs, Gemini 1.5 Pro and GPT-4o are the best value among the strong models. Claude 3.5 Sonnet sits in the middle on price but is favored for writing and reasoning quality.
What is prompt caching and how does it reduce costs?
Prompt caching stores parts of your prompt (typically the system prompt or long context) on the vendor side, so repeated requests pay 50-90% less on the cached portion. Claude, GPT, and Gemini all offer some form of caching. For applications that send the same system prompt many times, caching can cut bills by 70%+. Not reflected in this calculator since it depends on your specific usage.
How do I estimate token counts before calling the API?
Use our companion AI Token Counter tool. For English prose, one token is roughly 4 characters or three-quarters of a word. A 500-word email is about 650 tokens. A 5-page PDF is roughly 2,500-3,500 tokens depending on density. For exact counts, use the official tokenizer for the model.
Are these prices accurate?
They are the published list prices as of 2026 from each vendor. Real bills can be lower if you use batch APIs (typically 50% discount with delayed processing), prompt caching (50-90% off cached inputs), or volume commitments. For billing-critical decisions, always verify against the vendor pricing page on the day you commit.