Paste Your Prompt. See Real Costs.

Enter the exact prompt you use in production. See what each API call costs you today — then see what it costs after distillation.

Try:
~62 tokens

Estimated at ~4 chars / token · actual count may vary ±10%

200 tokens
Short (50)Tweet (280)Paragraph (500)Long (2000)

Input tokens

62

Output tokens

200

Total / call

262

Requests / month

Cost at 10K requests / month

per call · monthly total
GPT-4oOpenAI

$5/M in · $15/M out

$0.0033 / call

$33.10₹2.8K

GPT-4 TurboOpenAI

$10/M in · $30/M out

$0.0066 / call

$66.20₹5.6K

Claude SonnetAnthropic

$3/M in · $15/M out

$0.0032 / call

$31.86₹2.7K

Gemini 1.5 ProGoogle

$3.5/M in · $10.5/M out

$0.0023 / call

$23.17₹1.9K

DistillfastYOUR MODEL

$0.10/M in · $0.20/M out

$0.00005 / call

$0.46₹39

vs GPT-4o at 10K requests

99%

cheaper with Distillfast

You save / month

₹2.7K

Per call: before

$0.0033

Per call: after

$0.00005

Fine-tune for this use case — it's free

No credit card · Free plan includes 2 models · Under 5 min setup

Frequently Asked Questions

How much does GPT-4o cost per 1,000 requests?

A typical prompt with 500 input + 200 output tokens costs ~$0.0055 per call, or $55 per 10K requests. At 1M requests/month that is $5,500.

How many tokens is my prompt?

1 token ≈ 4 characters (¾ of a word on average). Paste your prompt above — the calculator estimates tokens in real time.

How can I reduce my LLM API costs by 80%?

Fine-tune a smaller open-source model (Llama 3.1, Mistral 7B) on your specific task. The model learns your use case and runs at 5–10% of GPT-4's price with comparable accuracy.

What is the difference between input and output tokens?

Input tokens = everything you send (system prompt + user message). Output tokens = what the model generates. Output costs 3–4× more than input on most providers.

Does token count vary between GPT-4, Claude, and Gemini?

Yes — each model uses a different tokenizer. The calculator uses the common ~4 chars/token estimate which is accurate to ±10% across all three.

When does fine-tuning beat prompt engineering?

When you run the same class of task thousands of times per day. Fine-tuning compresses your system prompt into model weights — shorter prompts, faster inference, lower cost.