The most common cost optimization: use a cheaper model for tasks that don't need the expensive one. We ran the numbers.
The price gap
GPT-4o: $2.50/$10.00 per million tokens. Claude Haiku: $0.25/$1.25. That's 10x on input, 8x on output. For an agent making 10K calls/day, switching eligible tasks saves $200-400/month.
Task-by-task breakdown
Classification. Haiku matches GPT-4o within 2% on structured classification. Tested across 5,000 support tickets. Verdict: switch to Haiku.
Data extraction. For well-structured schemas, Haiku is identical. For messy text, GPT-4o has a 5-8% edge. Verdict: Haiku for structured, GPT-4o for messy.
Summarization. Both produce good summaries under 10K tokens. For long documents needing nuanced synthesis, premium models win. Verdict: Haiku for short, premium for long.
Code generation. Premium models clearly win here. Claude Sonnet and GPT-4o produce meaningfully better code. Verdict: keep premium.
Conversational QA. FAQ-style with knowledge base — Haiku sufficient. Open-ended reasoning — GPT-4o better. Verdict: depends on complexity.
The 73% finding
Across all deployments we analyzed, 73% of GPT-4o calls were for tasks where a cheaper model would produce identical results. The fix isn't replacing GPT-4o everywhere — it's right model for the right task. AgentCostPilot's Model Comparison shows exactly which calls to switch.