KJP AI Cost Estimate — Token & Model Breakdown
Consolidated from Notion 2026-04-23. Notion is deprecated. Status: Phase 2 — deferred to fall/winter 2026.
Scope: KJP AI Tagging System — backfill + ongoing + optional descriptions Last updated: 2026-03-10 (source) · 2026-04-23 (migrated here) Note: All estimates use prompt caching as the baseline and assume images are pre-resized to 1024px max before API submission.
How the token math works
Every request sends three things:
- Image — ~1,400 tokens after resize to 1024px max (see warning below)
- Tag list + prompt — ~8,500 tokens (the full tag taxonomy is the expensive part)
- Output — ~150 tokens (JSON blob of selected tags + colors)
Large image warning: KJP's catalog includes images up to 3,000×3,000px or larger. Without a resize step, image token cost scales with resolution — a 3000×3000 image hits ~12,000 image tokens vs ~1,400 post-resize. That's 9× the cost with zero classification benefit (no need for full resolution to identify a flower species or color). The batch runner must resize every image to 1024px max on the longest side before sending to the API. All estimates on this page assume that step is in place. If skipped, multiply the backfill cost by 3–9× depending on image size mix.
The tag list is static per batch, making it a perfect candidate for prompt caching — the first request pays full rate, every subsequent one reads the cached version at 10% of normal input price. This is the biggest single lever.
Per-image cost by model
| Model | Input | Output | No cache | With cache | Cache + Batch |
|---|---|---|---|---|---|
| Claude Sonnet 4.6 | $3.00/MTok | $15.00/MTok | ~$0.032 | ~$0.011 | ~$0.006 |
| Claude Haiku 4.5 | $0.80/MTok | $4.00/MTok | ~$0.009 | ~$0.003 | ~$0.0015 |
| Claude Opus 4.6 | $15.00/MTok | $75.00/MTok | ~$0.161 | ~$0.055 | ~$0.028 |
| GPT-4o | $2.50/MTok | $10.00/MTok | ~$0.027 | n/a | n/a |
| GPT-4o mini | $0.15/MTok | $0.60/MTok | ~$0.002 | n/a | n/a |
| Gemini 2.0 Flash | $0.075/MTok | $0.30/MTok | ~$0.001 | n/a | n/a |
Phase-by-phase cost breakdown
Recommended stack: Sonnet for calibration, Haiku for bulk backfill.
| Phase | Images | Model | Why | Est. cost |
|---|---|---|---|---|
| Phase 1 — Tag calibration | ~150 | Sonnet 4.6 + cache | Highest quality while tuning prompt | ~$1.65 |
| Phase 2 — Color calibration | ~50 | Sonnet 4.6 + cache | Nail the color logic before bulk run | ~$0.55 |
| Haiku quality test | ~50 | Both (compare) | Validate Haiku matches Sonnet | ~$0.70 |
| Phase 3 — Full backfill | ~8,250 | Haiku 4.5 + cache | Prompt locked; Haiku handles classification well | ~$24.75 |
| Phase 4 — Ongoing (~100/mo) | ~100/mo | Haiku 4.5 + cache | Same as backfill | ~$0.30/mo |
| Phase 5 — Descriptions (if added) | ~6,124 | Sonnet 4.6 + cache | Voice/tone quality matters | ~$34 |
Recommended budget line for client invoice: - Tags + color only: ~$30 (rounds up for buffer) - Tags + color + descriptions: ~$70 - Monthly ongoing: ~$1/mo (absorb into retainer or bill quarterly)
Scenario comparison — full backfill, tags + color only
| Scenario | Total cost | Notes |
|---|---|---|
| Sonnet, no optimization | ~$268 | Worst case — never do this |
| Sonnet + prompt caching | ~$94 | Acceptable fallback if Haiku quality disappoints |
| Sonnet + caching + Batch API | ~$47 | Good option without OpenRouter |
| Haiku + caching (recommended) | ~$27 | Best default after Sonnet calibration proves the prompt |
| Smart routing via OpenRouter | ~$15–25 | Best total cost + most flexibility (see below) |
OpenRouter — Worth using?
Recommendation: Yes. Use OpenRouter.
OpenRouter is a routing layer in front of every major provider (Anthropic, OpenAI, Google, Meta, Mistral) with a single OpenAI-compatible API. Write the integration once, swap or mix models without touching code.
1. Billing passthrough — the real win for agencies. OpenRouter tracks cost per request with full transparency. Two clean options: - Give KJP their own OpenRouter account and have them fund it directly — they pay exactly what AI costs, nothing more, G&M never touches the billing - Or run through our account and forward the OpenRouter dashboard line-item
Either way, client gets an exact number. No "we think it was about $30" conversation.
2. Model routing — right tool for the job. Classification is a solved task. It doesn't need Claude for every image. A two-tier strategy: - Tier 1 (default): Gemini 2.0 Flash or GPT-4o-mini — cheap, fast, handles clear compositions well (~$0.001/image) - Tier 2 (escalate): Sonnet — if Tier 1 returns confidence < 0.85 or JSON fails validation, re-run (~$0.011/image)
Expected split: ~65% stay Tier 1, ~35% escalate. Total backfill: ~$18–22.
3. One SDK, no sprawl. OpenRouter is OpenAI-compatible. No separate Anthropic, Google, and OpenAI imports in the codebase. One client, one API key.
4. Descriptions always use Sonnet. Routing is config-based: tag/color batches use the tiered strategy, description batches hardcode Sonnet. One flag.
OpenRouter markup: 0–5% over provider cost. On this project, ~$1–3 extra on the full backfill. Negligible.
OpenRouter cost model
| Task | Model | Per image | Full backfill |
|---|---|---|---|
| Tags + color, Tier 1 | Gemini 2.0 Flash | ~$0.001 | ~$8 (if all stayed here) |
| Tags + color, Tier 2 escalation (~35%) | Claude Sonnet 4.6 | ~$0.011 | ~$10 |
| Descriptions (Phase 5 only) | Claude Sonnet 4.6 | ~$0.013 | ~$34 |
| Total — tags + color | Mixed | — | ~$18–22 |
| Total — with descriptions | Mixed | — | ~$52–56 |
What to put on the client invoice
| Line item | Amount | Notes |
|---|---|---|
| AI API costs — tags + color backfill | $30 | Padded above expected actual (~$18–25) |
| AI API costs — descriptions (if Phase 5) | $40 | Separate line item, only if activated |
| Ongoing AI costs | ~$1–5/mo | Absorb into retainer or pass through quarterly |
Simplest billing approach: If KJP funds their own OpenRouter account, this line item disappears from our invoice entirely. They load credits, we show them the dashboard.
Haiku quality validation (important)
Haiku 4.5 is a legitimate classification model but hasn't been tested on this specific task. Don't assume — test it.
At the end of Phase 2, run a 50-image parallel test: same images, same prompt, Haiku vs. Sonnet. Compare outputs side by side. If Haiku matches on 90%+ of images, use it for the bulk run. If not, fall back to Sonnet + caching + Batch API (~$47).
The test costs ~$0.70 total and protects an $8,000+ build from a $15 quality assumption.