KJP AI Cost Estimate — Token & Model Breakdown

Consolidated from Notion 2026-04-23. Notion is deprecated. Status: Phase 2 — deferred to fall/winter 2026.

Scope: KJP AI Tagging System — backfill + ongoing + optional descriptions Last updated: 2026-03-10 (source) · 2026-04-23 (migrated here) Note: All estimates use prompt caching as the baseline and assume images are pre-resized to 1024px max before API submission.


How the token math works

Every request sends three things:

  1. Image — ~1,400 tokens after resize to 1024px max (see warning below)
  2. Tag list + prompt — ~8,500 tokens (the full tag taxonomy is the expensive part)
  3. Output — ~150 tokens (JSON blob of selected tags + colors)

Large image warning: KJP's catalog includes images up to 3,000×3,000px or larger. Without a resize step, image token cost scales with resolution — a 3000×3000 image hits ~12,000 image tokens vs ~1,400 post-resize. That's 9× the cost with zero classification benefit (no need for full resolution to identify a flower species or color). The batch runner must resize every image to 1024px max on the longest side before sending to the API. All estimates on this page assume that step is in place. If skipped, multiply the backfill cost by 3–9× depending on image size mix.

The tag list is static per batch, making it a perfect candidate for prompt caching — the first request pays full rate, every subsequent one reads the cached version at 10% of normal input price. This is the biggest single lever.


Per-image cost by model

Model Input Output No cache With cache Cache + Batch
Claude Sonnet 4.6 $3.00/MTok $15.00/MTok ~$0.032 ~$0.011 ~$0.006
Claude Haiku 4.5 $0.80/MTok $4.00/MTok ~$0.009 ~$0.003 ~$0.0015
Claude Opus 4.6 $15.00/MTok $75.00/MTok ~$0.161 ~$0.055 ~$0.028
GPT-4o $2.50/MTok $10.00/MTok ~$0.027 n/a n/a
GPT-4o mini $0.15/MTok $0.60/MTok ~$0.002 n/a n/a
Gemini 2.0 Flash $0.075/MTok $0.30/MTok ~$0.001 n/a n/a

Phase-by-phase cost breakdown

Recommended stack: Sonnet for calibration, Haiku for bulk backfill.

Phase Images Model Why Est. cost
Phase 1 — Tag calibration ~150 Sonnet 4.6 + cache Highest quality while tuning prompt ~$1.65
Phase 2 — Color calibration ~50 Sonnet 4.6 + cache Nail the color logic before bulk run ~$0.55
Haiku quality test ~50 Both (compare) Validate Haiku matches Sonnet ~$0.70
Phase 3 — Full backfill ~8,250 Haiku 4.5 + cache Prompt locked; Haiku handles classification well ~$24.75
Phase 4 — Ongoing (~100/mo) ~100/mo Haiku 4.5 + cache Same as backfill ~$0.30/mo
Phase 5 — Descriptions (if added) ~6,124 Sonnet 4.6 + cache Voice/tone quality matters ~$34

Recommended budget line for client invoice: - Tags + color only: ~$30 (rounds up for buffer) - Tags + color + descriptions: ~$70 - Monthly ongoing: ~$1/mo (absorb into retainer or bill quarterly)


Scenario comparison — full backfill, tags + color only

Scenario Total cost Notes
Sonnet, no optimization ~$268 Worst case — never do this
Sonnet + prompt caching ~$94 Acceptable fallback if Haiku quality disappoints
Sonnet + caching + Batch API ~$47 Good option without OpenRouter
Haiku + caching (recommended) ~$27 Best default after Sonnet calibration proves the prompt
Smart routing via OpenRouter ~$15–25 Best total cost + most flexibility (see below)

OpenRouter — Worth using?

Recommendation: Yes. Use OpenRouter.

OpenRouter is a routing layer in front of every major provider (Anthropic, OpenAI, Google, Meta, Mistral) with a single OpenAI-compatible API. Write the integration once, swap or mix models without touching code.

1. Billing passthrough — the real win for agencies. OpenRouter tracks cost per request with full transparency. Two clean options: - Give KJP their own OpenRouter account and have them fund it directly — they pay exactly what AI costs, nothing more, G&M never touches the billing - Or run through our account and forward the OpenRouter dashboard line-item

Either way, client gets an exact number. No "we think it was about $30" conversation.

2. Model routing — right tool for the job. Classification is a solved task. It doesn't need Claude for every image. A two-tier strategy: - Tier 1 (default): Gemini 2.0 Flash or GPT-4o-mini — cheap, fast, handles clear compositions well (~$0.001/image) - Tier 2 (escalate): Sonnet — if Tier 1 returns confidence < 0.85 or JSON fails validation, re-run (~$0.011/image)

Expected split: ~65% stay Tier 1, ~35% escalate. Total backfill: ~$18–22.

3. One SDK, no sprawl. OpenRouter is OpenAI-compatible. No separate Anthropic, Google, and OpenAI imports in the codebase. One client, one API key.

4. Descriptions always use Sonnet. Routing is config-based: tag/color batches use the tiered strategy, description batches hardcode Sonnet. One flag.

OpenRouter markup: 0–5% over provider cost. On this project, ~$1–3 extra on the full backfill. Negligible.

OpenRouter cost model

Task Model Per image Full backfill
Tags + color, Tier 1 Gemini 2.0 Flash ~$0.001 ~$8 (if all stayed here)
Tags + color, Tier 2 escalation (~35%) Claude Sonnet 4.6 ~$0.011 ~$10
Descriptions (Phase 5 only) Claude Sonnet 4.6 ~$0.013 ~$34
Total — tags + color Mixed ~$18–22
Total — with descriptions Mixed ~$52–56

What to put on the client invoice

Line item Amount Notes
AI API costs — tags + color backfill $30 Padded above expected actual (~$18–25)
AI API costs — descriptions (if Phase 5) $40 Separate line item, only if activated
Ongoing AI costs ~$1–5/mo Absorb into retainer or pass through quarterly

Simplest billing approach: If KJP funds their own OpenRouter account, this line item disappears from our invoice entirely. They load credits, we show them the dashboard.


Haiku quality validation (important)

Haiku 4.5 is a legitimate classification model but hasn't been tested on this specific task. Don't assume — test it.

At the end of Phase 2, run a 50-image parallel test: same images, same prompt, Haiku vs. Sonnet. Compare outputs side by side. If Haiku matches on 90%+ of images, use it for the bulk run. If not, fall back to Sonnet + caching + Batch API (~$47).

The test costs ~$0.70 total and protects an $8,000+ build from a $15 quality assumption.