Tokenizer — LLM Cost Calculator

Overview

Total Spend

$2822

↑ 3.6%

vs. prior 30D

prev 30D

this 30D

Requests

118.2K

↑ 1.7%

vs. prior 30D

Burn Rate

$2822

$2860 proj/mo

→ steady vs prev 30D

prev: $2725

Cache Savings

$454.76

↓ 0.3%

vs. prior 30D

Cache Hit Rate

27.9%

of input processed

Tokens / request

7.3K

↓ 2.5%

vs. prior 30D

Agents

Chat Assistant

Code Review

Doc Search

Token usage

Input

Output

Cache write

Cache read

Daily spend

Under threshold

Over threshold

Latency

Select a wider time range
to see latency trends

P50

P95

Latency × cost

P95 latency vs. avg cost per request

Cost by agent

Chat Assistant

$143551%

Code Review

$105837%

Doc Search

$328.7612%

Token usage by models

Claude Sonnet 4.5

$1647 · 46.0K req

input 156.6Moutput 66.4M

Claude Haiku 4.5

$635.43 · 210.9K req

input 455.4Moutput 34.7M

GPT-4o

$540.17 · 12.7K req

input 147.8Moutput 5.0M

Insights

Spend trend

Spend is up 4% this period. Identify which agents are driving the increase and set a daily threshold to catch overruns before they compound.

Cache savings

Caching is cutting 14% off your prompt costs. Move repeated context into system prompts and widen your cache window to push the hit rate even higher.

Top cost driver

Chat Assistant accounts for 51% of total spend. Audit its prompt length and call frequency — small reductions here have the largest impact on your bill.

Peak traffic

Traffic spiked to 6.5K requests on May 5. Pre-warm your cache before known high-traffic windows to avoid cold-start cost surges.

Ask follow-up questions

Ask anything about your
usage data