LLM Cost Attribution

Which feature is burning your AI budget?

OpenAI sends you one number at the end of the month. CostFerret sniffs out what's actually behind it — by feature, user, and model — instrumented in under 30 minutes.

No credit card required. Free tier included.

Used by teams spending $500–$15K/month on AI APIs

costferret.com/dashboard

Jun 2026 · Total spend

$3,847

↑ 12% vs May

Cost by feature · last 30 days

/ai-search$1,847
/summarize$1,203
/classify$542
/draft-reply$255
💡

Recommendation

Semantic caching on /summarize could save $340/mo

78%of engineering teams make AI optimization decisions based on intuition, not data.
36%average growth in AI API bills last year. Most teams can't tell you where it went.
30 minto instrument CostFerret. See cost by feature and user before your next deploy.

You're shipping AI features. You have no idea what each one costs.

Your OpenAI invoice is a single number. Tokens consumed. Dollars spent. Model breakdown if you're lucky. That's it.

It doesn't tell you that your /summarize endpoint costs $0.04 per call while your /ai-search costs $1.12. It doesn't tell you that one user cohort generates 60% of your spend. It doesn't tell you that last Tuesday's deployment tripled your average request cost.

You find out when the bill arrives. You shrug. You ship more features. The number keeps going up.

Most teams only discover a cost problem when the monthly invoice doubles. By then, the wrong architectural decisions are already in production.

"Which feature is this?"

You added 4 AI features this quarter. Your bill went up $2,400. You have no idea which one did it.

"Why did costs spike?"

Something changed last Thursday. Costs jumped 40%. You can't tell if it was a new feature, a new user, or a bug.

"What do I tell the board?"

Investors are asking for cost-per-user. You're about to make up a number.

Attribute every dollar. In 30 minutes.

One wrapper. One tag. All the context you've been missing.

01

Instrument once

main.py
# Before
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[...]
)

# After — add two lines
from costferret import wrap
client = wrap(openai, feature="summarize", user_id=user.id)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...]
)

Works with OpenAI and Anthropic. Python and Node.js. Streaming supported. No change to response shape. Prefer zero installs? Change one line — point your base_url to our proxy. Done.

02

See attribution in real time

Cost by feature. Cost by user cohort. Cost by model. Per-request logs with token counts, latency, and full tag context. Daily trend line. Projected end-of-month based on your last 7 days.

No SQL. No dashboards to configure. It's ready when your first tagged request lands.

03

Know what to fix

CostFerret digs through your usage patterns and surfaces specific savings opportunities — with estimated dollar impact:

Your /summarize feature made 1,200 calls with near-identical prompts. Semantic caching could reduce costs 60–80% — approximately $340/month.
Your /classify feature uses GPT-4o for calls averaging 90 tokens. GPT-4o-mini is 15× cheaper for this use case — approximately $180/month.

Every recommendation links to a one-paragraph implementation guide. You don't need to become an LLM cost expert.

Everything between “what did it cost?” and “here's how to fix it.”

Cost Attribution Dashboard

Feature-level and user-level cost breakdown. 7, 30, and 90-day windows. Daily trend line. Projected month-end cost. Untagged calls roll into a catch-all bucket — nothing is lost.

Per-Request Logs

Every call logged: model, input/output tokens, estimated cost, latency, and your tags. 30-day retention on paid plans. Filter by feature, user, model, or date range.

Spend Alerts

Set a daily or monthly threshold. Get an email within 15 minutes when it's crossed. One-click snooze. Per-project configuration. No webhooks, no $800/month plan required.

Savings Recommendations

Rule-based analysis of your actual usage patterns. Identifies semantic caching candidates, model downsizing opportunities, and prompt caching wins. Each recommendation shows estimated monthly savings in dollars.

SDK + Proxy, Your Choice

Install the SDK wrapper (Python or Node.js) or change your base_url to our proxy. Both are production-safe. The proxy adds under 50ms at p95. Use whichever fits your stack.

Multi-Project Support

Separate projects for separate products or environments. Alerts, thresholds, and retention configured per-project. Starter tier includes 3 projects.

Built for cost — not as an afterthought.

Most observability tools track cost as a column in a logs table. CostFerret is built around the dollar number from the ground up.

FeatureCostFerretHeliconeProvider Dashboard
Cost by feature / user
HQL, $79+/mo
Email spend alerts
Starter ($49/mo)
Team only ($799/mo)
Savings recommendations
Actively maintained
Maintenance mode since Mar '26
N/A
Proxy latency p95<50ms<50msN/A
OpenAI + Anthropic
Per-provider only
Entry price$49/mo$79/moFree

A note on Helicone.

Helicone was acquired by Mintlify in March 2026. Both founders joined Mintlify's team. The Helicone product is in maintenance mode — security patches and pricing updates only. No new features.

Simple pricing. No surprises.

All plans include full attribution, per-request logs, and recommendations. Upgrade when your request volume grows — not when you need a feature.

Free

$0/month

Explore attribution with no commitment.

Start free →
  • 50,000 requests/month
  • 7-day log retention
  • 2 metadata tags
  • Basic dashboard
  • 1 spend alert
  • 1 project
Most popular

Starter

$49/month

$0.50 per 10K over 500K

Full attribution for growing AI products.

Start free, upgrade anytime →
  • 500,000 requests/month
  • 30-day log retention
  • Unlimited tags
  • Full attribution dashboard
  • 5 spend alerts + Slack
  • 3 projects
  • Savings recommendations

Growth

$149/month

$0.30 per 10K over 5M

Cost-per-user visibility for Series A.

Start free →
  • 5,000,000 requests/month
  • 90-day log retention
  • Everything in Starter
  • Cost-per-user export (CSV)
  • Anomaly detection
  • 5 team seats
  • Monthly savings PDF report

Scale

$399/month

Save 20% with annual — $319/mo

For teams that need the full picture.

Talk to us →
  • 50,000,000 requests/month
  • 180-day log retention
  • Everything in Growth
  • Unlimited team seats
  • Custom webhooks
  • Dedicated Slack support
  • Early access to new features

Pricing FAQ

From the first tagged request.

I had no idea /ai-search was costing $1.12 per call until CostFerret. We redesigned the feature and cut our monthly bill in half.

JK

CTO

AI-powered SaaS

We spent three weeks building a custom logging layer. It broke twice and nobody maintained it. CostFerret replaced the whole thing in an afternoon.

ML

Senior Engineer

Series A startup

Our Series A investors asked for cost-per-user. I had that number ready in five minutes. That used to be a two-day project.

AS

Founder

Developer tools startup

Your AI costs can't hide from a ferret.

Instrument in 30 minutes. See attribution in real time. Get your first savings recommendation before the end of the day.

No credit card required. Free tier included.

Free tier includes 50,000 requests/month. No time limit.