LLM Cost Attribution

Which feature is burning your AI budget?

OpenAI sends you one number at the end of the month. CostFerret sniffs out what's actually behind it — by feature, user, and model — instrumented in under 30 minutes.

No credit card required. Free tier included.

Used by teams spending $500–$15K/month on AI APIs

costferret.com/dashboard

Jun 2026 · Total spend

$3,847

↑ 12% vs May

Cost by feature · last 30 days

/ai-search$1,847

/summarize$1,203

/classify$542

/draft-reply$255

💡

Recommendation

Semantic caching on /summarize could save $340/mo

78%of engineering teams make AI optimization decisions based on intuition, not data.

36%average growth in AI API bills last year. Most teams can't tell you where it went.

30 minto instrument CostFerret. See cost by feature and user before your next deploy.

The Problem

You're shipping AI features. You have no idea what each one costs.

Your OpenAI invoice is a single number. Tokens consumed. Dollars spent. Model breakdown if you're lucky. That's it.

It doesn't tell you that your /summarize endpoint costs $0.04 per call while your /ai-search costs $1.12. It doesn't tell you that one user cohort generates 60% of your spend. It doesn't tell you that last Tuesday's deployment tripled your average request cost.

You find out when the bill arrives. You shrug. You ship more features. The number keeps going up.

Most teams only discover a cost problem when the monthly invoice doubles. By then, the wrong architectural decisions are already in production.

"Which feature is this?"

You added 4 AI features this quarter. Your bill went up $2,400. You have no idea which one did it.

"Why did costs spike?"

Something changed last Thursday. Costs jumped 40%. You can't tell if it was a new feature, a new user, or a bug.

"What do I tell the board?"

Investors are asking for cost-per-user. You're about to make up a number.

How it works

Attribute every dollar. In 30 minutes.

One wrapper. One tag. All the context you've been missing.

Instrument once

index.js

// Before
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [...]
});

// After — add two lines
import { wrap } from 'costferret';
const client = wrap(openai, { feature: "summarize", userId: user.id });
const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [...]
});

Works with OpenAI and Anthropic. Node.js and Python. Streaming supported. No change to response shape. Prefer zero installs? Change one line — point your baseURL to our proxy. Done.

See attribution in real time

Cost by feature. Cost by user cohort. Cost by model. Per-request logs with token counts, latency, and full tag context. Daily trend line. Projected end-of-month based on your last 7 days.

No SQL. No dashboards to configure. It's ready when your first tagged request lands.

Know what to fix

CostFerret digs through your usage patterns and surfaces specific savings opportunities — with estimated dollar impact:

“Your /summarize feature made 1,200 calls with near-identical prompts. Semantic caching could reduce costs 60–80% — approximately $340/month.”

“Your /classify feature uses GPT-4o for calls averaging 90 tokens. GPT-4o-mini is 15× cheaper for this use case — approximately $180/month.”

Every recommendation links to a one-paragraph implementation guide. You don't need to become an LLM cost expert.

Features

Everything between “what did it cost?” and “here's how to fix it.”

Cost Attribution Dashboard

Feature-level and user-level cost breakdown. 7, 30, and 90-day windows. Daily trend line. Projected month-end cost. Untagged calls roll into a catch-all bucket — nothing is lost.

Per-Request Logs

Every call logged: model, input/output tokens, estimated cost, latency, and your tags. 30-day retention on paid plans. Filter by feature, user, model, or date range.

Spend Alerts

Set a daily or monthly threshold. Get an email within 15 minutes when it's crossed. One-click snooze. Per-project configuration. No webhooks, no $800/month plan required.

Savings Recommendations

Rule-based analysis of your actual usage patterns. Identifies semantic caching candidates, model downsizing opportunities, and prompt caching wins. Each recommendation shows estimated monthly savings in dollars.

SDK + Proxy, Your Choice

Install the SDK wrapper (Node.js or Python) or change your baseURL to our proxy. Both are production-safe. The proxy adds under 50ms at p95. Use whichever fits your stack.

Multi-Project Support

Separate projects for separate products or environments. Alerts, thresholds, and retention configured per-project. Starter tier includes 3 projects.

How it compares

Built for cost — not as an afterthought.

Most observability tools track cost as a column in a logs table. CostFerret is built around the dollar number from the ground up.

Feature	CostFerret	Helicone	Provider Dashboard
Cost by feature / user		HQL, $79+/mo
Email spend alerts	Starter ($49/mo)	Team only ($799/mo)
Savings recommendations
Actively maintained		Maintenance mode since Mar '26	N/A
Proxy latency p95	<50ms	<50ms	N/A
OpenAI + Anthropic			Per-provider only
Entry price	$49/mo	$79/mo	Free

A note on Helicone.

Helicone was acquired by Mintlify in March 2026. Both founders joined Mintlify's team. The Helicone product is in maintenance mode — security patches and pricing updates only. No new features.

Pricing

Simple pricing. No surprises.

All plans include full attribution, per-request logs, and recommendations. Upgrade when your request volume grows — not when you need a feature.

Free

$0/month

Explore attribution with no commitment.

Start free →

50,000 requests/month
7-day log retention
2 metadata tags
Basic dashboard
1 spend alert
1 project

Pricing FAQ

What teams are saying

From the first tagged request.

“I had no idea /ai-search was costing $1.12 per call until CostFerret. We redesigned the feature and cut our monthly bill in half.”

CTO

AI-powered SaaS

“We spent three weeks building a custom logging layer. It broke twice and nobody maintained it. CostFerret replaced the whole thing in an afternoon.”

Senior Engineer

Series A startup

“Our Series A investors asked for cost-per-user. I had that number ready in five minutes. That used to be a two-day project.”

Founder

Developer tools startup

Your AI costs can't hide from a ferret.

Instrument in 30 minutes. See attribution in real time. Get your first savings recommendation before the end of the day.

No credit card required. Free tier included.

Free tier includes 50,000 requests/month. No time limit.