Which feature is burning your AI budget?
OpenAI sends you one number at the end of the month. CostFerret sniffs out what's actually behind it — by feature, user, and model — instrumented in under 30 minutes.
No credit card required. Free tier included.
Used by teams spending $500–$15K/month on AI APIs
Jun 2026 · Total spend
$3,847
↑ 12% vs May
Cost by feature · last 30 days
Recommendation
Semantic caching on /summarize could save $340/mo
The Problem
You're shipping AI features. You have no idea what each one costs.
Your OpenAI invoice is a single number. Tokens consumed. Dollars spent. Model breakdown if you're lucky. That's it.
It doesn't tell you that your /summarize endpoint costs $0.04 per call while your /ai-search costs $1.12. It doesn't tell you that one user cohort generates 60% of your spend. It doesn't tell you that last Tuesday's deployment tripled your average request cost.
You find out when the bill arrives. You shrug. You ship more features. The number keeps going up.
Most teams only discover a cost problem when the monthly invoice doubles. By then, the wrong architectural decisions are already in production.
"Which feature is this?"
You added 4 AI features this quarter. Your bill went up $2,400. You have no idea which one did it.
"Why did costs spike?"
Something changed last Thursday. Costs jumped 40%. You can't tell if it was a new feature, a new user, or a bug.
"What do I tell the board?"
Investors are asking for cost-per-user. You're about to make up a number.
How it works
Attribute every dollar. In 30 minutes.
One wrapper. One tag. All the context you've been missing.
Instrument once
# Before response = openai.chat.completions.create( model="gpt-4o", messages=[...] ) # After — add two lines from costferret import wrap client = wrap(openai, feature="summarize", user_id=user.id) response = client.chat.completions.create( model="gpt-4o", messages=[...] )
Works with OpenAI and Anthropic. Python and Node.js. Streaming supported. No change to response shape. Prefer zero installs? Change one line — point your base_url to our proxy. Done.
See attribution in real time
Cost by feature. Cost by user cohort. Cost by model. Per-request logs with token counts, latency, and full tag context. Daily trend line. Projected end-of-month based on your last 7 days.
No SQL. No dashboards to configure. It's ready when your first tagged request lands.
Know what to fix
CostFerret digs through your usage patterns and surfaces specific savings opportunities — with estimated dollar impact:
Every recommendation links to a one-paragraph implementation guide. You don't need to become an LLM cost expert.
Features
Everything between “what did it cost?” and “here's how to fix it.”
Cost Attribution Dashboard
Feature-level and user-level cost breakdown. 7, 30, and 90-day windows. Daily trend line. Projected month-end cost. Untagged calls roll into a catch-all bucket — nothing is lost.
Per-Request Logs
Every call logged: model, input/output tokens, estimated cost, latency, and your tags. 30-day retention on paid plans. Filter by feature, user, model, or date range.
Spend Alerts
Set a daily or monthly threshold. Get an email within 15 minutes when it's crossed. One-click snooze. Per-project configuration. No webhooks, no $800/month plan required.
Savings Recommendations
Rule-based analysis of your actual usage patterns. Identifies semantic caching candidates, model downsizing opportunities, and prompt caching wins. Each recommendation shows estimated monthly savings in dollars.
SDK + Proxy, Your Choice
Install the SDK wrapper (Python or Node.js) or change your base_url to our proxy. Both are production-safe. The proxy adds under 50ms at p95. Use whichever fits your stack.
Multi-Project Support
Separate projects for separate products or environments. Alerts, thresholds, and retention configured per-project. Starter tier includes 3 projects.
How it compares
Built for cost — not as an afterthought.
Most observability tools track cost as a column in a logs table. CostFerret is built around the dollar number from the ground up.
| Feature | CostFerret | Helicone | Provider Dashboard |
|---|---|---|---|
| Cost by feature / user | HQL, $79+/mo | ||
| Email spend alerts | Starter ($49/mo) | Team only ($799/mo) | |
| Savings recommendations | |||
| Actively maintained | Maintenance mode since Mar '26 | N/A | |
| Proxy latency p95 | <50ms | <50ms | N/A |
| OpenAI + Anthropic | Per-provider only | ||
| Entry price | $49/mo | $79/mo | Free |
A note on Helicone.
Helicone was acquired by Mintlify in March 2026. Both founders joined Mintlify's team. The Helicone product is in maintenance mode — security patches and pricing updates only. No new features.
Pricing
Simple pricing. No surprises.
All plans include full attribution, per-request logs, and recommendations. Upgrade when your request volume grows — not when you need a feature.
Free
Explore attribution with no commitment.
- 50,000 requests/month
- 7-day log retention
- 2 metadata tags
- Basic dashboard
- 1 spend alert
- 1 project
Starter
$0.50 per 10K over 500K
Full attribution for growing AI products.
- 500,000 requests/month
- 30-day log retention
- Unlimited tags
- Full attribution dashboard
- 5 spend alerts + Slack
- 3 projects
- Savings recommendations
Growth
$0.30 per 10K over 5M
Cost-per-user visibility for Series A.
- 5,000,000 requests/month
- 90-day log retention
- Everything in Starter
- Cost-per-user export (CSV)
- Anomaly detection
- 5 team seats
- Monthly savings PDF report
Scale
Save 20% with annual — $319/mo
For teams that need the full picture.
- 50,000,000 requests/month
- 180-day log retention
- Everything in Growth
- Unlimited team seats
- Custom webhooks
- Dedicated Slack support
- Early access to new features
Pricing FAQ
What teams are saying
From the first tagged request.
“I had no idea /ai-search was costing $1.12 per call until CostFerret. We redesigned the feature and cut our monthly bill in half.”
CTO
AI-powered SaaS
“We spent three weeks building a custom logging layer. It broke twice and nobody maintained it. CostFerret replaced the whole thing in an afternoon.”
Senior Engineer
Series A startup
“Our Series A investors asked for cost-per-user. I had that number ready in five minutes. That used to be a two-day project.”
Founder
Developer tools startup
Your AI costs can't hide from a ferret.
Instrument in 30 minutes. See attribution in real time. Get your first savings recommendation before the end of the day.
No credit card required. Free tier included.
Free tier includes 50,000 requests/month. No time limit.