Real-time feedback while you iterate. Automatic drift detection once you're stable. Foil learns how your agents should behave — and tells you when they don't.
Trusted by teams shipping AI agents in production
Add Foil with 1 command to your existing codebase.
Foil in Action
A SaaS company deploys an AI support agent handling thousands of conversations daily. Foil monitors every conversation in real time.
Key Insight
After a docs update, profile learning detected hallucinated answers. Alerting caught a 3x spike in "I don't know" responses within minutes.
A financial services firm processes documents through an AI pipeline - parsing, classification, extraction, and compliance checks.
Key Insight
Anchored invariants enforce 97% accuracy and <30s processing. Change detector flagged 12% accuracy drift after a model update.
An agent walks new users through product setup - answering questions, configuring settings, and escalating to humans when stuck.
Key Insight
Drift detection caught the agent routing 3x more users to human escalation after a docs update changed the setup flow.
An engineering team uses AI to review PRs and triage CI failures. Foil monitors invocations and developer feedback.
Key Insight
A Sunday volume spike revealed a broken dependency causing cascading CI failures - burning API credits unnoticed.
Agent Learning
Quantitative metrics update in real-time while AI-generated behavioral profiles deepen with every learning cycle.
The moment your agent sends its first trace, Foil tracks latency, error rates, tool usage, and volume - updating every 60 seconds.
Foil generates your agent's first behavioral profile - identity, tool patterns, error analysis - alongside health anchors that are continuously validated.
The profile re-learns at geometric intervals (125, 313, 783+ traces). When 2 consecutive cycles find no material changes, it converges.
Learning becomes change-driven. Foil monitors for behavioral drift and automatically re-learns when something meaningful changes.
Control Center surfaces the traces that matter most. Review flagged invocations, give feedback, and improve your agent in a single workflow.
"Your refund has been processed and $247 will be returned..." - fabricated refund amount
"I don't have access to that information" - repeated 4x in conversation
"The PR modifies the auth middleware" - file not in changeset
User sentiment dropped after 3rd redirect to documentation
"LGTM" response with no code analysis on 500-line PR
Foil builds a behavioral profile for each agent - separating always-fresh quantitative metrics from AI-generated behavioral intelligence. Live metrics update continuously. AI insights deepen as your agent processes more traces.
Customer support agent for vehicle inquiries, test drives, and dealership operations
Every trace is evaluated against 9 built-in checks -- hallucination, PII, prompt injection, and more. Unlike generic tools, Foil's evaluations use your agent's behavioral profile as context. A response that's normal for one agent might be anomalous for another.
support-agent · "How do I track my order?"
tool patterns, error baselines, behavioral norms
Platform Features
Purpose-built for AI agents. See what no other tool can show you.
See every step your agent takes: each LLM call, tool invocation, memory read, and branching decision.
Catch the moment your agent makes things up. Flag responses that contradict context in real-time.
Detect policy violations, harmful outputs, and prompt injections before they reach users.
Track token usage and API spend down to individual decisions. Find what's burning through your budget.
Link user feedback directly to agent decisions. Understand which reasoning paths lead to outcomes.
Replay any failed execution step-by-step. Rewind to the exact moment things went wrong.
Capabilities
From support bots to research agents, trace every thought and decision. Know exactly what went wrong and why.
Trace every thought, every decision, every retrieval. See the complete chain of reasoning and find exactly where the failure occurred.
Watch your agent's memory in action. See which documents it retrieves, which passages it focuses on, and whether it's using the context you gave it.
Visualize decision trees, track tool calls, and understand why it chose one path over another. Catch runaway loops before they drain your budget.
Understand how your agent crafts each output. Catch hallucinations, safety issues, and quality problems with deep inspection.
Integration
Our AI-powered wizard scans your codebase and adds tracing automatically. Built on OpenTelemetry - zero code changes to your LLM calls.
Scans your code, installs the SDK, adds tracing - done.
One command analyzes your codebase, detects LLM calls, and adds Foil instrumentation automatically.
The wizard creates a branch with all changes. Review the diff, merge, and deploy as normal.
Every OpenAI, Anthropic, or other LLM call is traced. Agent profiles build from real usage.
$ npx @getfoil/wizard
Foil Wizard Setup
? API Key: sk_live_••••••••
? Agent name: support-agent
? Target directory: ./my-app
# Auto-detects your LLM provider and app pattern
✔ Provider: OpenAI (gpt-4o-mini)
✔ Pattern: One-shot script
# Instruments your code automatically
foil.js — created (Foil config & shutdown handlers)
index.js — modified
├─ Foil import added as first require
├─ LLM calls wrapped in ctx.llmCall()
└─ main() wrapped in agent.trace()
✔ Branch created: foil-setup
$ npm install @getfoil/foil-js && node index.js
Pricing
Try the full evaluation suite free. Pay only for what you use.
14 days of Pro features
$0.001 / interaction
$0.005 / interaction
Real-time metrics from trace one. Behavioral intelligence that deepens over time. Evaluations that know your agent.