Now available

Understand Every Decision Your AI Agent Makes Instantly

Trace logic, detect failures, and improve AI agent performance with real-time observability.
The only platform that learns how your agents actually behave.

Dashboard

Traces

12,847 +12%

Latency

1.24s -8%

Errors

0.8% -23%

Cost

$847 +5%

TRACE	AGENT	DURATION	TOKENS	COST	STATUS
trc_8f3a2b	support-agent	1.24s	2,847	$0.023	success
trc_7e2b3c	research-agent	3.87s	8,234	$0.089	warning
trc_6d1c4e	code-agent	2.13s	4,521	$0.041	success
trc_5c0b5f	support-agent	0.89s	1,203	$0.012	success

Learning Active

247 traces analyzed · Next threshold: 250

v3.2

Learning Progress 247 / 250 traces

1025501002505001000

Domain 96% confidence

Customer Support

E-commerce Technical

Behavior

StyleProfessional

VerbosityModerate

FormatStructured

Terminology

Intermediate

order status refund tracking RMA

Evaluation Context

Hallucination: Order data grounding

PII: Email, phone, address focus

Safety: Refund policy boundaries

Signals

3,241+18%

Positive

2,847+22%

Negative

394-15%

Rating

4.2+0.3

TRACE	AGENT	TYPE	FEEDBACK	TIME
trc_8f3a2b	support-agent	thumbs	Positive	2m ago
trc_7e2b3c	research-agent	rating		5m ago
trc_6d1c4e	code-agent	thumbs	Negative "Incorrect syntax"	8m ago
trc_5c0b5f	support-agent	rating		12m ago

End-to-end trace timeline See every step your agent takes

Live agent profiles Metrics that update in real-time, intelligence that deepens over time

Latency & cost breakdown Optimize performance per step

Use Cases

Understand why your agent does what it does

From support bots to research agents, trace every thought and decision. Know exactly what went wrong and why.

Customer Support

Pinpoint the exact moment your agent went wrong

Trace every thought, every decision, every retrieval. See the complete chain of reasoning and find exactly where the failure occurred. No more guessing.

Thought chain replay
Hallucination detection
Context tracing

Conversation Replay

How do I reset my password?

You can reset your password by clicking "Forgot Password" on the login page...

Grounded 847 tokens · 0.89s

RAG Applications

See what your agent actually reads and remembers

Watch your agent's memory in action. See which documents it retrieves, which passages it focuses on, and whether it's actually using the context you gave it.

Memory inspection
Grounding scores
Source attribution

Retrieval Analysis

Retrieved Documents

docs/password-reset.md0.94

docs/security-faq.md0.87

docs/account-setup.md0.62

Response fully grounded in sources

Autonomous Agents

Follow every step your agent takes

Watch your agent think in real-time. Visualize decision trees, track tool calls, and understand exactly why it chose one path over another. Catch runaway loops before they drain your API budget.

Decision tree view
Tool call tracking
Loop detection

Span Timeline

llm llm:openai-agent

6.18s

llm gpt-4o-mini

3.06s237 tok

tool get_current_weather

105ms

tool get_current_weather

1ms

tool calculate

1ms

llm gpt-4o-mini

3.08s435 tok

Content & Code Generation

Inspect every generation before it ships

Understand how your agent crafts each output. See the reasoning behind every word. Catch hallucinations, safety issues, and quality problems with deep inspection.

Generation breakdown
Safety checks
Feedback collection

Quality Signals

User Feedback

4.2

Safety Score

95%

Safe On-brand Factual

Ready to understand your AI agents?

Foil in Action

Real teams, real agents, real results

External-facing

Customer Support AI

A SaaS company deploys an AI support agent handling thousands of conversations daily. Foil monitors every conversation, tracking latency, tool usage, and quality scores in real time.

Key Insight

After a docs update, profile learning detected hallucinated answers. Alerting caught a 3x spike in "I don't know" responses within minutes.

Tracing Profiles Alerting

Internal

Document Processing & Compliance

A financial services firm processes documents through an AI pipeline — PDF parsing, classification, extraction, and compliance checks. Foil traces each stage end to end.

Key Insight

Anchored invariants enforce 97% accuracy and <30s processing. Change detector flagged 12% accuracy drift after a model update before any compliance violation.

Tracing Evaluations Anchors

External-facing

AI Onboarding Agent

A SaaS company deploys an agent that walks new users through product setup — answering questions, configuring settings, and escalating to humans when stuck. Foil monitors every onboarding session end-to-end.

Key Insight

Drift detection caught the agent routing 3× more users to human escalation after a docs update changed the setup flow — the agent's help guide was now outdated.

Tracing Alerting Drift Detection

Internal

Code Review & CI Triage

An engineering team uses AI to review PRs and triage CI failures. Foil monitors invocations, suggestion acceptance rates, and developer feedback.

Key Insight

A Sunday volume spike revealed a broken dependency causing cascading CI failures — burning API credits unnoticed until Foil flagged the anomaly.

Signals Alerting Analytics

Agent Learning

Your agents reveal themselves over time

Foil continuously monitors your agents in production — quantitative metrics update in real-time while AI-generated behavioral profiles deepen with every learning cycle. The result: observability that's always accurate and always getting smarter.

Live Metrics Always On

From trace 1

Live Metrics, Instantly

The moment your agent sends its first trace, Foil starts tracking latency, error rates, tool usage, and volume — updating every 60 seconds. By the time your AI profile generates at 50 traces, you already have a full operational picture.

Real-time latency & errors Tool usage distribution Volume & temporal patterns

From 50 traces

First Profile & Anchors

At 50 traces, Foil generates your agent's first behavioral profile — identity, tool patterns, error analysis, and insights — alongside health anchors: falsifiable claims like "error rate stays below 5%" that are continuously validated against live data.

Behavioral identity & analysis Falsifiable health anchors AI-generated insights

50 → convergence

Rapid Refinement

The profile re-learns at geometric intervals (125, 313, 783+ traces), refining with each cycle. When 2 consecutive cycles find no material changes, the profile converges and transitions to steady-state.

Geometric learning cycles Convergence detection Progressively deeper analysis

Converged

Drift Detection & Self-Healing

Re-enters bootstrap on anchor break

Once converged, learning becomes change-driven. Foil monitors for behavioral drift — distribution shifts, rate changes, new tool or error types — and automatically re-learns when something meaningful changes. If enough anchors break, it re-enters rapid learning to self-correct.

Automated drift detection Change-driven re-learning Anchor-break self-healing

Agent Profiles

Every agent gets a living profile

Foil builds a behavioral profile for each agent that separates always-fresh quantitative metrics from AI-generated behavioral intelligence. Live metrics like latency, error rates, and tool distribution update continuously. AI insights deepen as your agent processes more traces.

Live metrics from the first trace — latency, errors, tool usage updated in real-time
AI behavioral profiles bootstrap from just 50 traces
Health anchors: falsifiable claims that are continuously validated
Drift detection alerts when behavior shifts from baseline
Lock specific profile fields to prevent auto-updates

support-agent Steady State

240 traces · 2/2 anchors passing

Auto Learning

Identity

Customer support agent for vehicle inquiries, test drives, and dealership operations

high confidence established

Behavioral Overview

240

Daily Volume

0.0%

Error Rate

1.2s

Median Latency

Active Hours

Tool Usage

search_inventory 46.7% gpt-4o-mini 29.8% check_slots 13.3% transfer_call 9.6% book_appt 6.3%

Insights AI-derived

◆ High daily volume (240 traces) indicates robust customer engagement

◆ Predominant use of search_inventory (46.7%) suggests most interactions check vehicle availability

Health Anchors 2/2 passing

Error rate stays below 1%

Daily volume exceeds 200 traces

Confidence: low

Smart Evaluations

Profile-powered evaluations that learn your agent

Every trace is evaluated against 9 built-in checks — hallucination, PII, prompt injection, and more. But unlike generic monitoring tools, Foil's evaluations use your agent's behavioral profile as context. A response that's normal for one agent might be anomalous for another. Foil knows the difference.

9 built-in evaluations: hallucination, PII, injection, jailbreak, quality, frustration, satisfaction, stuck detection, NSFW
Evaluations use live agent context — tool patterns, error baselines, behavioral norms
Create custom evaluations with few-shot examples from your own traces
Anomaly detection powered by agent profiles — catch deviations generic tools miss

Evaluation Pipeline

Incoming Trace trc_9f2a3b

support-agent · "How do I track my order?"

Agent Profile Context

research-agent · tool patterns, error baselines, behavioral norms

Built-in Evaluations

Hallucination

PII

Injection

Quality

NSFW

Jailbreak

Stuck/Loop

Frustration

Satisfaction

Custom Evaluations Pro

Brand Voice CompliancePass

Refund Policy AccuracyWarning

Platform Features

The deepest visibility into agent behavior

Purpose-built for AI agents. See what no other tool can show you.

Full Thought Tracing

See every step your agent takes: each LLM call, tool invocation, memory read, and branching decision. Understand its complete reasoning chain.

Hallucination Detection

Catch the moment your agent makes things up. Flag responses that contradict context or fabricate sources, in real-time.

Safety Monitoring

Detect policy violations, harmful outputs, and prompt injections before they reach users. See why they happened.

Cost Analytics

Track token usage and API spend down to individual decisions. Find exactly what's burning through your budget.

User Signals

Link user feedback directly to agent decisions. Understand which reasoning paths lead to good or bad outcomes.

Failure Replay

Replay any failed execution step-by-step. Rewind to the exact moment things went wrong and see every detail.

Deep Search

Find any conversation instantly

Natural language search across all your traces. Ask questions like "show me all conversations where the agent mentioned a refund" and get instant, semantically relevant results.

conversations about password resets

trc_abc12h ago

I can help you reset your password...

98%

trc_def25h ago

Your password has been reset successfully...

94%

trc_ghi31d ago

For security, we require email verification...

89%

847 results found

Cost Intelligence

Know exactly where your budget goes

Track costs per request, model, and agent. Get budget alerts, monthly projections, and identify your most expensive workflows before they drain your budget.

$1,247 This month

gpt-4o$892

gpt-4o-mini$234

claude-3.5$121

Projected monthly$2,494

Budget remaining$753

Integration

Up and running in minutes

Built on OpenTelemetry. Zero code changes to your LLM calls. Agent profiles build automatically.

Install the SDK

npm install @foil/foil-js or pip install foil-ai. Built on OpenTelemetry for automatic instrumentation of all LLM calls.

npm install @foil/foil-js

Initialize Foil

One call at app startup. Foil auto-instruments your LLM calls via OpenTelemetry -- no code changes needed.

Foil.init({ apiKey, agentName })

Use your LLM as normal

Every OpenAI, Anthropic, or other LLM call is automatically traced. Agent profiles build from real usage.

// Calls are traced automatically!

agent.js

const { Foil } = require('@foil/foil-js/otel');
const OpenAI = require('openai');

// Initialize Foil (do this once at app startup)
Foil.init({
  apiKey: process.env.FOIL_API_KEY,
  agentName: 'my-first-agent',
});

// Use OpenAI as normal - it's automatically traced!
const openai = new OpenAI();

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'What is the capital of France?' }],
});

console.log(response.choices[0].message.content);
// ↑ This call was automatically traced to Foil!

Pricing

Simple, transparent pricing

Get started with a free trial. Upgrade as you grow.

Pro Trial

Free

14 days of Pro features

All Pro features included
10,000 spans included
No credit card required

Start Free Trial

Starter

$49/mo

$0.001 / interaction

Unlimited spans, agents & retention
Alerts
Exports
Deep Search
Email support

Get Started

Pro

14 day free trial

$149/mo

$0.005 / interaction

Everything in Starter, plus:
Model training on your data
Deep Search + Semantic Search + Smart Search
Prompts
SSO & RBAC
Priority support

Start Free Trial

View full pricing details

Stop guessing. Start understanding.

Real-time metrics from trace one. Behavioral intelligence that deepens over time. Evaluations that know your agent.