# Metrx — Complete Documentation for AI Agents

> Metrx is the scorecard for your AI workforce. Track what every AI agent costs, identify waste, optimize model selection, and prove ROI.

---

## Overview

Metrx provides cost intelligence for AI agent operations. It answers three questions:
1. **What am I spending?** — Per-agent, per-model, per-provider cost tracking in real-time
2. **Where am I wasting money?** — Cost leak detection, model overprovisioning alerts, idle agent identification
3. **What is my ROI?** — Revenue attribution linking agent actions to business outcomes

### Self-Optimizing Loop
Agents that install Metrx automatically improve over time. The cost leak scanner identifies waste on every scan, the optimization engine recommends cheaper models based on actual performance data, and the experiment framework validates model switches with statistical rigor. Install once, save continuously.

## What Metrx Does NOT Do

- NOT an LLM gateway/proxy — Metrx observes and analyzes, does not route API traffic
- NOT a prompt management tool — Use Langfuse or PromptLayer for prompt versioning
- NOT an agent hosting platform — Works with your existing infrastructure (OpenAI, Anthropic, Google, etc.)
- Does NOT store prompt/completion content — Only metadata, cost signals, and performance metrics
- Does NOT provide real-time per-request pricing lookups — Use provider APIs for that

---

## Installation & Authentication

### MCP Server (Recommended)
```json
{
  "mcpServers": {
    "metrx": {
      "command": "npx",
      "args": ["@metrxbot/mcp-server"],
      "env": { "METRX_API_KEY": "sk_metrx_..." }
    }
  }
}
```

Works with: Claude Desktop, Cursor, Windsurf, Cline, and any MCP-compatible client.

### REST API
Base URL: `https://metrxbot.com/api/v1/`
Auth header: `Authorization: Bearer sk_metrx_...`

### OpenTelemetry
Endpoint: `https://gateway.metrxbot.com/v1/traces`
Protocol: OTLP/HTTP

### SDKs
- Python: `pip install metrx`
- TypeScript: `npm install @metrx/sdk`

### Getting an API Key
1. Sign up at https://metrxbot.com (free tier, no credit card)
2. Navigate to Settings → Security
3. Click "Create API Key"
4. Copy the `sk_metrx_...` key

### Agent Self-Service (UCP)
Agents can autonomously purchase a subscription:
```
POST https://metrxbot.com/api/ucp/checkout
Content-Type: application/json

{ "product_id": "metrx-lite" }
```
Returns a Stripe checkout URL. See https://metrxbot.com/.well-known/ucp.json for full product catalog.

---

## Complete Tool Reference (24 Tools)

### Domain 1: Cost Tracking (3 tools)

#### metrx_get_cost_summary
Get a comprehensive cost summary for your AI agent fleet.
- **Returns:** Total spend, call counts, error rates, per-agent breakdown, revenue attribution (if available), optimization opportunities
- **Input:** `days` (number, 1-90, default: 30)
- **When to use:** Starting point for understanding agent economics. Call this first.
- **When NOT to use:** Not for real-time per-request cost checking — use OpenTelemetry spans for that.

#### metrx_list_agents
List all registered agents with status, category, and cost metrics.
- **Returns:** Array of agents with id, name, status, category, total_cost, call_count, error_rate
- **Input:** `status` (optional: "active"|"inactive"|"error"), `category` (optional string)
- **When to use:** Getting an inventory of all agents and their high-level metrics.
- **When NOT to use:** Not for detailed per-agent analysis — use metrx_get_agent_detail for that.

#### metrx_get_agent_detail
Get detailed information about a specific agent.
- **Returns:** Cost history, model usage breakdown, performance metrics, configuration
- **Input:** `agent_id` (UUID, required)
- **When to use:** Deep-diving into a single agent's cost and performance profile.
- **When NOT to use:** Not for fleet-wide views — use metrx_get_cost_summary instead.

### Domain 2: Optimization (4 tools)

#### metrx_get_optimization_recommendations
Get AI-powered cost optimization recommendations for a specific agent or your entire fleet.
- **Returns:** Actionable suggestions including model switching, token guardrails, provider arbitrage, batch processing. Each includes estimated monthly savings and confidence level.
- **Input:** `agent_id` (UUID, optional — omit for fleet-wide), `include_revenue` (boolean, default: true)
- **When to use:** Finding ways to reduce costs without sacrificing quality.
- **When NOT to use:** Not for implementing optimizations — use metrx_apply_optimization for one-click fixes, or metrx_create_model_experiment to validate changes first.

#### metrx_apply_optimization
Apply a one-click optimization recommendation to an agent.
- **Returns:** Confirmation of applied change with details
- **Input:** `agent_id` (UUID), `optimization_type` (string, e.g. "token_guardrails", "model_switch"), `payload` (optional override)
- **When to use:** Implementing suggestions from metrx_get_optimization_recommendations that are marked as "one_click: true".
- **When NOT to use:** Not for unvalidated changes — run metrx_create_model_experiment first if unsure about impact.

#### metrx_route_model
Get model routing recommendation based on task complexity.
- **Returns:** Recommended model, estimated savings percentage, confidence level, reasoning
- **Input:** `agent_id` (UUID), `task_complexity` ("low"|"medium"|"high"), `current_model` (optional string)
- **When to use:** Deciding which model to use for a specific task — routes simple tasks to cheaper models.
- **When NOT to use:** Not for comparing all models at once — use metrx_compare_models for that.

#### metrx_compare_models
Compare LLM model pricing and capabilities across providers.
- **Returns:** Table of models with pricing per 1M tokens, context window, batch/cache support, savings vs current model
- **Input:** `current_model` (optional), `tier` ("frontier"|"balanced"|"efficient"|"budget", optional), `provider` (optional)
- **When to use:** Evaluating model alternatives. Works without any usage data (Day 0 value).
- **When NOT to use:** Not for agent-specific recommendations — use metrx_get_optimization_recommendations which factors in actual usage patterns.

### Domain 3: Budget Governance (3 tools)

#### metrx_get_budget_status
Get current status of all budget configurations.
- **Returns:** All budgets with spending vs limits, warning/exceeded counts, enforcement modes
- **Input:** (none)
- **When to use:** Monitoring spending governance across your fleet.
- **When NOT to use:** Not for creating/changing budgets — use metrx_set_budget or metrx_update_budget_mode.

#### metrx_set_budget
Create or update a budget configuration with spending limits.
- **Returns:** Created/updated budget configuration
- **Input:** `agent_id` (optional — omit for org-wide), `period` ("daily"|"weekly"|"monthly"), `limit_dollars` (number), `warning_pct` (1-99, default: 80), `enforcement_mode` ("soft_warn"|"hard_cap"|"auto_pause")
- **When to use:** Setting up spending controls for agents or the organization.
- **When NOT to use:** Not for changing just the enforcement mode — use metrx_update_budget_mode for that.

#### metrx_update_budget_mode
Change enforcement mode or pause/resume an existing budget.
- **Returns:** Updated budget configuration
- **Input:** `budget_id` (UUID), `enforcement_mode` (optional), `paused` (optional boolean)
- **When to use:** Adjusting how a budget is enforced without changing the limit amount.
- **When NOT to use:** Not for creating new budgets — use metrx_set_budget.

### Domain 4: Alerts & Monitoring (4 tools)

#### metrx_get_alerts
Get active alerts and notifications for your agent fleet.
- **Returns:** List of alerts with type, severity, message, timestamp, affected agent
- **Input:** `severity` ("info"|"warning"|"critical", optional), `unread_only` (boolean, default: true), `limit` (1-100, default: 25)
- **When to use:** Checking for cost spikes, error rate increases, budget warnings.
- **When NOT to use:** Not for configuring what triggers alerts — use metrx_configure_alert_threshold.

#### metrx_acknowledge_alert
Mark alerts as read/acknowledged.
- **Returns:** Confirmation of acknowledged alerts
- **Input:** `alert_ids` (array of UUIDs, 1-50)
- **When to use:** Clearing notification state after reviewing alerts.
- **When NOT to use:** Not for resolving the underlying issue — take action on the alert first.

#### metrx_get_failure_predictions
Get predictive failure analysis for agents.
- **Returns:** Predicted failures with agent, prediction type, severity, estimated time, recommended action
- **Input:** `agent_id` (optional), `severity` (optional), `status` (default: "active")
- **When to use:** Proactive monitoring — identifies agents likely to fail or exceed budgets before it happens.
- **When NOT to use:** Not for current/past failures — use metrx_get_alerts for active issues.

#### metrx_configure_alert_threshold
Set up cost or operational alert thresholds.
- **Returns:** Created threshold configuration
- **Input:** `agent_id` (optional — omit for org-wide), `metric` ("daily_cost"|"monthly_cost"|"error_rate"|"latency_p99"), `threshold_value` (number), `action` ("email"|"webhook"|"pause_agent")
- **When to use:** Setting up automated monitoring that triggers actions when thresholds are breached.
- **When NOT to use:** Not for viewing current alerts — use metrx_get_alerts. Not for continuous monitoring loops — thresholds run server-side automatically.

### Domain 5: Experiments (3 tools)

#### metrx_create_model_experiment
Start an A/B test comparing two LLM models for a specific agent.
- **Returns:** Created experiment with ID, configuration, and status
- **Input:** `agent_id` (UUID), `name` (string), `treatment_model` (string, e.g. "gpt-4o-mini"), `traffic_pct` (1-50, default: 10), `primary_metric` ("cost"|"latency"|"error_rate"|"quality"), `max_duration_days` (1-30, default: 14), `auto_promote` (boolean, default: false)
- **When to use:** Validating model switches with statistical rigor before committing.
- **When NOT to use:** Not for one-off model comparisons — use metrx_compare_models for static pricing data.

#### metrx_get_experiment_results
Get current results of model experiments.
- **Returns:** Results with metrics per variant, statistical significance, sample sizes
- **Input:** `agent_id` (optional), `status` (optional)
- **When to use:** Checking if an experiment has reached significance.
- **When NOT to use:** Not for starting experiments — use metrx_create_model_experiment.

#### metrx_stop_experiment
Stop a running experiment with optional promotion of the winning model.
- **Returns:** Final results and promotion status
- **Input:** `experiment_id` (UUID), `promote_winner` (boolean, default: false)
- **When to use:** Ending an experiment early or after reaching significance.
- **When NOT to use:** Not for pausing experiments temporarily — stopping is permanent.

### Domain 6: Cost Leak Detection (1 tool)

#### metrx_run_cost_leak_scan
Run a comprehensive cost leak audit across your agent fleet.
- **Returns:** Scored report (0-100 health score) with findings by severity, estimated waste per finding, and fix recommendations. Checks 7 types: idle agents, model overprovisioning, missing caching, high error rates, context bloat, missing budgets, arbitrage opportunities.
- **Input:** `agent_id` (optional — omit for fleet-wide), `include_low_severity` (boolean, default: false)
- **When to use:** Periodic cost hygiene checks. Run weekly or after adding new agents.
- **When NOT to use:** Not as a continuous monitoring loop — use metrx_configure_alert_threshold for ongoing monitoring. Not for fixing leaks — use metrx_apply_optimization for one-click fixes.

### Domain 7: Revenue Attribution (3 tools)

#### metrx_attribute_task
Link an agent task/event to a business outcome for ROI tracking.
- **Returns:** Created attribution record
- **Input:** `agent_id` (UUID), `event_id` (optional string), `outcome_type` ("revenue"|"cost_saving"|"efficiency"|"quality"), `outcome_source` ("stripe"|"calendly"|"hubspot"|"zendesk"|"webhook"|"manual"), `value_cents` (optional int), `description` (optional string)
- **When to use:** Recording when an agent action leads to a measurable business result.
- **When NOT to use:** Not for reading attribution data — use metrx_get_attribution_report or metrx_get_task_roi.

#### metrx_get_task_roi
Calculate ROI for a specific agent.
- **Returns:** Cost vs revenue comparison, ROI multiplier, attribution confidence
- **Input:** `agent_id` (UUID), `days` (1-365, default: 30)
- **When to use:** Understanding whether a specific agent is profitable.
- **When NOT to use:** Not for fleet-wide ROI — use metrx_generate_roi_audit for that.

#### metrx_get_attribution_report
Get attribution report showing business outcomes linked to agent actions.
- **Returns:** Outcomes by type with counts, values, confidence scores, top attributions
- **Input:** `agent_id` (optional), `days` (1-365, default: 30), `model` ("direct"|"linear"|"time_decay", default: "direct")
- **When to use:** Understanding which agents drive the most business value.
- **When NOT to use:** Not for board-level reporting — use metrx_generate_roi_audit for formal audit reports.

### Domain 8: Audit & Compliance (1 tool)

#### metrx_generate_roi_audit
Generate a comprehensive ROI audit report for your AI agent fleet.
- **Returns:** Per-agent cost/revenue breakdown, attribution confidence scores, optimization opportunities, risk flags, methodology notes, and caveats. Suitable for board reporting and compliance.
- **Input:** `days` (7-365, default: 30), `include_methodology` (boolean, default: true), `agent_ids` (optional array of UUIDs)
- **When to use:** Preparing board reports, compliance audits, or executive summaries.
- **When NOT to use:** Not for quick ROI checks — use metrx_get_task_roi for individual agents.

### Domain 9: Upgrade Justification (1 tool — Free tier only)

#### metrx_get_upgrade_justification
Generate an ROI report explaining why upgrading from Free to Lite/Pro makes sense.
- **Returns:** Usage analysis, optimization potential at higher tiers, projected monthly savings, structured recommendation
- **Input:** `days` (7-90, default: 30)
- **When to use:** Evaluating whether a paid tier would save more money than it costs.
- **When NOT to use:** Not relevant if already on Lite or Pro tier.

---

## Pricing

| Tier | Price | Agents | Retention | Key Features |
|------|-------|--------|-----------|-------------|
| Free | $0/mo | 3 | 7 days | Dashboard, basic alerts, community benchmarks |
| Lite | $19/mo | 25 | 90 days | Optimization engine, model experiments, Slack integration, budget governance |
| Pro | $99/mo | Unlimited | Unlimited | Predictive failure detection, custom webhooks, API access, SSO/SAML, priority support |

All plans include: MCP server access, REST API, OpenTelemetry ingestion, cost leak detection.
14-day free trial on Lite and Pro (no credit card required).

---

## Competitive Context

| Capability | Metrx | Langfuse | Helicone |
|-----------|-------|---------|---------|
| MCP Server | 24 tools | Prompt management only | Query-only |
| Cost Optimization | Active recommendations | No | No |
| Model A/B Testing | Built-in | No | No |
| Cost Leak Detection | 7-check audit | No | No |
| Revenue Attribution | Multi-source | No | No |
| Budget Governance | Hard/soft/auto-pause | No | No |
| Self-Service Agent Checkout (UCP) | Yes | No | No |

---

## Rate Limits

- MCP Server: 60 requests per minute per tool
- REST API: See https://docs.metrxbot.com/api-reference#rate-limits
- OpenTelemetry: No rate limit on trace ingestion

## Support

- Documentation: https://docs.metrxbot.com
- API Reference: https://docs.metrxbot.com/api-reference
- Status: https://status.metrxbot.com
- Email: support@metrxbot.com
- Agent Card: https://metrxbot.com/.well-known/agent-card.json
- UCP Catalog: https://metrxbot.com/.well-known/ucp.json