Newx402 agent payments are live agents pay for tools autonomously
Documentation

AgentCostPilot Docs

Everything you need to set up, integrate, and optimize your AI agent costs.

Quick Start Guide

Get up and running in under 2 minutes.

1
Create an account

Sign up at agentcostpilot.com/signup with email or Google OAuth. No credit card required.

2
Connect a provider

Go to Dashboard → Providers → Connect. Paste your OpenAI or Anthropic API key. We use read-only access.

3
See your costs

Your dashboard populates within minutes. View cost by agent, model, and task. Set alerts and get optimization recommendations.

Connect a Provider

AgentCostPilot supports 9 LLM providers.

ProviderKey FormatScope
OpenAIsk-proj-...usage.read
Anthropicsk-ant-...admin (usage read)
Google AIAIza...Vertex AI usage
AWS BedrockAKIA...bedrock:GetUsage
Azure OpenAIEndpoint + keyMetrics read
Mistral AIsk-...Usage API
Cohereco-...Usage read
Groqgsk_...Usage read
Together AItog-...Usage read

All API keys are encrypted with AES-256.

Create Your First Agent

Two ways to create agents:

Manual (Dashboard)

Dashboard Agents + Add agent.

Automatic (API/MCP)

New agent_id via MCP or REST API auto-creates the agent.

Cost Overview

Main dashboard: total spend (MTD), active agents, potential savings, token usage. Filter by 24h/7d/30d/90d. Budget burn rate, provider breakdown, top agents by cost.

Agent Management

Each agent has: cost history (30-day chart), model usage breakdown, budget utilization, health score (0-100), and personalized recommendations.

Alerts & Budgets

Alert types: threshold-based (daily spend exceeds $X) and smart alerts (anomaly detection using statistical analysis). Budget enforcement with kill switch.

Recommendations

Actionable recommendations with Apply button:

  • Switch model — e.g., GPT-4o → Claude Haiku for 73% savings
  • Cascade routing — start cheap, escalate on uncertainty
  • Kill idle agents — no output for 24h+
  • Enable prompt caching — repeated prompts detected, save 90%
  • Batch requests — reduce individual API calls

MCP Server Setup

Standard MCP protocol. Works with Claude, GPT, Gemini, LangChain, CrewAI, AutoGen.

// claude_desktop_config.json
{
  "mcpServers": {
    "agentcostpilot": {
      "url": "https://mcp.agentcostpilot.com",
      "headers": { "Authorization": "Bearer acp_sk_..." }
    }
  }
}
ToolDescription
acp_check_budgetCheck remaining budget before expensive calls
acp_report_usageReport token usage after each LLM call
acp_get_recommendationGet model recommendation for current task
acp_check_limitsCheck limits and agent status
acp_heartbeatConfirm agent is alive and active

REST API Reference

Base URL: https://api.agentcostpilot.com/v1

Check Budget

GET /v1/agents/{agent_id}/budget

Response:
{ "agent_id": "support-bot", "budget_usd": 500.00,
  "spent_usd": 357.50, "remaining_usd": 142.50,
  "utilization_pct": 71.5, "status": "active" }

Report Usage

POST /v1/usage
{ "agent_id": "support-bot", "model": "gpt-4o",
  "input_tokens": 1250, "output_tokens": 380,
  "task": "customer_reply" }

Response:
{ "cost_usd": 0.0284, "budget_remaining_usd": 142.47 }

Cache Analysis

GET /v1/cache/analysis

Response:
{ "total_duplicates": 47, "cacheable_spend_usd": 734.70,
  "potential_savings_usd": 661.23,
  "top_agents": [
    { "agent_id": "support-bot", "duplicates": 127, "savings_usd": 256.05 }
  ] }

Python SDK

pip install agentcostpilot
from agentcostpilot import ACPClient, wrap_openai, track

acp = ACPClient(api_key="acp_sk_...")

# Option 1: OTEL wrapper
client = wrap_openai(OpenAI(), acp, agent_id="my-agent")

# Option 2: Decorator
@track(acp, agent_id="summarizer", task="summarize")
def summarize(text):
    return openai.chat.completions.create(...)

# Option 3: Manual
acp.report_usage(agent_id="bot", model="gpt-4o",
    provider="openai", input_tokens=150,
    output_tokens=20, cost_usd=0.0001)
acp.flush()

Authentication

User Auth (Dashboard)

Supabase Auth with email/password or Google OAuth. JWT tokens, RLS.

Agent Auth (API/MCP)

API keys (acp_sk_...) hashed with SHA-256. Rate limit: 100 req/min.

OpenAI Integration

Endpoint: GET https://api.openai.com/v1/organization/usage
Auth: Bearer token (Admin API key, usage.read scope)
Polling: Every hour | Rate limit: 60 req/min

Models: gpt-4o ($2.50/$10.00/1M), gpt-4o-mini ($0.15/$0.60/1M),
        o1 ($15.00/$60.00/1M), text-embedding-3-small ($0.02/1M)

Anthropic Integration

Endpoint: GET https://api.anthropic.com/v1/organizations/{org}/usage
Auth: x-api-key header (Admin API key)
Polling: Every hour | Rate limit: 60 req/min

Models: claude-opus-4-6 ($15.00/$75.00/1M),
        claude-sonnet-4-6 ($3.00/$15.00/1M),
        claude-haiku-4-5 ($0.80/$4.00/1M)

Google AI Integration

Endpoint: Vertex AI Monitoring API
Auth: Service Account JSON key | Polling: Every hour

Models: gemini-2.5-pro ($1.25/$5.00/1M),
        gemini-2.5-flash ($0.075/$0.30/1M)

LiteLLM Gateway Integration

Using LiteLLM as your AI gateway? Connect ACP with one line in your config.

Option 1: LiteLLM Config (recommended)

# litellm config.yaml
model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o

litellm_settings:
  success_callback: ["agentcostpilot"]

environment_variables:
  ACP_API_KEY: "acp_sk_your_key_here"

Option 2: Python callback

from agentcostpilot.litellm_callback import ACPCallback

callback = ACPCallback(api_key="acp_sk_...", agent_id="litellm-proxy")

import litellm
litellm.callbacks = [callback]

Stripe Billing

Plans: Free ($0), Pro ($99/mo), Business ($299/mo), Enterprise ($999+/mo). Checkout via Stripe, self-service portal, webhook sync.

OTEL Integration

Track LLM costs via OpenTelemetry wrapper SDKs.

from agentcostpilot import ACPClient, wrap_openai
from openai import OpenAI

acp = ACPClient(api_key="acp_xxx")
client = wrap_openai(OpenAI(), acp, agent_id="my-agent")

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}]
)
# ACP captures: model, tokens, cost, latency

Ingest Endpoint

POST /v1/ingest/otlp
Content-Type: application/json
{ "traces": [...] }  // Standard OTLP format

POST /v1/ingest/batch
{ "events": [
    { "agent_id": "bot", "model": "gpt-4o",
      "input_tokens": 150, "output_tokens": 20,
      "cost_usd": 0.0001, "timestamp": 1711600000 }
  ]
}

Value Reports

Generate PDF/HTML reports showing AI agent ROI for stakeholders.

POST /v1/reports/value
{ "period_start": "2026-06-01", "period_end": "2026-06-30",
  "tasks_completed": 12500, "hours_saved": 480 }

Response:
{ "id": "rpt_xxx", "total_cost_usd": 4872.00,
  "total_savings_usd": 1364.16, "roi_percentage": 312,
  "agentic_cost_ratio": 4.2, "pdf_url": "..." }

Savings API

GET /v1/savings/summary
Response:
{ "total_savings_usd": 2840.00,
  "by_category": {
    "model_downgrade": 1680, "caching": 620,
    "cascade_routing": 380, "waste_elimination": 160 },
  "savings_rate_pct": 28.4 }

Health Scores

GET /v1/agents/{agent_id}/health
Response:
{ "agent_id": "support-bot", "score": 87,
  "efficiency_score": 92, "utilization_score": 85,
  "waste_score": 16, "trend": "improving",
  "recommendations": ["Enable prompt caching", "Try cascade routing"] }

Cache Analysis

ACP scans usage logs every hour to find duplicate requests. Anthropic prompt caching saves up to 90%.

GET /v1/cache/analysis
Response:
{ "total_duplicates": 47,
  "cacheable_spend_usd": 734.70,
  "potential_savings_usd": 661.23,
  "agents": [
    { "agent_id": "support-bot", "model": "gpt-4o",
      "duplicates": 127, "potential_savings_usd": 256.05 }
  ] }

Cascade Routing

Start with cheap models, escalate only when needed (Haiku Sonnet Opus).

GET /v1/recommendations?type=cascade_routing
Response:
{ "recommendations": [
    { "agent_id": "support-bot", "cascade_from": "gpt-4o",
      "cascade_to": ["gpt-4o-mini", "gpt-4o", "o1"],
      "estimated_savings_usd": 580.00,
      "estimated_savings_pct": 65 }
  ] }

Webhooks

Events: alert.triggered, budget.exceeded, agent.killed, anomaly.detected, recommendation.available, report.generated, cache.opportunity_found

POST /v1/webhooks
{ "url": "https://hooks.slack.com/services/...",
  "events": ["alert.triggered", "budget.exceeded"],
  "secret": "whsec_xxx" }

x402 Overview

x402 is an open protocol by Coinbase for internet-native payments. AI agents pay for premium ACP tools via HTTP 402 with USDC stablecoins on Base. No accounts, no subscriptions just request, pay, receive.

How it works

1
Agent requests premium tool

POST /v1/tools/recommendations/premium without payment

2
ACP returns 402

Response includes price ($0.05 USDC), network (Base), currency (USDC)

3
Agent pays via x402

Uses Coinbase CDP facilitator to complete USDC payment

4
Agent retries with proof

Repeats request with payment authorization header

5
ACP delivers result

Verifies payment, executes tool, logs in ledger

x402 is production-ready: 75M+ transactions, $24M+ volume. Adopted by Stripe, AWS, Vercel, Cloudflare. Built on open standard not proprietary.

Paid Tools (x402)

Some ACP tools are free, others are premium and require x402 payment.

ToolTypePrice
acp_check_budgetFree
acp_report_usageFree
acp_heartbeatFree
acp_get_recommendationFree
acp_premium_recommendationPaid (x402)$0.05
acp_anomaly_scanPaid (x402)$0.50
acp_model_routePaid (x402)$0.05
# Agent requests premium recommendation
POST /v1/tools/recommendations/premium

# Without payment:
HTTP 402 Payment Required
{ "x402": {
    "price": "0.05",
    "currency": "USDC",
    "network": "base",
    "facilitator": "https://x402.coinbase.com",
    "description": "Premium model recommendation"
  }
}

# After payment:
HTTP 200 OK
{ "recommendation": {
    "current_model": "gpt-4o",
    "suggested_model": "claude-haiku",
    "estimated_savings_pct": 73,
    "quality_impact": "< 2% degradation",
    "confidence": 0.94
  }
}

Wallet & Policies

Manage USDC balance and spend policies for autonomous agent payments. Dashboard: Wallet + Policies pages.

Wallet API

GET /v1/wallet/balance
{ "balance_usd": 49.35, "currency": "USDC", "network": "base" }

POST /v1/wallet/deposit
{ "amount_usd": 50.00, "currency": "USDC" }

GET /v1/ledger
{ "entries": [
    { "type": "credit", "amount": 50.00, "desc": "USDC deposit" },
    { "type": "debit", "amount": 0.05, "desc": "Premium recommendation",
      "agent": "support-bot", "tx_hash": "0x7a3f...e291" }
  ]
}

Policy API

GET /v1/policies
{ "max_per_call_usd": 1.00,
  "max_daily_usd": 25.00,
  "approval_mode": "auto",
  "allowed_tools": ["acp_premium_recommendation", "acp_anomaly_scan"] }

POST /v1/policies
{ "max_per_call_usd": 0.50,
  "approval_mode": "approve_above_threshold" }

Approval modes

  • auto — agents pay within policy limits without human approval
  • approve_above_threshold — auto below limit, require approval above
  • manual_first_vendor — first payment to new vendor needs approval
  • simulation_only — agents can request quotes but cannot pay

Wallet available on Business and Enterprise tiers. Free and Pro users can still pay per call without a wallet.

Self-Hosted (Docker)

Run ACP on your own infrastructure. Business ($299/mo) and Enterprise ($999+/mo) plans.

git clone https://github.com/Dmitrze/agentcostpilot.git
cd agentcostpilot
cp .env.example .env
docker-compose up -d

# Dashboard at http://localhost:3000

Circuit Breaker

If ACP is down, agents continue working. SDK buffers data locally and syncs when recovered.

from agentcostpilot import ACPClient

acp = ACPClient(
    api_key="acp_sk_...",
    circuit_breaker_threshold=3,
    circuit_breaker_timeout=60
)
GET /v1/mcp/health
{ "status": "ok", "latency_ms": 5, "uptime_seconds": 86400 }

Questions? Contact support@agentcostpilot.com