Documentation

Agent Budget Management

Agent budget management: spending caps, auto-recharge, predictive billing, cost alerts, multi-agent allocation, and optimization for autonomous AI systems

Why Agents Need Budget Management

Autonomous agents make hundreds of API calls per day. Without budget controls:

  • Costs spiral out of control
  • Unexpected bills surprise humans
  • Agents can't optimize spending
  • No protection against runaway loops

SkillBoss gives agents financial autonomy with guardrails.


Setting Spending Limits

Daily, Weekly, Monthly Caps

import requests

headers = {"Authorization": f"Bearer {API_KEY}"}

# Configure budget limits
response = requests.post(
    "https://api.skillboss.co/v1/agents/budget",
    headers=headers,
    json={
        "daily_limit": 25.00,      # Max $25/day
        "weekly_limit": 150.00,    # Max $150/week
        "monthly_limit": 500.00,   # Max $500/month
        "hard_stop": True          # Stop all operations when limit hit
    }
)

print(response.json())
# {
#   "budget_configured": True,
#   "limits_active": {
#     "daily": 25.00,
#     "weekly": 150.00,
#     "monthly": 500.00
#   }
# }
Soft Limits

Soft limit: Agent receives warnings but can continue

{
  "daily_limit": 25.00,
  "hard_stop": False,  # Warnings only
  "alert_at_80_percent": True
}

When 80% reached:

  • Agent receives webhook alert
  • Agent can decide: continue or pause
  • Useful for autonomous optimization
Hard Limits

Hard limit: Operations blocked when limit hit

{
  "daily_limit": 25.00,
  "hard_stop": True,  # Operations blocked
  "escalation_email": "human@company.com"
}

When limit reached:

  • All API calls return 402 Payment Required
  • Human receives email notification
  • Agent must wait until next period or human increases limit

Auto-Recharge

Never run out of credits mid-operation.

Basic Auto-Recharge

# Enable auto-recharge
requests.post(
    "https://api.skillboss.co/v1/agents/auto-recharge",
    headers=headers,
    json={
        "enabled": True,
        "trigger_threshold": 10.00,     # Recharge when balance < $10
        "recharge_amount": 100.00,      # Add $100 each time
        "max_recharges_per_month": 5    # Safety cap: max 5 recharges/month
    }
)

How it works:

Balance Drops Below Threshold

Agent makes API call, balance drops to $9.50

Auto-Recharge Triggers

SkillBoss charges payment method on file for $100

Credits Added

Balance increases to $109.50

Operation Continues

Agent's API call completes successfully

πŸ›‘οΈ

Safety Feature

Max recharges per month prevents runaway costs.

If agent hits 5 recharges in one month:

  • Auto-recharge pauses
  • Human receives alert: "Agent exceeded recharge limit"
  • Human reviews usage before enabling more recharges

Smart Auto-Recharge

Predict usage and recharge before depletion:

# AI-powered recharge predictions
requests.post(
    "https://api.skillboss.co/v1/agents/auto-recharge",
    headers=headers,
    json={
        "enabled": True,
        "mode": "predictive",  # vs "threshold"
        "recharge_amount": 100.00,
        "predict_hours_ahead": 24  # Recharge if predicted to run out in 24h
    }
)

How predictive recharge works:

  1. Agent's usage pattern: Averages $15/day
  2. Current balance: $20
  3. Prediction: Will run out in ~32 hours
  4. If predict_hours_ahead: 24: Triggers recharge now (before depletion)
βœ“

Benefit: Zero downtime. Agent never hits insufficient credits error.


Budget Alerts & Webhooks

Get notified when spending thresholds are reached:

# Configure alerts
requests.post(
    "https://api.skillboss.co/v1/agents/alerts",
    headers=headers,
    json={
        "alert_at_50_percent": True,   # Alert at 50% of daily limit
        "alert_at_80_percent": True,   # Alert at 80% of daily limit
        "alert_at_95_percent": True,   # Alert at 95% of daily limit
        "webhook_url": "https://your-agent.com/budget-webhook",
        "email_human": "human@company.com"  # Also email human
    }
)

Webhook payload when alert triggers:

{
  "alert_type": "budget_warning",
  "agent_id": "agent_abc123",
  "limit_type": "daily",
  "limit_amount": 25.00,
  "amount_spent": 20.00,
  "percent_used": 80,
  "amount_remaining": 5.00,
  "estimated_time_until_depleted": "3.2 hours",
  "recommended_action": "Consider pausing non-critical operations",
  "top_cost_drivers": [
    {"model": "claude-4-5-sonnet", "cost": 12.00, "calls": 80},
    {"model": "dall-e-3", "cost": 6.00, "calls": 60}
  ]
}

Agent response to webhook:

@app.post("/budget-webhook")
def handle_budget_alert(alert: dict):
    if alert["percent_used"] >= 80:
        # Switch to cheaper models
        switch_to_economy_mode()

    if alert["percent_used"] >= 95:
        # Pause non-critical operations
        pause_background_tasks()
        notify_human("Budget nearly depleted")

    return {"received": True}

Cost Tracking & Analytics

Real-Time Usage Monitoring

# Check current usage
usage = requests.get(
    "https://api.skillboss.co/v1/agents/usage",
    headers=headers,
    params={"period": "today"}
).json()

print(f"""
Today's Usage:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Spent:           ${usage['amount_spent']:.2f}
Daily limit:     ${usage['daily_limit']:.2f}
Remaining:       ${usage['amount_remaining']:.2f}
Operations:      {usage['operation_count']}

Top Models:
1. {usage['top_models'][0]['model']} - ${usage['top_models'][0]['cost']:.2f}
2. {usage['top_models'][1]['model']} - ${usage['top_models'][1]['cost']:.2f}
3. {usage['top_models'][2]['model']} - ${usage['top_models'][2]['cost']:.2f}

Optimization Opportunity:
{usage['optimization_suggestion']}
""")

Sample output:

Today's Usage:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Spent:           $18.45
Daily limit:     $25.00
Remaining:       $6.55
Operations:      1,247

Top Models:
1. claude-4-5-sonnet - $12.20
2. dall-e-3 - $4.80
3. gemini/gemini-2.5-flash - $1.45

Optimization Opportunity:
72% of Claude calls could use Gemini Flash instead. Potential savings: $8.78/day

Historical Analytics

# Get usage over time
analytics = requests.get(
    "https://api.skillboss.co/v1/agents/analytics",
    headers=headers,
    params={
        "start_date": "2026-02-01",
        "end_date": "2026-02-25",
        "group_by": "day"
    }
).json()

# Analyze trends
import pandas as pd

df = pd.DataFrame(analytics['usage_by_day'])
print(f"Average daily cost: ${df['cost'].mean():.2f}")
print(f"Peak day: {df.loc[df['cost'].idxmax(), 'date']} (${df['cost'].max():.2f})")
print(f"Cheapest day: {df.loc[df['cost'].idxmin(), 'date']} (${df['cost'].min():.2f})")

Multi-Agent Budget Allocation

Parent-Child Budget Hierarchy

Parent agent allocates budgets to child agents:

# Parent creates child agents with sub-budgets
children = [
    {"name": "ResearchAgent", "monthly_budget": 100.00},
    {"name": "ContentAgent", "monthly_budget": 200.00},
    {"name": "MarketingAgent", "monthly_budget": 150.00}
]

for child in children:
    response = requests.post(
        "https://api.skillboss.co/v1/agents/create-child",
        headers={"Authorization": f"Bearer {PARENT_API_KEY}"},
        json={
            "child_name": child["name"],
            "monthly_budget": child["monthly_budget"],
            "budget_overrun_policy": "block"  # or "alert"
        }
    )

    child["api_key"] = response.json()["api_key"]

# Parent monitors all children
children_usage = requests.get(
    "https://api.skillboss.co/v1/agents/children/usage",
    headers={"Authorization": f"Bearer {PARENT_API_KEY}"}
).json()

print(f"""
Total Budget: ${sum(c['monthly_budget'] for c in children)}
Total Spent:  ${sum(c['spent'] for c in children_usage['children'])}

Child Breakdown:
""")

for child in children_usage['children']:
    percent = (child['spent'] / child['budget']) * 100
    print(f"  {child['name']}: ${child['spent']:.2f} / ${child['budget']:.2f} ({percent:.0f}%)")

Cost Optimization Strategies

1. Model Selection Optimization

class CostOptimizer:
    """Automatically route to cheapest model that meets quality needs."""

    def __init__(self, quality_threshold: float = 0.8):
        self.quality_threshold = quality_threshold

    def select_model(self, task_complexity: str):
        """Choose model based on task complexity."""

        models = {
            "simple": {
                "model": "gemini/gemini-2.5-flash",
                "cost_per_1m": 0.075,
                "expected_quality": 0.85
            },
            "medium": {
                "model": "deepseek/deepseek-r1",
                "cost_per_1m": 0.14,
                "expected_quality": 0.90
            },
            "complex": {
                "model": "claude-4-5-sonnet",
                "cost_per_1m": 15.00,
                "expected_quality": 0.98
            }
        }

        return models[task_complexity]["model"]

    def fallback_if_needed(self, result, current_model):
        """Upgrade to better model if quality insufficient."""

        if self.evaluate_quality(result) < self.quality_threshold:
            # Try next tier up
            if "gemini" in current_model:
                return "deepseek/deepseek-r1"
            elif "deepseek" in current_model:
                return "claude-4-5-sonnet"

        return current_model  # Quality acceptable

2. Batch Processing

Reduce API calls by batching:

# Instead of 100 separate API calls
for item in items:
    result = process_single(item)  # 100 API calls

# Batch into 10 calls of 10 items each
for batch in chunks(items, size=10):
    results = process_batch(batch)  # 10 API calls

# Cost savings: 90% reduction in API overhead

3. Caching

Cache responses for repeated queries:

from functools import lru_cache

@lru_cache(maxsize=1000)
def cached_llm_call(prompt: str, model: str):
    """Cache LLM responses for identical prompts."""

    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}]
    )

    return response.choices[0].message.content

# Identical prompts hit cache instead of API
result1 = cached_llm_call("What is AI?", "gemini/gemini-2.5-flash")
result2 = cached_llm_call("What is AI?", "gemini/gemini-2.5-flash")  # Cached, $0 cost


Budget Approval Workflows

For expensive operations, agents can request human approval:

def expensive_operation(cost_estimate: float):
    """Request human approval for operations over $10."""

    if cost_estimate > 10.00:
        # Request approval
        approval = requests.post(
            "https://api.skillboss.co/v1/agents/approvals/request",
            headers=headers,
            json={
                "operation": "generate_100_videos",
                "estimated_cost": cost_estimate,
                "justification": "Monthly content batch for social media",
                "urgency": "medium"
            }
        ).json()

        # Wait for human approval (webhook notifies when approved)
        while approval["status"] == "pending":
            time.sleep(60)  # Check every minute
            approval = check_approval_status(approval["approval_id"])

        if approval["status"] == "approved":
            return execute_operation()
        else:
            logging.info(f"Operation denied: {approval['denial_reason']}")
            return None

    else:
        # Auto-approved for operations under $10
        return execute_operation()

Next Steps

πŸ“ˆ

Cost Optimization

Advanced strategies to reduce costs by 70%+

πŸ“„

Usage Tracking

Monitor and analyze your spending

πŸ“„

Multi-Model Routing

Automatically route to cheapest model

πŸ“„

Quick Start

Get started with SkillBoss