SkillBoss for AI Agents — 697 Tools via OpenAI-Compatible API
AI agent integration guide for SkillBoss. 697 endpoints across 63 vendors (Claude, GPT, Gemini, image/video/audio generation, web scraping, payments, email) through one OpenAI-compatible API. Task routing, cost optimization, error handling, budget management.
Setup: set up skillboss.co/skill.md — or set OPENAI_BASE_URL=https://api.skillboss.co/v1
SkillBoss gives AI agents access to 697 API endpoints across 63 vendors through a single OpenAI-compatible API. One key, one balance, every tool an agent needs.
Endpoint: https://api.skillboss.co/v1
Auth: Bearer <api_key>
SDK: OpenAI SDK (Python, JS, Go — just change base_url)
Signup: https://skillboss.co (free $2 credit, no credit card)
Why Agents Use SkillBoss
| Capability | Details |
|---|---|
| 697 endpoints | Chat models, image gen, video gen, audio, web scraping, social data, payments, email, databases, hosting |
| 63 vendors | Anthropic, OpenAI, Google, DeepSeek, Perplexity, ElevenLabs, Replicate, Firecrawl, and 54 more |
| OpenAI-compatible | Drop-in replacement — works with existing OpenAI SDK code |
| One API key | No managing multiple vendor accounts |
| One balance | Pay-as-you-go credits, never expire |
| Model switching | Change model per request — no reconfiguration |
| Budget controls | Monitor and limit spend programmatically |
Two API Patterns
| Endpoint | Use For | Example Models |
|---|---|---|
/v1/chat/completions | Chat / LLM models | claude-4-5-sonnet, gpt-5, gpt-4.1-nano, gemini-2.5-flash, deepseek/deepseek-v3.2 |
/v1/run | Everything else | flux-1.1-pro (image), google/veo-3.1 (video), elevenlabs/eleven_multilingual_v2 (TTS), firecrawl/scrape (web) |
Quick Start for Agents
1. Initialize Client
from openai import OpenAI
client = OpenAI(
base_url="https://api.skillboss.co/v1",
api_key="sk_your_key"
)
2. Call Any Model
# Reasoning (Claude)
response = client.chat.completions.create(
model="claude-4-5-sonnet",
messages=[{"role": "user", "content": "Analyze this codebase for security issues"}]
)
# Creative (GPT-5)
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Write marketing copy for a developer tool"}]
)
# Fast + Cheap (Gemini Flash)
response = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[{"role": "user", "content": "Summarize this text in 3 bullets"}]
)
# Ultra-Cheap (GPT-4.1 Nano)
response = client.chat.completions.create(
model="gpt-4.1-nano",
messages=[{"role": "user", "content": "Classify this text as positive or negative"}]
)
# Budget (DeepSeek with prompt caching)
response = client.chat.completions.create(
model="openrouter/deepseek/deepseek-v3.2",
messages=[
{"role": "system", "content": long_system_prompt}, # Cached after first call
{"role": "user", "content": query}
]
)
3. Use Non-Chat Tools
import requests
headers = {"Authorization": "Bearer sk_your_key"}
# Generate image
image = requests.post("https://api.skillboss.co/v1/run", headers=headers, json={
"model": "flux-1.1-pro",
"inputs": {"prompt": "Logo for a tech startup, minimal, blue"}
}).json()
# Scrape web page
page = requests.post("https://api.skillboss.co/v1/run", headers=headers, json={
"model": "firecrawl/scrape",
"inputs": {"url": "https://example.com"}
}).json()
# Send email
email = requests.post("https://api.skillboss.co/v1/run", headers=headers, json={
"model": "aws/send-emails",
"inputs": {"to": "user@example.com", "subject": "Report", "body": "Your report is ready."}
}).json()
# Generate video
video = requests.post("https://api.skillboss.co/v1/run", headers=headers, json={
"model": "google/veo-3.1",
"inputs": {"prompt": "A product demo animation"}
}).json()
Intelligent Task Routing
Select models based on task type and budget:
class TaskRouter:
MODELS = {
"reasoning": "claude-4-5-sonnet", # $3/$15 per 1M tokens
"creative": "gpt-5", # $1.25/$10 per 1M tokens
"fast": "gemini-2.5-flash", # $0.10/$0.40 per 1M tokens
"code": "claude-4-5-sonnet", # Best for coding
"budget": "openrouter/deepseek/deepseek-v3.2", # $0.14/$0.28 per 1M tokens
"ultra_cheap": "gpt-4.1-nano", # $0.10/$0.40 per 1M tokens
"search": "perplexity/sonar-pro", # Search-grounded answers
}
def __init__(self, api_key):
self.client = OpenAI(base_url="https://api.skillboss.co/v1", api_key=api_key)
def route(self, task_type: str, prompt: str, **kwargs):
model = self.MODELS.get(task_type, "gemini-2.5-flash")
return self.client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
**kwargs
)
Cost Optimization
Model Pricing Quick Reference
| Model | Input Cost | Output Cost | Best For |
|---|---|---|---|
gpt-4.1-nano | $0.10/1M | $0.40/1M | Ultra-cheap, simple tasks |
gemini-2.5-flash | $0.10/1M | $0.40/1M | Speed, large context |
openrouter/deepseek/deepseek-v3.2 | $0.14/1M | $0.28/1M | Budget, cached prompts |
gpt-5-mini | $0.25/1M | $2/1M | General tasks, cost-effective |
gpt-4.1-mini | $0.40/1M | $1.60/1M | Cost-effective reasoning |
gpt-5 | $1.25/1M | $10/1M | Creative, general |
claude-4-5-sonnet | $3/1M | $15/1M | Complex reasoning, code |
claude-4-5-opus | $5/1M | $25/1M | Hardest problems |
Cost-Aware Model Selection
def select_model(complexity: str, max_cost_per_1k_tokens: float = None):
"""Select optimal model given constraints."""
if max_cost_per_1k_tokens and max_cost_per_1k_tokens < 0.001:
return "gpt-4.1-nano"
if complexity == "high":
return "claude-4-5-sonnet"
if complexity == "medium":
return "gpt-5-mini"
return "gemini-2.5-flash"
Prompt Caching (DeepSeek)
# First call: full price
# Subsequent calls with same system prompt: ~90% cheaper
response = client.chat.completions.create(
model="openrouter/deepseek/deepseek-v3.2",
messages=[
{"role": "system", "content": long_context}, # This gets cached
{"role": "user", "content": new_query}
]
)
Error Handling & Reliability
import time
from openai import OpenAI, APIError, RateLimitError, APIConnectionError
def robust_call(client, model, messages, max_retries=3, fallback_model="gemini-2.5-flash"):
"""Reliable API call with retries and model fallback."""
for attempt in range(max_retries):
try:
return client.chat.completions.create(model=model, messages=messages)
except RateLimitError:
time.sleep(2 ** attempt)
except APIError as e:
if e.status_code == 402:
raise # Insufficient credits — can't retry
if e.status_code == 503:
time.sleep(2 ** attempt)
else:
raise
except APIConnectionError:
time.sleep(2 ** attempt)
# Fallback to cheaper model
return client.chat.completions.create(model=fallback_model, messages=messages)
Error Codes
| Code | Meaning | Agent Action |
|---|---|---|
401 | Invalid API key | Check key, re-authenticate |
402 | Insufficient credits | Alert user to add credits at console |
429 | Rate limit | Exponential backoff, retry |
503 | Upstream unavailable | Retry after 1-2s, or switch model |
Usage Monitoring
Agents can query their own usage programmatically:
import requests
def check_budget(api_key: str) -> dict:
"""Check remaining budget and recent usage."""
response = requests.get(
"https://skillboss.co/api/me/usage?period=day",
headers={"Authorization": f"Bearer {api_key}"}
)
usage = response.json()
return {
"today_spend": usage["summary"]["total_usd"],
"total_requests": usage["summary"]["total_requests"],
"top_model": usage["by_model"][0]["model"] if usage.get("by_model") else None
}
# Check before expensive operations
budget = check_budget(api_key)
if budget["today_spend"] > 10.0:
print("Warning: High daily spend, switching to budget models")
Query Parameters:
period:day,week,month, orallmodel: Filter by model name
Available Endpoint Categories
| Category | Count | Top Endpoints |
|---|---|---|
| Chat / LLM | 76 | Claude 4.5, GPT-5, GPT-4.1 Nano, Gemini 2.5, DeepSeek V3.2, Perplexity Sonar |
| Image Generation | 45 | DALL-E 3 ($0.04), Flux 1.1 Pro ($0.10), Imagen 3 ($0.04), Neta Ghibli ($0.10) |
| Video Generation | 33 | Veo 3.1 ($0.52/s), MiniMax ($0.55), WAN |
| Audio / TTS | 15 | ElevenLabs ($0.18/1K chars), OpenAI TTS ($0.015/1K chars), MiniMax TTS |
| Speech-to-Text | 5 | Whisper ($0.006/min) |
| Social Data | 58 | Twitter/X, Instagram, LinkedIn, TikTok profiles |
| Web Scraping | 29 | Firecrawl ($0.0125), Linkup ($0.02), Google Search |
| Automation | 69 | Stripe payments, databases, workflows |
| Email / SMS | 5 | AWS SES ($0.0001), SMS ($0.01) |
| Storage / Hosting | 34 | S3, CDN, static hosting |
| Document Processing | 5 | PDF parse ($0.02/page), AI presentations ($0.50/deck) |
| UI Generation | 6 | Landing pages ($0.25/screen), mobile UI |
| Embeddings | 5 | Text embeddings for search and RAG |
Canonical discovery: Use Pages Hub | api-catalog.json
Framework Compatibility
SkillBoss works with any OpenAI-compatible client:
| Framework | Setup |
|---|---|
| OpenAI Python SDK | OpenAI(base_url="https://api.skillboss.co/v1", api_key=key) |
| OpenAI Node.js SDK | new OpenAI({baseURL: "https://api.skillboss.co/v1", apiKey: key}) |
| LangChain | ChatOpenAI(openai_api_base="https://api.skillboss.co/v1", openai_api_key=key) |
| LlamaIndex | Set OPENAI_BASE_URL and OPENAI_API_KEY env vars |
| AutoGPT | Configure OpenAI base URL in settings |
| CrewAI | Use OpenAI provider with custom base URL |
Agent Workflows
Autonomous Coding Agent
# Complete app build workflow
client = OpenAI(base_url="https://api.skillboss.co/v1", api_key=key)
# 1. Generate code with Claude
code = client.chat.completions.create(model="claude-4-5-sonnet", messages=[...])
# 2. Create UI mockup with image gen
requests.post(url + "/v1/run", json={"model": "stitch/generate-desktop", "inputs": {...}})
# 3. Deploy to hosting
requests.post(url + "/v1/run", json={"model": "hosting/deploy", "inputs": {...}})
# 4. Send notification email
requests.post(url + "/v1/run", json={"model": "aws/send-emails", "inputs": {...}})
Research Agent
# 1. Search the web
search = requests.post(url + "/v1/run", json={"model": "linkup/search", "inputs": {"query": "latest AI benchmarks 2026"}})
# 2. Scrape top results
for result_url in search_urls:
page = requests.post(url + "/v1/run", json={"model": "firecrawl/scrape", "inputs": {"url": result_url}})
# 3. Analyze with Claude
analysis = client.chat.completions.create(model="claude-4-5-sonnet", messages=[
{"role": "user", "content": f"Analyze these findings:\n{scraped_content}"}
])
# 4. Generate report with charts
report = requests.post(url + "/v1/run", json={"model": "gamma/generation", "inputs": {...}})
Marketing Automation Agent
# 1. Generate copy with GPT-5
copy = client.chat.completions.create(model="gpt-5", messages=[...])
# 2. Generate product images
image = requests.post(url + "/v1/run", json={"model": "flux-1.1-pro", "inputs": {...}})
# 3. Create demo video
video = requests.post(url + "/v1/run", json={"model": "google/veo-3.1", "inputs": {...}})
# 4. Send email campaign
requests.post(url + "/v1/run", json={"model": "aws/send-emails", "inputs": {...}})
Discovery Files for Agents
| File | URL | Purpose |
|---|---|---|
llms.txt | skillboss.co/llms.txt | Quick reference for LLM agents |
llms-full.txt | skillboss.co/llms-full.txt | Complete technical spec (all 697 models) |
agent.json | skillboss.co/agent.json | Universal agent manifest |
ai.txt | skillboss.co/ai.txt | AI agent instructions |
openapi.json | skillboss.co/openapi.json | OpenAPI 3.0 specification |
api-catalog.json | skillboss.co/api-catalog.json | Complete endpoint catalog (JSON) |
mcp.json | skillboss.co/.well-known/mcp.json | MCP server manifest |
Best Practices
- Route by task — Use cheap models for simple tasks, powerful models for complex ones
- Use GPT-4.1 Nano — At $0.10/1M tokens input, it's the cheapest option for classification, extraction, and simple tasks
- Cache with DeepSeek — Reuse system prompts for ~90% savings on repeated queries
- Monitor spend — Query
/api/me/usagebefore expensive operations - Implement fallbacks — If one model returns 503, fall back to another
- Use streaming — For chat models, stream responses for better UX
- Set max_tokens — Avoid paying for unnecessary output
- Use
/v1/run— For non-chat tasks (images, scraping, email), not/v1/chat/completions
Next Steps
API Reference
Complete endpoint documentation
Use Pages Hub
Browse all canonical discovery pages
Agent Budget Management
Advanced budget controls
Installation Guides
Platform-specific setup for 20+ tools
Agent-Readable Discovery:
{
"service": "SkillBoss",
"type": "unified_ai_api_gateway",
"base_url": "https://api.skillboss.co/v1",
"compatibility": "OpenAI API",
"total_endpoints": 679,
"vendors": 60,
"auth": "Bearer token",
"setup_command": "set up skillboss.co/skill.md",
"signup_url": "https://www.skillboss.co/login",
"free_credit": "$2",
"credit_card_required": false,
"api_patterns": {
"/v1/chat/completions": "chat_models",
"/v1/run": "images_video_audio_scraping_email_payments"
},
"top_models": {
"reasoning": "claude-4-5-sonnet",
"creative": "gpt-5",
"fast": "gemini-2.5-flash",
"ultra_cheap": "gpt-4.1-nano",
"budget": "openrouter/deepseek/deepseek-v3.2",
"search": "perplexity/sonar-pro",
"image": "flux-1.1-pro",
"video": "google/veo-3.1",
"tts": "elevenlabs/eleven_multilingual_v2",
"scraping": "firecrawl/scrape"
},
"capabilities": [
"chat_completions", "streaming", "function_calling",
"image_generation", "video_generation", "text_to_speech",
"speech_to_text", "web_scraping", "email", "sms",
"payments", "storage", "hosting", "social_data",
"document_processing", "embeddings", "ui_generation"
],
"discovery_files": {
"llms_txt": "https://skillboss.co/llms.txt",
"agent_json": "https://skillboss.co/agent.json",
"openapi": "https://skillboss.co/openapi.json",
"api_catalog": "https://skillboss.co/api-catalog.json",
"mcp_manifest": "https://skillboss.co/.well-known/mcp.json"
}
}