DocumentationUpdated: 2026-03-01

Chat Completions API - SkillBoss Docs

Chat Completions API reference for Claude, GPT, Gemini, and 679+ endpoints. Streaming, function calling, vision, JSON mode, best practices, and code examples.

The Chat Completions API is 100% OpenAI-compatible and supports 679+ endpoints including Claude 4.5 Sonnet, GPT-5, Gemini 2.5 Flash, and DeepSeek R1.

Endpoint

POST https://api.skillboss.co/v1/chat/completions

Authentication

All API requests require an API key in the Authorization header:

curl https://api.skillboss.co/v1/chat/completions \
  -H "Authorization: Bearer sk-sb-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-4.5-sonnet",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Get your API key →

Request Body

ParameterTypeRequiredDescription
modelstringYesModel ID (e.g., claude-4.5-sonnet, gpt-5)
messagesarrayYesArray of message objects
max_tokensintegerNoMaximum tokens to generate (default: 4096)
temperaturenumberNoSampling temperature 0-2 (default: 1)
top_pnumberNoNucleus sampling 0-1 (default: 1)
streambooleanNoEnable streaming responses (default: false)
stopstring/arrayNoStop sequences
presence_penaltynumberNoPenalize new topics -2 to 2 (default: 0)
frequency_penaltynumberNoPenalize repetition -2 to 2 (default: 0)
userstringNoEnd-user ID for tracking

Messages Format

{
  "messages": [
    {
      "role": "system" | "user" | "assistant",
      "content": string | array
    }
  ]
}

Response Format

Non-Streaming Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1709251200,
  "model": "claude-4.5-sonnet",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Streaming Response

When stream: true, responses are sent as Server-Sent Events (SSE):

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1709251200,"model":"claude-4.5-sonnet","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1709251200,"model":"claude-4.5-sonnet","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1709251200,"model":"claude-4.5-sonnet","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

See chat completions API →

Code Examples

Node.js / TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.SKILLBOSS_API_KEY,
  baseURL: 'https://api.skillboss.co/v1'
});

const completion = await client.chat.completions.create({
  model: 'claude-4.5-sonnet',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing in simple terms.' }
  ],
  max_tokens: 500,
  temperature: 0.7
});

console.log(completion.choices[0].message.content);

Python

from openai import OpenAI

client = OpenAI(
    api_key="sk-sb-YOUR_API_KEY",
    base_url="https://api.skillboss.co/v1"
)

response = client.chat.completions.create(
    model="claude-4.5-sonnet",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    max_tokens=500,
    temperature=0.7
)

print(response.choices[0].message.content)

cURL

curl https://api.skillboss.co/v1/chat/completions \
  -H "Authorization: Bearer sk-sb-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-4.5-sonnet",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms."
      }
    ],
    "max_tokens": 500,
    "temperature": 0.7
  }'

Go

package main

import (
    "context"
    "fmt"
    "github.com/sashabaranov/go-openai"
)

func main() {
    config := openai.DefaultConfig("sk-sb-YOUR_API_KEY")
    config.BaseURL = "https://api.skillboss.co/v1"
    client := openai.NewClientWithConfig(config)

    resp, err := client.CreateChatCompletion(
        context.Background(),
        openai.ChatCompletionRequest{
            Model: "claude-4.5-sonnet",
            Messages: []openai.ChatCompletionMessage{
                {
                    Role:    openai.ChatMessageRoleSystem,
                    Content: "You are a helpful assistant.",
                },
                {
                    Role:    openai.ChatMessageRoleUser,
                    Content: "Explain quantum computing in simple terms.",
                },
            },
            MaxTokens:   500,
            Temperature: 0.7,
        },
    )

    if err != nil {
        panic(err)
    }

    fmt.Println(resp.Choices[0].Message.Content)
}

Available Models

Premium Models (Recommended)

ModelIDContextInput PriceOutput Price
Claude 4.5 Sonnetclaude-4.5-sonnet200K$3/1M tokens$15/1M tokens
GPT-5gpt-5128K$10/1M tokens$30/1M tokens
Gemini 2.5 Flashgemini-2.5-flash1M$0.15/1M tokens$0.60/1M tokens

High-Performance Models

ModelIDContextBest For
Claude 4.5 Haikuclaude-4.5-haiku200KFast responses, high throughput
GPT-4o minigpt-4o-mini128KCost-efficient chat
DeepSeek R1deepseek-r164KReasoning tasks

See all 679+ endpoints →

Multimodal Support (Vision)

Send images in messages:

const completion = await client.chat.completions.create({
  model: 'gpt-5',
  messages: [
    {
      role: 'user',
      content: [
        { type: 'text', text: 'What's in this image?' },
        {
          type: 'image_url',
          image_url: {
            url: 'https://example.com/image.jpg'
          }
        }
      ]
    }
  ]
});

Supported models:

  • gpt-5, gpt-4o, gpt-4o-mini
  • claude-4.5-sonnet, claude-4.5-opus
  • gemini-2.5-flash, gemini-2.5-pro

Advanced Features

Function Calling (Tools)

const completion = await client.chat.completions.create({
  model: 'claude-4.5-sonnet',
  messages: [{ role: 'user', content: 'What's the weather in SF?' }],
  tools: [
    {
      type: 'function',
      function: {
        name: 'get_weather',
        description: 'Get current weather for a location',
        parameters: {
          type: 'object',
          properties: {
            location: { type: 'string', description: 'City name' },
            unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
          },
          required: ['location']
        }
      }
    }
  ],
  tool_choice: 'auto'
});

// Handle tool calls
if (completion.choices[0].finish_reason === 'tool_calls') {
  const toolCall = completion.choices[0].message.tool_calls[0];
  console.log(toolCall.function.name); // "get_weather"
  console.log(toolCall.function.arguments); // '{"location":"San Francisco","unit":"celsius"}'
}

JSON Mode

Force model to output valid JSON:

const completion = await client.chat.completions.create({
  model: 'gpt-5',
  messages: [
    { role: 'system', content: 'Extract user info as JSON.' },
    { role: 'user', content: 'My name is John, I'm 30, from NYC.' }
  ],
  response_format: { type: 'json_object' }
});

const data = JSON.parse(completion.choices[0].message.content);
// { "name": "John", "age": 30, "location": "NYC" }

Reproducible Outputs (Seed)

const completion = await client.chat.completions.create({
  model: 'gpt-5',
  messages: [{ role: 'user', content: 'Generate a random number.' }],
  seed: 12345, // Same seed = same output
  temperature: 1
});

Error Handling

HTTP Status Codes

CodeMeaningSolution
200Success-
400Bad RequestCheck request format
401UnauthorizedVerify API key
429Rate LimitedReduce request rate
500Server ErrorRetry with exponential backoff

Error Response Format

{
  "error": {
    "message": "Invalid API key provided",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

Retry Logic Example

async function chatWithRetry(messages, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await client.chat.completions.create({
        model: 'claude-4.5-sonnet',
        messages
      });
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        // Exponential backoff: 1s, 2s, 4s
        await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000));
        continue;
      }
      throw error;
    }
  }
}

Rate Limits

SkillBoss enforces the following rate limits:

TierRequests/minTokens/min
Free6090,000
PaidUnlimited*10M

*Soft limit - contact support for higher limits

Best Practices

1. Use System Messages for Context

Good:

messages: [
  { role: 'system', content: 'You are a Python expert. Be concise.' },
  { role: 'user', content: 'Explain list comprehensions.' }
]

Bad:

messages: [
  { role: 'user', content: 'You are a Python expert. Explain list comprehensions.' }
]

2. Set max_tokens to Control Cost

{
  model: 'claude-4.5-sonnet',
  messages: [...],
  max_tokens: 500 // Limit output to 500 tokens
}

3. Use temperature Wisely

  • 0.0-0.3: Factual, deterministic (code, math)
  • 0.7-1.0: Creative, varied (stories, brainstorming)
  • 1.5-2.0: Highly creative (experimental)

4. Handle Streaming for Better UX

const stream = await client.chat.completions.create({
  model: 'claude-4.5-sonnet',
  messages: [...],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

See API overview →

Frequently Asked Questions

Can I use the OpenAI SDK?

Yes! SkillBoss is 100% OpenAI-compatible. Just change the baseURL:

const client = new OpenAI({
  apiKey: 'sk-sb-YOUR_KEY',
  baseURL: 'https://api.skillboss.co/v1' // Change this line
});

What's the difference between models?

  • Claude: Best reasoning, long context (200K)
  • GPT: Best general-purpose, fastest updates
  • Gemini: Best multimodal, 1M context window
  • DeepSeek: Best for coding & math

Full discovery overview →

How is billing calculated?

You pay for:

  • Input tokens: Tokens in your messages
  • Output tokens: Tokens in the response

Example:

Input: 100 tokens × $3/1M = $0.0003
Output: 50 tokens × $15/1M = $0.00075
Total: $0.00105 per request

Can I use streaming with all models?

Yes! All models support streaming via stream: true.


Last updated: March 1, 2026 Related: