Skip to main content
The Chat Completions endpoint generates model responses from a conversation. It follows the OpenAI Chat Completions format.

Endpoint

POST https://api.cline.bot/api/v1/chat/completions

Request Headers

HeaderRequiredDescription
AuthorizationYesBearer YOUR_API_KEY
Content-TypeYesapplication/json
HTTP-RefererNoYour application URL (for usage tracking)
X-TitleNoYour application name (for usage logs)

Request Body

ParameterTypeRequiredDefaultDescription
modelstringYesModel ID in provider/model format. See Models.
messagesarrayYesConversation messages. Each has role (system, user, assistant) and content.
streambooleanNotrueReturn the response as a stream of Server-Sent Events.
toolsarrayNoTool/function definitions in OpenAI format.
temperaturenumberNoModel defaultSampling temperature (0.0 to 2.0). Lower values are more deterministic.

Message Format

Each message in the messages array has this structure:
{
  "role": "user",
  "content": "Your message here"
}
Roles:
RolePurpose
systemSets the model’s behavior and persona. Place first in the array.
userThe human’s input.
assistantPrevious model responses (for multi-turn conversations).

Multi-Turn Conversation

Include previous messages to maintain context:
{
  "model": "anthropic/claude-sonnet-4-6",
  "messages": [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "What is a closure in JavaScript?"},
    {"role": "assistant", "content": "A closure is a function that..."},
    {"role": "user", "content": "Can you show me an example?"}
  ]
}

Streaming Response

When stream: true (the default), the response is a series of Server-Sent Events:
data: {"id":"gen-abc123","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"anthropic/claude-sonnet-4-6"}

data: {"id":"gen-abc123","choices":[{"delta":{"content":"The capital"},"index":0}],"model":"anthropic/claude-sonnet-4-6"}

data: {"id":"gen-abc123","choices":[{"delta":{"content":" of France"},"index":0}],"model":"anthropic/claude-sonnet-4-6"}

data: {"id":"gen-abc123","choices":[{"delta":{"content":" is Paris."},"index":0,"finish_reason":"stop"}],"model":"anthropic/claude-sonnet-4-6","usage":{"prompt_tokens":14,"completion_tokens":8,"cost":0.000066}}

data: [DONE]
Each data: line contains a JSON chunk. Key fields:
FieldDescription
idGeneration ID, consistent across all chunks
choices[0].delta.contentThe new text in this chunk
choices[0].delta.reasoningReasoning/thinking content (for reasoning models)
choices[0].finish_reasonstop when complete, error on failure
usageToken counts and cost (included in the final chunk)

Usage Object

The final chunk includes token usage and cost:
{
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 42,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "cost": 0.000315
  }
}
FieldDescription
prompt_tokensTotal input tokens
completion_tokensTotal output tokens
prompt_tokens_details.cached_tokensTokens served from cache (reduces cost)
costTotal cost in USD for this request

Non-Streaming Response

When stream: false, the response is a single JSON object:
{
  "id": "gen-abc123",
  "model": "anthropic/claude-sonnet-4-6",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop",
      "index": 0
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 8
  }
}

Tool Calling

You can define tools that the model can call using the OpenAI function calling format:
{
  "model": "anthropic/claude-sonnet-4-6",
  "messages": [
    {"role": "user", "content": "What's the weather in San Francisco?"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City and state, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        }
      }
    }
  ]
}
When the model decides to call a tool, the response includes a tool_calls array:
{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"San Francisco, CA\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}
To continue the conversation after a tool call, include the tool result:
{
  "messages": [
    {"role": "user", "content": "What's the weather in San Francisco?"},
    {"role": "assistant", "tool_calls": [{"id": "call_abc123", "type": "function", "function": {"name": "get_weather", "arguments": "{\"location\": \"San Francisco, CA\"}"}}]},
    {"role": "tool", "tool_call_id": "call_abc123", "content": "{\"temperature\": 62, \"condition\": \"foggy\"}"},
  ]
}

Reasoning Models

Some models support extended thinking (reasoning). When using these models, the response may include reasoning content in the streaming delta:
{"choices":[{"delta":{"reasoning":"Let me think about this step by step..."}}]}
Reasoning tokens are separate from the main content and appear in the delta.reasoning field. Some providers return encrypted reasoning blocks via delta.reasoning_details that can be passed back in subsequent requests to preserve the reasoning trace.
Not all models support reasoning. See Models for which models have reasoning capabilities.

Complete Example

curl -X POST https://api.cline.bot/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-6",
    "messages": [
      {"role": "system", "content": "You are a concise assistant. Answer in one sentence."},
      {"role": "user", "content": "Explain what an API is."}
    ],
    "stream": true
  }'

Models

Browse available models and their capabilities.

Errors

Handle errors and implement retry logic.

SDK Examples

Use this endpoint from Python, Node.js, and more.

Authentication

API key management and security practices.