Skip to main content
The Chat Completions endpoint generates model responses from a conversation. It follows the OpenAI Chat Completions format.

Endpoint

POST https://api.cline.bot/api/v1/chat/completions

Request Headers

HeaderRequiredDescription
AuthorizationYesBearer YOUR_API_KEY
Content-TypeYesapplication/json
HTTP-RefererNoYour application URL (for usage tracking)
X-TitleNoYour application name (for usage logs)

Request Body

ParameterTypeRequiredDefaultDescription
modelstringYesModel ID in provider/model format. See Models.
messagesarrayYesConversation messages. Each has role (system, user, assistant) and content.
streambooleanNotrueReturn the response as a stream of Server-Sent Events.
toolsarrayNoTool/function definitions in OpenAI format.
temperaturenumberNoModel defaultSampling temperature (0.0 to 2.0). Lower values are more deterministic.

Message Format

Each message in the messages array has this structure:
{
  "role": "user",
  "content": "Your message here"
}
Roles:
RolePurpose
systemSets the model’s behavior and persona. Place first in the array.
userThe human’s input.
assistantPrevious model responses (for multi-turn conversations).

Multi-Turn Conversation

Include previous messages to maintain context:
{
  "model": "anthropic/claude-sonnet-4-6",
  "messages": [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "What is a closure in JavaScript?"},
    {"role": "assistant", "content": "A closure is a function that..."},
    {"role": "user", "content": "Can you show me an example?"}
  ]
}

Streaming Response

When stream: true (the default), the response is a series of Server-Sent Events:
data: {"id":"gen-abc123","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"anthropic/claude-sonnet-4-6"}

data: {"id":"gen-abc123","choices":[{"delta":{"content":"The capital"},"index":0}],"model":"anthropic/claude-sonnet-4-6"}

data: {"id":"gen-abc123","choices":[{"delta":{"content":" of France"},"index":0}],"model":"anthropic/claude-sonnet-4-6"}

data: {"id":"gen-abc123","choices":[{"delta":{"content":" is Paris."},"index":0,"finish_reason":"stop"}],"model":"anthropic/claude-sonnet-4-6","usage":{"prompt_tokens":14,"completion_tokens":8,"cost":0.000066}}

data: [DONE]
Each data: line contains a JSON chunk. Key fields:
FieldDescription
idGeneration ID, consistent across all chunks
choices[0].delta.contentThe new text in this chunk
choices[0].delta.reasoningReasoning/thinking content (for reasoning models)
choices[0].finish_reasonstop when complete, error on failure
usageToken counts and cost (included in the final chunk)

Usage Object

The final chunk includes token usage and cost:
{
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 42,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "cost": 0.000315
  }
}
FieldDescription
prompt_tokensTotal input tokens
completion_tokensTotal output tokens
prompt_tokens_details.cached_tokensTokens served from cache (reduces cost)
costTotal cost in USD for this request

Non-Streaming Response

When stream: false, the response is a single JSON object:
{
  "id": "gen-abc123",
  "model": "anthropic/claude-sonnet-4-6",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop",
      "index": 0
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 8
  }
}

Tool Calling

You can define tools that the model can call using the OpenAI function calling format:
{
  "model": "anthropic/claude-sonnet-4-6",
  "messages": [
    {"role": "user", "content": "What's the weather in San Francisco?"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City and state, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        }
      }
    }
  ]
}
When the model decides to call a tool, the response includes a tool_calls array:
{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"San Francisco, CA\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}
To continue the conversation after a tool call, include the tool result:
{
  "messages": [
    {"role": "user", "content": "What's the weather in San Francisco?"},
    {"role": "assistant", "tool_calls": [{"id": "call_abc123", "type": "function", "function": {"name": "get_weather", "arguments": "{\"location\": \"San Francisco, CA\"}"}}]},
    {"role": "tool", "tool_call_id": "call_abc123", "content": "{\"temperature\": 62, \"condition\": \"foggy\"}"},
  ]
}

Reasoning Models

Some models support extended thinking (reasoning). When using these models, the response may include reasoning content in the streaming delta:
{"choices":[{"delta":{"reasoning":"Let me think about this step by step..."}}]}
Reasoning tokens are separate from the main content and appear in the delta.reasoning field. Some providers return encrypted reasoning blocks via delta.reasoning_details that can be passed back in subsequent requests to preserve the reasoning trace.
Not all models support reasoning. See Models for which models have reasoning capabilities.

Complete Example

curl -X POST https://api.cline.bot/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-6",
    "messages": [
      {"role": "system", "content": "You are a concise assistant. Answer in one sentence."},
      {"role": "user", "content": "Explain what an API is."}
    ],
    "stream": true
  }'