> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cline.bot/llms.txt
> Use this file to discover all available pages before exploring further.

# Baseten

> Learn how to configure and use Baseten's Model APIs with Cline. Access frontier open-source models with enterprise-grade performance, reliability, and competitive pricing.

Baseten provides on-demand frontier model APIs designed for production applications, not just experimentation. Built on the Baseten Inference Stack, these APIs deliver optimized inference for leading open-source models from OpenAI, DeepSeek, Moonshot AI, and Alibaba Cloud.

**Website:** [https://www.baseten.co/products/model-apis/](https://www.baseten.co/products/model-apis/)

### Getting an API Key

1. **Sign Up/Sign In:** Go to [Baseten](https://www.baseten.co/) and create an account or sign in.
2. **Navigate to API Keys:** Access your dashboard and go to the API Keys section.
3. **Create a Key:** Generate a new API key. Give it a descriptive name (e.g., "Cline").
4. **Copy the Key:** Copy the API key immediately and store it securely.

### Configuration in Cline

1. **Open Cline Settings:** Click the settings icon (⚙️) in the Cline panel.
2. **Select Provider:** Choose "Baseten" from the "API Provider" dropdown.
3. **Enter API Key:** Paste your Baseten API key into the "Baseten API Key" field.
4. **Select Model:** Choose your desired model from the "Model" dropdown.

**IMPORTANT: For Kimi K2 Thinking:** To use the `moonshotai/Kimi-K2-Thinking` model, you must enable **Native Tool Call (Experimental)** in Cline settings. This setting allows Cline to call tools through their native tool processor and is required for this reasoning model to function properly.

### Supported Models

Cline supports all current models under Baseten Model APIs, including:
For the most updated pricing, please visit: [https://www.baseten.co/products/model-apis/](https://www.baseten.co/products/model-apis/)

* `moonshotai/Kimi-K2-Thinking` (Moonshot AI) - Enhanced reasoning capabilities with step-by-step thought processes (262K context) - \$0.60/\$2.50 per 1M tokens
* `zai-org/GLM-4.6` (Z AI) - Frontier open model with advanced agentic, reasoning and coding capabilities by Z AI (200k context) \$0.60/\$2.20 per 1M tokens
* `moonshotai/Kimi-K2-Instruct-0905` (Moonshot AI) - September update with enhanced capabilities (262K context) - \$0.60/\$2.50 per 1M tokens
* `openai/gpt-oss-120b` (OpenAI) - 120B MoE with strong reasoning capabilities (128K context) - \$0.10/\$0.50 per 1M tokens
* `Qwen/Qwen3-Coder-480B-A35B-Instruct`- Advanced coding and reasoning (262K context) - \$0.38/\$1.53 per 1M tokens
* `Qwen/Qwen3-235B-A22B-Instruct-2507` - Math and reasoning expert (262K context) - \$0.22/\$0.80 per 1M tokens
* `deepseek-ai/DeepSeek-R1` - DeepSeek's first-generation reasoning model (163K context) - \$2.55/\$5.95 per 1M tokens
* `deepseek-ai/DeepSeek-R1-0528` - Latest revision of DeepSeek's reasoning model (163K context) - \$2.55/\$5.95 per 1M tokens
* `deepseek-ai/DeepSeek-V3-0324` - Fast general-purpose with enhanced reasoning (163K context) - \$0.77/\$0.77 per 1M tokens
* `deepseek-ai/DeepSeek-V3.1` - Hybrid reasoning with advanced tool calling (163K context) - \$0.50/\$1.50 per 1M tokens
* `deepseek-ai/DeepSeek-V3.2` - Hybrid reasoning with efficient long context scaling (163K context) - \$0.30/\$0.45 per 1M tokens

### Production-First Architecture

Baseten's Model APIs are built for production environments with several key advantages:

#### Enterprise-Grade Reliability

* **Four nines of uptime** (99.99%) through active-active redundancy
* **Cloud-agnostic, multi-cluster autoscaling** for consistent availability
* **SOC 2 Type II certified** and **HIPAA compliant** for security requirements

#### Optimized Performance

* **Pre-optimized models** shipped with the Baseten Inference Stack
* **Latest-generation GPUs** with multi-cloud infrastructure
* **Ultra-fast inference** optimized from the bottom up for production workloads

#### Cost Efficiency

* **5-10x less expensive** than closed alternatives
* **Optimized multi-cloud infrastructure** for efficient resource utilization
* **Transparent pricing** with no hidden costs or rate limit surprises

#### Developer Experience

* **OpenAI compatible API** - migrate by swapping a single URL
* **Drop-in replacement** for closed models with comprehensive observability and analytics
* **Seamless scaling** from Model APIs to dedicated deployments

### Special Features

#### Function Calling & Tool Use

All Baseten models support structured outputs, function calling, and tool use as part of the Baseten Inference Stack, making them ideal for agentic applications and coding workflows.

### Tips and Notes

* **Dynamic Model Updates:** Cline automatically fetches the latest model list from Baseten, ensuring access to new models as they're released in real time.
* **Multi-Cloud Capacity Management (MCM):** Baseten's multi-cloud infrastructure ensures high availability and low latency globally.
* **Support:** Baseten provides dedicated support for production deployments and can work with you on dedicated resources as you scale.

### Pricing Information

Current pricing is highly competitive and transparent. For the most up-to-date pricing, visit the [Baseten Model APIs page](https://www.baseten.co/products/model-apis/). Prices typically range from \$0.10-\$6.00 per million tokens, making Baseten significantly more cost-effective than many closed-model alternatives while providing access to state-of-the-art open-source models.
