Skip to main content
Nebius AI Studio provides inference for a wide range of open-source models including DeepSeek, Qwen, Llama, and others, with competitive pricing and fast/standard speed tiers. Website: https://studio.nebius.com/

Getting an API Key

  1. Sign Up/Sign In: Go to Nebius AI Studio. Create an account or sign in.
  2. Navigate to API Keys: Access the API key section in your dashboard.
  3. Create a Key: Generate a new API key.
  4. Copy the Key: Copy the API key immediately and store it securely.

Supported Models

Cline supports the following Nebius models:

DeepSeek Models

  • deepseek-ai/DeepSeek-V3 - General-purpose model (0.50/0.50/1.50 per 1M tokens)
  • deepseek-ai/DeepSeek-V3-0324-fast - Fast variant (2.00/2.00/6.00 per 1M tokens)
  • deepseek-ai/DeepSeek-R1 - Reasoning model (0.80/0.80/2.40 per 1M tokens)
  • deepseek-ai/DeepSeek-R1-fast - Fast reasoning (2.00/2.00/6.00 per 1M tokens)
  • deepseek-ai/DeepSeek-R1-0528 - Latest reasoning version (163K context, 0.80/0.80/2.40 per 1M tokens)
  • deepseek-ai/DeepSeek-R1-0528-fast - Fast latest reasoning (2.00/2.00/6.00 per 1M tokens)

Qwen Models

  • Qwen/Qwen3-Coder-480B-A35B-Instruct - 480B coding model (262K context, 0.40/0.40/1.80 per 1M tokens)
  • Qwen/Qwen3-235B-A22B - 235B MoE model (0.20/0.20/0.60 per 1M tokens)
  • Qwen/Qwen3-235B-A22B-Instruct-2507 - Latest instruct version (262K context, 0.20/0.20/0.60 per 1M tokens)
  • Qwen/Qwen3-32B / Qwen/Qwen3-32B-fast - Dense 32B model
  • Qwen/Qwen3-30B-A3B / Qwen/Qwen3-30B-A3B-fast - Compact MoE model
  • Qwen/Qwen3-4B-fast - Small fast model (0.08/0.08/0.24 per 1M tokens)
  • Qwen/Qwen2.5-Coder-32B-Instruct-fast - Coding-optimized (0.10/0.10/0.30 per 1M tokens)
  • Qwen/Qwen2.5-32B-Instruct-fast (Default) - General-purpose (0.13/0.13/0.40 per 1M tokens)

Other Models

  • moonshotai/Kimi-K2-Instruct - Kimi K2 with prompt caching (131K context, 0.50/0.50/2.40 per 1M tokens)
  • openai/gpt-oss-120b - OpenAI’s 120B open-weight model (0.15/0.15/0.60 per 1M tokens)
  • openai/gpt-oss-20b - OpenAI’s 20B open-weight model (0.05/0.05/0.20 per 1M tokens)
  • zai-org/GLM-4.5 / zai-org/GLM-4.5-Air - Z AI models with prompt caching
  • meta-llama/Llama-3.3-70B-Instruct-fast - Fast Llama 3.3 (0.25/0.25/0.75 per 1M tokens)

Configuration in Cline

  1. Open Cline Settings: Click the settings icon (⚙️) in the Cline panel.
  2. Select Provider: Choose “Nebius AI Studio” from the “API Provider” dropdown.
  3. Enter API Key: Paste your Nebius API key.
  4. Select Model: Choose your desired model from the “Model” dropdown.

Tips and Notes

  • Speed Tiers: Models with -fast suffix offer faster inference at higher prices.
  • Wide Selection: Access models from DeepSeek, Qwen, Meta, Moonshot, OpenAI, and Z AI.
  • Competitive Pricing: Generally lower prices than direct provider APIs.
  • Pricing: Check the Nebius documentation for current rates.