Learn how to configure and use Cerebras’s ultra-fast inference with Cline. Experience up to 2,600 tokens per second with wafer-scale chip architecture and real-time reasoning models.
qwen-3-coder-480b-free
(Free tier) - High-performance coding model at no costqwen-3-coder-480b
- Flagship 480B parameter coding modelqwen-3-235b-a22b-instruct-2507
- Advanced instruction-following modelqwen-3-235b-a22b-thinking-2507
- Reasoning model with step-by-step thinkingllama-3.3-70b
- Meta’s Llama 3.3 model optimized for speedqwen-3-32b
- Compact yet powerful model for general tasksqwen-3-coder-480b-free
model provides access to high-performance inference at no cost—unique among speed-focused providers.
qwen-3-235b-a22b-thinking-2507
can complete complex multi-step reasoning in under a second, making them practical for interactive development workflows.