Getting an API Key
International Users
- Sign Up/Sign In: Go to https://z.ai/model-api. Create an account or sign in.
- Navigate to API Keys: Access your account dashboard and find the API keys section.
- Create a Key: Generate a new API key for your application.
- Copy the Key: Copy the API key immediately and store it securely.
China Mainland Users
- Sign Up/Sign In: Go to https://open.bigmodel.cn/. Create an account or sign in.
- Navigate to API Keys: Access your account dashboard and find the API keys section.
- Create a Key: Generate a new API key for your application.
- Copy the Key: Copy the API key immediately and store it securely.
Supported Models
Z AI provides different model catalogs based on your selected region:GLM-4.5 Series
- GLM-4.5 - Flagship model with 355B total parameters, 32B active parameters
- GLM-4.5-Air - Compact model with 106B total parameters, 12B active parameters
GLM-4.5 Hybrid Reasoning Models
- GLM-4.5 (Thinking Mode) - Advanced reasoning with step-by-step analysis
- GLM-4.5-Air (Thinking Mode) - Efficient reasoning for mainstream hardware
- 128,000 token context window for extensive document processing
- Mixture of Experts (MoE) architecture for optimal performance
- Agent-native design integrating reasoning, coding, and tool usage
- Open-source availability under MIT license
Configuration in Cline
- Open Cline Settings: Click the settings icon (⚙️) in the Cline panel.
- Select Provider: Choose “Z AI” from the “API Provider” dropdown.
- Select Region: Choose your region:
- “International” for global access
- “China” for mainland China access
- Enter API Key: Paste your Z AI API key into the “Z AI API Key” field.
- Select Model: Choose your desired model from the “Model” dropdown.
GLM Coding Plans
Z AI offers subscription plans specifically designed for coding applications. These plans provide cost-effective access to GLM-4.5 models through a prompt-based structure rather than traditional API usage billing.Plan Options
GLM Coding Lite - $3/month- 120 prompts per 5-hour cycle
- Access to GLM-4.5 model
- Works exclusively through coding tools like Cline
- 600 prompts per 5-hour cycle
- Access to GLM-4.5 model
- Works exclusively through coding tools like Cline

Setting up GLM Coding Plans
To use the GLM Coding Plans with Cline:- Subscribe: Go to https://z.ai/subscribe and choose your plan.
- Create API Key: After subscribing, log into your zAI dashboard and create an API key for your coding plan.
- Configure in Cline: Open Cline settings, select “Z AI” as your provider, and paste your API key into the “Z AI API Key” field.

Z AI’s Hybrid Intelligence
Z AI’s GLM-4.5 series introduces revolutionary capabilities that set it apart from conventional language models:Hybrid Reasoning Architecture
GLM-4.5 operates in two distinct modes:- Thinking Mode: Designed for complex reasoning tasks and tool usage, engaging in deeper analytical processes
- Non-Thinking Mode: Provides immediate responses for straightforward queries, optimizing efficiency
Exceptional Performance
GLM-4.5 achieves a comprehensive score of 63.2 across 12 benchmarks spanning agentic tasks, reasoning, and coding challenges, securing 3rd place among all proprietary and open-source models. GLM-4.5-Air maintains competitive performance with a score of 59.8 while delivering superior efficiency.Mixture of Experts Excellence
The sophisticated MoE architecture optimizes performance while maintaining computational efficiency:- GLM-4.5: 355B total parameters with 32B active parameters
- GLM-4.5-Air: 106B total parameters with 12B active parameters
Extended Context Capabilities
The 128,000-token context window enables comprehensive understanding of lengthy documents and codebases, with real-world testing confirming effective processing of nearly 2,000-line codebases while maintaining remarkable performance.Open-Source Leadership
Released under MIT license, GLM-4.5 provides researchers and developers with access to state-of-the-art capabilities without proprietary restrictions, including base models, hybrid reasoning versions, and optimized FP8 variants.Regional Optimization
API Endpoints
- International: Uses
https://api.z.ai/api/paas/v4
- China: Uses
https://open.bigmodel.cn/api/paas/v4
Model Availability
The region setting determines both API endpoint and available models, with automatic filtering to ensure compatibility with your selected region.Special Features
Agentic Capabilities
GLM-4.5’s unified architecture makes it particularly suitable for complex intelligent agent applications requiring integrated reasoning, coding, and tool utilization capabilities.Comprehensive Benchmarking
Performance evaluation encompasses:- 3 agentic task benchmarks
- 7 reasoning benchmarks
- 2 coding benchmarks
Developer Integration
Models support integration through multiple frameworks:- transformers
- vLLM
- SGLang
Performance Comparisons
vs Claude 4 Sonnet
GLM-4.5 shows competitive performance in agentic coding and reasoning tasks, though Claude Sonnet 4 maintains advantages in coding success rates and autonomous multi-feature application development.vs GPT-4.5
GLM-4.5 ranks competitively in reasoning and agent benchmarks, with GPT-4.5 generally leading in raw task accuracy on professional benchmarks like MMLU and AIME.Tips and Notes
- Region Selection: Choose the appropriate region for optimal performance and compliance with local regulations.
- Model Selection: GLM-4.5 for maximum performance, GLM-4.5-Air for efficiency and mainstream hardware compatibility.
- Context Advantage: Large 128K context window enables processing of substantial codebases and documents.
- Open Source Benefits: MIT license enables both commercial use and secondary development.
- Agentic Applications: Particularly strong for applications requiring reasoning, coding, and tool usage integration.
- Hybrid Reasoning: Use Thinking Mode for complex problems, Non-Thinking Mode for simple queries.
- API Compatibility: OpenAI-compatible API provides streaming responses and usage reporting.
- Framework Support: Multiple integration options available for different deployment scenarios.