Vercel AI Gateway

Vercel AI Gateway gives you a single API to access models from many providers. You switch by model id without swapping SDKs or juggling multiple keys. Cline integrates directly so you can pick a Gateway model in the dropdown, use it like any other provider, and see token and cache usage in the stream. Useful links:

Team dashboard: https://vercel.com/d?to=%2F%5Bteam%5D%2F%7E%2Fai
Models catalog: https://vercel.com/ai-gateway/models
Docs: https://vercel.com/docs/ai-gateway

What you get

One endpoint for 100+ models with a single key
Automatic retries and fallbacks that you configure on the dashboard
Spend monitoring with requests by model, token counts, cache usage, latency percentiles, and cost
OpenAI-compatible surface so existing clients work

Getting an API Key

Sign in at https://vercel.com
Dashboard → AI Gateway → API Keys → Create key
Copy the key

For more on authentication and OIDC options, see https://vercel.com/docs/ai-gateway/authentication

Configuration in Cline

Open Cline settings
Select Vercel AI Gateway as the API Provider
Paste your Gateway API Key
Pick a model from the list. Cline fetches the catalog automatically. You can also paste an exact id

Notes:

Model ids often follow provider/model. Copy the exact id from the catalog
Examples:
- openai/gpt-5
- anthropic/claude-sonnet-4
- google/gemini-2.5-pro
- groq/llama-3.1-70b
- deepseek/deepseek-v3

Observability you can act on

Vercel AI Gateway observability with requests by model, tokens, cache, latency, and cost.

What to watch:

Requests by model - confirm routing and adoption
Tokens - input vs output, including reasoning if exposed
Cache - cached input and cache creation tokens
Latency - p75 duration and p75 time to first token
Cost - per project and per model

Use it to:

Compare output tokens per request before and after a model change
Validate cache strategy by tracking cache reads and write creation
Catch TTFT regressions during experiments
Align budgets with real usage

Supported models

The gateway supports a large and changing set of models. Cline pulls the list from the Gateway API and caches it locally. For the current catalog, see https://vercel.com/ai-gateway/models

Tips

Use separate gateway keys per environment (dev, staging, prod). It keeps dashboards clean and budgets isolated.

Pricing is pass-through at provider list price. Bring-your-own key has 0% markup. You still pay provider and processing fees.

Vercel does not add rate limits. Upstream providers may. New accounts receive $5 credits every 30 days until the first payment.

Troubleshooting

401 - send the Gateway key to the Gateway endpoint, not an upstream URL
404 model - copy the exact id from the Vercel catalog
Slow first token - check p75 TTFT in the dashboard and try a model optimized for streaming
Cost spikes - break down by model in the dashboard and cap or route traffic

Inspiration

Multi-model evals - swap only the model id in Cline and compare latency and output tokens
Progressive rollout - route a small percent to a new model in the dashboard and ramp with metrics
Budget enforcement - set per-project limits without code changes

Crosslinks

OpenAI-Compatible setup: /provider-config/openai-compatible
Model Selection Guide: /getting-started/model-selection-guide
Understanding Context Management: /getting-started/understanding-context-management

Introduction

Getting Started

Best Practices

CLI

Features

Model & Provider Configuration

MCP Integration

Cline Tools Reference

Enterprise

Reference

Vercel AI Gateway

What you get

Getting an API Key

Configuration in Cline

Observability you can act on

Supported models

Tips

Troubleshooting

Inspiration

Crosslinks

Introduction

Getting Started

Best Practices

CLI

Features

Model & Provider Configuration

MCP Integration

Cline Tools Reference

Enterprise

Reference

​What you get

​Getting an API Key

​Configuration in Cline

​Observability you can act on

​Supported models

​Tips

​Troubleshooting

​Inspiration

​Crosslinks

What you get

Getting an API Key

Configuration in Cline

Observability you can act on

Supported models

Tips

Troubleshooting

Inspiration

Crosslinks