Vercel AI Gateway gives you a single API to access models from many providers. You switch by model id without swapping SDKs or juggling multiple keys. Cline integrates directly so you can pick a Gateway model in the dropdown, use it like any other provider, and see token and cache usage in the stream. Useful links:

What you get

  • One endpoint for 100+ models with a single key
  • Automatic retries and fallbacks that you configure on the dashboard
  • Spend monitoring with requests by model, token counts, cache usage, latency percentiles, and cost
  • OpenAI-compatible surface so existing clients work

Getting an API Key

  1. Sign in at https://vercel.com
  2. Dashboard → AI Gateway → API Keys → Create key
  3. Copy the key
For more on authentication and OIDC options, see https://vercel.com/docs/ai-gateway/authentication

Configuration in Cline

  1. Open Cline settings
  2. Select Vercel AI Gateway as the API Provider
  3. Paste your Gateway API Key
  4. Pick a model from the list. Cline fetches the catalog automatically. You can also paste an exact id
Notes:
  • Model ids often follow provider/model. Copy the exact id from the catalog
    Examples:
    • openai/gpt-5
    • anthropic/claude-sonnet-4
    • google/gemini-2.5-pro
    • groq/llama-3.1-70b
    • deepseek/deepseek-v3

Observability you can act on

Vercel AI Gateway observability with requests by model, tokens, cache, latency, and cost.
What to watch:
  • Requests by model - confirm routing and adoption
  • Tokens - input vs output, including reasoning if exposed
  • Cache - cached input and cache creation tokens
  • Latency - p75 duration and p75 time to first token
  • Cost - per project and per model
Use it to:
  • Compare output tokens per request before and after a model change
  • Validate cache strategy by tracking cache reads and write creation
  • Catch TTFT regressions during experiments
  • Align budgets with real usage

Supported models

The gateway supports a large and changing set of models. Cline pulls the list from the Gateway API and caches it locally. For the current catalog, see https://vercel.com/ai-gateway/models

Tips

Use separate gateway keys per environment (dev, staging, prod). It keeps dashboards clean and budgets isolated.
Pricing is pass-through at provider list price. Bring-your-own key has 0% markup. You still pay provider and processing fees.
Vercel does not add rate limits. Upstream providers may. New accounts receive $5 credits every 30 days until the first payment.

Troubleshooting

  • 401 - send the Gateway key to the Gateway endpoint, not an upstream URL
  • 404 model - copy the exact id from the Vercel catalog
  • Slow first token - check p75 TTFT in the dashboard and try a model optimized for streaming
  • Cost spikes - break down by model in the dashboard and cap or route traffic

Inspiration

  • Multi-model evals - swap only the model id in Cline and compare latency and output tokens
  • Progressive rollout - route a small percent to a new model in the dashboard and ramp with metrics
  • Budget enforcement - set per-project limits without code changes
  • OpenAI-Compatible setup: /provider-config/openai-compatible
  • Model Selection Guide: /getting-started/model-selection-guide
  • Understanding Context Management: /getting-started/understanding-context-management