Learn how to configure and use Groq’s lightning-fast inference with Cline. Access models from OpenAI, Meta, DeepSeek, and more on Groq’s purpose-built LPU architecture.
llama-3.3-70b-versatile
(Meta) - Balanced performance with 131K contextllama-3.1-8b-instant
(Meta) - Fast inference with 131K contextopenai/gpt-oss-120b
(OpenAI) - Featured flagship model with 131K contextopenai/gpt-oss-20b
(OpenAI) - Featured compact model with 131K contextmoonshotai/kimi-k2-instruct
(Moonshot AI) - 1 trillion parameter model with prompt cachingdeepseek-r1-distill-llama-70b
(DeepSeek/Meta) - Reasoning-optimized modelqwen/qwen3-32b
(Alibaba Cloud) - Enhanced for Q&A tasksmeta-llama/llama-4-maverick-17b-128e-instruct
(Meta) - Latest Llama 4 variantmeta-llama/llama-4-scout-17b-16e-instruct
(Meta) - Latest Llama 4 variant