Ollama
Cline supports running models locally using Ollama. This approach offers privacy, offline access, and potentially reduced costs. It requires some initial setup and a sufficiently powerful computer. Because of the present state of consumer hardware, it’s not recommended to use Ollama with Cline as performance will likely be poor for average hardware configurations.
Website: https://ollama.com/
Setting up Ollama
-
Download and Install Ollama: Obtain the Ollama installer for your operating system from the Ollama website and follow their installation guide. Ensure Ollama is running. You can typically start it with:
-
Download a Model: Ollama supports a wide variety of models. A list of available models can be found on the Ollama model library. Some models recommended for coding tasks include:
codellama:7b-code
(a good, smaller starting point)codellama:13b-code
(offers better quality, larger size)codellama:34b-code
(provides even higher quality, very large)qwen2.5-coder:32b
mistralai/Mistral-7B-Instruct-v0.1
(a solid general-purpose model)deepseek-coder:6.7b-base
(effective for coding)llama3:8b-instruct-q5_1
(suitable for general tasks)
To download a model, open your terminal and execute:
For instance:
-
Configure the Model’s Context Window: By default, Ollama models often use a context window of 2048 tokens, which can be insufficient for many Cline requests. A minimum of 12,000 tokens is advisable for decent results, with 32,000 tokens being ideal. To adjust this, you’ll modify the model’s parameters and save it as a new version.
First, load the model (using
qwen2.5-coder:32b
as an example):Once the model is loaded within the Ollama interactive session, set the context size parameter:
Then, save this configured model with a new name:
(Replace
your_custom_model_name
with a name of your choice.) -
Configure Cline:
- Open the Cline sidebar (usually indicated by the Cline icon).
- Click the settings gear icon (⚙️).
- Select “ollama” as the API Provider.
- Enter the Model name you saved in the previous step (e.g.,
your_custom_model_name
). - (Optional) Adjust the base URL if Ollama is running on a different machine or port. The default is
http://localhost:11434
. - (Optional) Configure the Model context size in Cline’s Advanced settings. This helps Cline manage its context window effectively with your customized Ollama model.
Tips and Notes
- Resource Demands: Running large language models locally can be demanding on system resources. Ensure your computer meets the requirements for your chosen model.
- Model Choice: Experiment with various models to discover which best fits your specific tasks and preferences.
- Offline Capability: After downloading a model, you can use Cline with that model even without an internet connection.
- Token Usage Tracking: Cline tracks token usage for models accessed via Ollama, allowing you to monitor consumption.
- Ollama’s Own Documentation: For more detailed information, consult the official Ollama documentation.