Running a local language model is easier than ever thanks to Ollama — a lightweight engine that downloads, loads, and serves LLMs directly on your machine. This guide shows you how to install Ollama from the official website, pull your first model, run it locally, and interact with it through the API. Clean, private, and fully offline once installed.
1) Install Ollama (Official Website Method)
Go to the official download page:
https://ollama.com/downloadChoose your OS:
Windows: download and run the installer (
OllamaSetup.exe).macOS: download the
.dmgand drag Ollama into Applications.Linux: download the
.debor.rpmpackage and install with your package manager.
After installing, restart your terminal and verify:
If the command prints a version, you’re ready to roll.
2) Pull a Model (Example: Gpt-Oss:120b)
Ollama uses a simple command to download LLMs:
You can replace mistral with any model listed on the Ollama Model Library.
Once downloaded, the model is stored locally and can be used offline.
3) Run the Model Locally
To start an interactive session:
You’ll enter a prompt session where you can talk to the model directly.
Example:
Exit with Ctrl + C.
4) Create Your Own Custom Model (Optional)
Ollama supports custom model definitions with a Modelfile.
Example:
Build it:
Use it:
