How-to: Run a Local LLM with Ollama

Running a local language model is easier than ever thanks to Ollama — a lightweight engine that downloads, loads, and serves LLMs directly on your machine. This guide shows you how to install Ollama from the official website, pull your first model, run it locally, and interact with it through the API. Clean, private, and fully offline once installed.

1) Install Ollama (Official Website Method)

  1. Go to the official download page:
    https://ollama.com/download

  2. Choose your OS:

    • Windows: download and run the installer (OllamaSetup.exe).

    • macOS: download the .dmg and drag Ollama into Applications.

    • Linux: download the .deb or .rpm package and install with your package manager.

  3. After installing, restart your terminal and verify:

ollama version

If the command prints a version, you’re ready to roll.

2) Pull a Model (Example: Gpt-Oss:120b)

Ollama uses a simple command to download LLMs:

ollama run gpt-oss:120b

You can replace mistral with any model listed on the Ollama Model Library.

Once downloaded, the model is stored locally and can be used offline.

3) Run the Model Locally

To start an interactive session:

ollama run gpt-oss:120b

You’ll enter a prompt session where you can talk to the model directly.

Example:

>>> Explain quantum computing in one short paragraph.

Exit with Ctrl + C.

4) Create Your Own Custom Model (Optional)

Ollama supports custom model definitions with a Modelfile.

Example:

FROM gpt-oss:120b SYSTEM """
You are a concise cybersecurity assistant.
"""

Build it:

ollama create cyber-gpt -f Modelfile

Use it:

ollama run cyber-gpt

Conclusion

With Ollama installed, you now have a fully local, private LLM environment ready for experiments, development, and cybersecurity research. Pull a model, run it, and start building your own tools on top of it.

Leave a Reply