Netgrimoire/Gremlin-Grimoire/Runbooks/Model-Management.md

---
title: Ollama Model Management
description: Pulling, verifying, and managing models on the Gremlin stack
published: true
date: 2026-04-12T00:00:00.000Z
tags: gremlin, ollama, models, runbook
editor: markdown
dateCreated: 2026-04-12T00:00:00.000Z
---

# Ollama Model Management

## Pull Required Models

Run on docker4 after any fresh deploy or after the Ollama container is recreated:

```bash
docker exec $(docker ps -qf name=gremlin_ollama) ollama pull llama3.2:3b
docker exec $(docker ps -qf name=gremlin_ollama) ollama pull qwen2.5-coder:7b
```

## Verify Models Loaded

```bash
docker exec $(docker ps -qf name=gremlin_ollama) ollama list
```

## Model Reference

| Model | Size | Pull Time (CPU) | Used By |
|-------|------|----------------|---------|
| `llama3.2:3b` | ~2 GB | ~5 min | Kuma triage, Open WebUI |
| `qwen2.5-coder:7b` | ~5 GB | ~15 min | Forgejo audit, Open WebUI |

## Models Storage Path

`/DockerVol/ollama` — survives container restarts and redeployments.

## ⚠ Pull Before Workflows Run

n8n workflows fail silently if models aren't present. Ollama returns a model-not-found response but n8n may not surface this as an obvious error. Always pull models immediately after deploy before enabling workflows.