---
title: Gremlin Grimoire
description: Netgrimoire's local AI — the gremlin that runs the machine
published: true
date: 2026-04-12T00:00:00.000Z
tags: gremlin, ai, ollama, n8n
editor: markdown
dateCreated: 2026-04-12T00:00:00.000Z
---

# Gremlin Grimoire

![gremlin-badge](/images/gremlin-badge.png)

Gremlin is the local AI layer of Netgrimoire. It's not just a chat interface — it's an autonomous agent that watches the infrastructure, audits the codebase, triages alerts, and answers questions about the lab. The gremlin lives inside the machine and knows every dark corner of it.

---

## What Gremlin Is

Gremlin is a stack of four services running together on `docker4`, all pinned to the same Swarm node:

| Service | Role | URL |
|---------|------|-----|
| **Ollama** | Local LLM inference (CPU-only, Ryzen) | `http://ollama:11434` · `ollama.netgrimoire.com:11434` |
| **Open WebUI** | Chat interface + RAG frontend | `https://ai.netgrimoire.com` |
| **Qdrant** | Vector database for RAG knowledge base | `http://qdrant:6333` · dashboard `:6333/dashboard` |
| **n8n** | Automation brain — autonomous workflows | `https://n8n.netgrimoire.com` |

---

## What Gremlin Does Today

| Capability | Status | Workflow |
|-----------|--------|---------|
| Weekly YAML audit of all compose files | ✅ Live | Forgejo Audit — Monday 06:00 |
| Uptime Kuma alert triage | ✅ Live | Kuma Triage — webhook-triggered |
| Interactive chat with lab context | ✅ Live | Open WebUI + Ollama |
| RAG over wiki/docs | 🔧 Wired, not populated | Qdrant connected, knowledge base empty |
| Doc generation from compose files | 🟡 Parked | CPU quality insufficient — awaiting GPU |
| Email triage | 📋 Planned | Phase 3 — not built |

---

## Models

| Model | Size | Used For |
|-------|------|---------|
| `qwen2.5-coder:7b` | ~5 GB | Code review, YAML audits, compose analysis |
| `llama3.2:3b` | ~2 GB | Alert triage, Q&A, summarization |

Models must be pulled before workflows run. See [Ollama Model Management](/Gremlin-Grimoire/Runbooks/Model-Management).

---

## Sections

| | |
|---|---|
| [Stack](/Gremlin-Grimoire/Stack/Build-Config) | Full build config, volumes, env vars, compose YAML |
| [Workflows](/Gremlin-Grimoire/Workflows/Forgejo-Audit) | All n8n workflows — architecture, patterns, gotchas |
| [Runbooks](/Gremlin-Grimoire/Runbooks/Deploy) | Deploy, model management, troubleshooting |

---

## Planned Evolution

- **Homelable MCP backend** — next up. Provides tool-use for infra Q&A (topology, running services, resource usage). Blocked until Homelable stack is deployed.
- **GPU support** — unlocks doc generation and larger models. Compose GPU block is commented out, ready to enable.
- **Gremlin role variants** — specialized personas per domain (Proxy Gremlin, Storage Gremlin, Security Gremlin, etc.) with mood states and dynamic badge serving via Caddy.
- **RAG knowledge base population** — index all Wiki.js pages and the compose template standard into Qdrant.
- **Gremlin Router** — dedicated Flask container for webhook routing (currently handled directly by n8n).