| title |
description |
published |
date |
tags |
editor |
dateCreated |
| Gremlin Grimoire |
Netgrimoire's local AI — the gremlin that runs the machine |
true |
2026-04-12T00:00:00.000Z |
gremlin, ai, ollama, n8n |
markdown |
2026-04-12T00:00:00.000Z |
Gremlin Grimoire

Gremlin is the local AI layer of Netgrimoire. It's not just a chat interface — it's an autonomous agent that watches the infrastructure, audits the codebase, triages alerts, and answers questions about the lab. The gremlin lives inside the machine and knows every dark corner of it.
What Gremlin Is
Gremlin is a stack of four services running together on docker4, all pinned to the same Swarm node:
| Service |
Role |
URL |
| Ollama |
Local LLM inference (CPU-only, Ryzen) |
http://ollama:11434 · ollama.netgrimoire.com:11434 |
| Open WebUI |
Chat interface + RAG frontend |
https://ai.netgrimoire.com |
| Qdrant |
Vector database for RAG knowledge base |
http://qdrant:6333 · dashboard :6333/dashboard |
| n8n |
Automation brain — autonomous workflows |
https://n8n.netgrimoire.com |
What Gremlin Does Today
| Capability |
Status |
Workflow |
| Weekly YAML audit of all compose files |
✅ Live |
Forgejo Audit — Monday 06:00 |
| Uptime Kuma alert triage |
✅ Live |
Kuma Triage — webhook-triggered |
| Interactive chat with lab context |
✅ Live |
Open WebUI + Ollama |
| RAG over wiki/docs |
🔧 Wired, not populated |
Qdrant connected, knowledge base empty |
| Doc generation from compose files |
🟡 Parked |
CPU quality insufficient — awaiting GPU |
| Email triage |
📋 Planned |
Phase 3 — not built |
Models
| Model |
Size |
Used For |
qwen2.5-coder:7b |
~5 GB |
Code review, YAML audits, compose analysis |
llama3.2:3b |
~2 GB |
Alert triage, Q&A, summarization |
Models must be pulled before workflows run. See Ollama Model Management.
Sections
|
|
| Stack |
Full build config, volumes, env vars, compose YAML |
| Workflows |
All n8n workflows — architecture, patterns, gotchas |
| Runbooks |
Deploy, model management, troubleshooting |
Planned Evolution
- Homelable MCP backend — next up. Provides tool-use for infra Q&A (topology, running services, resource usage). Blocked until Homelable stack is deployed.
- GPU support — unlocks doc generation and larger models. Compose GPU block is commented out, ready to enable.
- Gremlin role variants — specialized personas per domain (Proxy Gremlin, Storage Gremlin, Security Gremlin, etc.) with mood states and dynamic badge serving via Caddy.
- RAG knowledge base population — index all Wiki.js pages and the compose template standard into Qdrant.
- Gremlin Router — dedicated Flask container for webhook routing (currently handled directly by n8n).