New Grimoire

2026-04-12 09:53:51 -05:00 · 2026-04-12 09:53:51 -05:00 · cc574f8aed
commit cc574f8aed
parent 77d589a13d
157 changed files with 29420 additions and 0 deletions
--- a/Gremlin-Grimoire/Overview.md
+++ b/Gremlin-Grimoire/Overview.md
@ -0,0 +1,72 @@
+---
+title: Gremlin Grimoire
+description: Netgrimoire's local AI — the gremlin that runs the machine
+published: true
+date: 2026-04-12T00:00:00.000Z
+tags: gremlin, ai, ollama, n8n
+editor: markdown
+dateCreated: 2026-04-12T00:00:00.000Z
+---
+
+# Gremlin Grimoire
+
+![gremlin-badge](/images/gremlin-badge.png)
+
+Gremlin is the local AI layer of Netgrimoire. It's not just a chat interface — it's an autonomous agent that watches the infrastructure, audits the codebase, triages alerts, and answers questions about the lab. The gremlin lives inside the machine and knows every dark corner of it.
+
+---
+
+## What Gremlin Is
+
+Gremlin is a stack of four services running together on `docker4`, all pinned to the same Swarm node:
+
+| Service | Role | URL |
+|---------|------|-----|
+| **Ollama** | Local LLM inference (CPU-only, Ryzen) | `http://ollama:11434` · `ollama.netgrimoire.com:11434` |
+| **Open WebUI** | Chat interface + RAG frontend | `https://ai.netgrimoire.com` |
+| **Qdrant** | Vector database for RAG knowledge base | `http://qdrant:6333` · dashboard `:6333/dashboard` |
+| **n8n** | Automation brain — autonomous workflows | `https://n8n.netgrimoire.com` |
+
+---
+
+## What Gremlin Does Today
+
+| Capability | Status | Workflow |
+|-----------|--------|---------|
+| Weekly YAML audit of all compose files | ✅ Live | Forgejo Audit — Monday 06:00 |
+| Uptime Kuma alert triage | ✅ Live | Kuma Triage — webhook-triggered |
+| Interactive chat with lab context | ✅ Live | Open WebUI + Ollama |
+| RAG over wiki/docs | 🔧 Wired, not populated | Qdrant connected, knowledge base empty |
+| Doc generation from compose files | 🟡 Parked | CPU quality insufficient — awaiting GPU |
+| Email triage | 📋 Planned | Phase 3 — not built |
+
+---
+
+## Models
+
+| Model | Size | Used For |
+|-------|------|---------|
+| `qwen2.5-coder:7b` | ~5 GB | Code review, YAML audits, compose analysis |
+| `llama3.2:3b` | ~2 GB | Alert triage, Q&A, summarization |
+
+Models must be pulled before workflows run. See [Ollama Model Management](/Gremlin-Grimoire/Runbooks/Model-Management).
+
+---
+
+## Sections
+
+| | |
+|---|---|
+| [Stack](/Gremlin-Grimoire/Stack/Build-Config) | Full build config, volumes, env vars, compose YAML |
+| [Workflows](/Gremlin-Grimoire/Workflows/Forgejo-Audit) | All n8n workflows — architecture, patterns, gotchas |
+| [Runbooks](/Gremlin-Grimoire/Runbooks/Deploy) | Deploy, model management, troubleshooting |
+
+---
+
+## Planned Evolution
+
+- **Homelable MCP backend** — next up. Provides tool-use for infra Q&A (topology, running services, resource usage). Blocked until Homelable stack is deployed.
+- **GPU support** — unlocks doc generation and larger models. Compose GPU block is commented out, ready to enable.
+- **Gremlin role variants** — specialized personas per domain (Proxy Gremlin, Storage Gremlin, Security Gremlin, etc.) with mood states and dynamic badge serving via Caddy.
+- **RAG knowledge base population** — index all Wiki.js pages and the compose template standard into Qdrant.
+- **Gremlin Router** — dedicated Flask container for webhook routing (currently handled directly by n8n).