Netgrimoire/Gremlin-Grimoire/Overview.md
2026-04-12 09:53:51 -05:00

3 KiB

title description published date tags editor dateCreated
Gremlin Grimoire Netgrimoire's local AI — the gremlin that runs the machine true 2026-04-12T00:00:00.000Z gremlin, ai, ollama, n8n markdown 2026-04-12T00:00:00.000Z

Gremlin Grimoire

gremlin-badge

Gremlin is the local AI layer of Netgrimoire. It's not just a chat interface — it's an autonomous agent that watches the infrastructure, audits the codebase, triages alerts, and answers questions about the lab. The gremlin lives inside the machine and knows every dark corner of it.


What Gremlin Is

Gremlin is a stack of four services running together on docker4, all pinned to the same Swarm node:

Service Role URL
Ollama Local LLM inference (CPU-only, Ryzen) http://ollama:11434 · ollama.netgrimoire.com:11434
Open WebUI Chat interface + RAG frontend https://ai.netgrimoire.com
Qdrant Vector database for RAG knowledge base http://qdrant:6333 · dashboard :6333/dashboard
n8n Automation brain — autonomous workflows https://n8n.netgrimoire.com

What Gremlin Does Today

Capability Status Workflow
Weekly YAML audit of all compose files Live Forgejo Audit — Monday 06:00
Uptime Kuma alert triage Live Kuma Triage — webhook-triggered
Interactive chat with lab context Live Open WebUI + Ollama
RAG over wiki/docs 🔧 Wired, not populated Qdrant connected, knowledge base empty
Doc generation from compose files 🟡 Parked CPU quality insufficient — awaiting GPU
Email triage 📋 Planned Phase 3 — not built

Models

Model Size Used For
qwen2.5-coder:7b ~5 GB Code review, YAML audits, compose analysis
llama3.2:3b ~2 GB Alert triage, Q&A, summarization

Models must be pulled before workflows run. See Ollama Model Management.


Sections

Stack Full build config, volumes, env vars, compose YAML
Workflows All n8n workflows — architecture, patterns, gotchas
Runbooks Deploy, model management, troubleshooting

Planned Evolution

  • Homelable MCP backend — next up. Provides tool-use for infra Q&A (topology, running services, resource usage). Blocked until Homelable stack is deployed.
  • GPU support — unlocks doc generation and larger models. Compose GPU block is commented out, ready to enable.
  • Gremlin role variants — specialized personas per domain (Proxy Gremlin, Storage Gremlin, Security Gremlin, etc.) with mood states and dynamic badge serving via Caddy.
  • RAG knowledge base population — index all Wiki.js pages and the compose template standard into Qdrant.
  • Gremlin Router — dedicated Flask container for webhook routing (currently handled directly by n8n).