2026-04-12 09:53:51 -05:00

3 KiB

Raw Blame History

title	description	published	date	tags	editor	dateCreated
Gremlin Grimoire	Netgrimoire's local AI — the gremlin that runs the machine	true	2026-04-12T00:00:00.000Z	gremlin, ai, ollama, n8n	markdown	2026-04-12T00:00:00.000Z

Gremlin Grimoire

Gremlin is the local AI layer of Netgrimoire. It's not just a chat interface — it's an autonomous agent that watches the infrastructure, audits the codebase, triages alerts, and answers questions about the lab. The gremlin lives inside the machine and knows every dark corner of it.

What Gremlin Is

Gremlin is a stack of four services running together on docker4, all pinned to the same Swarm node:

Service	Role	URL
Ollama	Local LLM inference (CPU-only, Ryzen)	`http://ollama:11434` · `ollama.netgrimoire.com:11434`
Open WebUI	Chat interface + RAG frontend	`https://ai.netgrimoire.com`
Qdrant	Vector database for RAG knowledge base	`http://qdrant:6333` · dashboard `:6333/dashboard`
n8n	Automation brain — autonomous workflows	`https://n8n.netgrimoire.com`

What Gremlin Does Today

Capability	Status	Workflow
Weekly YAML audit of all compose files	✅ Live	Forgejo Audit — Monday 06:00
Uptime Kuma alert triage	✅ Live	Kuma Triage — webhook-triggered
Interactive chat with lab context	✅ Live	Open WebUI + Ollama
RAG over wiki/docs	🔧 Wired, not populated	Qdrant connected, knowledge base empty
Doc generation from compose files	🟡 Parked	CPU quality insufficient — awaiting GPU
Email triage	📋 Planned	Phase 3 — not built

Models

Model	Size	Used For
`qwen2.5-coder:7b`	~5 GB	Code review, YAML audits, compose analysis
`llama3.2:3b`	~2 GB	Alert triage, Q&A, summarization

Models must be pulled before workflows run. See Ollama Model Management.

Sections


Stack	Full build config, volumes, env vars, compose YAML
Workflows	All n8n workflows — architecture, patterns, gotchas
Runbooks	Deploy, model management, troubleshooting

Planned Evolution

Homelable MCP backend — next up. Provides tool-use for infra Q&A (topology, running services, resource usage). Blocked until Homelable stack is deployed.
GPU support — unlocks doc generation and larger models. Compose GPU block is commented out, ready to enable.
Gremlin role variants — specialized personas per domain (Proxy Gremlin, Storage Gremlin, Security Gremlin, etc.) with mood states and dynamic badge serving via Caddy.
RAG knowledge base population — index all Wiki.js pages and the compose template standard into Qdrant.
Gremlin Router — dedicated Flask container for webhook routing (currently handled directly by n8n).

3 KiB Raw Blame History