New Grimoire

This commit is contained in:
traveler 2026-04-12 09:53:51 -05:00
parent 77d589a13d
commit cc574f8aed
157 changed files with 29420 additions and 0 deletions

View file

@ -0,0 +1,105 @@
---
title: Forgejo Audit Workflow
description: Weekly automated YAML compliance audit via n8n + Ollama
published: true
date: 2026-04-12T00:00:00.000Z
tags: gremlin, n8n, audit, forgejo
editor: markdown
dateCreated: 2026-04-12T00:00:00.000Z
---
# Forgejo Audit Workflow
**Status:** ✅ Live and confirmed working
Runs every Monday at 06:00. Walks all compose YAML files in `services/swarm/` and `services/swarm/stack/*/`, audits each one against the Swarm template standard using `qwen2.5-coder:7b`, and commits full reports to Forgejo + sends a summary to ntfy.
---
## What It Audits
Each file is checked for:
- Homepage labels on all services
- Uptime Kuma labels on all services
- Caddy labels on exposed services
- `node.platform.arch` exclusion constraints (ARM default)
- Volume paths follow `/DockerVol/` or `/data/nfs/znas/Docker/` convention
- No forbidden fields (`version:`, `container_name:`, `restart:`, `depends_on:`)
- `endpoint_mode: dnsrr` not used
- `diun.enable: "true"` present
- Network references `netgrimoire` external overlay
---
## Scope
~67 files total across `swarm/` (flat single-service YAMLs) and `swarm/stack/*/` (grouped stacks).
---
## Outputs
| Output | Where | Content |
|--------|-------|---------|
| ntfy notification | `gremlin-audits` topic | Short FAIL summary per file |
| Forgejo commit | `Netgrimoire/Audits/AUDIT-<name>-<date>.md` | Full audit report (POST new / PUT+SHA update) |
---
## n8n Architecture
```
Schedule Trigger (Mon 06:00)
→ Forgejo API: list all files in swarm/ and swarm/stack/*/
→ Loop Over Items (splitInBatches, batch=1)
→ Code node: fetch file content via Forgejo API
→ Code node: build Ollama prompt
→ Code node: POST to Ollama (qwen2.5-coder:7b)
→ Code node: parse result, build report markdown
→ Code node: commit report to Forgejo (POST or PUT+SHA)
→ Code node: send ntfy summary if FAIL
→ Loop feedback connection drives iteration
```
---
## Critical Patterns
All Forgejo and Ollama API calls use `this.helpers.httpRequest()` in Code nodes — **not** HTTP Request nodes. HTTP Request nodes hit body expression limits on large prompts.
Code nodes in "Run Once for Each Item" mode must return `{ json: ... }` not `[{ json: ... }]`.
Loop Over Items (splitInBatches, batch=1) + feedback connection from last node back to loop drives iteration over multiple files.
---
## Critical Environment Variables
| Variable | Value | Why |
|----------|-------|-----|
| `N8N_BLOCK_ENV_ACCESS_IN_NODE` | `false` | Allows env var access inside Code nodes |
| `N8N_RUNNERS_TASK_TIMEOUT` | `3600` | Prevents timeout on 67-file audit runs |
---
## Forgejo API Tokens
| Token | Scope |
|-------|-------|
| Read token | Fetch file content from `traveler/services` |
| Write token | Commit audit reports to `traveler/Netgrimoire` |
Tokens stored in n8n credentials, not in compose env vars.
---
## Forgejo Webhook Gotcha
If Forgejo webhooks fail to reach n8n, add to Forgejo `app.ini`:
```ini
[webhook]
ALLOWED_HOST_LIST = *
```
Required when `OFFLINE_MODE = true`. Restart Forgejo after edit.

View file

@ -0,0 +1,63 @@
---
title: Kuma Alert Triage Workflow
description: Uptime Kuma webhook → Ollama analysis → ntfy alert
published: true
date: 2026-04-12T00:00:00.000Z
tags: gremlin, n8n, kuma, alerts
editor: markdown
dateCreated: 2026-04-12T00:00:00.000Z
---
# Kuma Alert Triage Workflow
**Status:** ✅ Live and confirmed working
Triggered by Uptime Kuma webhook on service DOWN or RECOVERED events. DOWN events are analyzed by `llama3.2:3b` before alerting. RECOVERED events skip AI and send a simple notification.
---
## Webhook URL
```
https://n8n.netgrimoire.com/webhook/gremlin-kuma-alert
```
Configure in Uptime Kuma: Settings → Notifications → Webhook → apply to all monitors.
---
## Flow
```
Kuma Webhook
├── DOWN path:
│ → Parse payload (service name, URL, error)
│ → Ollama (llama3.2:3b): triage prompt
│ → ntfy gremlin-alerts (urgent priority) with AI analysis
└── RECOVERED path:
→ ntfy gremlin-alerts (normal priority, no AI call)
```
---
## Why Two Paths
AI triage is only useful for DOWN events — there's nothing to analyze on a recovery. Skipping Ollama on RECOVERED keeps notification latency near-instant for good news.
---
## ntfy Output Format
DOWN alert includes:
- Service name and URL
- Kuma error message
- Ollama's triage assessment (probable cause, suggested first step)
RECOVERED alert is a simple one-liner.
---
## Parked: Doc Generation Workflows
Two additional doc generation workflows were built but are currently inactive. CPU-only `llama3.2:3b` output barely exceeds reformatting the source compose file — not useful enough to commit. Will be revisited when GPU support is added to the Gremlin stack.