docs: update Netgrimoire/Gremlin-Grimoire/CICD_UserGuide

This commit is contained in:
Administrator 2026-04-30 18:33:16 +00:00 committed by John Smith
parent 87ef61d184
commit 74d78af239

View file

@ -2,7 +2,7 @@
title: Gremlin CI/CD User Guide
description:
published: true
date: 2026-04-29T13:20:50.963Z
date: 2026-04-30T18:33:09.881Z
tags:
editor: markdown
dateCreated: 2026-04-28T20:56:45.863Z
@ -67,7 +67,7 @@ services:
caddy.import_2: authentik
monitor.name: MyService
monitor.url: https://myservice.netgrimoire.com
monitor.url: http://myservice:8080 # internal URL preferred
homepage.group: NetGrimoire
homepage.name: MyService
@ -89,7 +89,7 @@ networks:
| Path type | Example | Placement constraint |
|---|---|---|
| `/DockerVol/` | `/DockerVol/myservice:/data` | `node.hostname` **required** |
| `/data/nfs/znas/` | `/data/nfs/znas/Docker/myservice:/data` | `node.hostname` **forbidden** |
| `/data/nfs/znas/` | `/data/nfs/znas/Docker/myservice:/data` | `node.hostname` **not required** |
Valid hostnames for `node.hostname`: `docker3`, `docker4`, `docker5`, `znas`, `dockerpi1`
@ -109,13 +109,15 @@ environment:
user: "1964:1964"
```
**Exemption** — Images that manage their own users (Authentik, MailCow):
**Exemption** — Images that manage their own users (Authentik, MailCow, Postgres, Redis):
```yaml
labels:
gremlin.uid.exempt: "true"
gremlin.uid.reason: "Authentik manages its own internal user context"
gremlin.uid.reason: "Postgres manages its own user — requires UID 999"
```
When `uid.exempt` is set, Prepare Volumes will `mkdir` the service's volume paths but will **not** `chown` them. The image manages its own ownership.
---
## Caddy Label Rules
@ -123,8 +125,8 @@ labels:
```yaml
caddy: myservice.netgrimoire.com # hostname only — no https:// prefix
caddy.reverse_proxy: myservice:8080 # service name and port — no IP addresses
caddy.import_1: crowdsec # mandatory
caddy.import_2: authentik # mandatory
caddy.import_1: crowdsec # always required
caddy.import_2: authentik # required unless gremlin.authentik.skip is set
```
Services without a public URL (internal sidecars, databases):
@ -132,29 +134,39 @@ Services without a public URL (internal sidecars, databases):
gremlin.caddy.skip: "true"
```
Services that should bypass Authentik but still go through CrowdSec:
```yaml
gremlin.authentik.skip: "true"
```
---
## Monitor Labels
Gremlin writes monitor endpoints to Gatus after each successful deploy.
Gremlin writes monitor endpoints to Gatus after each successful deploy. Monitor URLs should use the internal service name and port so Gatus checks the container directly without depending on Caddy or Authentik being up.
```yaml
monitor.name: MyService # display name in Gatus
monitor.url: https://myservice.netgrimoire.com
monitor.url: http://myservice:8080 # internal URL preferred
monitor.type: http # optional: http | tcp | ping | dns (default: http)
monitor.interval: "60" # optional: seconds, minimum 20 (default: 60)
```
For non-HTTP services (mail, databases):
```yaml
monitor.type: tcp
monitor.url: tcp://myservice:5432
```
Services that should not be monitored:
```yaml
gremlin.monitor.skip: "true"
```
TCP example (for non-HTTP services):
```yaml
monitor.type: tcp
monitor.url: myservice:5432
```
Gatus determines the check condition from the URL scheme:
- `http://` or `https://``[STATUS] == 200`
- `tcp://` or `type: tcp``[CONNECTED] == true`
- `type: ping``[CONNECTED] == true`
---
@ -177,116 +189,86 @@ gremlin.homepage.skip: "true"
---
## Gremlin Directives
## Gremlin Directives Reference
All directives go inside `deploy.labels`. All are opt-out — a stack with no `gremlin.*` labels gets full treatment.
### Pipeline Control
```yaml
gremlin.enable: "true"
# Set false to have Gremlin ignore this file entirely on push.
# Default: true
gremlin.checks: "all"
# Comma-separated checker IDs to run, or "all".
# Example: "swarm-syntax,identity,caddy"
# Default: all
gremlin.checks.skip: ""
# Comma-separated checker IDs to skip.
# Example: "homepage,monitor"
# Default: (none)
```
| Directive | Default | Description |
|---|---|---|
| `gremlin.enable` | `true` | Set `false` to have Gremlin ignore this file entirely on push |
| `gremlin.checks` | `all` | Comma-separated checker IDs to run, or `all` |
| `gremlin.checks.skip` | _(none)_ | Comma-separated checker IDs to skip |
| `gremlin.version` | _(auto)_ | Stamped automatically — do not set manually |
| `gremlin.context` | _(none)_ | Free text passed to Ollama as ground truth — Ollama will not flag anything this explains |
### Auto-fix Control
```yaml
gremlin.autofix: "true"
# Set false to disable all auto-fixing.
# Default: true
gremlin.autofix.skip: "false"
# Set true to notify but never attempt to fix.
# Default: false
gremlin.autofix.skip_fields: ""
# Comma-separated fields to skip during fix.
# Example: "hostname,uid"
# Default: (none)
```
### Port exposure
```yaml
gremlin.port: ""
# Internal container port for Caddy reverse_proxy and monitor URL derivation.
# Only needed when no ports: mapping is defined in the service.
# Gremlin checks ports: first, falls back to this directive.
# Example: gremlin.port: "9000"
# If neither ports: nor gremlin.port is set, caddy.reverse_proxy
# cannot be derived and will be flagged as unfixable.
```
| Directive | Default | Description |
|---|---|---|
| `gremlin.autofix` | `true` | Set `false` to disable all auto-fixing |
| `gremlin.autofix.skip` | `false` | Set `true` to notify but never attempt to fix |
| `gremlin.autofix.skip_fields` | _(none)_ | Comma-separated fields to skip during fix (e.g. `uid,hostname`) |
### Deploy Control
```yaml
gremlin.deploy: "true"
# Set false to run checks and fixes but never deploy.
# Use for test stacks or stacks managed manually.
# Default: true
| Directive | Default | Description |
|---|---|---|
| `gremlin.deploy` | `true` | Set `false` to run checks and fixes but never deploy |
| `gremlin.deploy.strategy` | `stack` | Deployment method — currently only `stack` is implemented |
| `gremlin.port` | _(none)_ | Internal container port when no `ports:` mapping exists — used to derive `caddy.reverse_proxy` and `monitor.url` |
gremlin.deploy.strategy: "stack"
# Deployment method. Values: stack | helm | kubectl
# Default: stack
```
### Identity
### Identity Exemptions
| Directive | Default | Description |
|---|---|---|
| `gremlin.uid.exempt` | `false` | Skip PUID/PGID/user checks and skip chown on volumes for this service |
| `gremlin.uid.reason` | _(none)_ | Documents why uid.exempt is set — include with every exemption |
```yaml
gremlin.uid.exempt: "false"
# Set true to skip PUID/PGID/user checks.
# Use for images that manage their own users.
# Default: false
### Placement
gremlin.uid.reason: ""
# Documents why uid.exempt is set.
# Required when uid.exempt is true.
```
| Directive | Default | Description |
|---|---|---|
| `gremlin.arm.allow` | `false` | Allow ARM/Pi deployment — removes ARM exclusion constraint requirement |
### Placement Control
### Caddy
```yaml
gremlin.arm.allow: "false"
# Set true to allow ARM/Pi deployment.
# Removes ARM exclusion constraints.
# Default: false
```
| Directive | Default | Description |
|---|---|---|
| `gremlin.caddy.skip` | `false` | Skip all Caddy label checks for this service |
| `gremlin.authentik.skip` | `false` | Skip `caddy.import_2: authentik` requirement only — CrowdSec still required |
### Service-level Skip Labels
### Homepage
```yaml
gremlin.caddy.skip: "false" # skip Caddy label validation
gremlin.homepage.skip: "false" # skip Homepage label validation
gremlin.monitor.skip: "false" # skip monitor label validation
gremlin.diun.skip: "false" # skip Diun label validation
gremlin.network.skip: "false" # skip network validation (whole stack)
```
| Directive | Default | Description |
|---|---|---|
| `gremlin.homepage.skip` | `false` | Skip Homepage label checks for this service |
### Ollama Context
### Monitor
```yaml
gremlin.context: ""
# Free text passed to Ollama audit as ground truth.
# Ollama will not flag anything the context explains.
# Example: "OIDC_CLIENT_SECRET in plain text is intentional — no secrets manager in use"
```
| Directive | Default | Description |
|---|---|---|
| `gremlin.monitor.skip` | `false` | Skip monitor label checks for this service |
### Notification Control
### Network
```yaml
gremlin.notify: "true" # false = suppress all ntfy for this stack
gremlin.notify.level: "all" # all | failures | none
```
| Directive | Default | Description |
|---|---|---|
| `gremlin.network.skip` | `false` | Skip netgrimoire network checks for this service |
### Diun
| Directive | Default | Description |
|---|---|---|
| `gremlin.diun.skip` | `false` | Skip `diun.enable` check for this service |
### Notifications
| Directive | Default | Description |
|---|---|---|
| `gremlin.notify` | `true` | Set `false` to suppress all ntfy notifications for this stack |
| `gremlin.notify.level` | `all` | `all` \| `failures` \| `none` |
---
@ -298,13 +280,14 @@ Use these IDs with `gremlin.checks` and `gremlin.checks.skip`:
|---|---|
| `swarm-syntax` | Forbidden fields: version, container_name, hostname, restart, depends_on, dnsrr |
| `identity` | PUID/PGID 1964 or user: "1964:1964" |
| `network` | netgrimoire overlay network |
| `network` | netgrimoire overlay network attached |
| `placement` | ARM exclusions, DockerVol/hostname rules, restart_policy |
| `caddy` | caddy: label, reverse_proxy, import_1/import_2 |
| `caddy` | caddy: label, reverse_proxy format, import_1/import_2 |
| `homepage` | group, name, icon, href, description |
| `monitor` | monitor.name, monitor.url, optional type/interval |
| `legacy-labels` | Flags kuma.* labels for removal |
| `diun` | diun.enable: "true" |
| `version` | gremlin.version stamp matches current config version |
| `diun` | diun.enable: "true" present |
---
@ -333,6 +316,29 @@ Use these IDs with `gremlin.checks` and `gremlin.checks.skip`:
- node.platform.arch != aarch64
- node.platform.arch != arm
- node.hostname == docker4
labels:
gremlin.uid.exempt: "true"
gremlin.uid.reason: "Postgres requires UID 999"
gremlin.caddy.skip: "true"
gremlin.homepage.skip: "true"
gremlin.monitor.skip: "true"
diun.enable: "true"
```
### Service without Authentik (remote browser, public endpoint)
```yaml
labels:
caddy: firefox.netgrimoire.com
caddy.reverse_proxy: firefox:5800
caddy.import_1: crowdsec
gremlin.authentik.skip: "true"
# ... other labels
```
### Service with no web UI and no public port
```yaml
labels:
gremlin.caddy.skip: "true"
gremlin.homepage.skip: "true"
@ -359,12 +365,20 @@ Use these IDs with `gremlin.checks` and `gremlin.checks.skip`:
- node.hostname == dockerpi1
```
### Image requiring root
### Service with no ports: mapping
```yaml
labels:
gremlin.uid.exempt: "true"
gremlin.uid.reason: "Image requires root — does not support PUID/PGID"
gremlin.port: "8080"
# tells Gremlin the internal port for caddy and monitor derivation
# ... other labels
```
### Ollama false positive suppression
```yaml
labels:
gremlin.context: "shm_size is set to 1gb — required for this browser application"
# ... other labels
```
@ -381,33 +395,39 @@ These fields are automatically removed by Gremlin:
| `hostname:` (service-level) | Conflicts with Swarm DNS |
| `restart:` (service-level) | Use `deploy.restart_policy` instead |
| `depends_on:` | Not supported in Swarm mode |
| `links:` | Not supported in Swarm mode |
These fields cause an **unfixable** block:
These fields cause an **unfixable** block — Gremlin cannot fix them automatically:
| Field | Reason |
|---|---|
| `endpoint_mode: dnsrr` | Breaks internal DNS resolution |
| `endpoint_mode: dnsrr` | Breaks internal DNS resolution — VIP mode required |
| Missing `deploy:` block | File treated as plain Compose, not Swarm |
| `/DockerVol/` without `node.hostname` | Gremlin cannot guess the target node |
---
## Troubleshooting
**"Missing deploy: block" — file skipped as non-Swarm**
Your compose file has no `deploy:` section. Add a `deploy:` block to each service for Swarm compatibility.
Your compose file has no `deploy:` section. Add a `deploy:` block to each service.
**"uses /DockerVol/ but has no node.hostname constraint" — unfixable**
Add a `node.hostname` constraint to your `deploy.placement.constraints`. Gremlin cannot guess which node to pin it to.
Add a `node.hostname` constraint to `deploy.placement.constraints`. Gremlin cannot guess which node to pin it to.
**PUID/PGID landing under volumes:**
Your service has no `environment:` block. Gremlin now creates one before `volumes:` automatically. If it still happens, add an `environment:` block manually with at least one entry.
**Ollama keeps blocking on legitimate config**
Add `gremlin.context` to explain the situation. Ollama treats context as ground truth and will not flag it.
Add `gremlin.context` explaining the situation. Ollama treats it as ground truth.
**Auto-fix loop — fixes applied but same issues keep appearing**
The fixer is finding the labels but the checker isn't recognizing them after insertion. Check label indentation — labels inside `deploy.labels` must be indented 8 spaces.
**Auto-fix loop — same issues reappear after fix**
Check label indentation — labels inside `deploy.labels` must be indented 8 spaces consistently.
**Deploy skipped every time**
Check `gremlin.deploy` — if set to `"false"` the pipeline validates and fixes but never deploys.
Check `gremlin.deploy` in the stack labels and in `gremlin/config.yaml`. Global `deploy: false` overrides all stacks unless the stack explicitly sets `gremlin.deploy: "true"`.
**Service shows up as "netgrimoire" in checker errors**
The file has a blank line between `services:` and the first service name — this was a known bug fixed in pipeline v2026-04-30.
---
@ -415,4 +435,4 @@ Check `gremlin.deploy` — if set to `"false"` the pipeline validates and fixes
- [Gremlin CI/CD Pipeline](gremlin-cicd-wiki.md)
- [NetGrimoire Stack Standards](stack-standards.md)
- [Gatus](gatus.md)
- [Gatus](gatus.md)