10 KiB
| title | description | published | date | tags | editor | dateCreated |
|---|---|---|---|---|---|---|
| Gremlin CI/CD User Guide | true | 2026-04-28T20:56:45.863Z | markdown | 2026-04-28T20:56:45.863Z |
Gremlin CI/CD — Operator Guide
NetGrimoire Infrastructure Reference How to write, structure, and manage Swarm stacks for the Gremlin CI/CD pipeline. For pipeline architecture, see Gremlin CI/CD Pipeline.
How It Works
Push any .yml or .yaml file under swarm/ to traveler/services and Gremlin takes over:
- Fetches the file and classifies it (Swarm, Pocket, or plain Compose)
- Runs all schema checkers
- If issues found and all are fixable — auto-fixes and recommits
- If issues found and unfixable — sends ntfy alert, stops
- If all checks pass — runs Ollama audit, then deploys
- After deploy — updates Gatus monitoring config
You get ntfy notifications at every stage. A clean push produces one notification: ✅ Deploy Complete.
Required Stack Structure
Every Swarm service must have these elements. Missing any will block deployment.
services:
myservice:
image: vendor/image:tag
environment:
PUID: "1964"
PGID: "1964"
TZ: America/Chicago
volumes:
- /DockerVol/myservice:/data # pinned — requires node.hostname
# or
- /data/nfs/znas/Docker/myservice:/data # floating — no hostname needed
networks:
- netgrimoire
deploy:
restart_policy:
condition: any
delay: 5s
max_attempts: 3
window: 120s
placement:
constraints:
- node.platform.arch != aarch64
- node.platform.arch != arm
- node.hostname == znas # required when using /DockerVol/
labels:
caddy: myservice.netgrimoire.com
caddy.reverse_proxy: myservice:8080
caddy.import_1: crowdsec
caddy.import_2: authentik
monitor.name: MyService
monitor.url: https://myservice.netgrimoire.com
homepage.group: NetGrimoire
homepage.name: MyService
homepage.icon: myservice.png
homepage.href: https://myservice.netgrimoire.com
homepage.description: My service description
diun.enable: "true"
networks:
netgrimoire:
external: true
Volume Path Rules
| Path type | Example | Placement constraint |
|---|---|---|
/DockerVol/ |
/DockerVol/myservice:/data |
node.hostname required |
/data/nfs/znas/ |
/data/nfs/znas/Docker/myservice:/data |
node.hostname forbidden |
Valid hostnames for node.hostname: docker3, docker4, docker5, znas, dockerpi1
Identity Rules
Method 1 — LinuxServer.io and homelab images (preferred):
environment:
PUID: "1964"
PGID: "1964"
Method 2 — Official Docker Hub images:
user: "1964:1964"
Exemption — Images that manage their own users (Authentik, MailCow):
labels:
gremlin.uid.exempt: "true"
gremlin.uid.reason: "Authentik manages its own internal user context"
Caddy Label Rules
caddy: myservice.netgrimoire.com # hostname only — no https:// prefix
caddy.reverse_proxy: myservice:8080 # service name and port — no IP addresses
caddy.import_1: crowdsec # mandatory
caddy.import_2: authentik # mandatory
Services without a public URL (internal sidecars, databases):
gremlin.caddy.skip: "true"
Monitor Labels
Gremlin writes monitor endpoints to Gatus after each successful deploy.
monitor.name: MyService # display name in Gatus
monitor.url: https://myservice.netgrimoire.com
monitor.type: http # optional: http | tcp | ping | dns (default: http)
monitor.interval: "60" # optional: seconds, minimum 20 (default: 60)
Services that should not be monitored:
gremlin.monitor.skip: "true"
TCP example (for non-HTTP services):
monitor.type: tcp
monitor.url: myservice:5432
Homepage Labels
homepage.group: Media # dashboard group
homepage.name: MyService # display name
homepage.icon: myservice.png # icon filename
homepage.href: https://myservice.netgrimoire.com
homepage.description: Brief description
Services that should not appear on Homepage:
gremlin.homepage.skip: "true"
Auto-fix note: If homepage labels are missing, Gremlin derives them from the caddy: label and service name. Group defaults to "New", icon defaults to "servicename.png". Review and correct after auto-fix.
Gremlin Directives
All directives go inside deploy.labels. All are opt-out — a stack with no gremlin.* labels gets full treatment.
Pipeline Control
gremlin.enable: "true"
# Set false to have Gremlin ignore this file entirely on push.
# Default: true
gremlin.checks: "all"
# Comma-separated checker IDs to run, or "all".
# Example: "swarm-syntax,identity,caddy"
# Default: all
gremlin.checks.skip: ""
# Comma-separated checker IDs to skip.
# Example: "homepage,monitor"
# Default: (none)
Auto-fix Control
gremlin.autofix: "true"
# Set false to disable all auto-fixing.
# Default: true
gremlin.autofix.skip: "false"
# Set true to notify but never attempt to fix.
# Default: false
gremlin.autofix.skip_fields: ""
# Comma-separated fields to skip during fix.
# Example: "hostname,uid"
# Default: (none)
Deploy Control
gremlin.deploy: "true"
# Set false to run checks and fixes but never deploy.
# Use for test stacks or stacks managed manually.
# Default: true
gremlin.deploy.strategy: "stack"
# Deployment method. Values: stack | helm | kubectl
# Default: stack
Identity Exemptions
gremlin.uid.exempt: "false"
# Set true to skip PUID/PGID/user checks.
# Use for images that manage their own users.
# Default: false
gremlin.uid.reason: ""
# Documents why uid.exempt is set.
# Required when uid.exempt is true.
Placement Control
gremlin.arm.allow: "false"
# Set true to allow ARM/Pi deployment.
# Removes ARM exclusion constraints.
# Default: false
Service-level Skip Labels
gremlin.caddy.skip: "false" # skip Caddy label validation
gremlin.homepage.skip: "false" # skip Homepage label validation
gremlin.monitor.skip: "false" # skip monitor label validation
gremlin.diun.skip: "false" # skip Diun label validation
gremlin.network.skip: "false" # skip network validation (whole stack)
Ollama Context
gremlin.context: ""
# Free text passed to Ollama audit as ground truth.
# Ollama will not flag anything the context explains.
# Example: "OIDC_CLIENT_SECRET in plain text is intentional — no secrets manager in use"
Notification Control
gremlin.notify: "true" # false = suppress all ntfy for this stack
gremlin.notify.level: "all" # all | failures | none
Checker IDs
Use these IDs with gremlin.checks and gremlin.checks.skip:
| ID | What it checks |
|---|---|
swarm-syntax |
Forbidden fields: version, container_name, hostname, restart, depends_on, dnsrr |
identity |
PUID/PGID 1964 or user: "1964:1964" |
network |
netgrimoire overlay network |
placement |
ARM exclusions, DockerVol/hostname rules, restart_policy |
caddy |
caddy: label, reverse_proxy, import_1/import_2 |
homepage |
group, name, icon, href, description |
monitor |
monitor.name, monitor.url, optional type/interval |
legacy-labels |
Flags kuma.* labels for removal |
diun |
diun.enable: "true" |
Common Patterns
Internal sidecar (database, cache)
postgres:
image: postgres:15
environment:
POSTGRES_USER: myapp
POSTGRES_PASSWORD: secret
volumes:
- /DockerVol/myapp/postgres:/var/lib/postgresql/data
networks:
- netgrimoire
deploy:
restart_policy:
condition: any
delay: 5s
max_attempts: 3
window: 120s
placement:
constraints:
- node.platform.arch != aarch64
- node.platform.arch != arm
- node.hostname == docker4
labels:
gremlin.caddy.skip: "true"
gremlin.homepage.skip: "true"
gremlin.monitor.skip: "true"
diun.enable: "true"
Test stack (never deployed)
labels:
gremlin.deploy: "false"
# ... other labels
ARM/Pi service
labels:
gremlin.arm.allow: "true"
# ... other labels
placement:
constraints:
- node.hostname == dockerpi1
Image requiring root
labels:
gremlin.uid.exempt: "true"
gremlin.uid.reason: "Image requires root — does not support PUID/PGID"
# ... other labels
Forbidden Fields
These fields are automatically removed by Gremlin:
| Field | Reason |
|---|---|
version: (top-level) |
Obsolete in Compose v3 |
container_name: |
Conflicts with Swarm service naming |
hostname: (service-level) |
Conflicts with Swarm DNS |
restart: (service-level) |
Use deploy.restart_policy instead |
depends_on: |
Not supported in Swarm mode |
links: |
Not supported in Swarm mode |
These fields cause an unfixable block:
| Field | Reason |
|---|---|
endpoint_mode: dnsrr |
Breaks internal DNS resolution |
Missing deploy: block |
File treated as plain Compose, not Swarm |
Troubleshooting
"Missing deploy: block" — file skipped as non-Swarm
Your compose file has no deploy: section. Add a deploy: block to each service for Swarm compatibility.
"uses /DockerVol/ but has no node.hostname constraint" — unfixable
Add a node.hostname constraint to your deploy.placement.constraints. Gremlin cannot guess which node to pin it to.
Ollama keeps blocking on legitimate config
Add gremlin.context to explain the situation. Ollama treats context as ground truth and will not flag it.
Auto-fix loop — fixes applied but same issues keep appearing
The fixer is finding the labels but the checker isn't recognizing them after insertion. Check label indentation — labels inside deploy.labels must be indented 8 spaces.
Deploy skipped every time
Check gremlin.deploy — if set to "false" the pipeline validates and fixes but never deploys.