move
This commit is contained in:
parent
b1a2672c76
commit
e55070398b
65 changed files with 0 additions and 0 deletions
453
Netgrimoire/Green-Grimoire/Library/Stash-Management.md
Normal file
453
Netgrimoire/Green-Grimoire/Library/Stash-Management.md
Normal file
|
|
@ -0,0 +1,453 @@
|
|||
---
|
||||
title: Stashapp Workflow
|
||||
description:
|
||||
published: true
|
||||
date: 2026-02-20T04:25:56.467Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-18T13:08:53.604Z
|
||||
---
|
||||
|
||||
# StashApp: Automated Library Management with Community Scrapers
|
||||
|
||||
> **Goal:** Automatically identify, tag, rename, and organize your media library with minimal manual intervention using StashDB, ThePornDB, and the CommunityScrapers repository.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Prerequisites](#1-prerequisites)
|
||||
2. [Installing CommunityScrapers](#2-installing-community-scrapers)
|
||||
3. [Configuring Metadata Providers](#3-configuring-metadata-providers)
|
||||
- [StashDB](#31-stashdb)
|
||||
- [ThePornDB (TPDB)](#32-theporndbtpdb)
|
||||
4. [Configuring Your Library](#4-configuring-your-library)
|
||||
5. [Automated File Naming & Moving](#5-automated-file-naming--moving)
|
||||
6. [The Core Workflow](#6-the-core-workflow)
|
||||
7. [Handling ABMEA & Amateur Content](#7-handling-abmea--amateur-content)
|
||||
8. [Automation with Scheduled Tasks](#8-automation-with-scheduled-tasks)
|
||||
9. [Tips & Troubleshooting](#9-tips--troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## 1. Prerequisites
|
||||
|
||||
Before starting, make sure you have:
|
||||
|
||||
- **StashApp installed and running** — see the [official install docs](https://github.com/stashapp/stash/wiki/Installation)
|
||||
- **Git installed** on your system (needed to clone the scrapers repo)
|
||||
- **A ThePornDB account** — free tier available at [metadataapi.net](https://metadataapi.net)
|
||||
- **A StashDB account** — requires a community invite; request one on [the Discord](https://discord.gg/2TsNFKt)
|
||||
- Your Stash config directory noted — default locations:
|
||||
|
||||
| OS | Default Path |
|
||||
|----|-------------|
|
||||
| Windows | `%APPDATA%\stash` |
|
||||
| macOS | `~/.stash` |
|
||||
| Linux | `~/.stash` |
|
||||
| Docker | `/root/.stash` |
|
||||
|
||||
---
|
||||
|
||||
## 2. Installing CommunityScrapers
|
||||
|
||||
The [CommunityScrapers](https://github.com/stashapp/CommunityScrapers) repository contains scrapers for hundreds of sites maintained by the Stash community. This is the primary source for site-specific scrapers including ABMEA.
|
||||
|
||||
### Step 1 — Navigate to your Stash config directory
|
||||
|
||||
```bash
|
||||
cd ~/.stash
|
||||
```
|
||||
|
||||
### Step 2 — Create a scrapers directory if it doesn't exist
|
||||
|
||||
```bash
|
||||
mkdir -p scrapers
|
||||
cd scrapers
|
||||
```
|
||||
|
||||
### Step 3 — Clone the CommunityScrapers repository
|
||||
|
||||
```bash
|
||||
git clone https://github.com/stashapp/CommunityScrapers.git
|
||||
```
|
||||
|
||||
This creates `~/.stash/scrapers/CommunityScrapers/` containing all available scrapers.
|
||||
|
||||
### Step 4 — Verify Stash detects the scrapers
|
||||
|
||||
1. Open Stash in your browser (default: `http://localhost:9999`)
|
||||
2. Go to **Settings → Metadata Providers → Scrapers**
|
||||
3. Click **Reload Scrapers**
|
||||
4. You should now see a long list of scrapers including entries for ABMEA, ManyVids, Clips4Sale, etc.
|
||||
|
||||
### Step 5 — Keep scrapers updated
|
||||
|
||||
Since community scrapers are actively maintained, set up a periodic update:
|
||||
|
||||
```bash
|
||||
cd ~/.stash/scrapers/CommunityScrapers
|
||||
git pull
|
||||
```
|
||||
|
||||
> 💡 **Tip:** You can automate this with a cron job or scheduled task. See [Section 8](#8-automation-with-scheduled-tasks).
|
||||
|
||||
### Installing Python Dependencies (if prompted)
|
||||
|
||||
Some scrapers require Python packages. If you see scraper errors mentioning missing modules:
|
||||
|
||||
```bash
|
||||
pip install requests cloudscraper py-cord lxml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Configuring Metadata Providers
|
||||
|
||||
Stash uses **metadata providers** to automatically match scenes by fingerprint (phash/oshash). This is what enables true automation — no filename matching required.
|
||||
|
||||
### 3.1 StashDB
|
||||
|
||||
StashDB is the official community-run fingerprint and metadata database. It is the most reliable source for mainstream and studio content.
|
||||
|
||||
1. Go to **Settings → Metadata Providers**
|
||||
2. Under **Stash-Box Endpoints**, click **Add**
|
||||
3. Fill in:
|
||||
- **Name:** `StashDB`
|
||||
- **Endpoint:** `https://stashdb.org/graphql`
|
||||
- **API Key:** *(generate this from your StashDB account → API Keys)*
|
||||
4. Click **Confirm**
|
||||
|
||||
### 3.2 ThePornDB (TPDB)
|
||||
|
||||
TPDB aggregates metadata from a large number of sites and is especially useful for amateur, clip site, and ABMEA content that may not be on StashDB.
|
||||
|
||||
1. Log in at [metadataapi.net](https://metadataapi.net) and go to your **API Settings** to get your key
|
||||
2. In Stash, go to **Settings → Metadata Providers**
|
||||
3. Under **Stash-Box Endpoints**, click **Add**
|
||||
4. Fill in:
|
||||
- **Name:** `ThePornDB`
|
||||
- **Endpoint:** `https://theporndb.net/graphql`
|
||||
- **API Key:** *(your TPDB API key)*
|
||||
5. Click **Confirm**
|
||||
|
||||
### Provider Priority Order
|
||||
|
||||
Set your identify task to query providers in this order for best results:
|
||||
|
||||
1. **StashDB** — highest quality, community-verified
|
||||
2. **ThePornDB** — broad coverage including amateur/clip sites
|
||||
3. **CommunityScrapers** (site-specific) — for anything not matched above
|
||||
|
||||
---
|
||||
|
||||
## 4. Configuring Your Library
|
||||
|
||||
### Adding Library Paths
|
||||
|
||||
1. Go to **Settings → Library**
|
||||
2. Under **Directories**, click **Add** and point to your media folders
|
||||
3. You can add multiple directories (e.g., separate drives or folders)
|
||||
|
||||
> ⚠️ **Do not** set your organized output folder as a source directory. Keep source and destination separate until you are confident in your setup.
|
||||
|
||||
### Recommended Directory Structure
|
||||
|
||||
```
|
||||
/media/
|
||||
├── stash-incoming/ ← Source: where new files land
|
||||
└── stash-library/ ← Destination: where Stash moves organized files
|
||||
├── Studios/
|
||||
│ └── ABMEA/
|
||||
└── Amateur/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Automated File Naming & Moving
|
||||
|
||||
This is the section that does the heavy lifting. Stash will rename and move files **only when a scene is marked as Organized**, which gives you a review gate before anything is touched.
|
||||
|
||||
### Enable File Moving
|
||||
|
||||
1. Go to **Settings → Library**
|
||||
2. Enable **"Move files to organized folder on organize"**
|
||||
3. Set your **Organized folder path** (e.g., `/media/stash-library`)
|
||||
|
||||
### Configure the File Naming Template
|
||||
|
||||
Still in **Settings → Library**, set your **Filename template**. These use Go template syntax with Stash variables.
|
||||
|
||||
**Recommended template for mixed studio/amateur libraries:**
|
||||
|
||||
```
|
||||
{studio}/{date} {title}
|
||||
```
|
||||
|
||||
**For performer-centric amateur libraries:**
|
||||
|
||||
```
|
||||
{performers}/{studio}/{date} {title}
|
||||
```
|
||||
|
||||
**Full example with fallbacks:**
|
||||
|
||||
```
|
||||
{{if .Studio}}{{.Studio.Name}}{{else}}Unknown{{end}}/{{if .Date}}{{.Date}}{{else}}0000-00-00{{end}} {{.Title}}
|
||||
```
|
||||
|
||||
### Available Template Variables
|
||||
|
||||
| Variable | Example Output |
|
||||
|----------|---------------|
|
||||
| `{title}` | `Scene Title Here` |
|
||||
| `{date}` | `2024-03-15` |
|
||||
| `{studio}` | `ABMEA` |
|
||||
| `{performers}` | `Jane Doe` |
|
||||
| `{resolution}` | `1080p` |
|
||||
| `{duration}` | `00-32-15` |
|
||||
| `{rating}` | `5` |
|
||||
|
||||
> 💡 If a field is empty (e.g., no studio), Stash skips that path segment. Test with a few scenes before running on your whole library.
|
||||
|
||||
---
|
||||
|
||||
## 6. The Core Workflow
|
||||
|
||||
Follow these steps **in order** every time you add new content. This is the automated pipeline.
|
||||
|
||||
```
|
||||
New Files → Scan → Generate Fingerprints → Identify → Review → Organize (Move + Rename)
|
||||
```
|
||||
|
||||
### Step 1 — Scan
|
||||
|
||||
**Tasks → Scan**
|
||||
|
||||
- Discovers new files and adds them to the database
|
||||
- Does not move or rename anything yet
|
||||
- Options to enable: **Generate covers on scan**
|
||||
|
||||
### Step 2 — Generate Fingerprints
|
||||
|
||||
**Tasks → Generate**
|
||||
|
||||
Select these options:
|
||||
|
||||
| Option | Purpose |
|
||||
|--------|---------|
|
||||
| ✅ **Phashes** | Used for fingerprint matching against StashDB/TPDB |
|
||||
| ✅ **Checksums (MD5/SHA256)** | Used for duplicate detection |
|
||||
| ✅ **Previews** | Thumbnail previews in the UI |
|
||||
| ✅ **Sprites** | Timeline scrubber images |
|
||||
|
||||
> ⏳ This step is CPU/GPU intensive. Let it complete before proceeding. On a large library, this may take hours.
|
||||
|
||||
### Step 3 — Identify (Auto-Scrape by Fingerprint)
|
||||
|
||||
**Tasks → Identify**
|
||||
|
||||
This is the magic step. Stash sends your file fingerprints to StashDB and TPDB and pulls back metadata automatically.
|
||||
|
||||
Configure the task:
|
||||
1. Click **Add Source** and add **StashDB** first
|
||||
2. Click **Add Source** again and add **ThePornDB**
|
||||
3. Under **Options**, enable:
|
||||
- ✅ Set cover image
|
||||
- ✅ Set performers
|
||||
- ✅ Set studio
|
||||
- ✅ Set tags
|
||||
- ✅ Set date
|
||||
4. Click **Identify**
|
||||
|
||||
Stash will now automatically match and populate metadata for any scene it recognizes by fingerprint.
|
||||
|
||||
### Step 4 — Auto Tag (Filename-Based Fallback)
|
||||
|
||||
For scenes that didn't match by fingerprint (common with amateur content), use Auto Tag to extract metadata from filenames.
|
||||
|
||||
**Tasks → Auto Tag**
|
||||
|
||||
- Matches **Performers**, **Studios**, and **Tags** from filenames against your existing database entries
|
||||
- Works best when filenames contain names (e.g., `JaneDoe_SceneTitle_1080p.mp4`)
|
||||
|
||||
### Step 5 — Review Unmatched Scenes
|
||||
|
||||
Filter to find scenes that still need attention:
|
||||
|
||||
1. Go to **Scenes**
|
||||
2. Filter by: **Organized = false** and **Studio = none** (or **Performers = none**)
|
||||
3. Use the **Tagger view** (icon in top right of Scenes) for rapid URL-based scraping
|
||||
|
||||
In Tagger view:
|
||||
- Paste the original source URL into the scrape field
|
||||
- Click **Scrape** — Stash fills in all metadata from that URL
|
||||
- Review and click **Save**
|
||||
|
||||
### Step 6 — Organize (Move & Rename)
|
||||
|
||||
Once you're satisfied with a scene's metadata:
|
||||
|
||||
1. Open the scene
|
||||
2. Click the **Organize** button (checkmark icon), OR
|
||||
3. Use **bulk organize**: select multiple scenes → Edit → Mark as Organized
|
||||
|
||||
When a scene is marked Organized, Stash will:
|
||||
- ✅ Rename the file according to your template
|
||||
- ✅ Move it to your organized folder
|
||||
- ✅ Update the database path
|
||||
|
||||
> ⚠️ **This action cannot be easily undone at scale.** Always verify metadata on a small batch first.
|
||||
|
||||
---
|
||||
|
||||
## 7. Handling ABMEA & Amateur Content
|
||||
|
||||
ABMEA and amateur clips often lack fingerprint matches. Use these additional strategies:
|
||||
|
||||
### ABMEA-Specific Scraper
|
||||
|
||||
The CommunityScrapers repo includes an ABMEA scraper. To use it manually:
|
||||
|
||||
1. Open a scene in Stash
|
||||
2. Click **Edit → Scrape with → ABMEA**
|
||||
3. If the scene URL is known, enter it; otherwise the scraper will search by title
|
||||
|
||||
### Batch URL Scraping Workflow for ABMEA
|
||||
|
||||
If you have many files sourced from ABMEA:
|
||||
|
||||
1. Before ingesting files, **rename them to include the ABMEA scene ID** in the filename if possible (e.g., `ABMEA-0123_title.mp4`)
|
||||
2. After scanning, go to **Tagger View**
|
||||
3. Filter to unmatched scenes and paste ABMEA URLs one by one
|
||||
|
||||
### Amateur Content Without a Source Site
|
||||
|
||||
For truly anonymous amateur clips:
|
||||
|
||||
1. Create a **Studio** entry called `Amateur` (or more specific names like `Amateur - Reddit`)
|
||||
2. Create **Performer** entries for recurring people you can identify
|
||||
3. Use **Auto Tag** to match these once entries exist
|
||||
4. Use tags liberally to compensate for missing structured metadata: `amateur`, `homemade`, `POV`, etc.
|
||||
|
||||
### Tag Hierarchy Recommendation
|
||||
|
||||
Set up tag parents in **Settings → Tags** to create a browsable hierarchy:
|
||||
|
||||
```
|
||||
Content Type
|
||||
├── Amateur
|
||||
├── Professional
|
||||
└── Compilation
|
||||
|
||||
Source
|
||||
├── ABMEA
|
||||
├── Clip Site
|
||||
└── Unknown
|
||||
|
||||
Quality
|
||||
├── 4K
|
||||
├── 1080p
|
||||
└── SD
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Automation with Scheduled Tasks
|
||||
|
||||
Minimize manual steps by scheduling recurring tasks.
|
||||
|
||||
### Setting Up Scheduled Tasks in Stash
|
||||
|
||||
Go to **Settings → Tasks → Scheduled Tasks** and create:
|
||||
|
||||
| Task | Schedule | Purpose |
|
||||
|------|----------|---------|
|
||||
| Scan | Every 6 hours | Pick up new files automatically |
|
||||
| Generate (Phashes only) | Every 6 hours | Fingerprint new files |
|
||||
| Identify | Daily at 2am | Match new fingerprinted files |
|
||||
| Auto Tag | Daily at 3am | Filename-based fallback tagging |
|
||||
| Clean | Weekly | Remove missing files from database |
|
||||
|
||||
### Auto-Update CommunityScrapers (Linux/macOS)
|
||||
|
||||
Add to your crontab (`crontab -e`):
|
||||
|
||||
```bash
|
||||
# Update CommunityScrapers every Sunday at midnight
|
||||
0 0 * * 0 cd ~/.stash/scrapers/CommunityScrapers && git pull
|
||||
```
|
||||
|
||||
### Auto-Update CommunityScrapers (Windows)
|
||||
|
||||
Create a scheduled task in Task Scheduler running:
|
||||
|
||||
```powershell
|
||||
cd C:\Users\YourUser\.stash\scrapers\CommunityScrapers; git pull
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Tips & Troubleshooting
|
||||
|
||||
### Scraper not appearing in Stash
|
||||
|
||||
- Go to **Settings → Metadata Providers → Scrapers** and click **Reload Scrapers**
|
||||
- Check that the `.yml` scraper file is in a subdirectory of your scrapers folder
|
||||
- Check Stash logs (**Settings → Logs**) for scraper loading errors
|
||||
|
||||
### Identify finds no matches
|
||||
|
||||
- Confirm phashes were generated (check scene details — phash should be populated)
|
||||
- Confirm your StashDB/TPDB API keys are correctly entered and not expired
|
||||
- The file may simply not be in either database — proceed to manual URL scraping
|
||||
|
||||
### Files not moving after marking as Organized
|
||||
|
||||
- Confirm **"Move files to organized folder"** is enabled in Settings → Library
|
||||
- Confirm the organized folder path is set and the folder exists
|
||||
- Check that Stash has write permissions to both source and destination
|
||||
|
||||
### Duplicate files
|
||||
|
||||
Run **Tasks → Clean → Find Duplicates** before organizing to avoid moving duplicates into your library. Stash uses phash to find visual duplicates even if filenames differ.
|
||||
|
||||
### Metadata keeps getting overwritten
|
||||
|
||||
In **Settings → Scraping**, set the **Scrape behavior** to `If not set` instead of `Always` to prevent already-populated fields from being overwritten during re-scrapes.
|
||||
|
||||
### Useful Stash Plugins
|
||||
|
||||
Install via **Settings → Plugins → Browse Available Plugins**:
|
||||
|
||||
| Plugin | Purpose |
|
||||
|--------|---------|
|
||||
| **Performer Image Cleanup** | Remove duplicate performer images |
|
||||
| **Tag Graph** | Visualize tag relationships |
|
||||
| **Duplicate Finder** | Advanced duplicate management |
|
||||
| **Stats** | Library analytics dashboard |
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference Checklist
|
||||
|
||||
Use this checklist every time you add new content:
|
||||
|
||||
```
|
||||
[ ] Drop files into stash-incoming directory
|
||||
[ ] Tasks → Scan
|
||||
[ ] Tasks → Generate → Phashes + Checksums
|
||||
[ ] Tasks → Identify (StashDB → TPDB)
|
||||
[ ] Tasks → Auto Tag
|
||||
[ ] Review unmatched scenes in Tagger View
|
||||
[ ] Manually scrape remaining unmatched scenes by URL
|
||||
[ ] Spot-check metadata on a sample of scenes
|
||||
[ ] Bulk select reviewed scenes → Mark as Organized
|
||||
[ ] Verify a few files moved and renamed correctly
|
||||
[ ] Done ✓
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
*Last updated: February 2026 | Stash version compatibility: 0.25+*
|
||||
*Community resources: [Stash Discord](https://discord.gg/2TsNFKt) | [GitHub](https://github.com/stashapp/stash) | [Wiki](https://github.com/stashapp/stash/wiki)*
|
||||
58
Netgrimoire/Green-Grimoire/Overview.md
Normal file
58
Netgrimoire/Green-Grimoire/Overview.md
Normal file
|
|
@ -0,0 +1,58 @@
|
|||
---
|
||||
title: Green Grimoire
|
||||
description: Adult media stack — the satyr's private library
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: green, adult, stash
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Green Grimoire
|
||||
|
||||

|
||||
|
||||
The Green Grimoire is the self-hosted adult media stack. Separate host and domain from Netgrimoire. All services sit behind `*.wasted-bandwidth.net` and Authelia. Homepage tab: **Nucking-Futz**.
|
||||
|
||||
Data lives at `/data/nfs/Baxter/Green/` with two libraries: Clips and Movies.
|
||||
|
||||
---
|
||||
|
||||
## Services
|
||||
|
||||
| Service | URL | Port | Purpose | Host |
|
||||
|---------|-----|------|---------|------|
|
||||
| Stash (main) | `stash.wasted-bandwidth.net` | 9999 | Primary adult content library | znas / Compose |
|
||||
| GreenFin (Jellyfinx) | Internal | 7096 | Green Door media server | docker5 / Compose |
|
||||
| Namer | `namer.wasted-bandwidth.net` | 6980 | Scene file namer | znas / Compose |
|
||||
| Whisparr | — | — | Adult content acquisition | znas / Swarm |
|
||||
| NZBGet | — | — | Downloader | znas / Swarm |
|
||||
| PocketStash | Internal | 9998 | Stash instance for Pocket Grimoire sync | znas / Compose |
|
||||
|
||||
---
|
||||
|
||||
## Data Structure
|
||||
|
||||
```
|
||||
/data/nfs/Baxter/Green/
|
||||
├── Clips/ ← Clips library
|
||||
├── Movies/ ← Movies library
|
||||
└── Pocket/ ← Synced to Pocket Grimoire pre-travel
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pocket Integration
|
||||
|
||||
PocketStash (port 9998) is a separate Stash instance that maintains a curated subset for travel. Before a trip, `syncoid` pushes `vault/Green/Pocket` to the Pocket Grimoire laptop. The Pocket instance runs in read-only travel mode — no writes while traveling.
|
||||
|
||||
See [Stash Integration](/Pocket-Grimoire/Software/Stash-Integration) in Pocket Grimoire docs.
|
||||
|
||||
---
|
||||
|
||||
## Sections
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| [Stash Management](/Green-Grimoire/Library/Stash-Management) | Library config, scrapers, metadata workflow |
|
||||
| [VHS Restoration](/Green-Grimoire/Scripts/VHS-Restoration) | Encoding, deinterlace, restoration scripts |
|
||||
531
Netgrimoire/Green-Grimoire/Scripts/VHS-Restoration.md
Normal file
531
Netgrimoire/Green-Grimoire/Scripts/VHS-Restoration.md
Normal file
|
|
@ -0,0 +1,531 @@
|
|||
---
|
||||
title: Video Restoration Script
|
||||
description: Restore VHS Video Captures
|
||||
published: true
|
||||
date: 2026-03-06T03:48:12.713Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-03-06T03:48:05.841Z
|
||||
---
|
||||
|
||||
# VHS Video Restoration — User Guide
|
||||
|
||||
A pipeline script for cleaning up and upscaling old VHS captures on Ubuntu 24.04.
|
||||
Runs in two modes: a fast FFmpeg-only cleanup pass, and a full AI upscale using Real-ESRGAN.
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
- **Ubuntu 24.04**
|
||||
- **FFmpeg** — `sudo apt install ffmpeg`
|
||||
- **bc** — `sudo apt install bc`
|
||||
- **Real-ESRGAN** (optional, for AI upscaling — see setup below)
|
||||
|
||||
---
|
||||
|
||||
## File Setup
|
||||
|
||||
Place everything in a working folder with this structure:
|
||||
|
||||
```
|
||||
~/your-folder/
|
||||
├── vhs_restore.sh
|
||||
├── realesrgan-ncnn-vulkan ← AI upscaler binary (optional)
|
||||
├── models/ ← Real-ESRGAN model files
|
||||
├── input/ ← Put your source videos here
|
||||
├── output/ ← Restored videos appear here
|
||||
└── work/ ← Temporary scratch files (auto-created)
|
||||
```
|
||||
|
||||
Supported input formats: `.mpg`, `.mpeg`, `.mp4`, `.avi`, `.mov`, `.mkv`, `.wmv`, `.m4v`, `.ts`
|
||||
|
||||
---
|
||||
|
||||
## First-Time Setup
|
||||
|
||||
```bash
|
||||
# Make the script executable
|
||||
chmod +x vhs_restore.sh
|
||||
|
||||
# Create the input folder and add your videos
|
||||
mkdir input
|
||||
cp /path/to/your/videos/*.mpg input/
|
||||
```
|
||||
|
||||
### Installing Real-ESRGAN (one-time, for AI upscaling)
|
||||
|
||||
1. Download the latest Ubuntu release from:
|
||||
https://github.com/xinntao/Real-ESRGAN/releases
|
||||
→ look for `realesrgan-ncnn-vulkan-*-ubuntu.zip`
|
||||
2. Unzip into your working folder
|
||||
3. `chmod +x realesrgan-ncnn-vulkan`
|
||||
|
||||
---
|
||||
|
||||
## Running the Script
|
||||
|
||||
### Quick cleanup only (recommended first pass)
|
||||
|
||||
Fast — processes in a few minutes per file. No AI upscaling.
|
||||
|
||||
```bash
|
||||
./vhs_restore.sh --no-ai
|
||||
```
|
||||
|
||||
### Full pipeline with AI upscaling
|
||||
|
||||
Slow on CPU (plan for several hours per hour of footage). Produces the best results.
|
||||
|
||||
```bash
|
||||
./vhs_restore.sh
|
||||
```
|
||||
|
||||
### All options
|
||||
|
||||
| Flag | Description | Default |
|
||||
|------|-------------|---------|
|
||||
| `-i DIR` | Input directory | `./input` |
|
||||
| `-o DIR` | Output directory | `./output` |
|
||||
| `-w DIR` | Scratch/work directory | `./work` |
|
||||
| `-b PATH` | Path to Real-ESRGAN binary | `./realesrgan-ncnn-vulkan` |
|
||||
| `-s 2` or `-s 4` | Upscale factor | `2` |
|
||||
| `-q 16` | Output quality (0–51, lower = better) | `16` |
|
||||
| `--no-ai` | Skip AI upscaling, FFmpeg only | off |
|
||||
| `--keep` | Keep extracted PNG frames after processing | off |
|
||||
| `-h` | Show help | |
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# Process files from a custom folder
|
||||
./vhs_restore.sh -i ~/Videos/VHS -o ~/Videos/Restored
|
||||
|
||||
# 4x upscale with slightly smaller output file
|
||||
./vhs_restore.sh -s 4 -q 18
|
||||
|
||||
# FFmpeg cleanup only, custom folders
|
||||
./vhs_restore.sh -i ~/Videos/VHS -o ~/Videos/Restored --no-ai
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What the Script Does
|
||||
|
||||
**Stage 1 — FFmpeg cleanup** (always runs):
|
||||
- Deinterlaces the video (`yadif`) — removes the horizontal combing artifacts common in VHS captures
|
||||
- Denoises (`hqdn3d=2:1:2:2`) — gentle noise reduction that avoids motion blocking
|
||||
- Sharpens edges (`unsharp`) — recovers detail softened by the denoise step
|
||||
- Colour corrects — boosts washed-out VHS colour, adjusts contrast and gamma, corrects the green/yellow cast common in aged tape
|
||||
|
||||
**Stage 2 — Frame extraction** (AI mode only):
|
||||
- Extracts every frame as a PNG into a temporary folder
|
||||
|
||||
**Stage 3 — Real-ESRGAN upscaling** (AI mode only):
|
||||
- Runs the `realesr-animevideov3` model on each frame
|
||||
- Default: 2× upscale (e.g. 640×480 → 1280×960)
|
||||
|
||||
**Reassembly:**
|
||||
- Rebuilds the video from upscaled frames with the original audio
|
||||
|
||||
---
|
||||
|
||||
## Live Progress
|
||||
|
||||
The script shows live FFmpeg output. Watch for:
|
||||
|
||||
- `speed=3.5x` — processing at 3.5× realtime (good)
|
||||
- `speed=0.5x` — slow, likely a very heavy filter load
|
||||
- `corrupt decoded frame` — normal for damaged VHS files, FFmpeg will push through
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Script hangs with no output**
|
||||
Run with `--no-ai` first to confirm FFmpeg is working, then check that your Real-ESRGAN binary is executable (`chmod +x realesrgan-ncnn-vulkan`).
|
||||
|
||||
**Output looks blocky during motion**
|
||||
The denoise values may still be too high for your footage. Edit the script and reduce `hqdn3d=2:1:2:2` to `hqdn3d=1:1:1:1`, or remove `hqdn3d` entirely — Real-ESRGAN handles noise well on its own.
|
||||
|
||||
**Colour looks over-saturated**
|
||||
Reduce `saturation=1.8` in the filter chain to `saturation=1.4` or `1.2`.
|
||||
|
||||
**Real-ESRGAN not found**
|
||||
Ensure the binary is in the same folder as the script and is executable. Or pass the path explicitly: `./vhs_restore.sh -b /path/to/realesrgan-ncnn-vulkan`
|
||||
|
||||
**Error logs**
|
||||
All FFmpeg and Real-ESRGAN logs are saved to `/tmp/` for diagnosis:
|
||||
- `/tmp/ffmpeg_stage1.log`
|
||||
- `/tmp/ffmpeg_extract.log`
|
||||
- `/tmp/realesrgan.log`
|
||||
- `/tmp/ffmpeg_reassemble.log`
|
||||
|
||||
---
|
||||
|
||||
## Workflow Recommendation
|
||||
|
||||
1. Run `--no-ai` first on one file to check the cleanup result
|
||||
2. If it looks good, run the full pipeline on all files overnight
|
||||
3. For heavily damaged footage, consider also running **CodeFormer** (face restoration) on top of the output — particularly effective if the video contains people
|
||||
|
||||
---
|
||||
|
||||
## Output
|
||||
|
||||
Restored files are saved to `./output/` as `<original_name>_restored.mp4` encoded as H.264 with AAC audio.
|
||||
|
||||
|
||||
## vhs_restore.sh Script
|
||||
|
||||
`#!/usr/bin/env bash
|
||||
# =============================================================================
|
||||
# vhs_restore.sh — Automated VHS Video Restoration Pipeline
|
||||
# Stages: Deinterlace → Denoise → Colour correct → AI Upscale → Reassemble
|
||||
#
|
||||
# Changes from v1:
|
||||
# - Gentle hqdn3d (2:1:2:2) to prevent motion blocking/pixelation
|
||||
# - Aggressive colour correction for washed-out VHS footage
|
||||
# - Live FFmpeg progress shown in terminal (no silent hanging)
|
||||
# - Logs still saved to /tmp/ for error diagnosis
|
||||
# =============================================================================
|
||||
set -euo pipefail
|
||||
|
||||
# ── Colour output helpers ────────────────────────────────────────────────────
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'
|
||||
CYAN='\033[0;36m'; BOLD='\033[1m'; NC='\033[0m'
|
||||
info() { echo -e "${CYAN}[INFO]${NC} $*"; }
|
||||
success() { echo -e "${GREEN}[OK]${NC} $*"; }
|
||||
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
|
||||
error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
|
||||
header() { echo -e "\n${BOLD}${CYAN}══ $* ══${NC}"; }
|
||||
|
||||
# ── Default configuration ────────────────────────────────────────────────────
|
||||
INPUT_DIR="./input" # Folder containing your source VHS videos
|
||||
OUTPUT_DIR="./output" # Final restored videos land here
|
||||
WORK_DIR="./work" # Scratch space (frames, temp files)
|
||||
REALESRGAN_BIN="./realesrgan-ncnn-vulkan" # Path to Real-ESRGAN binary
|
||||
REALESRGAN_MODEL="realesr-animevideov3" # Best model for home video
|
||||
UPSCALE_FACTOR=2 # 2x or 4x (4x is very slow on CPU)
|
||||
OUTPUT_WIDTH=1920 # Target width used in --no-ai mode
|
||||
OUTPUT_HEIGHT=1080 # Target height used in --no-ai mode
|
||||
CRF=16 # Output quality 0-51, lower = better
|
||||
PRESET="slow" # FFmpeg encode preset
|
||||
SKIP_UPSCALE=false # --no-ai flag sets this true
|
||||
KEEP_FRAMES=false # --keep flag sets this true
|
||||
|
||||
# ── Parse CLI flags ──────────────────────────────────────────────────────────
|
||||
usage() {
|
||||
cat <<EOF
|
||||
Usage: $(basename "$0") [options]
|
||||
|
||||
Options:
|
||||
-i DIR Input directory (default: ./input)
|
||||
-o DIR Output directory (default: ./output)
|
||||
-w DIR Work/scratch dir (default: ./work)
|
||||
-b PATH Path to realesrgan-ncnn-vulkan binary
|
||||
-s FACTOR Upscale factor: 2 or 4 (default: 2)
|
||||
-q CRF Output quality 0-51, lower=better (default: 16)
|
||||
--no-ai Skip Real-ESRGAN; FFmpeg cleanup only (fast)
|
||||
--keep Keep extracted frames after processing
|
||||
-h Show this help
|
||||
|
||||
Examples:
|
||||
$(basename "$0") -i ~/Videos/VHS -o ~/Videos/Restored
|
||||
$(basename "$0") -i ~/Videos/VHS --no-ai # Quick cleanup only
|
||||
$(basename "$0") -i ~/Videos/VHS -s 4 -q 18 # 4x upscale
|
||||
EOF
|
||||
exit 0
|
||||
}
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
-i) INPUT_DIR="$2"; shift 2 ;;
|
||||
-o) OUTPUT_DIR="$2"; shift 2 ;;
|
||||
-w) WORK_DIR="$2"; shift 2 ;;
|
||||
-b) REALESRGAN_BIN="$2"; shift 2 ;;
|
||||
-s) UPSCALE_FACTOR="$2"; shift 2 ;;
|
||||
-q) CRF="$2"; shift 2 ;;
|
||||
--no-ai) SKIP_UPSCALE=true; shift ;;
|
||||
--keep) KEEP_FRAMES=true; shift ;;
|
||||
-h|--help) usage ;;
|
||||
*) error "Unknown option: $1"; usage ;;
|
||||
esac
|
||||
done
|
||||
|
||||
# ── Dependency checks ────────────────────────────────────────────────────────
|
||||
header "Checking dependencies"
|
||||
|
||||
check_cmd() {
|
||||
if command -v "$1" &>/dev/null; then
|
||||
success "$1 found"
|
||||
else
|
||||
error "$1 not found. Install with: $2"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
check_cmd ffmpeg "sudo apt install ffmpeg"
|
||||
check_cmd ffprobe "sudo apt install ffmpeg"
|
||||
check_cmd bc "sudo apt install bc"
|
||||
|
||||
if [[ "$SKIP_UPSCALE" == false ]]; then
|
||||
if [[ ! -x "$REALESRGAN_BIN" ]]; then
|
||||
warn "Real-ESRGAN binary not found at: $REALESRGAN_BIN"
|
||||
echo
|
||||
echo -e "${YELLOW}To install Real-ESRGAN:${NC}"
|
||||
echo " 1. Download: https://github.com/xinntao/Real-ESRGAN/releases"
|
||||
echo " -> realesrgan-ncnn-vulkan-*-ubuntu.zip"
|
||||
echo " 2. Unzip into this directory"
|
||||
echo " 3. chmod +x realesrgan-ncnn-vulkan"
|
||||
echo " 4. Re-run this script"
|
||||
echo
|
||||
echo "Or run with --no-ai for FFmpeg-only cleanup (no upscaling)."
|
||||
exit 1
|
||||
fi
|
||||
success "Real-ESRGAN found"
|
||||
fi
|
||||
|
||||
# ── Locate input files ───────────────────────────────────────────────────────
|
||||
header "Scanning input directory: $INPUT_DIR"
|
||||
|
||||
if [[ ! -d "$INPUT_DIR" ]]; then
|
||||
error "Input directory not found: $INPUT_DIR"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
mapfile -t VIDEO_FILES < <(find "$INPUT_DIR" -maxdepth 1 \
|
||||
-type f \( -iname "*.mp4" -o -iname "*.avi" -o -iname "*.mov" \
|
||||
-o -iname "*.mkv" -o -iname "*.mpg" -o -iname "*.mpeg" \
|
||||
-o -iname "*.wmv" -o -iname "*.m4v" -o -iname "*.ts" \) \
|
||||
| sort)
|
||||
|
||||
if [[ ${#VIDEO_FILES[@]} -eq 0 ]]; then
|
||||
error "No video files found in $INPUT_DIR"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
info "Found ${#VIDEO_FILES[@]} video file(s):"
|
||||
for f in "${VIDEO_FILES[@]}"; do echo " * $(basename "$f")"; done
|
||||
|
||||
# ── Helpers ──────────────────────────────────────────────────────────────────
|
||||
probe() {
|
||||
ffprobe -v error -select_streams v:0 \
|
||||
-show_entries "stream=$2" -of csv=p=0 "$1" 2>/dev/null | head -1
|
||||
}
|
||||
|
||||
human_time() {
|
||||
local s="${1%.*}"
|
||||
printf '%dh %dm %ds' $((s/3600)) $(( (s%3600)/60 )) $((s%60))
|
||||
}
|
||||
|
||||
# ── Create directories ───────────────────────────────────────────────────────
|
||||
mkdir -p "$OUTPUT_DIR" "$WORK_DIR"
|
||||
|
||||
# ── Overall stats ────────────────────────────────────────────────────────────
|
||||
TOTAL_FILES=${#VIDEO_FILES[@]}
|
||||
PROCESSED=0
|
||||
FAILED=0
|
||||
PIPELINE_START=$(date +%s)
|
||||
|
||||
# ════════════════════════════════════════════════════════════════════════════
|
||||
# MAIN LOOP
|
||||
# ════════════════════════════════════════════════════════════════════════════
|
||||
for INPUT_FILE in "${VIDEO_FILES[@]}"; do
|
||||
|
||||
BASENAME=$(basename "$INPUT_FILE")
|
||||
STEM="${BASENAME%.*}"
|
||||
CLEANED="$WORK_DIR/${STEM}_cleaned.mp4"
|
||||
FRAMES_IN="$WORK_DIR/${STEM}_frames_in"
|
||||
FRAMES_OUT="$WORK_DIR/${STEM}_frames_out"
|
||||
FINAL_OUTPUT="$OUTPUT_DIR/${STEM}_restored.mp4"
|
||||
|
||||
header "Processing: $BASENAME ($((PROCESSED+1))/$TOTAL_FILES)"
|
||||
FILE_START=$(date +%s)
|
||||
|
||||
# ── Probe source ──────────────────────────────────────────────────────────
|
||||
FPS=$(probe "$INPUT_FILE" "r_frame_rate")
|
||||
FPS_DEC=$(echo "scale=3; $FPS" | bc 2>/dev/null || echo "25")
|
||||
WIDTH=$(probe "$INPUT_FILE" "width")
|
||||
HEIGHT=$(probe "$INPUT_FILE" "height")
|
||||
FIELD_ORDER=$(probe "$INPUT_FILE" "field_order")
|
||||
DURATION=$(ffprobe -v error -show_entries format=duration \
|
||||
-of csv=p=0 "$INPUT_FILE" 2>/dev/null | head -1)
|
||||
|
||||
info "Source: ${WIDTH}x${HEIGHT} ${FPS_DEC}fps $(human_time "${DURATION%.*}") field_order=${FIELD_ORDER:-unknown}"
|
||||
|
||||
# Always deinterlace for VHS -- safe even if not flagged as interlaced
|
||||
if [[ "$FIELD_ORDER" =~ ^(tt|tb|bt|bb)$ ]]; then
|
||||
DEINTERLACE_FILTER="yadif=mode=1,"
|
||||
info "Interlacing detected — applying yadif deinterlacer"
|
||||
else
|
||||
DEINTERLACE_FILTER="yadif=mode=1,"
|
||||
warn "Interlacing not confirmed by probe — applying yadif anyway (safe for VHS)"
|
||||
fi
|
||||
|
||||
# ── Stage 1: FFmpeg cleanup ───────────────────────────────────────────────
|
||||
header "Stage 1/3 — FFmpeg cleanup & colour correction"
|
||||
info "Watch fps= and speed= for live progress."
|
||||
info "Corrupt frame warnings are normal for old VHS captures."
|
||||
echo
|
||||
|
||||
if [[ "$SKIP_UPSCALE" == true ]]; then
|
||||
SCALE_FILTER="scale=${OUTPUT_WIDTH}:${OUTPUT_HEIGHT}:flags=lanczos,"
|
||||
else
|
||||
SCALE_FILTER=""
|
||||
fi
|
||||
|
||||
# Filter chain notes:
|
||||
# hqdn3d=2:1:2:2 -- gentle denoise; low temporal values (3rd/4th)
|
||||
# prevent the motion blocking seen with higher values
|
||||
# unsharp -- moderate sharpening to recover edge detail
|
||||
# eq -- aggressive colour boost for washed-out VHS
|
||||
# colorbalance -- corrects the green/yellow cast common in aged VHS
|
||||
VFILTER="${DEINTERLACE_FILTER}\
|
||||
hqdn3d=2:1:2:2,\
|
||||
unsharp=3:3:0.5:3:3:0.3,\
|
||||
eq=contrast=1.2:brightness=0.05:saturation=1.8:gamma=1.1,\
|
||||
colorbalance=rs=0.1:gs=0.0:bs=-0.1,\
|
||||
${SCALE_FILTER}\
|
||||
format=yuv420p"
|
||||
|
||||
if ! ffmpeg -y -i "$INPUT_FILE" \
|
||||
-vf "$VFILTER" \
|
||||
-c:v libx264 -crf 18 -preset medium \
|
||||
-c:a aac -b:a 192k -ac 2 \
|
||||
-stats \
|
||||
"$CLEANED" 2>&1 | tee /tmp/ffmpeg_stage1.log | \
|
||||
grep --line-buffered -E "(frame=|speed=|error|Error|Invalid)"; then
|
||||
error "FFmpeg stage 1 failed. Full log: /tmp/ffmpeg_stage1.log"
|
||||
FAILED=$((FAILED+1))
|
||||
continue
|
||||
fi
|
||||
|
||||
echo
|
||||
success "Stage 1 complete -> $(du -sh "$CLEANED" | cut -f1)"
|
||||
|
||||
if [[ "$SKIP_UPSCALE" == true ]]; then
|
||||
cp "$CLEANED" "$FINAL_OUTPUT"
|
||||
success "Output (no AI): $FINAL_OUTPUT"
|
||||
PROCESSED=$((PROCESSED+1))
|
||||
[[ "$KEEP_FRAMES" == false ]] && rm -f "$CLEANED"
|
||||
continue
|
||||
fi
|
||||
|
||||
# ── Stage 2: Extract frames ───────────────────────────────────────────────
|
||||
header "Stage 2/3 — Extracting frames for AI upscaling"
|
||||
mkdir -p "$FRAMES_IN" "$FRAMES_OUT"
|
||||
|
||||
FRAME_COUNT=$(ffprobe -v error -count_packets \
|
||||
-select_streams v:0 -show_entries stream=nb_read_packets \
|
||||
-of csv=p=0 "$CLEANED" 2>/dev/null | head -1)
|
||||
FRAME_COUNT=${FRAME_COUNT:-0}
|
||||
info "Extracting ~${FRAME_COUNT} frames..."
|
||||
|
||||
if ! ffmpeg -y -i "$CLEANED" \
|
||||
-vsync 0 -stats \
|
||||
"$FRAMES_IN/frame%08d.png" 2>&1 | tee /tmp/ffmpeg_extract.log | \
|
||||
grep --line-buffered -E "(frame=|speed=|error|Error)"; then
|
||||
error "Frame extraction failed. Full log: /tmp/ffmpeg_extract.log"
|
||||
FAILED=$((FAILED+1))
|
||||
continue
|
||||
fi
|
||||
|
||||
ACTUAL_FRAMES=$(find "$FRAMES_IN" -name "*.png" | wc -l)
|
||||
echo
|
||||
success "Extracted $ACTUAL_FRAMES frames"
|
||||
|
||||
# ── Stage 3: Real-ESRGAN ──────────────────────────────────────────────────
|
||||
header "Stage 3/3 — Real-ESRGAN AI upscaling (${UPSCALE_FACTOR}x)"
|
||||
warn "Slow on CPU — est. $(echo "scale=0; $ACTUAL_FRAMES * 10 / 60" | bc)-$(echo "scale=0; $ACTUAL_FRAMES * 30 / 60" | bc) minutes"
|
||||
info "Upscaled frames will appear in: $FRAMES_OUT"
|
||||
echo
|
||||
|
||||
UPSCALE_START=$(date +%s)
|
||||
if ! "$REALESRGAN_BIN" \
|
||||
-i "$FRAMES_IN" \
|
||||
-o "$FRAMES_OUT" \
|
||||
-n "$REALESRGAN_MODEL" \
|
||||
-s "$UPSCALE_FACTOR" \
|
||||
-f png 2>&1 | tee /tmp/realesrgan.log; then
|
||||
error "Real-ESRGAN failed. Full log: /tmp/realesrgan.log"
|
||||
FAILED=$((FAILED+1))
|
||||
continue
|
||||
fi
|
||||
|
||||
UPSCALE_END=$(date +%s)
|
||||
UPSCALE_ELAPSED=$((UPSCALE_END - UPSCALE_START))
|
||||
success "AI upscaling complete in $(human_time $UPSCALE_ELAPSED)"
|
||||
|
||||
# ── Reassemble ────────────────────────────────────────────────────────────
|
||||
REASSEMBLE_FPS=$(ffprobe -v error -select_streams v:0 \
|
||||
-show_entries stream=r_frame_rate \
|
||||
-of csv=p=0 "$CLEANED" 2>/dev/null | head -1)
|
||||
|
||||
info "Reassembling video from upscaled frames..."
|
||||
echo
|
||||
|
||||
if ! ffmpeg -y \
|
||||
-framerate "$REASSEMBLE_FPS" \
|
||||
-i "$FRAMES_OUT/frame%08d.png" \
|
||||
-i "$CLEANED" \
|
||||
-map 0:v -map 1:a \
|
||||
-c:v libx264 -crf "$CRF" -preset "$PRESET" \
|
||||
-c:a copy \
|
||||
-movflags +faststart \
|
||||
-stats \
|
||||
"$FINAL_OUTPUT" 2>&1 | tee /tmp/ffmpeg_reassemble.log | \
|
||||
grep --line-buffered -E "(frame=|speed=|error|Error)"; then
|
||||
error "Reassembly failed. Full log: /tmp/ffmpeg_reassemble.log"
|
||||
FAILED=$((FAILED+1))
|
||||
continue
|
||||
fi
|
||||
|
||||
# ── Cleanup ───────────────────────────────────────────────────────────────
|
||||
if [[ "$KEEP_FRAMES" == false ]]; then
|
||||
rm -rf "$FRAMES_IN" "$FRAMES_OUT" "$CLEANED"
|
||||
info "Scratch files cleaned up"
|
||||
else
|
||||
info "Frames kept in: $FRAMES_IN / $FRAMES_OUT"
|
||||
fi
|
||||
|
||||
FILE_END=$(date +%s)
|
||||
FILE_ELAPSED=$((FILE_END - FILE_START))
|
||||
PROCESSED=$((PROCESSED+1))
|
||||
|
||||
OUT_SIZE=$(du -sh "$FINAL_OUTPUT" | cut -f1)
|
||||
echo
|
||||
success "Done: $FINAL_OUTPUT"
|
||||
info " File size : $OUT_SIZE"
|
||||
info " Time taken: $(human_time $FILE_ELAPSED)"
|
||||
|
||||
done
|
||||
|
||||
# ════════════════════════════════════════════════════════════════════════════
|
||||
# Final summary
|
||||
# ════════════════════════════════════════════════════════════════════════════
|
||||
PIPELINE_END=$(date +%s)
|
||||
PIPELINE_ELAPSED=$((PIPELINE_END - PIPELINE_START))
|
||||
|
||||
header "Pipeline Complete"
|
||||
echo -e " ${GREEN}Processed : $PROCESSED / $TOTAL_FILES${NC}"
|
||||
[[ $FAILED -gt 0 ]] && echo -e " ${RED}Failed : $FAILED${NC}"
|
||||
echo -e " Total time: $(human_time $PIPELINE_ELAPSED)"
|
||||
echo -e " Output dir: $OUTPUT_DIR"
|
||||
echo
|
||||
|
||||
if [[ $PROCESSED -gt 0 ]]; then
|
||||
echo "Restored files:"
|
||||
find "$OUTPUT_DIR" -name "*_restored.mp4" | while read -r f; do
|
||||
SIZE=$(du -sh "$f" | cut -f1)
|
||||
echo " * $(basename "$f") ($SIZE)"
|
||||
done
|
||||
fi
|
||||
`
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
72
Netgrimoire/Gremlin-Grimoire/Overview.md
Normal file
72
Netgrimoire/Gremlin-Grimoire/Overview.md
Normal file
|
|
@ -0,0 +1,72 @@
|
|||
---
|
||||
title: Gremlin Grimoire
|
||||
description: Netgrimoire's local AI — the gremlin that runs the machine
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: gremlin, ai, ollama, n8n
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Gremlin Grimoire
|
||||
|
||||

|
||||
|
||||
Gremlin is the local AI layer of Netgrimoire. It's not just a chat interface — it's an autonomous agent that watches the infrastructure, audits the codebase, triages alerts, and answers questions about the lab. The gremlin lives inside the machine and knows every dark corner of it.
|
||||
|
||||
---
|
||||
|
||||
## What Gremlin Is
|
||||
|
||||
Gremlin is a stack of four services running together on `docker4`, all pinned to the same Swarm node:
|
||||
|
||||
| Service | Role | URL |
|
||||
|---------|------|-----|
|
||||
| **Ollama** | Local LLM inference (CPU-only, Ryzen) | `http://ollama:11434` · `ollama.netgrimoire.com:11434` |
|
||||
| **Open WebUI** | Chat interface + RAG frontend | `https://ai.netgrimoire.com` |
|
||||
| **Qdrant** | Vector database for RAG knowledge base | `http://qdrant:6333` · dashboard `:6333/dashboard` |
|
||||
| **n8n** | Automation brain — autonomous workflows | `https://n8n.netgrimoire.com` |
|
||||
|
||||
---
|
||||
|
||||
## What Gremlin Does Today
|
||||
|
||||
| Capability | Status | Workflow |
|
||||
|-----------|--------|---------|
|
||||
| Weekly YAML audit of all compose files | ✅ Live | Forgejo Audit — Monday 06:00 |
|
||||
| Uptime Kuma alert triage | ✅ Live | Kuma Triage — webhook-triggered |
|
||||
| Interactive chat with lab context | ✅ Live | Open WebUI + Ollama |
|
||||
| RAG over wiki/docs | 🔧 Wired, not populated | Qdrant connected, knowledge base empty |
|
||||
| Doc generation from compose files | 🟡 Parked | CPU quality insufficient — awaiting GPU |
|
||||
| Email triage | 📋 Planned | Phase 3 — not built |
|
||||
|
||||
---
|
||||
|
||||
## Models
|
||||
|
||||
| Model | Size | Used For |
|
||||
|-------|------|---------|
|
||||
| `qwen2.5-coder:7b` | ~5 GB | Code review, YAML audits, compose analysis |
|
||||
| `llama3.2:3b` | ~2 GB | Alert triage, Q&A, summarization |
|
||||
|
||||
Models must be pulled before workflows run. See [Ollama Model Management](/Gremlin-Grimoire/Runbooks/Model-Management).
|
||||
|
||||
---
|
||||
|
||||
## Sections
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| [Stack](/Gremlin-Grimoire/Stack/Build-Config) | Full build config, volumes, env vars, compose YAML |
|
||||
| [Workflows](/Gremlin-Grimoire/Workflows/Forgejo-Audit) | All n8n workflows — architecture, patterns, gotchas |
|
||||
| [Runbooks](/Gremlin-Grimoire/Runbooks/Deploy) | Deploy, model management, troubleshooting |
|
||||
|
||||
---
|
||||
|
||||
## Planned Evolution
|
||||
|
||||
- **Homelable MCP backend** — next up. Provides tool-use for infra Q&A (topology, running services, resource usage). Blocked until Homelable stack is deployed.
|
||||
- **GPU support** — unlocks doc generation and larger models. Compose GPU block is commented out, ready to enable.
|
||||
- **Gremlin role variants** — specialized personas per domain (Proxy Gremlin, Storage Gremlin, Security Gremlin, etc.) with mood states and dynamic badge serving via Caddy.
|
||||
- **RAG knowledge base population** — index all Wiki.js pages and the compose template standard into Qdrant.
|
||||
- **Gremlin Router** — dedicated Flask container for webhook routing (currently handled directly by n8n).
|
||||
73
Netgrimoire/Gremlin-Grimoire/Runbooks/Deploy.md
Normal file
73
Netgrimoire/Gremlin-Grimoire/Runbooks/Deploy.md
Normal file
|
|
@ -0,0 +1,73 @@
|
|||
---
|
||||
title: Deploy Gremlin Stack
|
||||
description: How to deploy and redeploy the Gremlin AI stack
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: gremlin, deploy, runbook
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Deploy Gremlin Stack
|
||||
|
||||
All Gremlin services run on `docker4` (hermes), pinned via `node.hostname == docker4`.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
```bash
|
||||
# On docker4 — create volume directories
|
||||
mkdir -p /DockerVol/ollama
|
||||
mkdir -p /DockerVol/open-webui
|
||||
mkdir -p /DockerVol/qdrant
|
||||
|
||||
# n8n requires specific ownership
|
||||
mkdir -p /DockerVol/n8n
|
||||
chown -R 1000:1000 /DockerVol/n8n
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deploy
|
||||
|
||||
```bash
|
||||
cd ~/services && git pull
|
||||
cd swarm/stack/Gremlin
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file gremlin-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml gremlin
|
||||
rm resolved.yml
|
||||
docker stack services gremlin
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pull Models After Deploy
|
||||
|
||||
Models must be pulled before n8n workflows run. Ollama returns a silent model-not-found error if workflows fire first.
|
||||
|
||||
```bash
|
||||
docker exec $(docker ps -qf name=gremlin_ollama) ollama pull llama3.2:3b
|
||||
docker exec $(docker ps -qf name=gremlin_ollama) ollama pull qwen2.5-coder:7b
|
||||
|
||||
# Verify
|
||||
docker exec $(docker ps -qf name=gremlin_ollama) ollama list
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verify Open WebUI Secret Key
|
||||
|
||||
Check that `WEBUI_SECRET_KEY` in `.env` on docker4 is set to a real secret, not the placeholder `change-this-secret-key`.
|
||||
|
||||
---
|
||||
|
||||
## Service URLs After Deploy
|
||||
|
||||
| Service | Internal | External |
|
||||
|---------|----------|---------|
|
||||
| Ollama | `http://ollama:11434` | `http://ollama.netgrimoire.com:11434` |
|
||||
| Open WebUI | `http://open-webui:8080` | `https://ai.netgrimoire.com` |
|
||||
| Qdrant | `http://qdrant:6333` | `http://qdrant.netgrimoire.com:6333/dashboard` |
|
||||
| n8n | `http://n8n:5678` | `https://n8n.netgrimoire.com` |
|
||||
41
Netgrimoire/Gremlin-Grimoire/Runbooks/Model-Management.md
Normal file
41
Netgrimoire/Gremlin-Grimoire/Runbooks/Model-Management.md
Normal file
|
|
@ -0,0 +1,41 @@
|
|||
---
|
||||
title: Ollama Model Management
|
||||
description: Pulling, verifying, and managing models on the Gremlin stack
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: gremlin, ollama, models, runbook
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Ollama Model Management
|
||||
|
||||
## Pull Required Models
|
||||
|
||||
Run on docker4 after any fresh deploy or after the Ollama container is recreated:
|
||||
|
||||
```bash
|
||||
docker exec $(docker ps -qf name=gremlin_ollama) ollama pull llama3.2:3b
|
||||
docker exec $(docker ps -qf name=gremlin_ollama) ollama pull qwen2.5-coder:7b
|
||||
```
|
||||
|
||||
## Verify Models Loaded
|
||||
|
||||
```bash
|
||||
docker exec $(docker ps -qf name=gremlin_ollama) ollama list
|
||||
```
|
||||
|
||||
## Model Reference
|
||||
|
||||
| Model | Size | Pull Time (CPU) | Used By |
|
||||
|-------|------|----------------|---------|
|
||||
| `llama3.2:3b` | ~2 GB | ~5 min | Kuma triage, Open WebUI |
|
||||
| `qwen2.5-coder:7b` | ~5 GB | ~15 min | Forgejo audit, Open WebUI |
|
||||
|
||||
## Models Storage Path
|
||||
|
||||
`/DockerVol/ollama` — survives container restarts and redeployments.
|
||||
|
||||
## ⚠ Pull Before Workflows Run
|
||||
|
||||
n8n workflows fail silently if models aren't present. Ollama returns a model-not-found response but n8n may not surface this as an obvious error. Always pull models immediately after deploy before enabling workflows.
|
||||
64
Netgrimoire/Gremlin-Grimoire/Runbooks/Troubleshooting.md
Normal file
64
Netgrimoire/Gremlin-Grimoire/Runbooks/Troubleshooting.md
Normal file
|
|
@ -0,0 +1,64 @@
|
|||
---
|
||||
title: Gremlin Troubleshooting
|
||||
description: Common Gremlin stack problems and fixes
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: gremlin, troubleshooting, runbook
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Gremlin Troubleshooting
|
||||
|
||||
## n8n Won't Start / Permission Error
|
||||
|
||||
```bash
|
||||
# On docker4
|
||||
chown -R 1000:1000 /DockerVol/n8n
|
||||
docker service update --force gremlin_n8n
|
||||
```
|
||||
|
||||
## Workflow Fails Silently on Ollama Call
|
||||
|
||||
Model not pulled. Ollama returns model-not-found but n8n may not surface it clearly.
|
||||
|
||||
```bash
|
||||
docker exec $(docker ps -qf name=gremlin_ollama) ollama list
|
||||
# If model missing:
|
||||
docker exec $(docker ps -qf name=gremlin_ollama) ollama pull llama3.2:3b
|
||||
docker exec $(docker ps -qf name=gremlin_ollama) ollama pull qwen2.5-coder:7b
|
||||
```
|
||||
|
||||
## Forgejo Webhook Not Reaching n8n
|
||||
|
||||
Add to Forgejo `app.ini`:
|
||||
```ini
|
||||
[webhook]
|
||||
ALLOWED_HOST_LIST = *
|
||||
```
|
||||
Restart Forgejo. Required when `OFFLINE_MODE = true`.
|
||||
|
||||
## Caddy Routes to Wrong Container IP
|
||||
|
||||
Ensure all Gremlin services include in labels:
|
||||
```yaml
|
||||
caddy_ingress_network: netgrimoire
|
||||
```
|
||||
|
||||
Never use `{{upstreams PORT}}` — breaks during `docker stack config` preprocessing. Use `caddy.reverse_proxy: servicename:PORT`.
|
||||
|
||||
## Audit Workflow Times Out
|
||||
|
||||
Check `N8N_RUNNERS_TASK_TIMEOUT` is set to `3600` in n8n environment. Default timeout is too short for 67-file audit runs.
|
||||
|
||||
## n8n Code Node Can't Access Env Vars
|
||||
|
||||
Set `N8N_BLOCK_ENV_ACCESS_IN_NODE=false` in n8n environment.
|
||||
|
||||
## Open WebUI Can't Connect to Qdrant
|
||||
|
||||
Verify both services are on the `netgrimoire` overlay and pinned to `docker4`. Qdrant gRPC port is 6334, REST is 6333.
|
||||
|
||||
## Audit Reports Not Committing to Forgejo
|
||||
|
||||
Check write token is set in n8n credentials. The read and write tokens are separate — confirm the workflow is using the write token for commit operations (POST new files, PUT+SHA for updates).
|
||||
503
Netgrimoire/Gremlin-Grimoire/Stack/Agent-Docs.md
Normal file
503
Netgrimoire/Gremlin-Grimoire/Stack/Agent-Docs.md
Normal file
|
|
@ -0,0 +1,503 @@
|
|||
---
|
||||
title: Ollama with agent
|
||||
description: The smart home reference
|
||||
published: true
|
||||
date: 2026-04-02T21:11:09.564Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-18T22:14:41.533Z
|
||||
---
|
||||
|
||||
# AI Automation Stack - Ollama + n8n + Open WebUI
|
||||
|
||||
## Overview
|
||||
|
||||
This stack provides a complete self-hosted AI automation solution for homelab infrastructure management, documentation generation, and intelligent monitoring. The system consists of four core components that work together to provide AI-powered workflows and knowledge management.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ AI Automation Stack │
|
||||
│ │
|
||||
│ Open WebUI ────────┐ │
|
||||
│ (Chat Interface) │ │
|
||||
│ │ │ │
|
||||
│ ▼ ▼ │
|
||||
│ Ollama ◄──── Qdrant │
|
||||
│ (LLM Runtime) (Vector DB) │
|
||||
│ ▲ │
|
||||
│ │ │
|
||||
│ n8n │
|
||||
│ (Workflow Engine) │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ Forgejo │ Wiki.js │ Monitoring │
|
||||
└─────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### Ollama
|
||||
- **Purpose**: Local LLM runtime engine
|
||||
- **Port**: 11434
|
||||
- **Resource Usage**: 4-6GB RAM (depending on model)
|
||||
- **Recommended Models**:
|
||||
- `qwen2.5-coder:7b` - Code analysis and documentation
|
||||
- `llama3.2:3b` - General queries and chat
|
||||
- `phi3:mini` - Lightweight alternative
|
||||
|
||||
### Open WebUI
|
||||
- **Purpose**: User-friendly chat interface with built-in RAG (Retrieval Augmented Generation)
|
||||
- **Port**: 3000
|
||||
- **Features**:
|
||||
- Document ingestion from Wiki.js
|
||||
- Conversational interface for querying documentation
|
||||
- RAG pipeline for context-aware responses
|
||||
- Multi-model support
|
||||
- **Access**: `http://your-server-ip:3000`
|
||||
|
||||
### Qdrant
|
||||
- **Purpose**: Vector database for semantic search and RAG
|
||||
- **Ports**: 6333 (HTTP), 6334 (gRPC)
|
||||
- **Resource Usage**: ~1GB RAM
|
||||
- **Function**: Stores embeddings of your documentation, code, and markdown files
|
||||
|
||||
### n8n
|
||||
- **Purpose**: Workflow automation and orchestration
|
||||
- **Port**: 5678
|
||||
- **Default Credentials**:
|
||||
- Username: `admin`
|
||||
- Password: `change-this-password` (⚠️ **Change this immediately**)
|
||||
- **Access**: `http://your-server-ip:5678`
|
||||
|
||||
## Installation
|
||||
|
||||
### Prerequisites
|
||||
- Docker and Docker Compose installed
|
||||
- 16GB RAM minimum (8GB available for the stack)
|
||||
- 50GB disk space for models and data
|
||||
|
||||
### Deployment Steps
|
||||
|
||||
1. **Create directory structure**:
|
||||
```bash
|
||||
mkdir -p ~/ai-stack/{n8n/workflows}
|
||||
cd ~/ai-stack
|
||||
```
|
||||
|
||||
2. **Download the compose file**:
|
||||
```bash
|
||||
# Place the ai-stack-compose.yml in this directory
|
||||
wget [your-internal-url]/ai-stack-compose.yml
|
||||
```
|
||||
|
||||
3. **Configure environment variables**:
|
||||
```bash
|
||||
# Edit the compose file and change:
|
||||
# - WEBUI_SECRET_KEY
|
||||
# - N8N_BASIC_AUTH_PASSWORD
|
||||
# - WEBHOOK_URL (use your server's IP)
|
||||
# - GENERIC_TIMEZONE
|
||||
nano ai-stack-compose.yml
|
||||
```
|
||||
|
||||
4. **Start the stack**:
|
||||
```bash
|
||||
docker-compose -f ai-stack-compose.yml up -d
|
||||
```
|
||||
|
||||
5. **Pull Ollama models**:
|
||||
```bash
|
||||
docker exec -it ollama ollama pull qwen2.5-coder:7b
|
||||
docker exec -it ollama ollama pull llama3.2:3b
|
||||
```
|
||||
|
||||
6. **Verify services**:
|
||||
```bash
|
||||
docker-compose -f ai-stack-compose.yml ps
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Open WebUI Setup
|
||||
|
||||
1. Navigate to `http://your-server-ip:3000`
|
||||
2. Create your admin account (first user becomes admin)
|
||||
3. Go to **Settings → Connections** and verify Ollama connection
|
||||
4. Configure Qdrant:
|
||||
- Host: `qdrant`
|
||||
- Port: `6333`
|
||||
|
||||
### Setting Up RAG for Wiki.js
|
||||
|
||||
1. In Open WebUI, go to **Workspace → Knowledge**
|
||||
2. Create a new collection: "Homelab Documentation"
|
||||
3. Add sources:
|
||||
- **URL Crawl**: Enter your Wiki.js base URL
|
||||
- **File Upload**: Upload markdown files from repositories
|
||||
4. Process and index the documents
|
||||
|
||||
### n8n Initial Configuration
|
||||
|
||||
1. Navigate to `http://your-server-ip:5678`
|
||||
2. Log in with credentials from docker-compose file
|
||||
3. Import starter workflows from `/n8n/workflows/` directory
|
||||
|
||||
## Use Cases
|
||||
|
||||
### 1. Automated Documentation Generation
|
||||
|
||||
**Workflow**: Forgejo webhook → n8n → Ollama → Wiki.js
|
||||
|
||||
When code is pushed to Forgejo:
|
||||
1. n8n receives webhook from Forgejo
|
||||
2. Extracts changed files and repo context
|
||||
3. Sends to Ollama with prompt: "Generate documentation for this code"
|
||||
4. Posts generated docs to Wiki.js via API
|
||||
|
||||
**Example n8n Workflow**:
|
||||
```
|
||||
Webhook Trigger
|
||||
→ HTTP Request (Forgejo API - get file contents)
|
||||
→ Ollama LLM Node (generate docs)
|
||||
→ HTTP Request (Wiki.js API - create/update page)
|
||||
→ Send notification (completion)
|
||||
```
|
||||
|
||||
### 2. Docker-Compose Standardization
|
||||
|
||||
**Workflow**: Repository scan → compliance check → issue creation
|
||||
|
||||
1. n8n runs on schedule (daily/weekly)
|
||||
2. Queries Forgejo API for all repositories
|
||||
3. Scans for `docker-compose.yml` files
|
||||
4. Compares against template standards stored in Qdrant
|
||||
5. Generates compliance report with Ollama
|
||||
6. Creates Forgejo issues for non-compliant repos
|
||||
|
||||
### 3. Intelligent Alert Processing
|
||||
|
||||
**Workflow**: Monitoring alert → AI analysis → smart routing
|
||||
|
||||
1. Beszel/Uptime Kuma sends webhook to n8n
|
||||
2. n8n queries historical data and context
|
||||
3. Ollama analyzes:
|
||||
- Is this expected? (scheduled backup, known maintenance)
|
||||
- Severity level
|
||||
- Recommended action
|
||||
4. Routes appropriately:
|
||||
- Critical: Immediate notification (Telegram/email)
|
||||
- Warning: Log and monitor
|
||||
- Info: Suppress (expected behavior)
|
||||
|
||||
### 4. Email Monitoring & Triage
|
||||
|
||||
**Workflow**: IMAP polling → AI classification → action routing
|
||||
|
||||
1. n8n polls email inbox every 5 minutes
|
||||
2. Filters for keywords: "alert", "critical", "down", "failed"
|
||||
3. Ollama classifies urgency and determines if actionable
|
||||
4. Routes based on classification:
|
||||
- Urgent: Forward to you immediately
|
||||
- Informational: Daily digest
|
||||
- Spam: Archive
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### Example: Repository Documentation Generator
|
||||
|
||||
```javascript
|
||||
// n8n workflow nodes:
|
||||
|
||||
1. Schedule Trigger (daily at 2 AM)
|
||||
↓
|
||||
2. HTTP Request - Forgejo API
|
||||
URL: http://forgejo:3000/api/v1/repos/search
|
||||
Method: GET
|
||||
↓
|
||||
3. Loop Over Items (each repo)
|
||||
↓
|
||||
4. HTTP Request - Get repo files
|
||||
URL: {{$node["Forgejo API"].json["clone_url"]}}/contents
|
||||
↓
|
||||
5. Filter - Find docker-compose.yml and README.md
|
||||
↓
|
||||
6. Ollama Node
|
||||
Model: qwen2.5-coder:7b
|
||||
Prompt: "Analyze this docker-compose file and generate comprehensive
|
||||
documentation including: purpose, services, ports, volumes,
|
||||
environment variables, and setup instructions."
|
||||
↓
|
||||
7. HTTP Request - Wiki.js API
|
||||
URL: http://wikijs:3000/graphql
|
||||
Method: POST
|
||||
Body: {mutation: createPage(...)}
|
||||
↓
|
||||
8. Send Notification
|
||||
Service: Telegram/Email
|
||||
Message: "Documentation updated for {{repo_name}}"
|
||||
```
|
||||
|
||||
### Example: Alert Intelligence Workflow
|
||||
|
||||
```javascript
|
||||
// n8n workflow nodes:
|
||||
|
||||
1. Webhook Trigger
|
||||
Path: /webhook/monitoring-alert
|
||||
↓
|
||||
2. Function Node - Parse Alert Data
|
||||
JavaScript: Extract service, metric, value, timestamp
|
||||
↓
|
||||
3. HTTP Request - Query Historical Data
|
||||
URL: http://beszel:8090/api/metrics/history
|
||||
↓
|
||||
4. Ollama Node
|
||||
Model: llama3.2:3b
|
||||
Context: Your knowledge base in Qdrant
|
||||
Prompt: "Alert: {{alert_message}}
|
||||
Historical context: {{historical_data}}
|
||||
Is this expected behavior?
|
||||
What's the severity?
|
||||
What action should be taken?"
|
||||
↓
|
||||
5. Switch Node - Route by Severity
|
||||
Conditions:
|
||||
- Critical: Route to immediate notification
|
||||
- Warning: Route to monitoring channel
|
||||
- Info: Route to log only
|
||||
↓
|
||||
6a. Send Telegram (Critical path)
|
||||
6b. Post to Slack (Warning path)
|
||||
6c. Write to Log (Info path)
|
||||
```
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Model Management
|
||||
|
||||
```bash
|
||||
# List installed models
|
||||
docker exec -it ollama ollama list
|
||||
|
||||
# Update a model
|
||||
docker exec -it ollama ollama pull qwen2.5-coder:7b
|
||||
|
||||
# Remove unused models
|
||||
docker exec -it ollama ollama rm old-model:tag
|
||||
```
|
||||
|
||||
### Backup Important Data
|
||||
|
||||
```bash
|
||||
# Backup Qdrant vector database
|
||||
docker-compose -f ai-stack-compose.yml stop qdrant
|
||||
tar -czf qdrant-backup-$(date +%Y%m%d).tar.gz ./qdrant_data/
|
||||
docker-compose -f ai-stack-compose.yml start qdrant
|
||||
|
||||
# Backup n8n workflows (automatic to ./n8n/workflows)
|
||||
tar -czf n8n-backup-$(date +%Y%m%d).tar.gz ./n8n_data/
|
||||
|
||||
# Backup Open WebUI data
|
||||
tar -czf openwebui-backup-$(date +%Y%m%d).tar.gz ./open_webui_data/
|
||||
```
|
||||
|
||||
### Log Monitoring
|
||||
|
||||
```bash
|
||||
# View all stack logs
|
||||
docker-compose -f ai-stack-compose.yml logs -f
|
||||
|
||||
# View specific service
|
||||
docker logs -f ollama
|
||||
docker logs -f n8n
|
||||
docker logs -f open-webui
|
||||
```
|
||||
|
||||
### Resource Monitoring
|
||||
|
||||
```bash
|
||||
# Check resource usage
|
||||
docker stats
|
||||
|
||||
# Expected usage:
|
||||
# - ollama: 4-6GB RAM (with model loaded)
|
||||
# - open-webui: ~500MB RAM
|
||||
# - qdrant: ~1GB RAM
|
||||
# - n8n: ~200MB RAM
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Ollama Not Responding
|
||||
|
||||
```bash
|
||||
# Check if Ollama is running
|
||||
docker logs ollama
|
||||
|
||||
# Restart Ollama
|
||||
docker restart ollama
|
||||
|
||||
# Test Ollama API
|
||||
curl http://localhost:11434/api/tags
|
||||
```
|
||||
|
||||
### Open WebUI Can't Connect to Ollama
|
||||
|
||||
1. Check network connectivity:
|
||||
```bash
|
||||
docker exec -it open-webui ping ollama
|
||||
```
|
||||
|
||||
2. Verify Ollama URL in Open WebUI settings
|
||||
3. Restart both containers:
|
||||
```bash
|
||||
docker restart ollama open-webui
|
||||
```
|
||||
|
||||
### n8n Workflows Failing
|
||||
|
||||
1. Check n8n logs:
|
||||
```bash
|
||||
docker logs n8n
|
||||
```
|
||||
|
||||
2. Verify webhook URLs are accessible
|
||||
3. Test Ollama connection from n8n:
|
||||
- Create test workflow
|
||||
- Add Ollama node
|
||||
- Run execution
|
||||
|
||||
### Qdrant Connection Issues
|
||||
|
||||
```bash
|
||||
# Check Qdrant health
|
||||
curl http://localhost:6333/health
|
||||
|
||||
# View Qdrant logs
|
||||
docker logs qdrant
|
||||
|
||||
# Restart if needed
|
||||
docker restart qdrant
|
||||
```
|
||||
|
||||
## Performance Optimization
|
||||
|
||||
### Model Selection by Use Case
|
||||
|
||||
- **Quick queries, chat**: `llama3.2:3b` or `phi3:mini` (fastest)
|
||||
- **Code analysis**: `qwen2.5-coder:7b` or `deepseek-coder:6.7b`
|
||||
- **Complex reasoning**: `mistral:7b` or `llama3.1:8b`
|
||||
|
||||
### n8n Workflow Optimization
|
||||
|
||||
- Use **Wait** nodes to batch operations
|
||||
- Enable **Execute Once** for loops to reduce memory
|
||||
- Store large data in temporary files instead of node output
|
||||
- Use **Split In Batches** for processing large datasets
|
||||
|
||||
### Qdrant Performance
|
||||
|
||||
- Default settings are optimized for homelab use
|
||||
- Increase `collection_shards` if indexing >100,000 documents
|
||||
- Enable quantization for large collections
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Change Default Credentials
|
||||
|
||||
```bash
|
||||
# Generate secure password
|
||||
openssl rand -base64 32
|
||||
|
||||
# Update in docker-compose.yml:
|
||||
# - WEBUI_SECRET_KEY
|
||||
# - N8N_BASIC_AUTH_PASSWORD
|
||||
```
|
||||
|
||||
### Network Isolation
|
||||
|
||||
Consider using a reverse proxy (Traefik, Nginx Proxy Manager) with authentication:
|
||||
- Limit external access to Open WebUI only
|
||||
- Keep n8n, Ollama, Qdrant on internal network
|
||||
- Use VPN for remote access
|
||||
|
||||
### API Security
|
||||
|
||||
- Use strong API tokens for Wiki.js and Forgejo integrations
|
||||
- Rotate credentials periodically
|
||||
- Audit n8n workflow permissions
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Connecting to Existing Services
|
||||
|
||||
**Uptime Kuma**:
|
||||
- Configure webhook alerts → n8n webhook URL
|
||||
- Path: `http://your-server-ip:5678/webhook/uptime-kuma`
|
||||
|
||||
**Beszel**:
|
||||
- Use Shoutrrr webhook format
|
||||
- URL: `http://your-server-ip:5678/webhook/beszel`
|
||||
|
||||
**Forgejo**:
|
||||
- Repository webhooks for push events
|
||||
- URL: `http://your-server-ip:5678/webhook/forgejo-push`
|
||||
- Enable in repo settings → Webhooks
|
||||
|
||||
**Wiki.js**:
|
||||
- GraphQL API endpoint: `http://wikijs:3000/graphql`
|
||||
- Create API key in Wiki.js admin panel
|
||||
- Store in n8n credentials
|
||||
|
||||
## Advanced Features
|
||||
|
||||
### Creating Custom n8n Nodes
|
||||
|
||||
For frequently used Ollama prompts, create custom nodes:
|
||||
|
||||
1. Go to n8n → Settings → Community Nodes
|
||||
2. Install `n8n-nodes-ollama-advanced` if available
|
||||
3. Or create Function nodes with reusable code
|
||||
|
||||
### Training Custom Models
|
||||
|
||||
While Ollama doesn't support fine-tuning directly, you can:
|
||||
1. Use RAG with your specific documentation
|
||||
2. Create detailed system prompts in n8n
|
||||
3. Store organization-specific context in Qdrant
|
||||
|
||||
### Multi-Agent Workflows
|
||||
|
||||
Chain multiple Ollama calls for complex tasks:
|
||||
```
|
||||
Planning Agent → Execution Agent → Review Agent → Output
|
||||
```
|
||||
|
||||
Example: Code refactoring
|
||||
1. Planning: Analyze code and create refactoring plan
|
||||
2. Execution: Generate refactored code
|
||||
3. Review: Check for errors and improvements
|
||||
4. Output: Create pull request with changes
|
||||
|
||||
## Resources
|
||||
|
||||
- **Ollama Documentation**: https://ollama.ai/docs
|
||||
- **Open WebUI Docs**: https://docs.openwebui.com
|
||||
- **n8n Documentation**: https://docs.n8n.io
|
||||
- **Qdrant Docs**: https://qdrant.tech/documentation
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions:
|
||||
1. Check container logs first
|
||||
2. Review this documentation
|
||||
3. Search n8n community forums
|
||||
4. Check Ollama Discord/GitHub issues
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: {{current_date}}
|
||||
**Maintained By**: Homelab Admin
|
||||
**Status**: Production
|
||||
383
Netgrimoire/Gremlin-Grimoire/Stack/Build-Config.md
Normal file
383
Netgrimoire/Gremlin-Grimoire/Stack/Build-Config.md
Normal file
File diff suppressed because one or more lines are too long
194
Netgrimoire/Gremlin-Grimoire/Stack/User-Guide.md
Normal file
194
Netgrimoire/Gremlin-Grimoire/Stack/User-Guide.md
Normal file
File diff suppressed because one or more lines are too long
105
Netgrimoire/Gremlin-Grimoire/Workflows/Forgejo-Audit.md
Normal file
105
Netgrimoire/Gremlin-Grimoire/Workflows/Forgejo-Audit.md
Normal file
|
|
@ -0,0 +1,105 @@
|
|||
---
|
||||
title: Forgejo Audit Workflow
|
||||
description: Weekly automated YAML compliance audit via n8n + Ollama
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: gremlin, n8n, audit, forgejo
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Forgejo Audit Workflow
|
||||
|
||||
**Status:** ✅ Live and confirmed working
|
||||
|
||||
Runs every Monday at 06:00. Walks all compose YAML files in `services/swarm/` and `services/swarm/stack/*/`, audits each one against the Swarm template standard using `qwen2.5-coder:7b`, and commits full reports to Forgejo + sends a summary to ntfy.
|
||||
|
||||
---
|
||||
|
||||
## What It Audits
|
||||
|
||||
Each file is checked for:
|
||||
- Homepage labels on all services
|
||||
- Uptime Kuma labels on all services
|
||||
- Caddy labels on exposed services
|
||||
- `node.platform.arch` exclusion constraints (ARM default)
|
||||
- Volume paths follow `/DockerVol/` or `/data/nfs/znas/Docker/` convention
|
||||
- No forbidden fields (`version:`, `container_name:`, `restart:`, `depends_on:`)
|
||||
- `endpoint_mode: dnsrr` not used
|
||||
- `diun.enable: "true"` present
|
||||
- Network references `netgrimoire` external overlay
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
~67 files total across `swarm/` (flat single-service YAMLs) and `swarm/stack/*/` (grouped stacks).
|
||||
|
||||
---
|
||||
|
||||
## Outputs
|
||||
|
||||
| Output | Where | Content |
|
||||
|--------|-------|---------|
|
||||
| ntfy notification | `gremlin-audits` topic | Short FAIL summary per file |
|
||||
| Forgejo commit | `Netgrimoire/Audits/AUDIT-<name>-<date>.md` | Full audit report (POST new / PUT+SHA update) |
|
||||
|
||||
---
|
||||
|
||||
## n8n Architecture
|
||||
|
||||
```
|
||||
Schedule Trigger (Mon 06:00)
|
||||
→ Forgejo API: list all files in swarm/ and swarm/stack/*/
|
||||
→ Loop Over Items (splitInBatches, batch=1)
|
||||
→ Code node: fetch file content via Forgejo API
|
||||
→ Code node: build Ollama prompt
|
||||
→ Code node: POST to Ollama (qwen2.5-coder:7b)
|
||||
→ Code node: parse result, build report markdown
|
||||
→ Code node: commit report to Forgejo (POST or PUT+SHA)
|
||||
→ Code node: send ntfy summary if FAIL
|
||||
→ Loop feedback connection drives iteration
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Critical Patterns
|
||||
|
||||
All Forgejo and Ollama API calls use `this.helpers.httpRequest()` in Code nodes — **not** HTTP Request nodes. HTTP Request nodes hit body expression limits on large prompts.
|
||||
|
||||
Code nodes in "Run Once for Each Item" mode must return `{ json: ... }` not `[{ json: ... }]`.
|
||||
|
||||
Loop Over Items (splitInBatches, batch=1) + feedback connection from last node back to loop drives iteration over multiple files.
|
||||
|
||||
---
|
||||
|
||||
## Critical Environment Variables
|
||||
|
||||
| Variable | Value | Why |
|
||||
|----------|-------|-----|
|
||||
| `N8N_BLOCK_ENV_ACCESS_IN_NODE` | `false` | Allows env var access inside Code nodes |
|
||||
| `N8N_RUNNERS_TASK_TIMEOUT` | `3600` | Prevents timeout on 67-file audit runs |
|
||||
|
||||
---
|
||||
|
||||
## Forgejo API Tokens
|
||||
|
||||
| Token | Scope |
|
||||
|-------|-------|
|
||||
| Read token | Fetch file content from `traveler/services` |
|
||||
| Write token | Commit audit reports to `traveler/Netgrimoire` |
|
||||
|
||||
Tokens stored in n8n credentials, not in compose env vars.
|
||||
|
||||
---
|
||||
|
||||
## Forgejo Webhook Gotcha
|
||||
|
||||
If Forgejo webhooks fail to reach n8n, add to Forgejo `app.ini`:
|
||||
|
||||
```ini
|
||||
[webhook]
|
||||
ALLOWED_HOST_LIST = *
|
||||
```
|
||||
|
||||
Required when `OFFLINE_MODE = true`. Restart Forgejo after edit.
|
||||
63
Netgrimoire/Gremlin-Grimoire/Workflows/Kuma-Triage.md
Normal file
63
Netgrimoire/Gremlin-Grimoire/Workflows/Kuma-Triage.md
Normal file
|
|
@ -0,0 +1,63 @@
|
|||
---
|
||||
title: Kuma Alert Triage Workflow
|
||||
description: Uptime Kuma webhook → Ollama analysis → ntfy alert
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: gremlin, n8n, kuma, alerts
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Kuma Alert Triage Workflow
|
||||
|
||||
**Status:** ✅ Live and confirmed working
|
||||
|
||||
Triggered by Uptime Kuma webhook on service DOWN or RECOVERED events. DOWN events are analyzed by `llama3.2:3b` before alerting. RECOVERED events skip AI and send a simple notification.
|
||||
|
||||
---
|
||||
|
||||
## Webhook URL
|
||||
|
||||
```
|
||||
https://n8n.netgrimoire.com/webhook/gremlin-kuma-alert
|
||||
```
|
||||
|
||||
Configure in Uptime Kuma: Settings → Notifications → Webhook → apply to all monitors.
|
||||
|
||||
---
|
||||
|
||||
## Flow
|
||||
|
||||
```
|
||||
Kuma Webhook
|
||||
├── DOWN path:
|
||||
│ → Parse payload (service name, URL, error)
|
||||
│ → Ollama (llama3.2:3b): triage prompt
|
||||
│ → ntfy gremlin-alerts (urgent priority) with AI analysis
|
||||
│
|
||||
└── RECOVERED path:
|
||||
→ ntfy gremlin-alerts (normal priority, no AI call)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Why Two Paths
|
||||
|
||||
AI triage is only useful for DOWN events — there's nothing to analyze on a recovery. Skipping Ollama on RECOVERED keeps notification latency near-instant for good news.
|
||||
|
||||
---
|
||||
|
||||
## ntfy Output Format
|
||||
|
||||
DOWN alert includes:
|
||||
- Service name and URL
|
||||
- Kuma error message
|
||||
- Ollama's triage assessment (probable cause, suggested first step)
|
||||
|
||||
RECOVERED alert is a simple one-liner.
|
||||
|
||||
---
|
||||
|
||||
## Parked: Doc Generation Workflows
|
||||
|
||||
Two additional doc generation workflows were built but are currently inactive. CPU-only `llama3.2:3b` output barely exceeds reformatting the source compose file — not useful enough to commit. Will be revisited when GPU support is added to the Gremlin stack.
|
||||
522
Netgrimoire/Keystone-Grimoire/Docker/Caddy.md
Normal file
522
Netgrimoire/Keystone-Grimoire/Docker/Caddy.md
Normal file
|
|
@ -0,0 +1,522 @@
|
|||
---
|
||||
title: Caddy Reverse Proxy
|
||||
description: Curreent and future config
|
||||
published: true
|
||||
date: 2026-02-25T01:50:20.558Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-23T22:09:16.106Z
|
||||
---
|
||||
|
||||
# Caddy Reverse Proxy
|
||||
|
||||
**Host:** znas (Docker Swarm node)
|
||||
**Internal IP:** 192.168.5.10
|
||||
**Data Path:** `/export/Docker/caddy/`
|
||||
**Networks:** `netgrimoire` (service network), `vpn`
|
||||
**Ports:** 80 (mapped to host 8900), 443
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Caddy serves as the primary reverse proxy for all public and internal web services. It uses the `caddy-docker-proxy` pattern, which allows services to register themselves with Caddy by adding Docker labels to their compose files — no manual Caddyfile edits required per service.
|
||||
|
||||
Configuration is **hybrid**: some services are defined entirely via Docker labels, others are defined statically in the Caddyfile, and most use both (labels for routing, Caddyfile for shared snippets). The `caddy-docker-proxy` container merges both sources at runtime.
|
||||
|
||||
---
|
||||
|
||||
## Current State
|
||||
|
||||
### Image
|
||||
|
||||
```yaml
|
||||
image: lucaslorentz/caddy-docker-proxy:ci-alpine
|
||||
```
|
||||
|
||||
This image provides the Docker Proxy module only. It has no CrowdSec, GeoIP, or rate limiting built in.
|
||||
|
||||
### Docker Compose (`/export/Docker/caddy/docker-compose.yml`)
|
||||
|
||||
```yaml
|
||||
configs:
|
||||
caddy-basic-content:
|
||||
file: ./Caddyfile
|
||||
labels:
|
||||
caddy:
|
||||
|
||||
services:
|
||||
caddy:
|
||||
image: lucaslorentz/caddy-docker-proxy:ci-alpine
|
||||
ports:
|
||||
- 8900:80
|
||||
- 443:443
|
||||
environment:
|
||||
- CADDY_INGRESS_NETWORKS=netgrimoire
|
||||
networks:
|
||||
- netgrimoire
|
||||
- vpn
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock
|
||||
- /export/Docker/caddy/Caddyfile:/etc/caddy/Caddyfile
|
||||
- /export/Docker/caddy:/data
|
||||
#- /export/Docker/caddy/logs:/var/log/caddy # Placeholder for CrowdSec log mount
|
||||
deploy:
|
||||
placement:
|
||||
constraints:
|
||||
- node.hostname == znas
|
||||
|
||||
networks:
|
||||
netgrimoire:
|
||||
external: true
|
||||
vpn:
|
||||
external: true
|
||||
```
|
||||
|
||||
### Caddyfile (`/export/Docker/caddy/Caddyfile`)
|
||||
|
||||
The Caddyfile defines shared authentication snippets and static site blocks. These snippets are available to all services — including label-defined ones — via `import`.
|
||||
|
||||
```caddyfile
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# AUTH SNIPPETS
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
(authentik) {
|
||||
route /outpost.goauthentik.io/* {
|
||||
reverse_proxy http://authentik:9000
|
||||
}
|
||||
|
||||
forward_auth http://authentik:9000 {
|
||||
uri /outpost.goauthentik.io/auth/caddy
|
||||
header_up X-Forwarded-URI {http.request.uri}
|
||||
copy_headers X-Authentik-Username X-Authentik-Groups X-Authentik-Email \
|
||||
X-Authentik-Name X-Authentik-Uid X-Authentik-Jwt \
|
||||
X-Authentik-Meta-Jwks X-Authentik-Meta-Outpost X-Authentik-Meta-Provider \
|
||||
X-Authentik-Meta-App X-Authentik-Meta-Version
|
||||
}
|
||||
}
|
||||
|
||||
(authelia) {
|
||||
forward_auth http://authelia:9091 {
|
||||
uri /api/verify?rd=https://login.wasted-bandwidth.net/
|
||||
copy_headers Remote-User Remote-Groups Remote-Email Remote-Name
|
||||
}
|
||||
}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# MAIL SNIPPETS
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
(email-proxy) {
|
||||
redir https://mail.netgrimoire.com/sogo 301
|
||||
}
|
||||
|
||||
(mailcow-proxy) {
|
||||
reverse_proxy nginx-mailcow:80
|
||||
}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# STATIC SITE BLOCKS — NETGRIMOIRE.COM
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
cloud.netgrimoire.com {
|
||||
reverse_proxy http://nextcloud-aio-apache:11000
|
||||
}
|
||||
|
||||
log.netgrimoire.com {
|
||||
reverse_proxy http://graylog:9000
|
||||
}
|
||||
|
||||
win.netgrimoire.com {
|
||||
reverse_proxy http://192.168.5.10:8006
|
||||
}
|
||||
|
||||
docker.netgrimoire.com {
|
||||
reverse_proxy http://portainer:9000
|
||||
}
|
||||
|
||||
immich.netgrimoire.com {
|
||||
reverse_proxy http://192.168.5.10:2283
|
||||
}
|
||||
|
||||
npm.netgrimoire.com {
|
||||
reverse_proxy http://librenms:8000
|
||||
}
|
||||
|
||||
#jellyfin.netgrimoire.com {
|
||||
# reverse_proxy http://jellyfin:8096
|
||||
#}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# AUTHENTICATED — NETGRIMOIRE.COM
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
dozzle.netgrimoire.com {
|
||||
import authentik
|
||||
reverse_proxy http://192.168.4.72:8043
|
||||
}
|
||||
|
||||
dns.netgrimoire.com {
|
||||
import authentik
|
||||
reverse_proxy http://192.168.5.7:5380
|
||||
}
|
||||
|
||||
webtop.netgrimoire.com {
|
||||
import authentik
|
||||
reverse_proxy http://webtop:3000
|
||||
}
|
||||
|
||||
jackett.netgrimoire.com {
|
||||
import authentik
|
||||
reverse_proxy http://gluetun:9117
|
||||
}
|
||||
|
||||
transmission.netgrimoire.com {
|
||||
import authentik
|
||||
reverse_proxy http://gluetun:9091
|
||||
}
|
||||
|
||||
scrutiny.netgrimoire.com {
|
||||
import authentik
|
||||
reverse_proxy http://192.168.5.10:8081
|
||||
}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# AUTHENTICATED — WASTED-BANDWIDTH.NET (Authelia)
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
stash.wasted-bandwidth.net {
|
||||
import authelia
|
||||
reverse_proxy http://192.168.5.10:9999
|
||||
}
|
||||
|
||||
namer.wasted-bandwidth.net {
|
||||
import authelia
|
||||
reverse_proxy http://192.168.5.10:6980
|
||||
}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# PUBLIC — PNCHARRIS.COM / WASTED-BANDWIDTH.NET
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
fish.pncharris.com {
|
||||
reverse_proxy http://web
|
||||
}
|
||||
|
||||
www.wasted-bandwidth.net {
|
||||
reverse_proxy http://web
|
||||
}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# MAILCOW — MULTI-DOMAIN
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
mail.netgrimoire.com, autodiscover.netgrimoire.com, autoconfig.netgrimoire.com, \
|
||||
mail.wasted-bandwidth.net, autodiscover.wasted-bandwidth.net, autoconfig.wasted-bandwidth.net, \
|
||||
mail.gnarlypandaproductions.com, autodiscover.gnarlypandaproductions.com, autoconfig.gnarlypandaproductions.com, \
|
||||
mail.pncfishandmore.com, autodiscover.pncfishandmore.com, autoconfig.pncfishandmore.com, \
|
||||
mail.pncharrisenterprises.com, autodiscover.pncharrisenterprises.com, autoconfig.pncharrisenterprises.com, \
|
||||
mail.pncharris.com, autodiscover.pncharris.com, autoconfig.pncharris.com, \
|
||||
mail.florosafd.org, autodiscover.florosafd.org, autoconfig.florosafd.org {
|
||||
import mailcow-proxy
|
||||
}
|
||||
```
|
||||
|
||||
### Docker Label Pattern (label-defined services)
|
||||
|
||||
Services not in the Caddyfile are registered via labels on their own containers. The snippet defined in the Caddyfile is available to them via `caddy.import`:
|
||||
|
||||
```yaml
|
||||
labels:
|
||||
- caddy=homepage.netgrimoire.com
|
||||
- caddy.import=authentik
|
||||
- caddy.reverse_proxy={{upstreams 3000}}
|
||||
```
|
||||
|
||||
For services that need no auth:
|
||||
```yaml
|
||||
labels:
|
||||
- caddy=myservice.netgrimoire.com
|
||||
- caddy.reverse_proxy={{upstreams 8080}}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Authentication Layers
|
||||
|
||||
Two identity proxies are in use, each serving different domains/use cases:
|
||||
|
||||
| Provider | Domain Pattern | Snippet |
|
||||
|----------|----------------|---------|
|
||||
| Authentik | `*.netgrimoire.com` internal tools | `import authentik` |
|
||||
| Authelia | `*.wasted-bandwidth.net` | `import authelia` |
|
||||
|
||||
Services without an auth import are either public (e.g. `fish.pncharris.com`) or carry their own authentication (e.g. Nextcloud, Graylog, Portainer).
|
||||
|
||||
---
|
||||
|
||||
## Current Security Posture
|
||||
|
||||
CrowdSec protection exists only at the **OPNsense firewall level** — IP reputation blocking before traffic reaches Caddy. CrowdSec does not currently inspect HTTP traffic at the application layer. This means:
|
||||
|
||||
- Known-bad IPs are blocked at the perimeter
|
||||
- Application-layer attacks (SQLi in URLs, malicious paths, bad user agents, brute force on specific endpoints) are not blocked at the Caddy level
|
||||
- Services behind Authentik/Authelia have an additional protection layer; unauthenticated public services do not
|
||||
|
||||
---
|
||||
|
||||
## Future State: CrowdSec + GeoIP + Rate Limiting
|
||||
|
||||
### Target Image
|
||||
|
||||
```yaml
|
||||
image: ghcr.io/serfriz/caddy-crowdsec-geoip-ratelimit-security-dockerproxy:latest
|
||||
```
|
||||
|
||||
This is a drop-in replacement for `lucaslorentz/caddy-docker-proxy`. All existing Docker labels and Caddyfile site blocks continue to work unchanged. The image is automatically rebuilt monthly when Caddy releases updates — no custom image maintenance required.
|
||||
|
||||
**Included modules:**
|
||||
- `caddy-docker-proxy` — same label-based config as current
|
||||
- `caddy-crowdsec-bouncer` — inline HTTP blocking based on CrowdSec decisions
|
||||
- `caddy-geoip` — GeoIP filtering at the application layer
|
||||
- `caddy-ratelimit` — per-endpoint rate limiting
|
||||
- `caddy-security` — additional auth/security middleware
|
||||
|
||||
### Updated Compose
|
||||
|
||||
```yaml
|
||||
configs:
|
||||
caddy-basic-content:
|
||||
file: ./Caddyfile
|
||||
labels:
|
||||
caddy:
|
||||
|
||||
services:
|
||||
caddy:
|
||||
image: ghcr.io/serfriz/caddy-crowdsec-geoip-ratelimit-security-dockerproxy:latest
|
||||
ports:
|
||||
- 8900:80
|
||||
- 443:443
|
||||
environment:
|
||||
- CADDY_INGRESS_NETWORKS=netgrimoire
|
||||
- CADDY_DOCKER_EVENT_THROTTLE_INTERVAL=2000 # Prevents non-deterministic reload with CrowdSec module
|
||||
- CROWDSEC_API_KEY=BYSLg/wKOa7wlHYzChJpBVJA06Ukc7G6fKJCvBwjyZg
|
||||
networks:
|
||||
- netgrimoire
|
||||
- vpn
|
||||
- crowdsec_net
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock
|
||||
- /export/Docker/caddy/Caddyfile:/etc/caddy/Caddyfile
|
||||
- /export/Docker/caddy:/data
|
||||
- caddy-logs:/var/log/caddy
|
||||
deploy:
|
||||
placement:
|
||||
constraints:
|
||||
- node.hostname == znas
|
||||
|
||||
crowdsec:
|
||||
image: crowdsecurity/crowdsec
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
COLLECTIONS: "crowdsecurity/caddy crowdsecurity/http-cve crowdsecurity/whitelist-good-actors"
|
||||
BOUNCER_KEY_CADDY: BYSLg/wKOa7wlHYzChJpBVJA06Ukc7G6fKJCvBwjyZg # Pre-registers the Caddy bouncer automatically
|
||||
volumes:
|
||||
- crowdsec-db:/var/lib/crowdsec/data
|
||||
- ./crowdsec/acquis.yaml:/etc/crowdsec/acquis.yaml
|
||||
- caddy-logs:/var/log/caddy:ro
|
||||
networks:
|
||||
- crowdsec_net
|
||||
deploy:
|
||||
placement:
|
||||
constraints:
|
||||
- node.hostname == znas
|
||||
|
||||
volumes:
|
||||
caddy-logs:
|
||||
crowdsec-db:
|
||||
|
||||
networks:
|
||||
netgrimoire:
|
||||
external: true
|
||||
vpn:
|
||||
external: true
|
||||
crowdsec_net:
|
||||
driver: overlay # Swarm overlay network
|
||||
```
|
||||
|
||||
### CrowdSec Log Acquisition (`./crowdsec/acquis.yaml`)
|
||||
|
||||
```yaml
|
||||
filenames:
|
||||
- /var/log/caddy/access.log
|
||||
labels:
|
||||
type: caddy
|
||||
```
|
||||
|
||||
### Environment File (`.env`)
|
||||
|
||||
```env
|
||||
CROWDSEC_API_KEY=<generate-with-cscli-or-set-before-first-boot>
|
||||
```
|
||||
|
||||
The `BOUNCER_KEY_CADDY` env var in the CrowdSec container pre-registers the bouncer key at startup. Set the same value in `.env` as `CROWDSEC_API_KEY` and both sides will be in sync on first boot — no need to run `cscli bouncers add` manually.
|
||||
|
||||
### Updated Caddyfile Additions
|
||||
|
||||
Add a global block at the top of the Caddyfile and a new `crowdsec` snippet. All other existing content remains unchanged.
|
||||
|
||||
```caddyfile
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# GLOBAL BLOCK — add this at the very top before any snippets
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
{
|
||||
crowdsec {
|
||||
api_url http://crowdsec:8080
|
||||
api_key {$CROWDSEC_API_KEY}
|
||||
}
|
||||
log {
|
||||
output file /var/log/caddy/access.log {
|
||||
roll_size 50mb
|
||||
roll_keep 5
|
||||
}
|
||||
format json
|
||||
}
|
||||
}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# CROWDSEC SNIPPET — add alongside existing auth snippets
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
(crowdsec) {
|
||||
route {
|
||||
crowdsec
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Applying CrowdSec to Existing Services
|
||||
|
||||
Once the snippet exists, add `import crowdsec` to site blocks and container labels. This is a **gradual rollout** — services without it remain fully functional, just without Caddy-level CrowdSec inspection (they still have OPNsense perimeter protection).
|
||||
|
||||
**In the Caddyfile:**
|
||||
```caddyfile
|
||||
# Before
|
||||
cloud.netgrimoire.com {
|
||||
reverse_proxy http://nextcloud-aio-apache:11000
|
||||
}
|
||||
|
||||
# After
|
||||
cloud.netgrimoire.com {
|
||||
import crowdsec
|
||||
reverse_proxy http://nextcloud-aio-apache:11000
|
||||
}
|
||||
|
||||
# With auth
|
||||
dozzle.netgrimoire.com {
|
||||
import crowdsec
|
||||
import authentik
|
||||
reverse_proxy http://192.168.4.72:8043
|
||||
}
|
||||
```
|
||||
|
||||
**In Docker labels:**
|
||||
```yaml
|
||||
labels:
|
||||
- caddy=homepage.netgrimoire.com
|
||||
- caddy.import=crowdsec
|
||||
- caddy.import=authentik
|
||||
- caddy.reverse_proxy={{upstreams 3000}}
|
||||
```
|
||||
|
||||
### CrowdSec Rollout Priority
|
||||
|
||||
Roll out `import crowdsec` in this order based on risk exposure:
|
||||
|
||||
**High priority — do first (public, no auth):**
|
||||
- `cloud.netgrimoire.com` (Nextcloud)
|
||||
- `immich.netgrimoire.com`
|
||||
- `docker.netgrimoire.com` (Portainer)
|
||||
- `fish.pncharris.com`
|
||||
- `www.wasted-bandwidth.net`
|
||||
|
||||
**Medium priority — high value behind auth:**
|
||||
- `log.netgrimoire.com` (Graylog)
|
||||
- `win.netgrimoire.com` (Proxmox)
|
||||
- All `dozzle`, `dns`, `webtop`, `jackett`, `transmission`, `scrutiny`
|
||||
|
||||
**Lower priority — already protected by Authelia/Authentik:**
|
||||
- `stash.wasted-bandwidth.net`
|
||||
- `namer.wasted-bandwidth.net`
|
||||
- All label-defined services behind auth
|
||||
|
||||
**Skip:**
|
||||
- Mailcow block — handled by nginx-mailcow, different threat model
|
||||
|
||||
### Behavior if CrowdSec Container Goes Down
|
||||
|
||||
The bouncer is designed to **fail open** by default. If `crowdsec` is unreachable, Caddy continues serving traffic normally — enforcement is temporarily suspended but the site stays up. This is the safe default for a homelab. To change this behavior, set `enable_hard_fails true` in the global crowdsec block (will cause 500 errors if CrowdSec is down — not recommended for homelab).
|
||||
|
||||
---
|
||||
|
||||
## Bootstrap Steps
|
||||
|
||||
When ready to migrate to the new image:
|
||||
|
||||
**Step 1 — Add the CrowdSec global block and snippet to the Caddyfile** before changing the image. This ensures the Caddyfile is valid for the new image on startup.
|
||||
|
||||
**Step 2 — Create `./crowdsec/acquis.yaml`** with the content above.
|
||||
|
||||
**Step 3 — Create `.env`** with a strong random value for `CROWDSEC_API_KEY`:
|
||||
```bash
|
||||
openssl rand -hex 32
|
||||
```
|
||||
|
||||
**Step 4 — Update the image and add the CrowdSec service to the compose file**, then redeploy:
|
||||
```bash
|
||||
docker stack deploy -c docker-compose.yml caddy
|
||||
```
|
||||
|
||||
**Step 5 — Verify CrowdSec is reading Caddy logs:**
|
||||
```bash
|
||||
docker exec <crowdsec_container> cscli metrics
|
||||
```
|
||||
Look for the `Acquisition Metrics` table showing hits from `/var/log/caddy/access.log`.
|
||||
|
||||
**Step 6 — Test a ban manually:**
|
||||
```bash
|
||||
docker exec <crowdsec_container> cscli decisions add --ip 1.2.3.4 --duration 5m
|
||||
# Verify the IP gets a 403 from Caddy
|
||||
curl -I https://yoursite.com --resolve yoursite.com:443:1.2.3.4
|
||||
docker exec <crowdsec_container> cscli decisions delete --ip 1.2.3.4
|
||||
```
|
||||
|
||||
**Step 7 — Gradually add `import crowdsec`** to site blocks and labels per the priority order above.
|
||||
|
||||
---
|
||||
|
||||
## File Layout
|
||||
|
||||
```
|
||||
/export/Docker/caddy/
|
||||
├── Caddyfile # Shared snippets and static site blocks
|
||||
├── docker-compose.yml # Caddy + CrowdSec services
|
||||
├── .env # CROWDSEC_API_KEY (future)
|
||||
├── data/ # Caddy data volume (TLS certs, etc.)
|
||||
├── logs/ # caddy-logs volume mount point (future)
|
||||
└── crowdsec/
|
||||
└── acquis.yaml # Tells CrowdSec where to read Caddy logs (future)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Known Issues / Notes
|
||||
|
||||
- Port 80 is mapped to host port 8900 — this is intentional for Swarm. OPNsense NAT handles the external 80→8900 translation.
|
||||
- The `CADDY_DOCKER_EVENT_THROTTLE_INTERVAL=2000` setting is **required** with the CrowdSec module to prevent non-deterministic domain matching behavior during container label reloads (see [issue #61](https://github.com/hslatman/caddy-crowdsec-bouncer/issues/61)).
|
||||
- Jellyfin is commented out in the Caddyfile — likely served via a different path or disabled temporarily.
|
||||
- The `web` upstream referenced by `fish.pncharris.com` and `www.wasted-bandwidth.net` resolves to a container named `web` on the `netgrimoire` network.
|
||||
- Authelia redirect URL is `https://login.wasted-bandwidth.net/` — update if this changes.
|
||||
- The serfriz image is rebuilt on the **1st of each month** for module updates, and on every new Caddy release. Force a module update by recreating the container: `docker service update --force caddy_caddy`.
|
||||
144
Netgrimoire/Keystone-Grimoire/Docker/Swarm-Template.md
Normal file
144
Netgrimoire/Keystone-Grimoire/Docker/Swarm-Template.md
Normal file
|
|
@ -0,0 +1,144 @@
|
|||
---
|
||||
title: Docker Swarm Template Standard
|
||||
description: Canonical YAML template and label rules for all Netgrimoire swarm services
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: keystone, docker, swarm
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Docker Swarm Template Standard
|
||||
|
||||
All Swarm YAML files in `services/swarm/` and `services/swarm/stack/` must follow this standard. The Gremlin audit workflow checks compliance weekly.
|
||||
|
||||
---
|
||||
|
||||
## Canonical Template
|
||||
|
||||
```yaml
|
||||
# Deploy: docker stack deploy -c <service>.yaml <service>
|
||||
services:
|
||||
<servicename>:
|
||||
image: <image>:latest
|
||||
environment:
|
||||
TZ: America/Chicago
|
||||
volumes:
|
||||
- /DockerVol/<servicename>:/config
|
||||
# - /data/nfs/znas/Docker/<servicename>:/data
|
||||
networks:
|
||||
- netgrimoire
|
||||
deploy:
|
||||
restart_policy:
|
||||
condition: any
|
||||
delay: 5s
|
||||
max_attempts: 3
|
||||
window: 120s
|
||||
placement:
|
||||
constraints:
|
||||
- node.hostname == znas
|
||||
- node.platform.arch != aarch64
|
||||
- node.platform.arch != arm
|
||||
labels:
|
||||
# Caddy
|
||||
caddy: <servicename>.netgrimoire.com
|
||||
caddy.reverse_proxy: <servicename>:<PORT>
|
||||
caddy.import: crowdsec
|
||||
caddy.import_1: authentik
|
||||
|
||||
# Uptime Kuma
|
||||
kuma.<servicename>.http.name: <Service Name>
|
||||
kuma.<servicename>.http.url: https://<servicename>.netgrimoire.com
|
||||
|
||||
# Homepage
|
||||
homepage.group: <Group>
|
||||
homepage.name: <Service Name>
|
||||
homepage.icon: <service>.png
|
||||
homepage.href: https://<servicename>.netgrimoire.com
|
||||
homepage.description: <Description>
|
||||
|
||||
# DIUN
|
||||
diun.enable: "true"
|
||||
|
||||
networks:
|
||||
netgrimoire:
|
||||
external: true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Forbidden Fields
|
||||
|
||||
Never use these at the service level:
|
||||
|
||||
| Field | Reason |
|
||||
|-------|--------|
|
||||
| `version:` | Deprecated in Compose v2+ |
|
||||
| `container_name:` | Incompatible with Swarm replicas |
|
||||
| `restart:` | Use `deploy.restart_policy` instead |
|
||||
| `depends_on:` | Not supported in Swarm mode |
|
||||
| `endpoint_mode: dnsrr` | Breaks internal DNS — always use VIP |
|
||||
|
||||
---
|
||||
|
||||
## Volume Path Rules
|
||||
|
||||
| Path | When to Use |
|
||||
|------|-------------|
|
||||
| `/DockerVol/<service>` | Config, SQLite DBs, small app state. **Only valid with a `node.hostname` placement constraint.** |
|
||||
| `/data/nfs/znas/Docker/<service>` | Bulk data, media, or any service without a hostname constraint |
|
||||
|
||||
---
|
||||
|
||||
## Placement Constraints
|
||||
|
||||
**Default (all services):**
|
||||
```yaml
|
||||
constraints:
|
||||
- node.hostname == znas
|
||||
- node.platform.arch != aarch64
|
||||
- node.platform.arch != arm
|
||||
```
|
||||
|
||||
ARM exclusion prevents accidental scheduling on Pi vault/worker nodes. Override only if the service is ARM-specific.
|
||||
|
||||
For services pinned to docker4 (Gremlin stack):
|
||||
```yaml
|
||||
constraints:
|
||||
- node.hostname == docker4
|
||||
- node.platform.arch != aarch64
|
||||
- node.platform.arch != arm
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Caddy Label Rules
|
||||
|
||||
```yaml
|
||||
caddy: servicename.netgrimoire.com # no https:// prefix
|
||||
caddy.reverse_proxy: servicename:PORT # container name:port, NOT {{upstreams PORT}}
|
||||
caddy.import: crowdsec # always both
|
||||
caddy.import_1: authentik # always both, no exceptions
|
||||
```
|
||||
|
||||
Never use `{{upstreams PORT}}` — it breaks during `docker stack config` preprocessing.
|
||||
|
||||
**Wasted-bandwidth services** use `wasted-bandwidth.net` domain and `caddy.import_1: authelia` instead of authentik.
|
||||
|
||||
---
|
||||
|
||||
## Deploy Workflow
|
||||
|
||||
```bash
|
||||
# From services repo root
|
||||
git add . && git commit -m "Add/update <service>" && git push
|
||||
|
||||
# On znas (or docker4 for Gremlin services)
|
||||
cd ~/services && git pull
|
||||
cd swarm/stack/<StackName>
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file <service>.yaml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml <service>
|
||||
rm resolved.yml
|
||||
docker stack services <service>
|
||||
```
|
||||
59
Netgrimoire/Keystone-Grimoire/Hosts/Host-Inventory.md
Normal file
59
Netgrimoire/Keystone-Grimoire/Hosts/Host-Inventory.md
Normal file
|
|
@ -0,0 +1,59 @@
|
|||
---
|
||||
title: Host Inventory
|
||||
description: All Netgrimoire nodes — roles, IPs, services, hardware
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: keystone, hosts
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Host Inventory
|
||||
|
||||
## Swarm Cluster
|
||||
|
||||
| Host | Hostname | IP | Role | Runtime |
|
||||
|------|----------|----|------|---------|
|
||||
| znas | znas | 192.168.5.10 | NAS + Primary Swarm manager | Swarm manager + Compose |
|
||||
| docker2 | — | — | VPN gateway | Compose only |
|
||||
| docker3 | — | — | LibreNMS | Compose only |
|
||||
| docker4 | hermes | 192.168.5.16 | Mail server + AI worker | Compose + Swarm worker |
|
||||
| docker5 | — | 192.168.5.18 | Media host | Compose only |
|
||||
| Pi nodes | various | various | Swarm workers + vault nodes | Swarm workers |
|
||||
|
||||
## Other Infrastructure
|
||||
|
||||
| Device | IP | Purpose |
|
||||
|--------|----|---------|
|
||||
| OPNsense firewall | 192.168.3.4 | Firewall, dual-WAN, NAT, WireGuard |
|
||||
| Internal DNS | 192.168.5.7 | Technitium DNS |
|
||||
| ISPConfig | 192.168.4.11 | Web/DNS hosting control panel |
|
||||
|
||||
## WAN
|
||||
|
||||
| Interface | IP | Status | Purpose |
|
||||
|-----------|----|----|---------|
|
||||
| ATT (`igc1`) | 107.133.34.145/28 | Primary | 5 static IPs allocated |
|
||||
| Cox | — | Retiring | Legacy WAN |
|
||||
|
||||
**ATT Static IP Assignments:**
|
||||
|
||||
| IP | Assigned To |
|
||||
|----|-------------|
|
||||
| .145 | Admin / default |
|
||||
| .146 | Web services |
|
||||
| .147 | Jellyfin |
|
||||
| .148 | Mail (ATT_Mail — pending) |
|
||||
| .149 | WireGuard / Spare |
|
||||
|
||||
## Pinned Services by Host
|
||||
|
||||
**znas** — Caddy, Forgejo, Wiki.js, Homepage, Uptime Kuma, AutoKuma, ntfy, Portainer, Authentik, LLDAP, Kopia, Vault, Nextcloud AIO, Immich, Joplin, n8n (Gremlin), all arr services, all media services
|
||||
|
||||
**docker4 (hermes)** — MailCow (Compose), Ollama, Open WebUI, Qdrant (Swarm, pinned docker4), Roundcube
|
||||
|
||||
**docker5** — Jellyfin, Jellyfinx (Compose)
|
||||
|
||||
**docker2** — Gluetun, Jackett, Transmission (Compose)
|
||||
|
||||
**docker3** — LibreNMS (Compose)
|
||||
401
Netgrimoire/Keystone-Grimoire/Mail/Domain-Setup.md
Normal file
401
Netgrimoire/Keystone-Grimoire/Mail/Domain-Setup.md
Normal file
|
|
@ -0,0 +1,401 @@
|
|||
---
|
||||
title: Sample Domain Setup
|
||||
description: Graymutt@nucking-futz.com
|
||||
published: true
|
||||
date: 2026-03-16T00:34:08.387Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-25T22:02:27.719Z
|
||||
---
|
||||
|
||||
# Mail Setup — nucking-futz.com
|
||||
|
||||
## Part 0 — OPNsense: Configure ATT_Mail Secondary IP
|
||||
|
||||
Before configuring DNS or Mailcow, the secondary AT&T static IP must be configured in OPNsense as a virtual IP on the WAN interface and NAT rules must be set so only raw SMTP traffic (ports 25, 465, 587, 993, 143) uses this address. Webmail, the Mailcow admin UI, and all other traffic continue to use the primary WAN IP (107.133.34.145).
|
||||
|
||||
| Address | Purpose |
|
||||
|---------|---------|
|
||||
| 107.133.34.145 | Primary WAN — web, admin, everything else |
|
||||
| 107.133.34.146 | ATT_Mail — SMTP/IMAP inbound and outbound only |
|
||||
|
||||
### Step 0.1 — Add Virtual IP
|
||||
|
||||
1. Go to **Interfaces → Virtual IPs → Settings**
|
||||
2. Click **+ Add**
|
||||
3. Set the following:
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Mode | IP Alias |
|
||||
| Interface | WAN (igc1) |
|
||||
| Network / Address | `107.133.34.146 / 28` |
|
||||
| Description | `ATT_Mail` |
|
||||
|
||||
4. Click **Save**, then **Apply changes**
|
||||
|
||||
> The /28 subnet mask matches the AT&T block (107.133.34.144/28). All 5 static IPs in the block share this mask.
|
||||
|
||||
### Step 0.2 — Outbound NAT for SMTP Traffic
|
||||
|
||||
This ensures Mailcow's outbound SMTP connections leave through the ATT_Mail IP rather than the primary WAN IP. OPNsense must be in **Hybrid** or **Manual** outbound NAT mode.
|
||||
|
||||
1. Go to **Firewall → NAT → Outbound**
|
||||
2. Confirm mode is set to **Hybrid Outbound NAT** (or Manual — either works)
|
||||
3. Click **Add** to create a new rule
|
||||
|
||||
**Rule for outbound SMTP (port 587 relay to MXRoute):**
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Interface | WAN |
|
||||
| TCP/IP Version | IPv4 |
|
||||
| Protocol | TCP |
|
||||
| Source | `192.168.5.16 / 32` (Mailcow host) |
|
||||
| Source Port | any |
|
||||
| Destination | any |
|
||||
| Destination Port | 587 |
|
||||
| Translation / Target | `107.133.34.146` (ATT_Mail) |
|
||||
| Description | `Mailcow outbound relay via ATT_Mail` |
|
||||
|
||||
4. Repeat for port **25** (direct outbound SMTP, if used) and port **465** (SMTPS)
|
||||
5. Click **Save** and **Apply changes**
|
||||
|
||||
### Step 0.3 — Inbound NAT (Port Forwards) for Mail Ports
|
||||
|
||||
Route inbound connections on mail ports to Mailcow using the ATT_Mail IP as the external address.
|
||||
|
||||
1. Go to **Firewall → NAT → Port Forward**
|
||||
2. Create rules for each mail port:
|
||||
|
||||
| External IP | Port(s) | Forward to | Description |
|
||||
|-------------|---------|-----------|-------------|
|
||||
| 107.133.34.146 | 25 | 192.168.5.16:25 | SMTP inbound |
|
||||
| 107.133.34.146 | 465 | 192.168.5.16:465 | SMTPS inbound |
|
||||
| 107.133.34.146 | 587 | 192.168.5.16:587 | Submission inbound |
|
||||
| 107.133.34.146 | 993 | 192.168.5.16:993 | IMAPS |
|
||||
| 107.133.34.146 | 143 | 192.168.5.16:143 | IMAP (if needed) |
|
||||
|
||||
> **Do not** add port forwards for 80, 443, or 3443 (Mailcow admin/webmail ports) on this IP. Those remain on the primary WAN IP via Caddy.
|
||||
|
||||
3. Click **Save** and **Apply changes**
|
||||
|
||||
### Step 0.4 — Firewall Rules
|
||||
|
||||
Ensure the WAN firewall rules permit inbound traffic on the mail ports to the ATT_Mail IP. If you have a default deny-all WAN rule (recommended), add explicit pass rules:
|
||||
|
||||
1. Go to **Firewall → Rules → WAN**
|
||||
2. Add pass rules for each port in the table above with destination `107.133.34.146`
|
||||
|
||||
### Step 0.5 — Verify
|
||||
|
||||
```bash
|
||||
# From outside your network, confirm the mail IP is live
|
||||
telnet 107.133.34.146 25
|
||||
# Should see: 220 hermes.netgrimoire.com ESMTP
|
||||
|
||||
# Confirm primary WAN IP does NOT respond on port 25
|
||||
telnet 107.133.34.145 25
|
||||
# Should time out or be refused
|
||||
|
||||
# Check that Mailcow outbound connections leave from the ATT_Mail IP
|
||||
# Send a test to check-auth@verifier.port25.com and inspect the Return-Path
|
||||
# or check the Received: header — the sending IP should be 107.133.34.146
|
||||
```
|
||||
|
||||
> ⚠ If the verify step shows port 25 still responding on 107.133.34.145, check that no leftover port forward rules exist on the primary WAN IP for mail ports.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This guide covers complete mail setup for `nucking-futz.com` using MXRoute as the inbound gateway and Mailcow as the mailbox host. MXRoute receives all inbound mail from the internet (solving residential IP filtering issues with banks and financial institutions) and forwards to Mailcow for storage and retrieval. Mailcow handles outbound mail via the MXRoute SMTP relay.
|
||||
|
||||
**Architecture:**
|
||||
|
||||
```
|
||||
Inbound: Internet → MXRoute (commercial IP) → Mailcow (192.168.5.16)
|
||||
Outbound: Mailcow → MXRoute SMTP relay → Internet
|
||||
```
|
||||
|
||||
**Why two domains in Mailcow:**
|
||||
MXRoute forwarders require a valid destination email address. You cannot forward `graymutt@nucking-futz.com` back to `graymutt@nucking-futz.com` — that loops. The solution is to have Mailcow own a subdomain (`mail.nucking-futz.com`) with its own MX record pointing directly to your server. MXRoute forwards to `graymutt@mail.nucking-futz.com`, Mailcow delivers locally, and an alias domain maps `nucking-futz.com` back so users only ever see and use `graymutt@nucking-futz.com`.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- MXRoute account active with DirectAdmin access
|
||||
- Mailcow running at 192.168.5.16
|
||||
- DNS management access for nucking-futz.com
|
||||
- Your MXRoute server hostname from your MXRoute welcome email (e.g. `arrow.mxrouting.net`)
|
||||
|
||||
---
|
||||
|
||||
## Step 1 — DNS Records
|
||||
|
||||
Create all DNS records before configuring either service. Keep TTL at 300 during setup — raise to 3600 once confirmed working.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
### Required DNS Records
|
||||
|
||||
| Type | Host | Value | Notes |
|
||||
|------|------|-------|-------|
|
||||
| A | `mail` | `YOUR_ATT_MAIL_IP` | Points to Mailcow — MXRoute forwards to this server |
|
||||
| MX | `@` | `heracles.mxrouting.net (Priority 10)` | Check MXRoute welcome email for exact hostname |
|
||||
| MX | `@` | `heracles-relay.mxrouting.net (Priority 20)` (priority 20) | Secondary MXRoute server from welcome email |
|
||||
| MX | `mail` | `mail.nucking-futz.com` (priority 10) | Mailcow handles this subdomain directly |
|
||||
| CNAME | `imap` | `mail.nucking-futz.com` | Client autoconfiguration |
|
||||
| CNAME | `smtp` | `mail.nucking-futz.com` | Client autoconfiguration |
|
||||
| CNAME | `webmail` | `mail.nucking-futz.com` | Roundcube access |
|
||||
| CNAME | `autodiscover` | `mail.nucking-futz.com` | Outlook autodiscover |
|
||||
| CNAME | `autoconfig` | `mail.nucking-futz.com` | Thunderbird autoconfig |
|
||||
| TXT | `@` | `v=spf1 ip4:YOUR_ATT_MAIL_IP include:mxroute.com -all` | SPF — authorizes both Mailcow direct and MXRoute relay |
|
||||
| TXT | `mail` | `v=spf1 ip4:YOUR_ATT_MAIL_IP -all` | SPF for subdomain — Mailcow sends directly from here |
|
||||
| TXT | `_dmarc` | `v=DMARC1; p=reject; rua=mailto:admin@netgrimoire.com` | DMARC enforcement |
|
||||
|
||||
> DKIM TXT records (two selectors) are added in Steps 2 and 3 after generating keys in Mailcow and MXRoute.
|
||||
|
||||
---
|
||||
|
||||
## Step 2 — Mailcow Configuration
|
||||
|
||||
### 2.1 Add the Subdomain as Primary Domain
|
||||
|
||||
Mailcow owns `mail.nucking-futz.com` as its active mail domain. Mailboxes live internally on this subdomain.
|
||||
|
||||
1. Log into Mailcow admin UI → **Mail Setup → Domains**
|
||||
2. Click **Add domain**
|
||||
3. Set **Domain:** `mail.nucking-futz.com`
|
||||
4. Leave all other settings as default
|
||||
5. Click **Add domain**
|
||||
|
||||
### 2.2 Add the Alias Domain
|
||||
|
||||
This makes Mailcow accept mail addressed to `@nucking-futz.com` and deliver it to the matching `@mail.nucking-futz.com` mailbox. Users send and receive as `@nucking-futz.com` — the subdomain is invisible to them.
|
||||
|
||||
1. Go to **Mail Setup → Alias Domains**
|
||||
2. Click **Add alias domain**
|
||||
3. Set **Alias Domain:** `nucking-futz.com`
|
||||
4. Set **Target Domain:** `mail.nucking-futz.com`
|
||||
5. Click **Add**
|
||||
|
||||
### 2.3 Create Mailbox
|
||||
|
||||
1. Go to **Mail Setup → Mailboxes**
|
||||
2. Click **Add mailbox**
|
||||
3. Set **Username:** `graymutt`
|
||||
4. Set **Domain:** `mail.nucking-futz.com`
|
||||
5. Set a strong password
|
||||
6. Set quota as needed
|
||||
7. Click **Add**
|
||||
|
||||
The mailbox is internally `graymutt@mail.nucking-futz.com`. The alias domain from Step 2.2 means Mailcow also accepts and delivers mail for `graymutt@nucking-futz.com` to this same mailbox.
|
||||
|
||||
### 2.4 Generate DKIM Key
|
||||
|
||||
1. Go to **Configuration → Configuration & Diagnostics → Configuration**
|
||||
2. Click **ARC/DKIM Keys** tab
|
||||
3. Select domain `mail.nucking-futz.com`
|
||||
4. Set **Selector:** `mailcow`
|
||||
5. Set **Key length:** 2048
|
||||
6. Click **Generate**
|
||||
7. Copy the full TXT record value — needed for DNS
|
||||
|
||||
### 2.5 Add Mailcow DKIM DNS Record
|
||||
|
||||
| Type | Host | Value |
|
||||
|------|------|-------|
|
||||
| TXT | `mailcow._domainkey.mail` | *(full key string from Mailcow — begins with `v=DKIM1;`)* |
|
||||
|
||||
### 2.6 Add MXRoute to Trusted Networks
|
||||
|
||||
Prevents Mailcow from applying spam scoring to forwarded mail arriving from MXRoute's IPs.
|
||||
|
||||
1. Go to **Configuration → Configuration & Diagnostics → Configuration**
|
||||
2. Click **Extra Postfix configuration** tab
|
||||
3. Add to `extra.cf`:
|
||||
|
||||
```
|
||||
# Trust MXRoute forwarding IPs
|
||||
mynetworks = 127.0.0.1/8 [::1]/128 192.168.5.0/24 69.167.160.0/19 198.54.120.0/22
|
||||
```
|
||||
|
||||
> Verify current MXRoute IP ranges in your MXRoute account documentation — these may change.
|
||||
|
||||
4. Click **Save**
|
||||
5. Click **Restart affected containers**
|
||||
|
||||
### 2.7 Configure Outbound Relay
|
||||
|
||||
Routes outbound mail through MXRoute for best deliverability.
|
||||
|
||||
1. Go to **Configuration → Routing → Sender-Dependent Transports**
|
||||
2. Click **Add transport**
|
||||
3. Set **Domain:** `nucking-futz.com`
|
||||
4. Set **Relay host:** `[smtp.mxroute.com]:587` (confirm SMTP hostname from MXRoute welcome email)
|
||||
5. Set **Username:** your MXRoute relay username
|
||||
6. Set **Password:** your MXRoute relay password
|
||||
7. Click **Add**
|
||||
8. Repeat for domain `mail.nucking-futz.com` using the same relay credentials
|
||||
|
||||
---
|
||||
|
||||
## Step 3 — MXRoute Configuration
|
||||
|
||||
### 3.1 Add Domain in DirectAdmin
|
||||
|
||||
1. Log into MXRoute DirectAdmin
|
||||
2. Go to **Account Manager → Domain Setup**
|
||||
3. Add domain: `nucking-futz.com`
|
||||
4. Complete the domain wizard
|
||||
|
||||
### 3.2 Create Forwarder
|
||||
|
||||
MXRoute does not support domain-level remote MX routing — forwarders must be created per address. The destination must be on a domain whose MX resolves to Mailcow, not back to MXRoute.
|
||||
|
||||
1. Go to **Forwarders** in the MXRoute control panel
|
||||
2. Click **Create New Forwarder**
|
||||
3. Set **Forwarder Name:** `graymutt` (the `@nucking-futz.com` part is shown automatically)
|
||||
4. Set **Destination Type:** `Forward to Email(s)`
|
||||
5. Set **Recipients:** `graymutt@mail.nucking-futz.com`
|
||||
6. Click **Create Forwarder**
|
||||
|
||||
> Every new mailbox requires a matching forwarder entry. The pattern is always `user@nucking-futz.com` → `user@mail.nucking-futz.com`. See the Adding a New Mailbox section below.
|
||||
|
||||
### 3.3 Get MXRoute DKIM Key
|
||||
|
||||
1. Go to **Email Manager → DKIM Keys** for `nucking-futz.com`
|
||||
2. Generate or view the DKIM key — note the selector name assigned (often `x`)
|
||||
3. Copy the full TXT record value
|
||||
|
||||
### 3.4 Add MXRoute DKIM DNS Record
|
||||
|
||||
| Type | Host | Value |
|
||||
|------|------|-------|
|
||||
| TXT | `x._domainkey` *(replace `x` with MXRoute's actual selector)* | *(full key string from MXRoute DirectAdmin)* |
|
||||
|
||||
---
|
||||
|
||||
## Step 4 — Verify DNS
|
||||
|
||||
Once DNS has propagated, verify all records:
|
||||
|
||||
```bash
|
||||
# MX for main domain — should show MXRoute servers
|
||||
dig MX nucking-futz.com +short
|
||||
|
||||
# MX for subdomain — should show mail.nucking-futz.com
|
||||
dig MX mail.nucking-futz.com +short
|
||||
|
||||
# A record — should show your ATT IP
|
||||
dig A mail.nucking-futz.com +short
|
||||
|
||||
# SPF
|
||||
dig TXT nucking-futz.com +short
|
||||
dig TXT mail.nucking-futz.com +short
|
||||
|
||||
# DMARC
|
||||
dig TXT _dmarc.nucking-futz.com +short
|
||||
|
||||
# DKIM — Mailcow
|
||||
dig TXT mailcow._domainkey.mail.nucking-futz.com +short
|
||||
|
||||
# DKIM — MXRoute (replace x with your selector)
|
||||
dig TXT x._domainkey.nucking-futz.com +short
|
||||
```
|
||||
|
||||
Run a full check at [https://mxtoolbox.com](https://mxtoolbox.com) → Email Health for `nucking-futz.com`.
|
||||
|
||||
---
|
||||
|
||||
## Step 5 — Test Mail Flow
|
||||
|
||||
### Inbound Test
|
||||
|
||||
Send a test email to `graymutt@nucking-futz.com` from an external Gmail or Outlook account. Verify:
|
||||
|
||||
- Mail arrives in the Mailcow mailbox
|
||||
- Headers show the MXRoute → Mailcow forwarding path (two `Received:` hops)
|
||||
- No spam flagging
|
||||
|
||||
In Roundcube open the test message → **More → View Source** and check the `Received:` chain.
|
||||
|
||||
### Outbound Test
|
||||
|
||||
Send from `graymutt@nucking-futz.com` to an external Gmail address. Run through [https://mail-tester.com](https://mail-tester.com) for a full delivery score.
|
||||
|
||||
### DKIM/SPF/DMARC Test
|
||||
|
||||
Send a test to `check-auth@verifier.port25.com` — you will receive an automated reply confirming pass/fail for SPF, DKIM, and DMARC.
|
||||
|
||||
### Bank/Financial Test
|
||||
|
||||
Send from a bank address to `graymutt@nucking-futz.com` and confirm delivery. This is the primary goal — banks see MXRoute's commercial IPs in the MX record, not your residential AT&T IP.
|
||||
|
||||
---
|
||||
|
||||
## Email Client Settings
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| Email address | `graymutt@nucking-futz.com` |
|
||||
| IMAP server | `mail.nucking-futz.com` |
|
||||
| IMAP port | `993` (SSL/TLS) |
|
||||
| SMTP server | `mail.nucking-futz.com` |
|
||||
| SMTP port | `465` (SSL/TLS) |
|
||||
| Username | `graymutt@nucking-futz.com` |
|
||||
| Password | *(mailbox password set in Step 2.3)* |
|
||||
|
||||
> Users log in and send as `graymutt@nucking-futz.com`. Mailcow resolves this to the internal `mail.nucking-futz.com` mailbox transparently via the alias domain.
|
||||
|
||||
---
|
||||
|
||||
## Adding a New Mailbox
|
||||
|
||||
Every new address on `nucking-futz.com` requires entries in both Mailcow and MXRoute.
|
||||
|
||||
**In Mailcow:**
|
||||
1. Mail Setup → Mailboxes → Add mailbox
|
||||
2. Username: `newuser`, Domain: `mail.nucking-futz.com`
|
||||
|
||||
**In MXRoute control panel:**
|
||||
1. Forwarders → Create New Forwarder
|
||||
2. Forwarder Name: `newuser`, Destination Type: `Forward to Email(s)`, Recipients: `newuser@mail.nucking-futz.com`
|
||||
|
||||
---
|
||||
|
||||
## Credentials Reference
|
||||
|
||||
| Service | Account | Password |
|
||||
|---------|---------|----------|
|
||||
| Mailcow mailbox | `graymutt@mail.nucking-futz.com` | *(set during mailbox creation)* |
|
||||
| MXRoute relay | *(from MXRoute welcome email)* | *(from MXRoute welcome email)* |
|
||||
| MXRoute DirectAdmin | *(from MXRoute welcome email)* | *(from MXRoute welcome email)* |
|
||||
|
||||
---
|
||||
|
||||
## Known Gotchas
|
||||
|
||||
**Forwarder destination must not loop.** Never set the MXRoute forwarder destination to an address on the same domain that has MXRoute as its MX. `graymutt@nucking-futz.com` → `graymutt@nucking-futz.com` will loop. Always forward to `@mail.nucking-futz.com` which has its own MX resolving directly to Mailcow.
|
||||
|
||||
**Two DKIM selectors required.** `mailcow._domainkey.mail.nucking-futz.com` covers mail Mailcow sends directly from the subdomain. `x._domainkey.nucking-futz.com` (MXRoute selector) covers outbound mail relayed through MXRoute. Both must exist for DMARC to pass on all paths.
|
||||
|
||||
**New mailboxes need matching MXRoute forwarders.** MXRoute has no catch-all forwarding to remote servers. Every address that needs to receive mail must have an explicit forwarder in DirectAdmin. Add the MXRoute forwarder step to your mailbox creation checklist.
|
||||
|
||||
**Alias domain vs. alias mailbox.** The alias domain in Step 2.2 maps the entire `nucking-futz.com` domain to `mail.nucking-futz.com`. Do not also create individual alias mailboxes for the same addresses — this creates duplicate delivery and may cause unexpected behavior.
|
||||
|
||||
**SPF differs between the two domains.** The main domain SPF includes `include:mxroute.com` because MXRoute relay sends outbound from there. The subdomain SPF (`mail.nucking-futz.com`) only needs your ATT IP — Mailcow sends directly from that domain without going through MXRoute. Two different records for two different send paths.
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [MailCow Configuration](./mailcow)
|
||||
- [MXRoute Outbound Relay Setup](./mxroute-outbound-relay)
|
||||
- [OPNsense Firewall](./opnsense-firewall) — static IP allocation for ATT_Mail
|
||||
391
Netgrimoire/Keystone-Grimoire/Mail/Hardening.md
Normal file
391
Netgrimoire/Keystone-Grimoire/Mail/Hardening.md
Normal file
|
|
@ -0,0 +1,391 @@
|
|||
---
|
||||
title: MailCow Hardening
|
||||
description: Securing Mailcow
|
||||
published: true
|
||||
date: 2026-02-23T21:56:32.211Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-23T21:56:22.997Z
|
||||
---
|
||||
|
||||
# MailCow Security Hardening
|
||||
|
||||
**Service:** MailCow Dockerized
|
||||
**Host:** 192.168.5.16 (MailCow_Ngnx alias)
|
||||
**Relay:** MXRoute (outbound only)
|
||||
**Last Reviewed:** February 2026
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Running MailCow with MXRoute as an outbound relay creates a specific threat model that's different from either a fully self-hosted or fully managed setup. Your server receives inbound directly (MX points to your IP), stores all mailboxes locally, and hands outbound to MXRoute. This means you carry the risk surface of both — inbound SMTP exposure plus the credential and reputation exposure of a relay relationship.
|
||||
|
||||
The security areas that matter most for this setup:
|
||||
|
||||
| Area | Risk | Priority |
|
||||
|---|---|---|
|
||||
| DNS authentication (SPF/DKIM/DMARC) | Spoofing, deliverability failure, relay abuse | 🔴 Critical |
|
||||
| MTA-STS + TLS-RPT | SMTP downgrade attacks on inbound | 🔴 Critical |
|
||||
| MXRoute relay credential security | Relay hijacking, spam abuse on your reputation | 🔴 Critical |
|
||||
| Mailcow admin hardening | Account takeover, open relay creation | 🔴 Critical |
|
||||
| Postfix TLS hardening | Weak cipher negotiation | 🟡 High |
|
||||
| Nginx header hardening | XSS, clickjacking on webmail | 🟡 High |
|
||||
| Rspamd tuning | Inbound spam, outbound policy enforcement | 🟡 High |
|
||||
| DMARC reporting | Visibility into spoofing and misdelivery | 🟡 High |
|
||||
| ClamAV / attachment scanning | Malware distribution via your domain | 🟢 Medium |
|
||||
| Rate limiting | Compromised account spam runs | 🟢 Medium |
|
||||
|
||||
---
|
||||
|
||||
## DNS Authentication
|
||||
|
||||
This is the foundation. If any of these are misconfigured your mail either doesn't deliver or your domain gets spoofed. With MXRoute in the mix the SPF record requires special attention.
|
||||
|
||||
### SPF — Include Both Sources
|
||||
|
||||
Your SPF must authorize **both** your own IP (for any direct sends) and MXRoute's sending infrastructure:
|
||||
|
||||
```dns
|
||||
@ IN TXT "v=spf1 ip4:YOUR_ATT_MAIL_IP include:mxroute.com ~all"
|
||||
```
|
||||
|
||||
Replace `YOUR_ATT_MAIL_IP` with the static IP you've dedicated to mail (ATT_Mail virtual IP). The `include:mxroute.com` covers MXRoute's sending servers.
|
||||
|
||||
> ⚠ Do not use `-all` (hard fail) until you have confirmed all your sending sources are covered. Use `~all` (softfail) initially, then tighten after verifying DMARC reports show no legitimate sources failing.
|
||||
|
||||
> ⚠ SPF has a **10 DNS lookup limit**. Each `include:` costs lookups. If you add more includes (e.g. transactional services), check your SPF lookup count at [mxtoolbox.com/spf](https://mxtoolbox.com/spf.aspx).
|
||||
|
||||
### DKIM — Two Selectors for Two Signers
|
||||
|
||||
Because MXRoute re-signs outbound mail with their own DKIM key, you need a DKIM record for both signers:
|
||||
|
||||
| Selector | Signer | Where to get the key |
|
||||
|---|---|---|
|
||||
| `mailcow._domainkey` | MailCow (inbound, internal sends) | MailCow UI → Configuration → ARC/DKIM Keys |
|
||||
| `mxroute._domainkey` (or `x._domainkey`) | MXRoute (outbound relay) | MXRoute control panel |
|
||||
|
||||
Add both as TXT records. Having both means DMARC passes regardless of which path the mail took.
|
||||
|
||||
> ✓ MailCow lets you choose the DKIM selector name. Use `mailcow` as the selector to avoid confusion with the MXRoute selector.
|
||||
|
||||
### DMARC — Start Monitoring, Then Enforce
|
||||
|
||||
DMARC ties SPF and DKIM together and tells receiving servers what to do with failures. Start in monitoring mode, review reports for 2–4 weeks, then advance to enforcement.
|
||||
|
||||
**Phase 1 — Monitor (add immediately):**
|
||||
```dns
|
||||
_dmarc IN TXT "v=DMARC1; p=none; rua=mailto:dmarc-reports@yourdomain.com; ruf=mailto:dmarc-failures@yourdomain.com; fo=1"
|
||||
```
|
||||
|
||||
**Phase 2 — Quarantine (after reviewing reports, no legitimate failures):**
|
||||
```dns
|
||||
_dmarc IN TXT "v=DMARC1; p=quarantine; pct=100; rua=mailto:dmarc-reports@yourdomain.com; fo=1"
|
||||
```
|
||||
|
||||
**Phase 3 — Reject (final enforcement):**
|
||||
```dns
|
||||
_dmarc IN TXT "v=DMARC1; p=reject; pct=100; rua=mailto:dmarc-reports@yourdomain.com; fo=1"
|
||||
```
|
||||
|
||||
> ✓ `fo=1` requests forensic reports on any authentication failure — more detail for debugging.
|
||||
|
||||
**DMARC Report Processing:** Raw DMARC reports are XML and not human-readable. Use one of these free tools to process them:
|
||||
- [Postmark DMARC](https://dmarc.postmarkapp.com/) — free, email-based weekly digest
|
||||
- [dmarcian.com](https://dmarcian.com) — free tier, dashboard view
|
||||
- Self-hosted: [Parsedmarc](https://github.com/domainaware/parsedmarc) → send to Graylog/Grafana
|
||||
|
||||
---
|
||||
|
||||
## MTA-STS (MailCow September 2025+)
|
||||
|
||||
MTA-STS forces other mail servers to use TLS when delivering to you, preventing downgrade attacks that try to force plaintext SMTP. The September 2025 MailCow update added the `postfix-tlspol-mailcow` container which enforces MTA-STS on **outbound** connections too.
|
||||
|
||||
### What You Need
|
||||
|
||||
**1. DNS records** — three records for each domain:
|
||||
|
||||
```dns
|
||||
# For your mail server's hostname domain (e.g. netgrimoire.com)
|
||||
mta-sts IN CNAME mail.netgrimoire.com.
|
||||
_mta-sts IN TXT "v=STSv1; id=20260223"
|
||||
_smtp._tls IN TXT "v=TLSRPTv1; rua=mailto:tls-reports@netgrimoire.com"
|
||||
```
|
||||
|
||||
The `id` value in `_mta-sts` is a version string — update it (e.g. to today's date) whenever you change your MTA-STS policy.
|
||||
|
||||
**2. Policy file** — served by MailCow's nginx at `https://mta-sts.yourdomain.com/.well-known/mta-sts.txt`:
|
||||
|
||||
```bash
|
||||
# On your MailCow host:
|
||||
mkdir -p /opt/mailcow-dockerized/data/web/.well-known/
|
||||
cat > /opt/mailcow-dockerized/data/web/.well-known/mta-sts.txt << 'EOF'
|
||||
version: STSv1
|
||||
mode: enforce
|
||||
max_age: 86400
|
||||
mx: mail.netgrimoire.com
|
||||
EOF
|
||||
```
|
||||
|
||||
Start with `mode: testing` for the first week, then switch to `mode: enforce`.
|
||||
|
||||
**3. For additional domains** — add CNAMEs pointing to your primary domain's records:
|
||||
|
||||
```dns
|
||||
# For each additional mail domain you host on MailCow:
|
||||
mta-sts.otherdomain.com IN CNAME mail.netgrimoire.com.
|
||||
_mta-sts.otherdomain.com IN CNAME _mta-sts.netgrimoire.com.
|
||||
_smtp._tls.otherdomain.com IN CNAME _smtp._tls.netgrimoire.com.
|
||||
```
|
||||
|
||||
> ✓ TLS-RPT (`_smtp._tls` TXT record) sends you reports about TLS failures when other servers connect to you. Pipe these to Graylog or Postmark for visibility.
|
||||
|
||||
---
|
||||
|
||||
## MXRoute Relay Security
|
||||
|
||||
This is the most overlooked area. Your MXRoute credentials can send mail as your domain — if they're compromised, someone else is spamming from your reputation.
|
||||
|
||||
### Credential Hardening
|
||||
|
||||
- Use a **unique, strong password** for your MXRoute account — not shared with anything else
|
||||
- Store the MXRoute SMTP credentials in MailCow's relay configuration only, not in any config file or environment variable that gets committed to git
|
||||
- If MXRoute supports API tokens or app passwords, use those instead of your main account password
|
||||
|
||||
### Relay Configuration in MailCow
|
||||
|
||||
In MailCow UI: **Configuration → Routing → Sender-Dependent Transports**
|
||||
|
||||
Verify the relay is configured to authenticate via TLS (port 587 with STARTTLS or port 465 with SSL). Do not relay over port 25 without authentication.
|
||||
|
||||
```
|
||||
# What the relay entry should look like in Postfix terms:
|
||||
# relayhost = [smtp.mxroute.com]:587
|
||||
# smtp_sasl_auth_enable = yes
|
||||
# smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
|
||||
# smtp_tls_security_level = encrypt ← ensures TLS is required, not optional
|
||||
```
|
||||
|
||||
> ⚠ Set `smtp_tls_security_level = encrypt` (not `may`) so the connection to MXRoute is always encrypted. If the TLS negotiation fails, Postfix should reject rather than fall back to plaintext.
|
||||
|
||||
### Rate Limiting (Prevent Relay Abuse if Account Compromised)
|
||||
|
||||
Add rate limits in MailCow UI: **Configuration → Mail Setup → Domains → [your domain] → Rate Limit**
|
||||
|
||||
| Setting | Recommended Value | Notes |
|
||||
|---|---|---|
|
||||
| Outbound messages/hour | 500 | Adjust for your actual sending volume |
|
||||
| Outbound messages/day | 2000 | A sudden spike above this = red flag |
|
||||
|
||||
This doesn't stop abuse but limits blast radius if a mailbox is compromised and starts spamming through MXRoute.
|
||||
|
||||
---
|
||||
|
||||
## MailCow Admin Hardening
|
||||
|
||||
### Two-Factor Authentication
|
||||
|
||||
Enable 2FA on the admin account and all mailbox accounts that have access to the admin panel.
|
||||
|
||||
MailCow UI: **Edit mailbox → Two-Factor Authentication → TOTP**
|
||||
|
||||
> ⚠ There was a session fixation vulnerability in the MailCow web panel (GHSA-23c8-4wwr-g3c6, January 2025) and a critical SSTI vulnerability (GHSA-8p7g-6cjj-wr9m, July 2025). Both require staying current on updates. Enable auto-updates or check the MailCow blog monthly.
|
||||
|
||||
### Restrict Admin UI to Internal Network
|
||||
|
||||
The MailCow admin panel should not be reachable from the public internet. Access should require being on your internal network or connected via WireGuard.
|
||||
|
||||
In OPNsense, add a firewall rule blocking external access to port 443 on 192.168.5.16 except from your static admin IP or WireGuard peers.
|
||||
|
||||
Alternatively, configure MailCow's nginx to restrict the admin path by IP:
|
||||
|
||||
```nginx
|
||||
# In data/conf/nginx/includes/site-defaults.conf
|
||||
# Add inside the server block for the admin panel:
|
||||
location /admin {
|
||||
allow 192.168.3.0/24;
|
||||
allow 192.168.5.0/24;
|
||||
allow 192.168.32.0/24; # WireGuard peers
|
||||
deny all;
|
||||
}
|
||||
```
|
||||
|
||||
### API Key Rotation
|
||||
|
||||
If you use the MailCow API (for automation or Netgrimoire tooling), generate a dedicated read-only key where possible, and rotate keys annually or after any suspected compromise.
|
||||
|
||||
---
|
||||
|
||||
## Postfix TLS Hardening
|
||||
|
||||
Add to `/opt/mailcow-dockerized/data/conf/postfix/extra.cf`:
|
||||
|
||||
```ini
|
||||
# Enforce TLS 1.2+ and strong ciphers
|
||||
tls_high_cipherlist = ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256
|
||||
tls_preempt_cipherlist = yes
|
||||
|
||||
# Inbound SMTP (smtpd) — receiving from other mail servers
|
||||
smtpd_tls_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1
|
||||
smtpd_tls_ciphers = high
|
||||
smtpd_tls_mandatory_ciphers = high
|
||||
|
||||
# Outbound SMTP (smtp) — delivery to MXRoute and direct sends
|
||||
smtp_tls_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1
|
||||
smtp_tls_ciphers = high
|
||||
smtp_tls_mandatory_ciphers = high
|
||||
|
||||
# Require encryption on the MXRoute relay connection
|
||||
smtp_tls_security_level = encrypt
|
||||
```
|
||||
|
||||
After editing, restart Postfix:
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
docker compose restart postfix-mailcow
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Nginx Header Hardening
|
||||
|
||||
Add to `/opt/mailcow-dockerized/data/conf/nginx/includes/site-defaults.conf`:
|
||||
|
||||
```nginx
|
||||
# Strong SSL ciphers only
|
||||
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;
|
||||
ssl_conf_command Options PrioritizeChaCha;
|
||||
|
||||
# HSTS — include subdomains if all your services use HTTPS
|
||||
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
|
||||
|
||||
# Disable X-XSS-Protection (deprecated, CSP replaces it)
|
||||
add_header X-XSS-Protection "0";
|
||||
|
||||
# Deny unused browser permissions
|
||||
add_header Permissions-Policy "accelerometer=(), ambient-light-sensor=(), autoplay=(), battery=(), camera=(), geolocation=(), gyroscope=(), magnetometer=(), microphone=(), payment=(), usb=()";
|
||||
|
||||
# Content Security Policy — if NOT using Gravatar with SOGo
|
||||
add_header Content-Security-Policy "default-src 'none'; connect-src 'self' https://api.github.com; font-src 'self' https://fonts.gstatic.com; img-src 'self' data:; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; frame-ancestors 'none'; upgrade-insecure-requests; block-all-mixed-content; base-uri 'none'";
|
||||
|
||||
# Cross-origin isolation headers
|
||||
add_header Cross-Origin-Resource-Policy same-origin;
|
||||
add_header Cross-Origin-Opener-Policy same-origin;
|
||||
add_header Cross-Origin-Embedder-Policy require-corp;
|
||||
|
||||
# Disable gzip to prevent BREACH attack
|
||||
# Change gzip on; → gzip off; in the main nginx conf
|
||||
```
|
||||
|
||||
> ⚠ The December 2025 MailCow update already removed the deprecated `X-XSS-Protection` header from defaults. If you're current, you may already have this. Check before duplicating.
|
||||
|
||||
After editing, restart nginx:
|
||||
```bash
|
||||
docker compose restart nginx-mailcow
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rspamd Tuning
|
||||
|
||||
Rspamd is MailCow's spam filter. The defaults are reasonable but a few adjustments improve both inbound protection and outbound policy enforcement.
|
||||
|
||||
### Key Settings to Review
|
||||
|
||||
Navigate to **MailCow UI → Configuration → Rspamd UI** (or directly at `https://mail.yourdomain.com/rspamd/`)
|
||||
|
||||
**Actions → Score Thresholds:**
|
||||
|
||||
| Action | Default | Recommended |
|
||||
|---|---|---|
|
||||
| Greylist | 4 | 3 |
|
||||
| Add header | 6 | 5 |
|
||||
| Reject | 15 | 12 |
|
||||
|
||||
Lowering the reject threshold from 15 to 12 catches more aggressive spam while avoiding false positives.
|
||||
|
||||
**Modules to enable/verify:**
|
||||
|
||||
| Module | Purpose |
|
||||
|---|---|
|
||||
| DKIM verification | Verify incoming DKIM signatures |
|
||||
| SPF | Verify incoming SPF |
|
||||
| DMARC | Enforce DMARC on inbound |
|
||||
| MX Check | Verify sending domain has a valid MX |
|
||||
| RBL (Realtime Blacklists) | Check sending IPs against blocklists |
|
||||
| Greylisting | Temporary reject new senders (forces retry) |
|
||||
|
||||
### Add CrowdSec as an Rspamd Feed
|
||||
|
||||
If you also have the CrowdSec bouncer running on the MailCow host (or can reach it), you can feed CrowdSec decisions into Rspamd to reject mail from banned IPs. This is advanced but powerful — see the [CrowdSec Bouncer for Rspamd](https://hub.crowdsec.net) hub entry.
|
||||
|
||||
---
|
||||
|
||||
## Deliverability Verification
|
||||
|
||||
Run these checks after making any DNS or config changes:
|
||||
|
||||
| Tool | What It Checks | URL |
|
||||
|---|---|---|
|
||||
| MXToolbox | SPF, DKIM, DMARC, MX, PTR, blacklists | mxtoolbox.com |
|
||||
| mail-tester.com | Send a test email, get a 1–10 score | mail-tester.com |
|
||||
| Port25 verifier | Send to check-auth@verifier.port25.com | Email-based |
|
||||
| DKIM validator | Validates DKIM signature | dkimvalidator.com |
|
||||
| Google Postmaster Tools | Gmail reputation monitoring (requires setup) | postmaster.google.com |
|
||||
| Microsoft SNDS | Outlook/Hotmail reputation | sendersupport.olc.protection.outlook.com |
|
||||
|
||||
> ✓ Aim for 9–10/10 on mail-tester.com. Anything below 8 indicates a misconfiguration that will hurt deliverability.
|
||||
|
||||
---
|
||||
|
||||
## Keeping MailCow Updated
|
||||
|
||||
MailCow has had several critical security vulnerabilities in 2025 (session fixation, SSTI, password reset poisoning). Staying current is non-negotiable.
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Pull latest images
|
||||
docker compose pull
|
||||
|
||||
# Apply update
|
||||
./update.sh
|
||||
|
||||
# Or if using the newer helper:
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
> ✓ Subscribe to the [MailCow blog](https://mailcow.email/posts/) or watch the [GitHub releases](https://github.com/mailcow/mailcow-dockerized/releases) for security advisories. The update cadence is roughly monthly.
|
||||
|
||||
Set up a cron job or Monit check to alert you when MailCow is more than 30 days behind the latest release.
|
||||
|
||||
---
|
||||
|
||||
## Checklist Summary
|
||||
|
||||
| Item | Status |
|
||||
|---|---|
|
||||
| SPF includes both own IP and mxroute.com | ☐ |
|
||||
| Two DKIM selectors (mailcow + mxroute) | ☐ |
|
||||
| DMARC in monitoring mode, advancing to reject | ☐ |
|
||||
| DMARC reports being processed (Postmark/dmarcian) | ☐ |
|
||||
| MTA-STS policy published and enforced | ☐ |
|
||||
| TLS-RPT record in DNS | ☐ |
|
||||
| MXRoute relay connection uses TLS/encrypt level | ☐ |
|
||||
| Admin UI restricted to internal network | ☐ |
|
||||
| 2FA on admin and all privileged accounts | ☐ |
|
||||
| Postfix TLS 1.2+ enforced via extra.cf | ☐ |
|
||||
| Nginx security headers added | ☐ |
|
||||
| Rate limits set on outbound per-domain | ☐ |
|
||||
| MailCow updated to latest (monthly check) | ☐ |
|
||||
| Rspamd thresholds reviewed | ☐ |
|
||||
| PTR/rDNS record matches mail hostname | ☐ |
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [OPNsense Firewall](./opnsense-firewall) — dedicated ATT_Mail virtual IP, port NAT
|
||||
- [CrowdSec](./crowdsec) — IP reputation blocking at firewall level
|
||||
- [Graylog](./graylog) — DMARC report and TLS-RPT ingestion target
|
||||
- [Caddy Reverse Proxy](./caddy-reverse-proxy) — if MailCow webmail is proxied through Caddy
|
||||
490
Netgrimoire/Keystone-Grimoire/Mail/Install.md
Normal file
490
Netgrimoire/Keystone-Grimoire/Mail/Install.md
Normal file
|
|
@ -0,0 +1,490 @@
|
|||
---
|
||||
title: Mailcow Dockerized Install and Config
|
||||
description:
|
||||
published: true
|
||||
date: 2026-02-25T21:05:48.256Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-25T21:05:38.864Z
|
||||
---
|
||||
|
||||
# MailCow — Installation & Configuration
|
||||
|
||||
**Host:** docker4 (192.168.5.16)
|
||||
**Hostname:** hermes.netgrimoire.com
|
||||
**Admin URL:** https://mail.netgrimoire.com
|
||||
**Version:** 2025-10a (update 2026-01 available as of documentation date)
|
||||
**Installed:** /opt/mailcow-dockerized
|
||||
**Timezone:** America/Chicago
|
||||
**Architecture:** x86_64
|
||||
**CPU:** 16 cores
|
||||
**RAM:** 30.63 GB
|
||||
**Disk:** /dev/nvme0n1p2 — 442G / 502G used (93% — monitor this)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Mailcow runs as a Docker stack on docker4, attached to the `netgrimoire` overlay network. All containers use `restart: unless-stopped` via a compose override. Outbound mail routes through MXRoute via sender-dependent transports. Inbound mail arrives from MXRoute which acts as the public-facing inbound gateway (solving residential AT&T IP filtering issues with banks).
|
||||
|
||||
See [MXRoute Master Configuration](./mxroute-master) for full inbound/outbound/DNS detail per domain.
|
||||
|
||||
---
|
||||
|
||||
## Installation Paths
|
||||
|
||||
| Path | Purpose |
|
||||
|------|---------|
|
||||
| `/opt/mailcow-dockerized/` | Mailcow root |
|
||||
| `/opt/mailcow-dockerized/mailcow.conf` | Primary configuration file |
|
||||
| `/opt/mailcow-dockerized/docker-compose.yml` | Base compose (do not edit) |
|
||||
| `/opt/mailcow-dockerized/docker-compose.override.yml` | Local overrides — network and restart policy |
|
||||
| `/opt/mailcow-dockerized/data/conf/postfix/extra.cf` | Persistent Postfix overrides |
|
||||
| `/opt/mailcow-dockerized/data/conf/postfix/main.cf` | Postfix base config (managed by Mailcow) |
|
||||
| `/opt/mailcow-dockerized/data/conf/rspamd/` | Rspamd configuration |
|
||||
| `/opt/mailcow-dockerized/data/assets/ssl/` | TLS certificates |
|
||||
|
||||
---
|
||||
|
||||
## mailcow.conf — Key Settings
|
||||
|
||||
```ini
|
||||
MAILCOW_HOSTNAME=hermes.netgrimoire.com
|
||||
MAILCOW_PASS_SCHEME=BLF-CRYPT
|
||||
|
||||
# Database
|
||||
DBNAME=mailcow
|
||||
DBUSER=mailcow
|
||||
DBPASS=mg7Z8W9UsPlOh0S6vF7TmmPb6n1s
|
||||
DBROOT=JdymsZFFACHkDcOdziQ53QruCTG2
|
||||
|
||||
# Redis
|
||||
REDISPASS=6AduWQsmBYGMKfOi1CNEGQfTE3RH
|
||||
|
||||
# Ports — HTTPS runs on 3443, proxied through Caddy
|
||||
HTTP_PORT=80
|
||||
HTTP_BIND=
|
||||
HTTPS_PORT=3443
|
||||
HTTPS_BIND=
|
||||
HTTP_REDIRECT=n
|
||||
|
||||
# Mail ports (standard)
|
||||
SMTP_PORT=25
|
||||
SMTPS_PORT=465
|
||||
SUBMISSION_PORT=587
|
||||
IMAP_PORT=143
|
||||
IMAPS_PORT=993
|
||||
POP_PORT=110
|
||||
POPS_PORT=995
|
||||
SIEVE_PORT=4190
|
||||
|
||||
# Internal ports (localhost only)
|
||||
DOVEADM_PORT=127.0.0.1:19991
|
||||
SQL_PORT=127.0.0.1:13306
|
||||
REDIS_PORT=127.0.0.1:7654
|
||||
|
||||
# TLS cert coverage
|
||||
ADDITIONAL_SAN=smtp.*,imap.*
|
||||
AUTODISCOVER_SAN=y
|
||||
|
||||
# ACME / Let's Encrypt
|
||||
SKIP_LETS_ENCRYPT=n
|
||||
SKIP_IP_CHECK=y
|
||||
SKIP_HTTP_VERIFICATION=y
|
||||
|
||||
# Services — all enabled
|
||||
SKIP_CLAMD=n
|
||||
SKIP_OLEFY=n
|
||||
SKIP_SOGO=n
|
||||
SKIP_FTS=n
|
||||
|
||||
# FTS (Flatcurve/Xapian)
|
||||
FTS_HEAP=128
|
||||
FTS_PROCS=1
|
||||
|
||||
# Watchdog
|
||||
USE_WATCHDOG=y
|
||||
WATCHDOG_NOTIFY_START=y
|
||||
WATCHDOG_NOTIFY_BAN=n
|
||||
WATCHDOG_EXTERNAL_CHECKS=n
|
||||
|
||||
# Networking
|
||||
IPV4_NETWORK=172.22.1
|
||||
IPV6_NETWORK=fd4d:6169:6c63:6f77::/64
|
||||
ENABLE_IPV6=false
|
||||
|
||||
# Misc
|
||||
MAILDIR_GC_TIME=7200
|
||||
MAILDIR_SUB=Maildir
|
||||
SOGO_EXPIRE_SESSION=480
|
||||
SOGO_URL_ENCRYPTION_KEY=ojmPfhnM4MYMsA2f
|
||||
ACL_ANYONE=disallow
|
||||
ALLOW_ADMIN_EMAIL_LOGIN=n
|
||||
DOCKER_COMPOSE_VERSION=native
|
||||
COMPOSE_PROJECT_NAME=mailcow
|
||||
LOG_LINES=9999
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## docker-compose.override.yml
|
||||
|
||||
All services are attached to the external `netgrimoire` overlay network and set to `restart: unless-stopped`. The override does not change any image versions or environment variables — it only adds network membership and restart policy.
|
||||
|
||||
```yaml
|
||||
services:
|
||||
unbound-mailcow:
|
||||
networks:
|
||||
netgrimoire:
|
||||
restart: unless-stopped
|
||||
|
||||
mysql-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
redis-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
clamd-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
rspamd-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
php-fpm-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
sogo-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
dovecot-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
postfix-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
postfix-tlspol-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
memcached-mailcow:
|
||||
restart: unless-stopped
|
||||
|
||||
nginx-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
acme-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
watchdog-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
dockerapi-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
olefy-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
ofelia-mailcow:
|
||||
networks:
|
||||
- netgrimoire
|
||||
restart: unless-stopped
|
||||
|
||||
networks:
|
||||
netgrimoire:
|
||||
external: true
|
||||
driver: overlay
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Container Image Versions
|
||||
|
||||
From `docker-compose.yml` (base file — version 2025-10a):
|
||||
|
||||
| Service | Image |
|
||||
|---------|-------|
|
||||
| unbound-mailcow | ghcr.io/mailcow/unbound:1.24 |
|
||||
| mysql-mailcow | mariadb:10.11 |
|
||||
| redis-mailcow | redis:7.4.6-alpine |
|
||||
| clamd-mailcow | ghcr.io/mailcow/clamd:1.71 |
|
||||
| rspamd-mailcow | ghcr.io/mailcow/rspamd:2.4 |
|
||||
| php-fpm-mailcow | ghcr.io/mailcow/phpfpm:1.94 |
|
||||
| sogo-mailcow | ghcr.io/mailcow/sogo:1.136 |
|
||||
| dovecot-mailcow | ghcr.io/mailcow/dovecot:2.35 |
|
||||
| postfix-mailcow | ghcr.io/mailcow/postfix:1.81 |
|
||||
| postfix-tlspol-mailcow | ghcr.io/mailcow/postfix-tlspol:1.0 |
|
||||
| memcached-mailcow | memcached:alpine |
|
||||
| nginx-mailcow | ghcr.io/mailcow/nginx:1.05 |
|
||||
| acme-mailcow | ghcr.io/mailcow/acme:1.94 |
|
||||
| netfilter-mailcow | ghcr.io/mailcow/netfilter:1.63 |
|
||||
| watchdog-mailcow | ghcr.io/mailcow/watchdog:2.09 |
|
||||
| dockerapi-mailcow | ghcr.io/mailcow/dockerapi:2.11 |
|
||||
| olefy-mailcow | ghcr.io/mailcow/olefy:1.15 |
|
||||
| ofelia-mailcow | mcuadros/ofelia:latest |
|
||||
|
||||
---
|
||||
|
||||
## Postfix Configuration
|
||||
|
||||
### extra.cf
|
||||
|
||||
```
|
||||
myhostname = hermes.netgrimoire.com
|
||||
```
|
||||
|
||||
> The MXRoute trusted network entries should also be here. Current extra.cf only contains myhostname — confirm mynetworks is set correctly or add the MXRoute IP ranges if not already present via the UI.
|
||||
|
||||
### Key Postfix Settings (from running config)
|
||||
|
||||
```
|
||||
mynetworks = 127.0.0.0/8 172.22.1.0/24 10.0.1.0/24 [::1]/128 [fd4d:6169:6c63:6f77::]/64 [fe80::]/64
|
||||
message_size_limit = 104857600 # 100MB
|
||||
mailbox_size_limit = 0 # unlimited
|
||||
bounce_queue_lifetime = 1d
|
||||
maximal_queue_lifetime = 5d
|
||||
delay_warning_time = 4h
|
||||
postscreen_dnsbl_threshold = 6
|
||||
postscreen_dnsbl_action = enforce
|
||||
postscreen_greet_action = enforce
|
||||
smtpd_relay_restrictions = permit_mynetworks, permit_sasl_authenticated, defer_unauth_destination
|
||||
disable_vrfy_command = yes
|
||||
broken_sasl_auth_clients = yes
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Domains
|
||||
|
||||
10 domains configured. All active.
|
||||
|
||||
| Domain | Mailboxes | Sender-Dependent Transport | Created |
|
||||
|--------|-----------|---------------------------|---------|
|
||||
| bamalady.com | 0 / 10 | *(not confirmed)* | — |
|
||||
| bill740.com | 1 / 10 | *(not confirmed)* | — |
|
||||
| florosafd.org | 4 / 10 | ID 4: heracles.mxrouting.net:587 (relay@florosafd.org) | 2025-11-21 |
|
||||
| gnarlypandaproductions.com | 2 / 10 | ID 5: heracles.mxrouting.net:587 (relay@gnarlypandaproductions.com) | 2025-11-21 |
|
||||
| netgrimoire.com | 2 / 10 | ID 2: heracles.mxrouting.net:587 (relay@netgrimoire.com) | 2025-11-21 |
|
||||
| nucking-futz.net | 0 / 10 | *(not confirmed)* | — |
|
||||
| pncfishandmore.com | 4 / 10 | ID 6: heracles.mxrouting.net:587 (relay@pncfishandmore.com) | — |
|
||||
| pncharris.com | 4 / 10 | ID 3: heracles.mxrouting.net:587 (passer@pncharris.com) | 2025-11-21 |
|
||||
| pncharrisenterprises.com | 2 / 10 | *(not confirmed from screenshots)* | — |
|
||||
| wasted-bandwidth.net | 1 / 10 | ID 1: heracles.mxrouting.net:587 (relay@wasted-bandwidth.net) | — |
|
||||
|
||||
> MXRoute relay hostname is `heracles.mxrouting.net:587` — note this differs from the generic `smtp.mxroute.com` placeholder used in setup docs. Always use `heracles.mxrouting.net:587` for this account.
|
||||
|
||||
---
|
||||
|
||||
## Mailboxes
|
||||
|
||||
19 active mailboxes across all domains:
|
||||
|
||||
| Mailbox | Messages | Domain |
|
||||
|---------|----------|--------|
|
||||
| bill@bill740.com | 1 | bill740.com |
|
||||
| chieflee@florosafd.org | 2124 | florosafd.org |
|
||||
| cindy@pncfishandmore.com | 1109 | pncfishandmore.com |
|
||||
| cindy@pncharris.com | 33797 | pncharris.com |
|
||||
| cindy@pncharrisenterprises.com | 819 | pncharrisenterprises.com |
|
||||
| dads_attic@pncharris.com | 0 | pncharris.com |
|
||||
| jim.harris@florosafd.org | 8 | florosafd.org |
|
||||
| kyle@gnarlypandaproductions.com | 486 | gnarlypandaproductions.com |
|
||||
| kyle@pncfishandmore.com | 110 | pncfishandmore.com |
|
||||
| kyle@pncharris.com | 31182 | pncharris.com |
|
||||
| phil@florosafd.org | 5 | florosafd.org |
|
||||
| phil@gnarlypandaproductions.com | 5 | gnarlypandaproductions.com |
|
||||
| phil@netgrimoire.com | 1 | netgrimoire.com |
|
||||
| phil@pncfishandmore.com | 10 | pncfishandmore.com |
|
||||
| phil@pncharris.com | 3210 | pncharris.com |
|
||||
| phil@pncharrisenterprises.com | 1 | pncharrisenterprises.com |
|
||||
| times@florosafd.org | 191 | florosafd.org |
|
||||
| traveler@netgrimoire.com | 3 | netgrimoire.com |
|
||||
| traveler@wasted-bandwidth.net | 138 | wasted-bandwidth.net |
|
||||
|
||||
---
|
||||
|
||||
## Aliases
|
||||
|
||||
| ID | Alias | Target Domain | Internal |
|
||||
|----|-------|---------------|---------|
|
||||
| 7 | cindy@bamalady.com | bamalady.com | No |
|
||||
|
||||
---
|
||||
|
||||
## Sender-Dependent Transports
|
||||
|
||||
All outbound relay routes through `heracles.mxrouting.net:587`. This is your MXRoute server hostname — use this exact value when adding new transports.
|
||||
|
||||
| ID | Host | Username | Password |
|
||||
|----|------|----------|----------|
|
||||
| 1 | heracles.mxrouting.net:587 | relay@wasted-bandwidth.net | dZ4yLYznVvgSJtqWZJFA |
|
||||
| 2 | heracles.mxrouting.net:587 | relay@netgrimoire.com | TVGCnJp9SxRbWU8EhkMw |
|
||||
| 3 | heracles.mxrouting.net:587 | passer@pncharris.com | bBJtPhrGkHvvhxhukkae |
|
||||
| 4 | heracles.mxrouting.net:587 | relay@florosafd.org | 2Fe8XMyaeh6Z5dvdHYdq |
|
||||
| 5 | heracles.mxrouting.net:587 | relay@gnarlypandaproductions.com | vG5ZsUQhRWD2UyzLPsqA |
|
||||
| 6 | heracles.mxrouting.net:587 | relay@pncfishandmore.com | *(confirm from MXRoute panel)* |
|
||||
|
||||
---
|
||||
|
||||
## DKIM Keys
|
||||
|
||||
Two DKIM selectors are configured per domain — one for Mailcow (selector: `dkim`) and one added separately for MXRoute outbound signing. The Mailcow-managed keys use selector `dkim._domainkey`.
|
||||
|
||||
### pncharris.com
|
||||
```
|
||||
v=DKIM1;k=rsa;t=s;s=email;p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAqhgQV7r+KKQwJceWenZ3FNq8AsllgW6cIm/0jpsLT62vF1yy0nh2MdhjYgQAX2MK9HHYzNZcCB3+OPpqBbXeNbSDckxB/dC+z/vboMHrJmYonfaSYshZjSR80V/a2Yoq+hiXQ9eBcuOggENtMm4XvEsl/vOWLBMfasqe+X11gzQBeRv1tTaXJB0C4i7tAcfi0O/AxH8QFTr2099+k2iepn8J15ukk1zu4zemBJj4Z3uFTNnBP8YpgKbYoUDyMVIKIxGjANVBBypcrMKavpQ4F1JLhgGFhWAsAuFRwZsnOaftZyMuzAZxM37DTd/bF2WanmK3Xe75SN5uOnEXjuzW/wIDAQAB
|
||||
```
|
||||
|
||||
### netgrimoire.com
|
||||
```
|
||||
v=DKIM1;k=rsa;t=s;s=email;p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAoJ9YKqV9+6gOcVKI+UJ0TRcMmergxU8HLO+mwTMfqOhblsEcDPO60c8ya24iIXg51AA2k5Xcbb0bLScaaIi0P/TRzP/bonAZkPS1Y8Fx1se9dikTsA9Lazho u6DvoFkkV/IPH1ZNg68Cd9teAD5tvoY18OSneJJsocXwFo57c+XccUaTxjpV7eReuT4da7iNHMmUmZNfKenxVMKD740zrDJAeAsXtEb/71CochHYSm+qAvuG9/WPixJbMsJLF/iVhV3Byp0LCrB+CwGTwnsiUcd7QpuD6rRs/7zzdGBtoN22m/j390GimFstYvB61I20h8sHWGAG66dLko6Sgvs47wIDAQAB
|
||||
```
|
||||
|
||||
### gnarlypandaproductions.com
|
||||
```
|
||||
v=DKIM1;k=rsa;t=s;s=email;p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...
|
||||
```
|
||||
*(scroll cut off in screenshot — retrieve full key from Mailcow UI → Edit domain → bottom of page)*
|
||||
|
||||
> All other domain DKIM keys should be retrieved from the Mailcow domain edit page and recorded here for disaster recovery completeness.
|
||||
|
||||
---
|
||||
|
||||
## Network Configuration
|
||||
|
||||
Mailcow containers join the `netgrimoire` external overlay network, allowing communication with other Docker Swarm services (Caddy reverse proxy, etc.) without exposing ports directly to the host network.
|
||||
|
||||
**Internal Docker network:** `172.22.1.0/24`
|
||||
|
||||
Key container IPs within the mailcow-network:
|
||||
- unbound: 172.22.1.254
|
||||
- redis: 172.22.1.249
|
||||
- sogo: 172.22.1.248
|
||||
- dovecot: 172.22.1.250
|
||||
- postfix: 172.22.1.253
|
||||
|
||||
**IPv6:** disabled (`ENABLE_IPV6=false`)
|
||||
|
||||
---
|
||||
|
||||
## Caddy Reverse Proxy
|
||||
|
||||
Mailcow's nginx listens on HTTPS port 3443 internally. Caddy proxies external requests to it. Mailcow handles its own TLS for direct mail client connections (IMAP 993, SMTP 465/587).
|
||||
|
||||
The admin UI at `mail.netgrimoire.com` is proxied through Caddy on the `netgrimoire` overlay network.
|
||||
|
||||
---
|
||||
|
||||
## Updating Mailcow
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Pull latest
|
||||
git fetch origin
|
||||
git checkout origin/master
|
||||
|
||||
# Update containers
|
||||
docker compose pull
|
||||
./update.sh
|
||||
```
|
||||
|
||||
> As of documentation date, version 2026-01 is available. Current running version is 2025-10a. Update when convenient — check the [MailCow changelog](https://github.com/mailcow/mailcow-dockerized/releases) for breaking changes first.
|
||||
|
||||
Monthly update check is recommended. MailCow had multiple security vulnerabilities in 2025 — staying current is important.
|
||||
|
||||
---
|
||||
|
||||
## Common Operations
|
||||
|
||||
### Restart all containers
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
docker compose restart
|
||||
```
|
||||
|
||||
### Restart single container (e.g. after extra.cf change)
|
||||
```bash
|
||||
docker compose restart postfix-mailcow
|
||||
```
|
||||
|
||||
### View logs
|
||||
```bash
|
||||
# Postfix
|
||||
docker compose logs postfix-mailcow -f
|
||||
|
||||
# Dovecot
|
||||
docker compose logs dovecot-mailcow -f
|
||||
|
||||
# All containers
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
### Check queue
|
||||
```bash
|
||||
docker exec mailcow-postfix-mailcow-1 postqueue -p
|
||||
```
|
||||
|
||||
### Flush queue
|
||||
```bash
|
||||
docker exec mailcow-postfix-mailcow-1 postqueue -f
|
||||
```
|
||||
|
||||
### Check container health
|
||||
```bash
|
||||
docker compose ps
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Known Gotchas
|
||||
|
||||
**Disk usage is at 93%.** The nvme0n1p2 volume has 442G used of 502G. This needs attention — vmail storage grows over time and garbage collection runs hourly but only removes items older than 7200 minutes (5 days). Monitor this and consider quota enforcement per mailbox if growth continues.
|
||||
|
||||
**extra.cf is minimal.** The MXRoute trusted network IPs should be confirmed in the running Postfix config. The `mynetworks` value from `postconf` shows `10.0.1.0/24` is already trusted — confirm whether MXRoute IP ranges `69.167.160.0/19` and `198.54.120.0/22` are included. If not, add them to extra.cf and restart postfix.
|
||||
|
||||
**MXRoute relay hostname.** The actual relay hostname for this account is `heracles.mxrouting.net:587` — not the generic `smtp.mxroute.com` placeholder. All 6 transports use `heracles.mxrouting.net:587`. Use this exact hostname for any new transport entries.
|
||||
|
||||
**pncharris.com uses passer@ not relay@.** Transport ID 3 for pncharris.com authenticates as `passer@pncharris.com`, not `relay@pncharris.com`. This is intentional — the relay@ account exists but passer@ is the current active relay credential.
|
||||
|
||||
**HTTPS on port 3443.** Mailcow's web UI is not on the standard 443 — it binds to 3443 and Caddy handles the public-facing 443 proxy. Direct access to the UI requires going through Caddy or using the internal port.
|
||||
|
||||
**nucking-futz.net vs nucking-futz.com.** The domains list shows `nucking-futz.net` but the intended new domain is `nucking-futz.com`. Verify which is actually configured and correct if needed.
|
||||
|
||||
**bamalady.com and bill740.com** have no transport assigned in the screenshots. Confirm whether these domains need MXRoute relay configured.
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [MXRoute Master Configuration](./mxroute-master) — per-domain DNS, inbound forwarding, outbound relay credentials
|
||||
- [Mail Setup — nucking-futz.com](./mail-setup-nucking-futz) — new domain setup guide
|
||||
- [MailCow Security Hardening](./mailcow-security-hardening)
|
||||
- [Caddy Reverse Proxy](./caddy-reverse-proxy) — proxies mail.netgrimoire.com to port 3443
|
||||
- [OPNsense Firewall](./opnsense-firewall) — ATT_Mail static IP, port forwarding rules
|
||||
430
Netgrimoire/Keystone-Grimoire/Mail/MXRoute-Integration.md
Normal file
430
Netgrimoire/Keystone-Grimoire/Mail/MXRoute-Integration.md
Normal file
|
|
@ -0,0 +1,430 @@
|
|||
---
|
||||
title: Integrating MXRoute with MailCow
|
||||
description:
|
||||
published: true
|
||||
date: 2026-02-25T21:04:37.135Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-25T19:22:31.514Z
|
||||
---
|
||||
|
||||
# MXRoute — Master Configuration Reference
|
||||
|
||||
## Overview
|
||||
|
||||
MXRoute serves two roles in Netgrimoire mail infrastructure:
|
||||
|
||||
- **Inbound gateway** — MX records for all domains point to MXRoute's commercial IPs, solving residential AT&T IP filtering by banks and financial institutions. MXRoute receives mail and forwards to Mailcow via per-address forwarders.
|
||||
- **Outbound relay** — Mailcow sends all outbound mail through MXRoute via sender-dependent transports for improved deliverability.
|
||||
|
||||
**Mail flow:**
|
||||
|
||||
```
|
||||
Inbound: Internet → MXRoute (commercial IP) → Mailcow (192.168.5.16)
|
||||
Outbound: Mailcow (192.168.5.16) → MXRoute SMTP relay → Internet
|
||||
```
|
||||
|
||||
**Mailcow host:** 192.168.5.16
|
||||
**MXRoute control panel:** confirm server hostname from MXRoute welcome email (e.g. `arrow.mxrouting.net`)
|
||||
**MXRoute SMTP relay:** confirm from welcome email (e.g. `smtp.mxroute.com:587`)
|
||||
|
||||
---
|
||||
|
||||
## Architecture — Why Two Domains Per Hosted Domain
|
||||
|
||||
MXRoute forwarders require a valid destination email address. Forwarding `user@domain.com` back to `user@domain.com` creates a mail loop because MXRoute would look up the MX for `domain.com` and find itself. The solution is a `mail.domain.com` subdomain with its own MX record pointing directly to Mailcow. MXRoute forwards to `user@mail.domain.com`, Mailcow accepts and delivers, and an alias domain maps `@domain.com` back so users only ever see `@domain.com`.
|
||||
|
||||
```
|
||||
domain.com MX → MXRoute (public-facing, receives from internet)
|
||||
mail.domain.com MX → 192.168.5.16 (internal, MXRoute forwards here)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## MXRoute Control Panel
|
||||
|
||||
**Login:** confirm URL from MXRoute welcome email
|
||||
**Interface:** MXRoute 4.0 (new UI — not old DirectAdmin)
|
||||
|
||||
### Creating a Forwarder
|
||||
|
||||
1. Go to **Forwarders**
|
||||
2. Click **Create New Forwarder**
|
||||
3. Set **Forwarder Name:** `username` (domain shown automatically)
|
||||
4. Set **Destination Type:** `Forward to Email(s)`
|
||||
5. Set **Recipients:** `username@mail.domain.com`
|
||||
6. Click **Create Forwarder**
|
||||
|
||||
> Recipients field accepts multiple addresses comma or newline separated.
|
||||
|
||||
---
|
||||
|
||||
## Mailcow Configuration
|
||||
|
||||
### Adding a New Domain (One-Time Per Domain)
|
||||
|
||||
1. **Mail Setup → Domains → Add domain**
|
||||
- Domain: `mail.domain.com` (the subdomain Mailcow owns)
|
||||
- Leave relay settings as default
|
||||
|
||||
2. **Mail Setup → Alias Domains → Add alias domain**
|
||||
- Alias Domain: `domain.com`
|
||||
- Target Domain: `mail.domain.com`
|
||||
- This makes Mailcow accept and deliver mail for `@domain.com` to `@mail.domain.com` mailboxes
|
||||
|
||||
3. **Configuration → ARC/DKIM Keys**
|
||||
- Select domain `mail.domain.com`
|
||||
- Selector: `mailcow`
|
||||
- Key length: 2048
|
||||
- Generate and copy TXT record for DNS
|
||||
|
||||
4. **Configuration → Extra Postfix configuration → extra.cf**
|
||||
|
||||
```
|
||||
# Trust MXRoute forwarding IPs — prevents SPF scoring on forwarded mail
|
||||
mynetworks = 127.0.0.1/8 [::1]/128 192.168.5.0/24 69.167.160.0/19 198.54.120.0/22
|
||||
```
|
||||
|
||||
Restart affected containers after saving.
|
||||
|
||||
### Adding a New Mailbox
|
||||
|
||||
1. **Mail Setup → Mailboxes → Add mailbox**
|
||||
- Username: `user`
|
||||
- Domain: `mail.domain.com`
|
||||
|
||||
2. **MXRoute control panel → Forwarders → Create New Forwarder**
|
||||
- Forwarder: `user@domain.com`
|
||||
- Destination: `user@mail.domain.com`
|
||||
|
||||
### Outbound Relay — Sender-Dependent Transports
|
||||
|
||||
One transport entry per domain. **Configuration → Routing → Sender-Dependent Transports**
|
||||
|
||||
| Domain | Relay Host | Username | Password |
|
||||
|--------|-----------|----------|----------|
|
||||
| pncharris.com | `[smtp.mxroute.com]:587` | relay@pncharris.com | H@rv3yD)G123 |
|
||||
| wasted-bandwidth.net | `[smtp.mxroute.com]:587` | relay@wasted-bandwidth.net | dZ4yLYznVvgSJtqWZJFA |
|
||||
| netgrimoire.com | `[smtp.mxroute.com]:587` | relay@netgrimoire.com | TVGCnJp9SxRbWU8EhkMw |
|
||||
| florosafd.org | `[smtp.mxroute.com]:587` | relay@florosafd.org | 2Fe8XMyaeh6Z5dvdHYdq |
|
||||
| gnarlypandaproductions.com | `[smtp.mxroute.com]:587` | relay@gnarlypandaproductions.com | vG5ZsUQhRWD2UyzLPsqA |
|
||||
|
||||
> Confirm SMTP relay hostname from MXRoute welcome email — substitute actual hostname for `smtp.mxroute.com` if different.
|
||||
|
||||
### Email Client Settings (All Domains)
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| IMAP server | `mail.domain.com` |
|
||||
| IMAP port | `993` (SSL/TLS) |
|
||||
| SMTP server | `mail.domain.com` |
|
||||
| SMTP port | `465` (SSL/TLS) |
|
||||
| Username | `user@domain.com` |
|
||||
|
||||
> Users log in with `@domain.com`. Mailcow resolves to the internal `@mail.domain.com` mailbox via alias domain — transparent to the user.
|
||||
|
||||
---
|
||||
|
||||
## DNS Reference — All Domains
|
||||
|
||||
### DNS Pattern (Apply to Every Domain)
|
||||
|
||||
Two sets of MX records are required — one for the public domain (pointing to MXRoute) and one for the mail subdomain (pointing directly to Mailcow).
|
||||
|
||||
| Type | Host | Value | Notes |
|
||||
|------|------|-------|-------|
|
||||
| A | `mail` | `YOUR_ATT_MAIL_IP` | Mailcow server — MXRoute forwards here |
|
||||
| MX | `@` | MXRoute primary (priority 10) | From MXRoute welcome email |
|
||||
| MX | `@` | MXRoute secondary (priority 20) | From MXRoute welcome email |
|
||||
| MX | `mail` | `mail.domain.com` (priority 10) | Mailcow handles subdomain directly |
|
||||
| CNAME | `imap` | `mail.domain.com` | Client autoconfiguration |
|
||||
| CNAME | `smtp` | `mail.domain.com` | Client autoconfiguration |
|
||||
| CNAME | `webmail` | `mail.domain.com` | Roundcube access |
|
||||
| CNAME | `autodiscover` | `mail.domain.com` | Outlook autodiscover |
|
||||
| CNAME | `autoconfig` | `mail.domain.com` | Thunderbird autoconfig |
|
||||
| TXT | `@` | `v=spf1 ip4:YOUR_ATT_MAIL_IP include:mxroute.com -all` | SPF — both Mailcow direct and MXRoute relay |
|
||||
| TXT | `mail` | `v=spf1 ip4:YOUR_ATT_MAIL_IP -all` | SPF for subdomain — Mailcow direct only |
|
||||
| TXT | `_dmarc` | `v=DMARC1; p=reject; rua=mailto:admin@netgrimoire.com` | DMARC enforcement |
|
||||
| TXT | `mailcow._domainkey.mail` | *(generated in Mailcow ARC/DKIM Keys)* | Mailcow DKIM selector |
|
||||
| TXT | `x._domainkey` | *(from MXRoute control panel)* | MXRoute DKIM selector — confirm actual selector name |
|
||||
|
||||
---
|
||||
|
||||
### pncharris.com
|
||||
|
||||
| Type | Host | Value |
|
||||
|------|------|-------|
|
||||
| A | `mail` | YOUR_ATT_MAIL_IP |
|
||||
| MX | `@` | MXRoute primary (priority 10) |
|
||||
| MX | `@` | MXRoute secondary (priority 20) |
|
||||
| MX | `mail` | `mail.pncharris.com` (priority 10) |
|
||||
| CNAME | `imap` | `mail.pncharris.com` |
|
||||
| CNAME | `smtp` | `mail.pncharris.com` |
|
||||
| CNAME | `webmail` | `mail.pncharris.com` |
|
||||
| CNAME | `autodiscover` | `mail.pncharris.com` |
|
||||
| CNAME | `autoconfig` | `mail.pncharris.com` |
|
||||
| TXT | `@` | `v=spf1 ip4:YOUR_ATT_MAIL_IP include:mxroute.com -all` |
|
||||
| TXT | `mail` | `v=spf1 ip4:YOUR_ATT_MAIL_IP -all` |
|
||||
| TXT | `_dmarc` | `v=DMARC1; p=reject; rua=mailto:admin@netgrimoire.com` |
|
||||
| TXT | `mailcow._domainkey.mail` | *(from Mailcow ARC/DKIM Keys for mail.pncharris.com)* |
|
||||
| TXT | `x._domainkey` | *(from MXRoute control panel)* |
|
||||
|
||||
**Mailcow domains:** `mail.pncharris.com` (primary), `pncharris.com` (alias domain → mail.pncharris.com)
|
||||
|
||||
**Relay credentials:**
|
||||
|
||||
| Account | Password | Notes |
|
||||
|---------|----------|-------|
|
||||
| relay@pncharris.com | H@rv3yD)G123 | Current relay account |
|
||||
| forwarder@pncharris.com | *(see password history below)* | Legacy account |
|
||||
| passer@pncharris.com | bBJtPhrGkHvvhxhukkae | Current |
|
||||
| kylr pncharris | -,68,incTeR | |
|
||||
| G4@rlyf1ng3r | *(Feb 14)* | |
|
||||
|
||||
**passer@pncharris.com password history** (most recent last):
|
||||
- !5!,_\*zDyLEhhR4
|
||||
- sh7dXWnTPqbkDGsTcwtn
|
||||
- MY3V8p69b2HYksygxhXX
|
||||
- RS6U2GU6rcYe3THKKgYx
|
||||
- yzqNysrd73yzWptVEZ5H (current)
|
||||
|
||||
---
|
||||
|
||||
### wasted-bandwidth.net
|
||||
|
||||
| Type | Host | Value |
|
||||
|------|------|-------|
|
||||
| A | `mail` | YOUR_ATT_MAIL_IP |
|
||||
| MX | `@` | MXRoute primary (priority 10) |
|
||||
| MX | `@` | MXRoute secondary (priority 20) |
|
||||
| MX | `mail` | `mail.wasted-bandwidth.net` (priority 10) |
|
||||
| CNAME | `imap` | `mail.wasted-bandwidth.net` |
|
||||
| CNAME | `smtp` | `mail.wasted-bandwidth.net` |
|
||||
| CNAME | `webmail` | `mail.wasted-bandwidth.net` |
|
||||
| CNAME | `autodiscover` | `mail.wasted-bandwidth.net` |
|
||||
| CNAME | `autoconfig` | `mail.wasted-bandwidth.net` |
|
||||
| TXT | `@` | `v=spf1 ip4:YOUR_ATT_MAIL_IP include:mxroute.com -all` |
|
||||
| TXT | `mail` | `v=spf1 ip4:YOUR_ATT_MAIL_IP -all` |
|
||||
| TXT | `_dmarc` | `v=DMARC1; p=reject; rua=mailto:admin@netgrimoire.com` |
|
||||
| TXT | `mailcow._domainkey.mail` | *(from Mailcow ARC/DKIM Keys for mail.wasted-bandwidth.net)* |
|
||||
| TXT | `x._domainkey` | *(from MXRoute control panel)* |
|
||||
|
||||
**Mailcow domains:** `mail.wasted-bandwidth.net` (primary), `wasted-bandwidth.net` (alias domain)
|
||||
|
||||
**Relay credentials:**
|
||||
|
||||
| Account | Password |
|
||||
|---------|----------|
|
||||
| relay@wasted-bandwidth.net | dZ4yLYznVvgSJtqWZJFA |
|
||||
|
||||
---
|
||||
|
||||
### netgrimoire.com
|
||||
|
||||
| Type | Host | Value |
|
||||
|------|------|-------|
|
||||
| A | `mail` | YOUR_ATT_MAIL_IP |
|
||||
| MX | `@` | MXRoute primary (priority 10) |
|
||||
| MX | `@` | MXRoute secondary (priority 20) |
|
||||
| MX | `mail` | `mail.netgrimoire.com` (priority 10) |
|
||||
| CNAME | `imap` | `mail.netgrimoire.com` |
|
||||
| CNAME | `smtp` | `mail.netgrimoire.com` |
|
||||
| CNAME | `webmail` | `mail.netgrimoire.com` |
|
||||
| CNAME | `autodiscover` | `mail.netgrimoire.com` |
|
||||
| CNAME | `autoconfig` | `mail.netgrimoire.com` |
|
||||
| TXT | `@` | `v=spf1 ip4:YOUR_ATT_MAIL_IP include:mxroute.com -all` |
|
||||
| TXT | `mail` | `v=spf1 ip4:YOUR_ATT_MAIL_IP -all` |
|
||||
| TXT | `_dmarc` | `v=DMARC1; p=reject; rua=mailto:admin@netgrimoire.com` |
|
||||
| TXT | `mailcow._domainkey.mail` | *(from Mailcow ARC/DKIM Keys for mail.netgrimoire.com)* |
|
||||
| TXT | `x._domainkey` | *(from MXRoute control panel)* |
|
||||
|
||||
**Mailcow domains:** `mail.netgrimoire.com` (primary), `netgrimoire.com` (alias domain)
|
||||
|
||||
**Relay credentials:**
|
||||
|
||||
| Account | Password |
|
||||
|---------|----------|
|
||||
| relay@netgrimoire.com | TVGCnJp9SxRbWU8EhkMw |
|
||||
|
||||
---
|
||||
|
||||
### florosafd.org
|
||||
|
||||
| Type | Host | Value |
|
||||
|------|------|-------|
|
||||
| A | `mail` | YOUR_ATT_MAIL_IP |
|
||||
| MX | `@` | MXRoute primary (priority 10) |
|
||||
| MX | `@` | MXRoute secondary (priority 20) |
|
||||
| MX | `mail` | `mail.florosafd.org` (priority 10) |
|
||||
| CNAME | `imap` | `mail.florosafd.org` |
|
||||
| CNAME | `smtp` | `mail.florosafd.org` |
|
||||
| CNAME | `webmail` | `mail.florosafd.org` |
|
||||
| CNAME | `autodiscover` | `mail.florosafd.org` |
|
||||
| CNAME | `autoconfig` | `mail.florosafd.org` |
|
||||
| TXT | `@` | `v=spf1 ip4:YOUR_ATT_MAIL_IP include:mxroute.com -all` |
|
||||
| TXT | `mail` | `v=spf1 ip4:YOUR_ATT_MAIL_IP -all` |
|
||||
| TXT | `_dmarc` | `v=DMARC1; p=reject; rua=mailto:admin@netgrimoire.com` |
|
||||
| TXT | `mailcow._domainkey.mail` | *(from Mailcow ARC/DKIM Keys for mail.florosafd.org)* |
|
||||
| TXT | `x._domainkey` | *(from MXRoute control panel)* |
|
||||
|
||||
**Mailcow domains:** `mail.florosafd.org` (primary), `florosafd.org` (alias domain)
|
||||
|
||||
**Relay credentials:**
|
||||
|
||||
| Account | Password |
|
||||
|---------|----------|
|
||||
| relay@florosafd.org | 2Fe8XMyaeh6Z5dvdHYdq |
|
||||
|
||||
---
|
||||
|
||||
### gnarlypandaproductions.com
|
||||
|
||||
| Type | Host | Value |
|
||||
|------|------|-------|
|
||||
| A | `mail` | YOUR_ATT_MAIL_IP |
|
||||
| MX | `@` | MXRoute primary (priority 10) |
|
||||
| MX | `@` | MXRoute secondary (priority 20) |
|
||||
| MX | `mail` | `mail.gnarlypandaproductions.com` (priority 10) |
|
||||
| CNAME | `imap` | `mail.gnarlypandaproductions.com` |
|
||||
| CNAME | `smtp` | `mail.gnarlypandaproductions.com` |
|
||||
| CNAME | `webmail` | `mail.gnarlypandaproductions.com` |
|
||||
| CNAME | `roundcube` | `roundcube.netgrimoire.com` |
|
||||
| CNAME | `autodiscover` | `mail.gnarlypandaproductions.com` |
|
||||
| CNAME | `autoconfig` | `mail.gnarlypandaproductions.com` |
|
||||
| TXT | `@` | `v=spf1 ip4:YOUR_ATT_MAIL_IP include:mxroute.com -all` |
|
||||
| TXT | `mail` | `v=spf1 ip4:YOUR_ATT_MAIL_IP -all` |
|
||||
| TXT | `_dmarc` | `v=DMARC1; p=reject; rua=mailto:admin@gnarlypandaproductions.com` |
|
||||
| TXT | `mailcow._domainkey.mail` | *(from Mailcow ARC/DKIM Keys for mail.gnarlypandaproductions.com)* |
|
||||
| TXT | `default._domainkey` | `v=DKIM1; t=s; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA3D3vyPoBHB4eMSMq8HygVWHzYbketRX4yjk9wV4bdaar0/c89dK230FMOW6zVXEsY1sXKFk1kBxerHVw0wY8qnQyooHgINEQcEXrtB/x93Sl/cqBQXk+PHOIOymQwgni8WCUhCSnvunxXK8qX5f9J56qzd0/wpY2WSEHho+XrnQjc+c7HMvkcC3+nKJe59ZNgvQW/Y9B/L6zFDjAp+QOUYp9wwX4L+j1T4fQSygYxAJZ0aIoR8FsbOuXc38pht99HyUnYwH08HoK7xv3DL2BrVo3KVZ7xMe2S4YMxd1HkJz2evbV/ziNsJcKW/le3fFS7mza09yJXDLDcLOKLXbYUQIDAQAB` |
|
||||
| TXT | `x._domainkey` | *(from MXRoute control panel — confirm actual selector)* |
|
||||
|
||||
**Mailcow domains:** `mail.gnarlypandaproductions.com` (primary), `gnarlypandaproductions.com` (alias domain)
|
||||
|
||||
**Relay credentials:**
|
||||
|
||||
| Account | Password |
|
||||
|---------|----------|
|
||||
| relay@gnarlypandaproductions.com | vG5ZsUQhRWD2UyzLPsqA |
|
||||
|
||||
---
|
||||
|
||||
### nucking-futz.com
|
||||
|
||||
New domain — see [Mail Setup — nucking-futz.com](./mail-setup-nucking-futz) for full setup guide.
|
||||
|
||||
| Type | Host | Value |
|
||||
|------|------|-------|
|
||||
| A | `mail` | YOUR_ATT_MAIL_IP |
|
||||
| MX | `@` | MXRoute primary (priority 10) |
|
||||
| MX | `@` | MXRoute secondary (priority 20) |
|
||||
| MX | `mail` | `mail.nucking-futz.com` (priority 10) |
|
||||
| CNAME | `imap` | `mail.nucking-futz.com` |
|
||||
| CNAME | `smtp` | `mail.nucking-futz.com` |
|
||||
| CNAME | `webmail` | `mail.nucking-futz.com` |
|
||||
| CNAME | `autodiscover` | `mail.nucking-futz.com` |
|
||||
| CNAME | `autoconfig` | `mail.nucking-futz.com` |
|
||||
| TXT | `@` | `v=spf1 ip4:YOUR_ATT_MAIL_IP include:mxroute.com -all` |
|
||||
| TXT | `mail` | `v=spf1 ip4:YOUR_ATT_MAIL_IP -all` |
|
||||
| TXT | `_dmarc` | `v=DMARC1; p=reject; rua=mailto:admin@netgrimoire.com` |
|
||||
| TXT | `mailcow._domainkey.mail` | *(from Mailcow ARC/DKIM Keys for mail.nucking-futz.com)* |
|
||||
| TXT | `x._domainkey` | *(from MXRoute control panel)* |
|
||||
|
||||
**Mailcow domains:** `mail.nucking-futz.com` (primary), `nucking-futz.com` (alias domain)
|
||||
|
||||
**Relay credentials:**
|
||||
|
||||
| Account | Password |
|
||||
|---------|----------|
|
||||
| relay@nucking-futz.com | *(set during MXRoute domain creation)* |
|
||||
|
||||
---
|
||||
|
||||
## Adding a New Domain — Checklist
|
||||
|
||||
Use this checklist every time a new domain is added to the stack.
|
||||
|
||||
**DNS (at registrar):**
|
||||
- [ ] A record: `mail.newdomain.com` → YOUR_ATT_MAIL_IP
|
||||
- [ ] MX records: `@` → MXRoute servers
|
||||
- [ ] MX record: `mail` → `mail.newdomain.com`
|
||||
- [ ] CNAME records: imap, smtp, webmail, autodiscover, autoconfig
|
||||
- [ ] SPF TXT: `@` — includes both ATT IP and `include:mxroute.com`
|
||||
- [ ] SPF TXT: `mail` — ATT IP only
|
||||
- [ ] DMARC TXT: `_dmarc`
|
||||
- [ ] DKIM TXT: `mailcow._domainkey.mail` — after generating in Mailcow
|
||||
- [ ] DKIM TXT: `x._domainkey` — after retrieving from MXRoute
|
||||
|
||||
**Mailcow:**
|
||||
- [ ] Add domain: `mail.newdomain.com`
|
||||
- [ ] Add alias domain: `newdomain.com` → `mail.newdomain.com`
|
||||
- [ ] Generate DKIM key (selector: `mailcow`) for `mail.newdomain.com`
|
||||
- [ ] Add sender-dependent transport for `newdomain.com`
|
||||
- [ ] Add sender-dependent transport for `mail.newdomain.com`
|
||||
- [ ] Create mailboxes as `user@mail.newdomain.com`
|
||||
|
||||
**MXRoute:**
|
||||
- [ ] Add domain in control panel
|
||||
- [ ] Create forwarder for each mailbox: `user@newdomain.com` → `user@mail.newdomain.com`
|
||||
- [ ] Retrieve DKIM key for DNS
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Mail not delivering inbound (not reaching Mailcow)
|
||||
|
||||
- Check MX records for `@` point to MXRoute servers: `dig MX domain.com +short`
|
||||
- Check MX record for `mail` subdomain points to Mailcow: `dig MX mail.domain.com +short`
|
||||
- Verify MXRoute forwarder exists for the address in the control panel
|
||||
- Check Mailcow logs: **Logs → Postfix** — look for the delivery attempt and any rejection reason
|
||||
- Verify MXRoute IP ranges are in Mailcow `extra.cf` trusted networks
|
||||
|
||||
### Mail not delivering inbound (banks / financial institutions)
|
||||
|
||||
- This is the residential AT&T IP problem — confirm MX records point to MXRoute, not directly to your IP
|
||||
- Run `dig MX domain.com +short` — should show MXRoute servers, not your IP
|
||||
- If MX still points to your ATT IP, update DNS and wait for propagation
|
||||
|
||||
### Outbound mail rejected or going to spam
|
||||
|
||||
- Verify sender-dependent transport is configured for the domain in Mailcow
|
||||
- Check relay credentials are current in the transport entry
|
||||
- Run an SPF check: `dig TXT domain.com +short` — confirm `include:mxroute.com` is present
|
||||
- Send test to check-auth@verifier.port25.com for full SPF/DKIM/DMARC report
|
||||
- Run through https://mail-tester.com for a deliverability score
|
||||
|
||||
### DKIM verification failing
|
||||
|
||||
- Confirm both selectors are published in DNS:
|
||||
- `dig TXT mailcow._domainkey.mail.domain.com +short`
|
||||
- `dig TXT x._domainkey.domain.com +short` (substitute actual MXRoute selector)
|
||||
- Allow up to 48 hours for DNS propagation after adding records
|
||||
- Verify selector names match exactly what Mailcow and MXRoute are using to sign
|
||||
|
||||
### DMARC failures
|
||||
|
||||
- SPF and DKIM must both pass and align with the From: domain
|
||||
- Check DMARC reports sent to `admin@netgrimoire.com` — use [Postmark DMARC](https://dmarc.postmarkapp.com/) or [dmarcian.com](https://dmarcian.com) to parse raw XML reports
|
||||
- Common cause: outbound mail going through MXRoute but `include:mxroute.com` missing from SPF
|
||||
|
||||
### Forwarded mail getting spam-scored
|
||||
|
||||
- Confirm MXRoute IP ranges are in Mailcow `extra.cf` mynetworks
|
||||
- Check that Mailcow trusted networks were saved and containers restarted
|
||||
- Verify SRS is working: in Roundcube open a forwarded message → More → View Source → `Return-Path` should begin with `SRS0=`
|
||||
|
||||
### New mailbox not receiving mail
|
||||
|
||||
- Two steps are required — confirm both were done:
|
||||
1. Mailbox created in Mailcow as `user@mail.domain.com`
|
||||
2. Forwarder created in MXRoute as `user@domain.com` → `user@mail.domain.com`
|
||||
- If the MXRoute forwarder is missing, inbound mail silently goes nowhere
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [MailCow Configuration](./mailcow)
|
||||
- [MailCow Security Hardening](./mailcow-security-hardening)
|
||||
- [Mail Setup — nucking-futz.com](./mail-setup-nucking-futz)
|
||||
- [OPNsense Firewall](./opnsense-firewall) — ATT_Mail static IP allocation
|
||||
85
Netgrimoire/Keystone-Grimoire/Mail/MailCow-Overview.md
Normal file
85
Netgrimoire/Keystone-Grimoire/Mail/MailCow-Overview.md
Normal file
|
|
@ -0,0 +1,85 @@
|
|||
---
|
||||
title: MailCow Overview
|
||||
description: Self-hosted mail stack — architecture, domains, and key decisions
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: keystone, mail, mailcow
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# MailCow Overview
|
||||
|
||||
MailCow runs on `docker4` (hermes, 192.168.5.16) via Docker Compose — not Swarm. It manages mail for all 8 domains.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
| Component | Role |
|
||||
|-----------|------|
|
||||
| MailCow stack | Postfix, Dovecot, Rspamd, ClamAV, SOGo, Roundcube, nginx-mailcow |
|
||||
| MXRoute | Inbound filtering + outbound relay for all domains |
|
||||
| nginx-mailcow | Only MailCow container connected to `netgrimoire` overlay |
|
||||
|
||||
**Critical:** Only `nginx-mailcow` is attached to the `netgrimoire` overlay network. All other MailCow containers stay on the internal `mailcow-network` bridge. Connecting other containers to the overlay causes Redis and PHP-FPM to resolve to wrong IPs, breaking the entire stack.
|
||||
|
||||
---
|
||||
|
||||
## Domains
|
||||
|
||||
`netgrimoire.com` · `pncharris.com` · `wasted-bandwidth.net` · `nucking-futz.com` · `florosafd.org` · `gnarlypandaproductions.com` · `pncfishandmore.com` · `pncharrisenterprises.com`
|
||||
|
||||
---
|
||||
|
||||
## Mail Flow
|
||||
|
||||
**Inbound:** MXRoute filters → forwards to MailCow → Dovecot delivers
|
||||
|
||||
**Outbound:** Postfix → MXRoute relay → recipient
|
||||
|
||||
**SRS rewriting:** MXRoute rewrites the envelope sender on forwarded mail. All domains using MXRoute inbound forwarding **must** have catch-all aliases configured in MailCow, or `reject_unlisted_sender` will reject the rewritten addresses.
|
||||
|
||||
---
|
||||
|
||||
## DKIM
|
||||
|
||||
Two selectors required:
|
||||
|
||||
| Selector | Purpose |
|
||||
|----------|---------|
|
||||
| `mailcow` | Direct sends from MailCow |
|
||||
| `mxroute` | MXRoute relay path |
|
||||
|
||||
---
|
||||
|
||||
## Key Limits (must match across all three)
|
||||
|
||||
Attachment size limits must be set identically in Postfix, Rspamd, and ClamAV. Changing only Postfix is insufficient — Rspamd and ClamAV reject large messages before Postfix processes them.
|
||||
|
||||
---
|
||||
|
||||
## Roundcube SSL
|
||||
|
||||
Internal connections to Dovecot use self-signed certs. In `config.inc.php`:
|
||||
|
||||
```php
|
||||
$config['imap_conn_options'] = ['ssl' => ['verify_peer' => false, 'verify_peer_name' => false]];
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Docs
|
||||
|
||||
- [MXRoute Integration](/Keystone-Grimoire/Mail/MXRoute-Integration)
|
||||
- [Domain Setup](/Keystone-Grimoire/Mail/Domain-Setup)
|
||||
- [MailCow Hardening](/Keystone-Grimoire/Mail/Hardening)
|
||||
- [MailCow Backup](/Vault-Grimoire/Backups/MailCow-Backup)
|
||||
|
||||
---
|
||||
|
||||
## Pending
|
||||
|
||||
- [ ] Dedicated ATT_Mail static IP for outbound mail (OPNsense outbound NAT rule)
|
||||
- [ ] Second DKIM selector (`mxroute`) validation
|
||||
- [ ] MTA-STS validation (supported since Sep 2025 update)
|
||||
60
Netgrimoire/Keystone-Grimoire/Network/Port-Assignments.md
Normal file
60
Netgrimoire/Keystone-Grimoire/Network/Port-Assignments.md
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
---
|
||||
title: Port Assignments
|
||||
description:
|
||||
published: true
|
||||
date: 2026-02-20T04:21:52.996Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-01-27T03:42:58.945Z
|
||||
---
|
||||
|
||||
# Physical Paths
|
||||
|
||||
|Device|IP|Room|Home Infra|DLink|TPLink|Closet|Inter Rack|Rack|Ubiquity|
|
||||
|------|--|----|------|------|-------|------|----|----|--------|
|
||||
|Dlink |5.2 |Office | |1| | | | |1 |
|
||||
|ZNAS |5.10 | | |2| | | | | |
|
||||
|Docker3 | | | |3| | | | | |
|
||||
|Docker5 | | | |4| | | | | |
|
||||
|DockerPi1 | | | |5| | | | | |
|
||||
|DNS |5.7 | | |6| | | | | |
|
||||
|Docker4 | | | | | | |W:7 |19|4 |
|
||||
|Docker2 | | Office | | | | |W:5 |17|11|
|
||||
|Time Machine| | | | | | |W:6 |18|12|
|
||||
|Deco Satt | |Room 1 |1 | | | | | |15|
|
||||
|Deco AP | |Office(E)|10-24| | |24|W:9 |21|20|
|
||||
|TP Link | | | | |1|22|W:10|22|23|
|
||||
|OpnSense |3.4 | | | | |23|W:11|23|24|
|
||||
|OPnSense-Cox| | | | | | | | | |
|
||||
| | | | | | | | | | |
|
||||
| | |Room 2 |2 | | | | |2 | |
|
||||
| | |Room 3 |3 | | | | |3 | |
|
||||
| | |Living(E)|4 | | | | |4 | |
|
||||
| | |Living(W)|5 | | | | |5 | |
|
||||
| | |Family |6 | | | | |6 | |
|
||||
| | |Pantry |7 | | | | |7 | |
|
||||
| | |Room 4 |8 | | | | |8 | |
|
||||
| | |Gym |9 | | | | |9 | |
|
||||
| | |Office(S)|11 | | | | |11| |
|
||||
| | |Office(W)|12 | | | | |12| |
|
||||
| | |Office(W)|13 | | | | |13| |
|
||||
| | |Office(W)|14 | | | | |14| |
|
||||
| | |Office(W)|15 | | | | |15| |
|
||||
| | |Office(W)|16 | | | | |16| |
|
||||
| | |Office(N)|17 | | | | |17| |
|
||||
| | |Office(N)|18 | | | | |18| |
|
||||
| | |Office(N)|19 | | | | |19| |
|
||||
| | |Office(N)|20 | | | | |20| |
|
||||
|
||||
Note: For rooms N,E,S,W are compass directions
|
||||
For InterRack, W - wall, H - Hallway
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
49
Netgrimoire/Keystone-Grimoire/Network/Topology.md
Normal file
49
Netgrimoire/Keystone-Grimoire/Network/Topology.md
Normal file
|
|
@ -0,0 +1,49 @@
|
|||
---
|
||||
title: Network Topology
|
||||
description: Netgrimoire network layout — VLANs, subnets, routing
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: keystone, network
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Network Topology
|
||||
|
||||
## Subnets
|
||||
|
||||
| Subnet | Purpose |
|
||||
|--------|---------|
|
||||
| 192.168.3.0/24 | OPNsense / firewall management |
|
||||
| 192.168.4.0/24 | ISPConfig / web hosting |
|
||||
| 192.168.5.0/24 | Primary LAN — all Docker hosts |
|
||||
| 192.168.8.0/24 | Pocket Grimoire (GL.iNet Beryl AX) |
|
||||
| 192.168.32.0/24 | WireGuard VPN peers |
|
||||
|
||||
## WireGuard Peers
|
||||
|
||||
| Peer | IP | Device |
|
||||
|------|----|--------|
|
||||
| Obie | 192.168.32.2 | — |
|
||||
| pncfishandmore | 192.168.32.3 | — |
|
||||
| GLNet | 192.168.32.4 | GL.iNet router |
|
||||
| PortaPotty | 192.168.32.5 | Pocket Grimoire laptop |
|
||||
| GLNet | 192.168.32.6 | Second GL.iNet |
|
||||
|
||||
## DNS
|
||||
|
||||
Internal DNS runs on Technitium at `192.168.5.7` (`dns.netgrimoire.com`), behind Authentik.
|
||||
|
||||
All `*.netgrimoire.com` and `*.wasted-bandwidth.net` internal hostnames resolve via Technitium. Public DNS managed via ISPConfig and domain registrars.
|
||||
|
||||
## Docker Overlay Network
|
||||
|
||||
All Swarm services share the `netgrimoire` external overlay network (VIP mode). This is the only overlay network in use.
|
||||
|
||||
```
|
||||
Name: netgrimoire
|
||||
Driver: overlay
|
||||
Mode: VIP (always — dnsrr is banned)
|
||||
```
|
||||
|
||||
See [Docker Swarm Template](/Keystone-Grimoire/Docker/Swarm-Template) for attachment rules.
|
||||
36
Netgrimoire/Keystone-Grimoire/Overview.md
Normal file
36
Netgrimoire/Keystone-Grimoire/Overview.md
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
---
|
||||
title: Keystone Grimoire
|
||||
description: Architecture — the dwarven runesmith's blueprints
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: keystone, architecture
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Keystone Grimoire
|
||||
|
||||

|
||||
|
||||
The Keystone Grimoire holds the architectural blueprints of Netgrimoire — how everything is wired together, how traffic flows, why decisions were made. Remove the keystone and the arch falls. This is the arch.
|
||||
|
||||
---
|
||||
|
||||
## Sections
|
||||
|
||||
| Section | Contents |
|
||||
|---------|----------|
|
||||
| [Hosts](/Keystone-Grimoire/Hosts/Host-Inventory) | Node inventory, roles, IPs, pinned services, hardware |
|
||||
| [Network](/Keystone-Grimoire/Network/Topology) | Topology, VLANs, DNS, WireGuard, OpenVPN, port assignments |
|
||||
| [Docker](/Keystone-Grimoire/Docker/Swarm-Template) | Swarm template standard, overlay network, label rules, volume paths |
|
||||
| [Mail](/Keystone-Grimoire/Mail/MailCow-Overview) | MailCow, MXRoute, DKIM, SRS, domain setup, hardening |
|
||||
|
||||
---
|
||||
|
||||
## Key Principles
|
||||
|
||||
- **Caddy is the single entry point** for all web traffic. Every public service goes through Caddy. No exceptions.
|
||||
- **Docker labels drive routing** — services register themselves with Caddy via `deploy.labels`. Static Caddyfile entries only for Compose stacks where label pickup is unreliable.
|
||||
- **Never mix label and static routing for the same hostname** — caddy-docker-proxy merges them into a broken upstream pool.
|
||||
- **Always VIP endpoint mode** — `endpoint_mode: dnsrr` is banned. It breaks internal DNS resolution.
|
||||
- **ARM nodes are excluded by default** — all swarm services carry `node.platform.arch != aarch64` and `node.platform.arch != arm` constraints unless explicitly ARM-specific.
|
||||
45
Netgrimoire/Pocket-Grimoire/Hardware/Inventory.md
Normal file
45
Netgrimoire/Pocket-Grimoire/Hardware/Inventory.md
Normal file
|
|
@ -0,0 +1,45 @@
|
|||
---
|
||||
title: Hardware Inventory
|
||||
description: Pocket Grimoire hardware — laptop, router, storage, power
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: pocket, hardware
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Hardware Inventory
|
||||
|
||||
## Core Compute
|
||||
|
||||
- Laptop (Docker host)
|
||||
- ZFS pool `pocket-green` at `/srv/greenpg/`
|
||||
- Docker Engine (not Swarm)
|
||||
|
||||
## Networking
|
||||
|
||||
- GL.iNet Beryl AX (GL-MT3000)
|
||||
- LAN: `192.168.8.0/24`
|
||||
- WireGuard peer: `PortaPotty` (192.168.32.5)
|
||||
- Short CAT5/6 cable (router ↔ laptop)
|
||||
|
||||
## Storage
|
||||
|
||||
| Drive | Mount | Encrypted | Contents |
|
||||
|-------|-------|-----------|---------|
|
||||
| SSD Vault | ZFS pool | Yes | Git mirrors, wiki backup, Kopia repo, SSH keys, system configs |
|
||||
| SSD Green | ZFS pool | Yes | Personal media, Stash data, VeraCrypt containers — personal trips only |
|
||||
|
||||
## Media Players
|
||||
|
||||
- 2x Onn 4K streaming boxes with power
|
||||
- FireTV Stick with power
|
||||
|
||||
## Power
|
||||
|
||||
- Anker Prime 200W 6-Port GaN desktop charger
|
||||
- Short USB-C cables (router)
|
||||
- Short USB-A to USB-C (laptop power backup)
|
||||
- 2x short USB-3 cables (SSDs)
|
||||
- Longer USB-C to USB-C (laptop primary power)
|
||||
- Longer USB-C to USB-C (phone/tablet)
|
||||
863
Netgrimoire/Pocket-Grimoire/Hardware/ONN-Media-Streamer.md
Normal file
863
Netgrimoire/Pocket-Grimoire/Hardware/ONN-Media-Streamer.md
Normal file
|
|
@ -0,0 +1,863 @@
|
|||
---
|
||||
title: Stream Box
|
||||
description: Configure ONN Media Box
|
||||
published: true
|
||||
date: 2026-02-20T04:50:44.701Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-20T04:50:34.384Z
|
||||
---
|
||||
|
||||
# Onn 4K Streaming Box Setup Guide
|
||||
|
||||
**Complete configuration guide for Onn 4K streaming boxes used with Pocket Grimoire**
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This guide covers the complete setup of your Onn 4K streaming boxes for use with Pocket Grimoire, including:
|
||||
- Initial device setup
|
||||
- WiFi configuration (portapotty network)
|
||||
- Required app installations (Jellyfin, StashApp, Netflix, YouTube TV)
|
||||
- Connection to Pocket Grimoire services
|
||||
- Troubleshooting common issues
|
||||
|
||||
**Network Configuration:**
|
||||
- **WiFi SSID:** `portapotty` (GL.iNet Beryl AX travel router)
|
||||
- **Connection:** All devices connect wirelessly to portapotty
|
||||
- **Exception:** Raspberry Pi connects to router via CAT5 ethernet
|
||||
|
||||
---
|
||||
|
||||
## Hardware Information
|
||||
|
||||
### Onn 4K Streaming Box Specifications
|
||||
- **Model:** Onn 4K Streaming Box (Walmart exclusive)
|
||||
- **OS:** Android TV (Google TV interface)
|
||||
- **CPU:** Amlogic S905Y4 quad-core
|
||||
- **RAM:** 2GB
|
||||
- **Storage:** 8GB internal
|
||||
- **Video:** 4K HDR, Dolby Vision, Dolby Atmos
|
||||
- **WiFi:** 802.11ac (WiFi 5) dual-band
|
||||
- **Bluetooth:** 5.0
|
||||
- **Ports:** HDMI 2.1, Micro-USB (power)
|
||||
- **Remote:** Voice remote with Google Assistant
|
||||
|
||||
### What's in the Box
|
||||
- Onn 4K streaming box
|
||||
- Voice remote with batteries
|
||||
- USB power adapter
|
||||
- HDMI cable (short)
|
||||
- Quick start guide
|
||||
|
||||
---
|
||||
|
||||
## Initial Setup
|
||||
|
||||
### First Power-On
|
||||
|
||||
1. **Connect to TV:**
|
||||
- Plug HDMI cable into Onn box
|
||||
- Connect other end to hotel TV HDMI port
|
||||
- Plug Micro-USB power into Onn box
|
||||
- Connect USB power adapter to wall or Anker Prime
|
||||
|
||||
2. **Power On:**
|
||||
- TV should auto-detect HDMI input
|
||||
- If not, use TV remote to select correct HDMI input
|
||||
- Onn box LED will light up (solid white when ready)
|
||||
- Wait for Google TV home screen
|
||||
|
||||
3. **Select Language:**
|
||||
- Use remote to select language (English)
|
||||
- Click OK
|
||||
|
||||
4. **Accessibility Options:**
|
||||
- Skip unless needed (click "Skip")
|
||||
|
||||
### WiFi Configuration
|
||||
|
||||
**Critical: Connect to portapotty network**
|
||||
|
||||
1. **WiFi Setup Screen:**
|
||||
- List of available networks will appear
|
||||
- Scroll to find `portapotty`
|
||||
- Select `portapotty`
|
||||
- Click "Connect"
|
||||
|
||||
2. **Enter Password:**
|
||||
- Enter WiFi password for portapotty network
|
||||
- Use on-screen keyboard
|
||||
- Click "Connect"
|
||||
- Wait for connection (should take 5-10 seconds)
|
||||
- "Connected" message will appear
|
||||
|
||||
3. **Verify Connection:**
|
||||
- Should show "portapotty" with signal strength
|
||||
- Should show "Connected" status
|
||||
|
||||
**Troubleshooting WiFi:**
|
||||
- If portapotty doesn't appear: Ensure Beryl AX router is powered on
|
||||
- If password fails: Double-check portapotty WiFi password
|
||||
- If connection drops: Move closer to router
|
||||
- Signal strength: Should be "Excellent" or "Good" in hotel room
|
||||
|
||||
### Google Account Setup
|
||||
|
||||
**Option A: Sign in with Google Account**
|
||||
1. Select "Sign in"
|
||||
2. Use phone to scan QR code or enter code
|
||||
3. Follow prompts on phone
|
||||
4. Account will sync to Onn box
|
||||
|
||||
**Option B: Set up without Google Account (Limited)**
|
||||
1. Select "Skip"
|
||||
2. Click "Skip" again to confirm
|
||||
3. Some features will be limited (Play Store, purchases)
|
||||
4. **Recommendation:** Use Option A for full functionality
|
||||
|
||||
**For Pocket Grimoire:**
|
||||
- Need Google account for: Play Store (to install apps)
|
||||
- StashApp requires sideloading (see separate section)
|
||||
|
||||
### Complete Initial Setup
|
||||
|
||||
1. **Google Services:**
|
||||
- Accept terms (or skip)
|
||||
- Location services: Your choice
|
||||
- Device name: Name it (e.g., "Onn Box 1", "Onn Box 2")
|
||||
|
||||
2. **Voice Match:**
|
||||
- Set up "Hey Google" voice commands (optional)
|
||||
- Can skip and set up later
|
||||
|
||||
3. **Apps to Install:**
|
||||
- Google will suggest popular apps
|
||||
- Skip for now (we'll install specific apps later)
|
||||
- Click "Next" or "Skip"
|
||||
|
||||
4. **Complete:**
|
||||
- Should arrive at Google TV home screen
|
||||
- Remote should control interface
|
||||
- Ready to install apps
|
||||
|
||||
---
|
||||
|
||||
## App Installations
|
||||
|
||||
### 1. Jellyfin for Android TV
|
||||
|
||||
**Install from Google Play Store:**
|
||||
|
||||
1. **Open Play Store:**
|
||||
- Press Home button on remote
|
||||
- Navigate to "Apps" tab at top
|
||||
- Select "Play Store"
|
||||
|
||||
2. **Search for Jellyfin:**
|
||||
- Click search icon (magnifying glass)
|
||||
- Type "Jellyfin" using on-screen keyboard
|
||||
- Select "Jellyfin for Android TV" from results
|
||||
- **Developer:** Jellyfin
|
||||
- **Note:** Choose "Jellyfin for Android TV" not regular Jellyfin
|
||||
|
||||
3. **Install:**
|
||||
- Click "Install"
|
||||
- Wait for download and installation (~30 seconds)
|
||||
- Click "Open" when complete
|
||||
|
||||
4. **Configure Jellyfin:**
|
||||
- Click "Connect to Server"
|
||||
- **Method 1 - Manual Entry:**
|
||||
- Click "Add server manually"
|
||||
- Host: `pocket-grimoire.local` or `10.0.0.10` (Pi's IP)
|
||||
- Port: `8096`
|
||||
- Click "Connect"
|
||||
|
||||
- **Method 2 - Auto-Discovery (if available):**
|
||||
- Wait for Jellyfin to discover Pocket Grimoire
|
||||
- Select "Pocket Grimoire" from list
|
||||
- Click "Connect"
|
||||
|
||||
5. **Login:**
|
||||
- Enter username and password
|
||||
- Or select "Quick Connect" if configured
|
||||
- Click "Sign In"
|
||||
|
||||
6. **Verify:**
|
||||
- Should see Jellyfin home screen
|
||||
- Libraries (Movies, TV Shows) should appear
|
||||
- Test playing a video (should be direct play, no buffering)
|
||||
|
||||
**Jellyfin Settings (Optional but Recommended):**
|
||||
- Settings → Playback
|
||||
- Video quality: Maximum
|
||||
- Allow direct play: ON
|
||||
- Allow direct stream: ON
|
||||
- Allow video transcoding: OFF (should be disabled on server already)
|
||||
|
||||
### 2. StashApp for Android TV
|
||||
|
||||
**Installation: Requires Sideloading (GitHub Release)**
|
||||
|
||||
StashApp is not available in Play Store, must be installed manually via APK file.
|
||||
|
||||
#### Prerequisites
|
||||
- USB drive (for APK transfer)
|
||||
- Computer with internet access
|
||||
- OR Android phone with file transfer capability
|
||||
|
||||
#### Method 1: USB Drive Installation (Recommended)
|
||||
|
||||
**On Your Computer:**
|
||||
|
||||
1. **Download StashApp APK:**
|
||||
- Open browser: https://github.com/damontecres/StashAppAndroidTV/releases
|
||||
- Find latest release (e.g., v1.x.x)
|
||||
- Download file: `stashapp-tv-release-vX.X.X.apk`
|
||||
- Save to USB drive
|
||||
|
||||
2. **Prepare USB Drive:**
|
||||
- Format as FAT32 or exFAT (if not already)
|
||||
- Copy APK to root of USB drive
|
||||
- Safely eject USB drive
|
||||
|
||||
**On Onn Box:**
|
||||
|
||||
3. **Enable Unknown Sources:**
|
||||
- Press Home button
|
||||
- Navigate to Settings (gear icon)
|
||||
- Select "Device Preferences"
|
||||
- Select "Security & Restrictions"
|
||||
- Enable "Unknown Sources"
|
||||
- Confirm warning (accept risk)
|
||||
|
||||
4. **Install File Manager (if needed):**
|
||||
- Open Play Store
|
||||
- Search "File Commander" or "X-plore File Manager"
|
||||
- Install one of these apps
|
||||
- Open the file manager app
|
||||
|
||||
5. **Connect USB Drive:**
|
||||
- Plug USB drive into Onn box USB port
|
||||
- **Note:** Onn box only has Micro-USB (power), so you need:
|
||||
- USB OTG adapter (Micro-USB to USB-A female)
|
||||
- OR transfer APK via network/Bluetooth
|
||||
|
||||
**Alternative: Network Transfer**
|
||||
|
||||
Since Onn box doesn't have easy USB access:
|
||||
|
||||
1. **Use Send Files to TV App:**
|
||||
- On Onn box: Install "Send Files to TV" from Play Store
|
||||
- On phone/computer: Install companion app
|
||||
- Transfer APK wirelessly
|
||||
- Open with package installer
|
||||
|
||||
2. **Or Use Cloud Storage:**
|
||||
- Upload APK to Google Drive
|
||||
- On Onn box: Install Google Drive app
|
||||
- Download APK from Drive
|
||||
- Open with package installer
|
||||
|
||||
#### Method 2: Direct Download on Onn Box (Easiest)
|
||||
|
||||
**On Onn Box:**
|
||||
|
||||
1. **Install Downloader App:**
|
||||
- Open Play Store
|
||||
- Search "Downloader" (by AFTVnews)
|
||||
- Install and open
|
||||
|
||||
2. **Download StashApp APK:**
|
||||
- In Downloader app, click URL field
|
||||
- Enter: `https://github.com/damontecres/StashAppAndroidTV/releases`
|
||||
- Navigate to latest release
|
||||
- Click APK download link
|
||||
- Save APK
|
||||
|
||||
3. **Install APK:**
|
||||
- Downloader will prompt to install after download
|
||||
- Click "Install"
|
||||
- Click "Done" when complete
|
||||
- APK will be installed
|
||||
|
||||
**Configure StashApp:**
|
||||
|
||||
1. **Open StashApp:**
|
||||
- Find in Apps list (may be under "See all apps")
|
||||
- Or search "Stash" in search bar
|
||||
|
||||
2. **Connect to Server:**
|
||||
- Enter server URL: `http://pocket-grimoire.local:9999`
|
||||
- Or use IP: `http://10.0.0.10:9999`
|
||||
- Enter API key (if required)
|
||||
- Click "Connect"
|
||||
|
||||
3. **Test Connection:**
|
||||
- Should load Stash interface
|
||||
- Browse library
|
||||
- Test playing a preview
|
||||
- Verify scene markers work
|
||||
|
||||
**StashApp Settings:**
|
||||
- Video quality: Original (for direct play)
|
||||
- Hardware acceleration: ON
|
||||
- Cache previews: ON (if storage available)
|
||||
|
||||
### 3. Netflix
|
||||
|
||||
**Install from Google Play Store:**
|
||||
|
||||
1. **Open Play Store:**
|
||||
- Press Home button
|
||||
- Navigate to "Apps"
|
||||
- Select "Play Store"
|
||||
|
||||
2. **Search Netflix:**
|
||||
- Search bar → type "Netflix"
|
||||
- Select "Netflix" (official app)
|
||||
- Click "Install"
|
||||
- Wait for installation
|
||||
|
||||
3. **Open Netflix:**
|
||||
- Click "Open" after installation
|
||||
- Or find in Apps list
|
||||
|
||||
4. **Sign In:**
|
||||
- Enter Netflix email and password
|
||||
- Or scan QR code with phone
|
||||
- Select profile
|
||||
|
||||
5. **Test:**
|
||||
- Browse content
|
||||
- Play a video to verify streaming works
|
||||
- Check video quality (should be HD/4K)
|
||||
|
||||
**Netflix Settings:**
|
||||
- Profile: Select your profile
|
||||
- Video quality: High (auto)
|
||||
- Subtitles/audio: Configure as preferred
|
||||
|
||||
### 4. YouTube TV
|
||||
|
||||
**Install from Google Play Store:**
|
||||
|
||||
1. **Open Play Store:**
|
||||
- Navigate to Play Store
|
||||
- Search "YouTube TV"
|
||||
|
||||
2. **Install:**
|
||||
- Select "YouTube TV" (official app)
|
||||
- Click "Install"
|
||||
- Wait for installation
|
||||
|
||||
3. **Sign In:**
|
||||
- Open YouTube TV
|
||||
- Sign in with Google account (YouTube TV subscription)
|
||||
- Or use TV code activation:
|
||||
- Visit tv.youtube.com/start on computer/phone
|
||||
- Enter code shown on TV
|
||||
- Sign in and authorize
|
||||
|
||||
4. **Test:**
|
||||
- Browse live TV channels
|
||||
- Test DVR recordings
|
||||
- Verify streaming quality
|
||||
|
||||
**YouTube TV Settings:**
|
||||
- Live guide: Configure preferences
|
||||
- DVR: Verify recordings accessible
|
||||
- Picture quality: Auto or 4K (if available)
|
||||
|
||||
---
|
||||
|
||||
## Network Configuration Details
|
||||
|
||||
### portapotty WiFi Network (GL.iNet Beryl AX)
|
||||
|
||||
**Network Details:**
|
||||
- **SSID:** `portapotty`
|
||||
- **Frequency:** 2.4GHz + 5GHz (dual-band)
|
||||
- **Security:** WPA2/WPA3
|
||||
- **DHCP:** Enabled (automatic IP assignment)
|
||||
- **Subnet:** 192.168.8.0/24 (default GL.iNet)
|
||||
- **Router IP:** 192.168.8.1 (Beryl AX admin panel)
|
||||
- **DNS:** Handled by Beryl AX (AdGuard Home)
|
||||
|
||||
**Devices on portapotty Network:**
|
||||
- Raspberry Pi 4: Ethernet (CAT5) → 10.0.0.10 (static, or check DHCP)
|
||||
- Onn Box 1: WiFi → 192.168.8.x (DHCP assigned)
|
||||
- Onn Box 2: WiFi → 192.168.8.x (DHCP assigned)
|
||||
- Laptop: WiFi → 192.168.8.x (DHCP assigned)
|
||||
- Phone/tablet: WiFi → 192.168.8.x (DHCP assigned)
|
||||
|
||||
### Pocket Grimoire Service Addresses
|
||||
|
||||
**When connected to portapotty network:**
|
||||
|
||||
```
|
||||
Jellyfin: http://pocket-grimoire.local:8096
|
||||
or http://10.0.0.10:8096
|
||||
|
||||
Stash: http://pocket-grimoire.local:9999
|
||||
or http://10.0.0.10:9999
|
||||
|
||||
Wiki.js: http://pocket-grimoire.local:3000
|
||||
or http://10.0.0.10:3000
|
||||
|
||||
File Browser: http://pocket-grimoire.local:8080
|
||||
or http://10.0.0.10:8080
|
||||
|
||||
Router Admin: http://192.168.8.1
|
||||
```
|
||||
|
||||
**If `.local` names don't resolve:**
|
||||
- Use IP addresses directly (10.0.0.10)
|
||||
- Check Beryl AX DNS settings
|
||||
- Restart Onn box
|
||||
|
||||
---
|
||||
|
||||
## Configuration Checklist
|
||||
|
||||
### Pre-Deployment (At Home)
|
||||
|
||||
**Before traveling, complete these tasks:**
|
||||
|
||||
- [ ] Both Onn boxes powered on and tested
|
||||
- [ ] Both connected to test WiFi network
|
||||
- [ ] Google accounts signed in on both boxes
|
||||
- [ ] All 4 apps installed on both boxes:
|
||||
- [ ] Jellyfin for Android TV
|
||||
- [ ] StashApp for Android TV (sideloaded)
|
||||
- [ ] Netflix
|
||||
- [ ] YouTube TV
|
||||
- [ ] Jellyfin configured and tested (play test video)
|
||||
- [ ] StashApp configured and tested (browse library)
|
||||
- [ ] Netflix signed in (test streaming)
|
||||
- [ ] YouTube TV signed in (test live TV)
|
||||
- [ ] Both remotes have fresh batteries
|
||||
- [ ] Both boxes labeled (Box 1, Box 2) or distinguishable
|
||||
|
||||
### Hotel Deployment
|
||||
|
||||
**Setup sequence at hotel:**
|
||||
|
||||
1. **Setup Beryl AX Router:**
|
||||
- Power on Beryl AX
|
||||
- Connect to hotel WiFi (via Beryl AX admin or phone app)
|
||||
- Verify internet connection
|
||||
- portapotty WiFi should be active
|
||||
|
||||
2. **Setup Pocket Grimoire:**
|
||||
- Power on Raspberry Pi
|
||||
- Connect via CAT5 to Beryl AX
|
||||
- Wait 2-3 minutes for boot
|
||||
- SSH in and unlock ZFS (if needed)
|
||||
- Verify Docker containers running
|
||||
|
||||
3. **Setup Onn Box 1:**
|
||||
- Connect to TV HDMI port
|
||||
- Power on
|
||||
- Wait for boot (30 seconds)
|
||||
- Should auto-connect to portapotty
|
||||
- If not: Settings → Network → portapotty → Connect
|
||||
- Test Jellyfin (should connect automatically)
|
||||
- Test StashApp (should connect automatically)
|
||||
|
||||
4. **Setup Onn Box 2 (if using):**
|
||||
- Connect to second TV or different HDMI port
|
||||
- Repeat setup steps above
|
||||
- Verify connection to portapotty
|
||||
|
||||
5. **Verify All Services:**
|
||||
- Open Jellyfin → Browse library → Play test video
|
||||
- Open StashApp → Browse library → Test preview
|
||||
- Open Netflix → Test streaming
|
||||
- Open YouTube TV → Test live channel
|
||||
|
||||
**Total setup time: 10-15 minutes**
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### WiFi Connection Issues
|
||||
|
||||
**Onn box won't connect to portapotty:**
|
||||
|
||||
1. **Verify Router is Online:**
|
||||
- Check Beryl AX power LED (should be solid)
|
||||
- Check Beryl AX WiFi LED (should be blinking/solid)
|
||||
- Use phone to verify portapotty network is visible
|
||||
|
||||
2. **Forget and Reconnect:**
|
||||
- Settings → Network & Internet
|
||||
- Select portapotty
|
||||
- Click "Forget network"
|
||||
- Scan again
|
||||
- Reconnect with password
|
||||
|
||||
3. **Check Router Settings:**
|
||||
- Access Beryl AX admin: http://192.168.8.1
|
||||
- Verify WiFi is enabled
|
||||
- Check if DHCP is active
|
||||
- Verify no MAC filtering enabled
|
||||
|
||||
4. **Restart Devices:**
|
||||
- Power cycle Onn box (unplug, wait 10 seconds, plug back in)
|
||||
- Restart Beryl AX router
|
||||
- Try connecting again
|
||||
|
||||
**Weak WiFi Signal:**
|
||||
|
||||
- Move Beryl AX closer to TV/Onn box
|
||||
- Reduce obstacles between router and box
|
||||
- Use 2.4GHz band instead of 5GHz (better range, slower speed)
|
||||
- Check for interference (hotel WiFi channels)
|
||||
|
||||
### Jellyfin Connection Issues
|
||||
|
||||
**Can't connect to Jellyfin server:**
|
||||
|
||||
1. **Verify Server is Running:**
|
||||
- SSH into Pocket Grimoire
|
||||
- Run: `docker ps | grep jellyfin`
|
||||
- Should show `pocketgrimoire_jellyfin` running
|
||||
|
||||
2. **Check Network Connectivity:**
|
||||
- On Onn box, open browser app
|
||||
- Navigate to: `http://pocket-grimoire.local:8096`
|
||||
- Or try IP: `http://10.0.0.10:8096`
|
||||
- Should load Jellyfin web interface
|
||||
|
||||
3. **Reconnect Jellyfin App:**
|
||||
- Open Jellyfin app
|
||||
- Settings → Server
|
||||
- Delete existing server
|
||||
- Add server manually:
|
||||
- Host: `pocket-grimoire.local` or `10.0.0.10`
|
||||
- Port: `8096`
|
||||
- Connect and login
|
||||
|
||||
4. **Check Firewall:**
|
||||
- SSH into Pi
|
||||
- Verify port 8096 is open: `sudo netstat -tlnp | grep 8096`
|
||||
- Should show jellyfin listening
|
||||
|
||||
**Jellyfin Playback Issues:**
|
||||
|
||||
**Video won't play:**
|
||||
- Check media is H.264/AAC (see encoding guide)
|
||||
- Verify network bandwidth (should be strong WiFi)
|
||||
- Try different video file
|
||||
- Check Jellyfin logs: `docker logs pocketgrimoire_jellyfin`
|
||||
|
||||
**Video buffers/stutters:**
|
||||
- Check WiFi signal strength (move router closer)
|
||||
- Verify direct play (check playback info, should NOT say "transcoding")
|
||||
- If transcoding occurs: Media is not properly encoded
|
||||
- Check network activity: `ssh user@pocket-grimoire.local` then `iftop`
|
||||
|
||||
**Subtitles don't work:**
|
||||
- Ensure subtitles are SRT format (not PGS/VobSub)
|
||||
- External .srt files work best
|
||||
- Embedded SRT in MKV also works
|
||||
|
||||
### StashApp Connection Issues
|
||||
|
||||
**Can't connect to Stash server:**
|
||||
|
||||
1. **Verify Stash is Running:**
|
||||
- SSH into Pocket Grimoire
|
||||
- Run: `docker ps | grep stash`
|
||||
- Should show `pocketgrimoire_stash` running
|
||||
|
||||
2. **Test Server Connection:**
|
||||
- Open browser on Onn box
|
||||
- Navigate to: `http://pocket-grimoire.local:9999`
|
||||
- Or try: `http://10.0.0.10:9999`
|
||||
- Should load Stash web interface
|
||||
|
||||
3. **Reconfigure StashApp:**
|
||||
- Open StashApp
|
||||
- Settings → Server
|
||||
- Remove existing server
|
||||
- Add server:
|
||||
- URL: `http://pocket-grimoire.local:9999`
|
||||
- Or: `http://10.0.0.10:9999`
|
||||
- Enter API key (if required)
|
||||
- Connect
|
||||
|
||||
4. **Check API Key:**
|
||||
- If StashApp requires API key
|
||||
- SSH into Pi: `cat /srv/vaultpg/stash/config/config.yml | grep api_key`
|
||||
- Or access Stash web UI → Settings → Security → API Key
|
||||
- Copy key into StashApp
|
||||
|
||||
**StashApp Crashes or Freezes:**
|
||||
- Clear app cache: Settings → Apps → StashApp → Clear cache
|
||||
- Restart Onn box
|
||||
- Reinstall StashApp (download latest APK)
|
||||
- Check Stash server logs: `docker logs pocketgrimoire_stash`
|
||||
|
||||
**Previews won't play:**
|
||||
- Verify previews synced from Netgrimoire
|
||||
- Check: `ssh user@pocket-grimoire.local`
|
||||
- Run: `ls /srv/vaultpg/stash/generated/` (should show preview files)
|
||||
- If empty: Sync hasn't completed, or previews not generated on Netgrimoire
|
||||
|
||||
### Netflix/YouTube TV Issues
|
||||
|
||||
**Netflix won't sign in:**
|
||||
- Verify Netflix subscription is active
|
||||
- Try signing in on phone/computer first
|
||||
- Use "Sign in with code" option (visit netflix.com/tv8 on another device)
|
||||
- Check internet connection (portapotty → hotel WiFi)
|
||||
|
||||
**YouTube TV won't play:**
|
||||
- Verify YouTube TV subscription is active
|
||||
- Check location restrictions (some content blocked outside home area)
|
||||
- Try signing out and back in
|
||||
- Verify internet connection speed
|
||||
|
||||
**Streaming quality poor:**
|
||||
- Check WiFi signal strength
|
||||
- Verify hotel internet speed (not throttled)
|
||||
- Switch to lower quality in app settings temporarily
|
||||
- Move router closer to TV
|
||||
|
||||
### General Onn Box Issues
|
||||
|
||||
**Box won't turn on:**
|
||||
- Check power adapter is plugged in
|
||||
- Check Micro-USB cable is secure
|
||||
- Try different power source
|
||||
- LED should light up (white when on)
|
||||
|
||||
**Remote not working:**
|
||||
- Check batteries (replace if needed)
|
||||
- Re-pair remote: Hold Back + Home for 5 seconds
|
||||
- Check for obstructions between remote and box
|
||||
- Try using Google Home app as remote backup
|
||||
|
||||
**Box is slow/laggy:**
|
||||
- Clear cache: Settings → Storage → Cached data → Clear
|
||||
- Uninstall unused apps
|
||||
- Restart box: Settings → Device Preferences → About → Restart
|
||||
- Factory reset (last resort)
|
||||
|
||||
**Apps keep crashing:**
|
||||
- Clear app cache and data
|
||||
- Uninstall and reinstall app
|
||||
- Check for OS updates: Settings → Device Preferences → About → System update
|
||||
- Factory reset if persistent
|
||||
|
||||
**No sound:**
|
||||
- Check TV volume (not muted)
|
||||
- Check HDMI connection (reseat cable)
|
||||
- Settings → Display & Sound → Audio output → Test
|
||||
- Try different HDMI port on TV
|
||||
- Check if audio is set to "Auto" or "Stereo"
|
||||
|
||||
### DNS Resolution Issues
|
||||
|
||||
**`.local` addresses don't work (pocket-grimoire.local fails):**
|
||||
|
||||
1. **Use IP Address Instead:**
|
||||
- Replace `pocket-grimoire.local` with `10.0.0.10`
|
||||
- Example: `http://10.0.0.10:8096` for Jellyfin
|
||||
|
||||
2. **Check Pi's IP Address:**
|
||||
- SSH into Pi: `ip addr show eth0`
|
||||
- Look for inet address (e.g., 192.168.8.50)
|
||||
- Use this IP in apps instead of .local
|
||||
|
||||
3. **Check Beryl AX DNS:**
|
||||
- Access http://192.168.8.1
|
||||
- Check DNS settings
|
||||
- Verify AdGuard Home is running
|
||||
- Ensure mDNS/Bonjour reflection is enabled (if option available)
|
||||
|
||||
4. **Add Static DNS Entry:**
|
||||
- In Beryl AX admin panel
|
||||
- Add static DNS entry: pocket-grimoire → 10.0.0.10
|
||||
|
||||
---
|
||||
|
||||
## Advanced Configuration
|
||||
|
||||
### Setting Static IP for Raspberry Pi
|
||||
|
||||
**On Beryl AX router:**
|
||||
|
||||
1. Access admin panel: http://192.168.8.1
|
||||
2. Navigate to Network → DHCP Server
|
||||
3. Find Raspberry Pi in client list
|
||||
4. Assign static IP: 10.0.0.10
|
||||
5. Save and apply
|
||||
|
||||
**Or on Raspberry Pi directly:**
|
||||
|
||||
```bash
|
||||
# Edit network config
|
||||
sudo nano /etc/dhcpcd.conf
|
||||
|
||||
# Add at end:
|
||||
interface eth0
|
||||
static ip_address=10.0.0.10/24
|
||||
static routers=192.168.8.1
|
||||
static domain_name_servers=192.168.8.1
|
||||
```
|
||||
|
||||
### Optimizing Video Playback
|
||||
|
||||
**Jellyfin Video Settings (on Onn box):**
|
||||
- Settings → Playback
|
||||
- Max streaming bitrate: Maximum (Auto)
|
||||
- Video quality: Maximum
|
||||
- Allow video playback that may require conversion: OFF
|
||||
- Skip intro: ON (if desired)
|
||||
|
||||
**StashApp Video Settings:**
|
||||
- Settings → Playback
|
||||
- Video quality: Original
|
||||
- Hardware acceleration: ON
|
||||
- Buffer size: Large
|
||||
|
||||
### Remote Control Tips
|
||||
|
||||
**Voice Commands:**
|
||||
- "Hey Google, open Jellyfin"
|
||||
- "Hey Google, play [movie name] on Jellyfin"
|
||||
- "Hey Google, pause"
|
||||
- "Hey Google, turn off TV"
|
||||
|
||||
**Useful Remote Shortcuts:**
|
||||
- Home button (twice): Recent apps
|
||||
- Back button (hold): Return to home
|
||||
- Play/Pause: Works in most video apps
|
||||
- Voice button: Google Assistant
|
||||
|
||||
---
|
||||
|
||||
## App Locations
|
||||
|
||||
**After installation, find apps here:**
|
||||
|
||||
**Home Screen:**
|
||||
- Netflix, YouTube TV usually appear automatically
|
||||
|
||||
**Apps Tab:**
|
||||
- All installed apps listed alphabetically
|
||||
- Jellyfin, StashApp will be here
|
||||
|
||||
**Quick Access:**
|
||||
- Long-press Home → Add to Favorites
|
||||
- Apps appear on home screen for quick access
|
||||
|
||||
---
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Weekly (While Using)
|
||||
- Check for app updates (Play Store → Updates)
|
||||
- Clear cache if apps feel slow
|
||||
- Verify WiFi connection strength
|
||||
|
||||
### Before Each Trip
|
||||
- Test all apps at home
|
||||
- Update apps if updates available
|
||||
- Check remote batteries
|
||||
- Verify all logins still active
|
||||
|
||||
### After Each Trip
|
||||
- Check for OS updates
|
||||
- Review installed apps (remove if unused)
|
||||
- Clear cache to free storage
|
||||
|
||||
---
|
||||
|
||||
## Factory Reset (If Needed)
|
||||
|
||||
**When to factory reset:**
|
||||
- Box is extremely slow
|
||||
- Apps constantly crash
|
||||
- Persistent connection issues
|
||||
- Selling/giving away box
|
||||
|
||||
**How to factory reset:**
|
||||
|
||||
1. **Via Settings:**
|
||||
- Settings → Device Preferences
|
||||
- About → Factory Reset
|
||||
- Confirm reset
|
||||
- Wait for reboot (3-5 minutes)
|
||||
|
||||
2. **Via Recovery Mode:**
|
||||
- Power off box
|
||||
- Hold reset button (if present)
|
||||
- Power on while holding
|
||||
- Navigate with remote to "Factory Reset"
|
||||
- Confirm
|
||||
|
||||
**After reset:**
|
||||
- Complete initial setup again (see beginning of guide)
|
||||
- Reinstall all apps
|
||||
- Reconfigure WiFi and services
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference Card
|
||||
|
||||
**Essential Information:**
|
||||
|
||||
```
|
||||
WiFi Network: portapotty
|
||||
Router Admin: http://192.168.8.1
|
||||
|
||||
Pocket Grimoire Services:
|
||||
- Jellyfin: http://pocket-grimoire.local:8096
|
||||
- Stash: http://pocket-grimoire.local:9999
|
||||
- Wiki: http://pocket-grimoire.local:3000
|
||||
|
||||
If .local fails, use IP: http://10.0.0.10:[PORT]
|
||||
|
||||
Apps Required:
|
||||
✓ Jellyfin for Android TV (Play Store)
|
||||
✓ StashApp for Android TV (Sideload APK)
|
||||
✓ Netflix (Play Store)
|
||||
✓ YouTube TV (Play Store)
|
||||
|
||||
Troubleshooting:
|
||||
1. Restart Onn box
|
||||
2. Check portapotty WiFi connection
|
||||
3. Verify Pocket Grimoire is running (SSH check)
|
||||
4. Use IP addresses instead of .local names
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Appendix: StashApp APK Sources
|
||||
|
||||
**Official GitHub Repository:**
|
||||
- https://github.com/damontecres/StashAppAndroidTV
|
||||
- Releases: https://github.com/damontecres/StashAppAndroidTV/releases
|
||||
- Latest version: Check releases page
|
||||
|
||||
**Verification:**
|
||||
- Download only from official GitHub releases
|
||||
- Verify file integrity (check file size, release notes)
|
||||
- Watch for malware warnings (false positives common with sideloaded APKs)
|
||||
|
||||
**Update Process:**
|
||||
- Check GitHub for new releases periodically
|
||||
- Download new APK
|
||||
- Install over existing app (data preserved)
|
||||
- Or uninstall and reinstall clean
|
||||
|
||||
---
|
||||
|
||||
*This guide was created for Onn 4K streaming box configuration with Pocket Grimoire. Keep updated as apps and configurations change.*
|
||||
64
Netgrimoire/Pocket-Grimoire/Overview.md
Normal file
64
Netgrimoire/Pocket-Grimoire/Overview.md
Normal file
|
|
@ -0,0 +1,64 @@
|
|||
---
|
||||
title: Pocket Grimoire
|
||||
description: Portable travel lab — offline-first, encrypted, self-contained
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: pocket, portable, travel
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Pocket Grimoire
|
||||
|
||||

|
||||
|
||||
Pocket Grimoire is a portable, encrypted, offline-first companion to Netgrimoire. It travels. It runs without internet. It tunnels home via WireGuard when connectivity is available. And it doubles as one of the two Vault Grimoire offsite nodes — every time it leaves the house, it takes an encrypted copy of the data with it.
|
||||
|
||||
---
|
||||
|
||||
## Hardware at a Glance
|
||||
|
||||
- **Laptop** — Docker host, ZFS pool `pocket-green` at `/srv/greenpg/`
|
||||
- **GL.iNet Beryl AX (GL-MT3000)** — travel router, LAN `192.168.8.0/24`, WireGuard peer `PortaPotty`
|
||||
- **2x Onn 4K streaming boxes** — hotel/TV playback
|
||||
- **Anker 200W GaN charging station** — one plug for everything
|
||||
- **SSDs** — Vault (always connected) + Green (personal trips only)
|
||||
|
||||
---
|
||||
|
||||
## Software Stack
|
||||
|
||||
| Service | Purpose | Mode |
|
||||
|---------|---------|------|
|
||||
| Jellyfin | Media playback | Read/write |
|
||||
| Stash (PocketStash, port 9998) | Adult media | Read-only travel mode |
|
||||
| Wiki.js | Documentation mirror | Pull-only |
|
||||
| Filebrowser | File access | Read/write |
|
||||
|
||||
---
|
||||
|
||||
## WireGuard Home Tunnel
|
||||
|
||||
WireGuard peer `PortaPotty` (192.168.32.5) connects back to OPNsense on Netgrimoire when internet is available. All management traffic and sync operations use the tunnel.
|
||||
|
||||
---
|
||||
|
||||
## As a Vault Node
|
||||
|
||||
Pocket Grimoire receives a `syncoid` push from `znas` before each trip:
|
||||
|
||||
```bash
|
||||
syncoid znas:vault/Green/Pocket pocket:/srv/greenpg/Green
|
||||
```
|
||||
|
||||
This makes it an offsite encrypted backup node whenever it leaves home. See [Vault Architecture](/Vault-Grimoire/Offsite/Vault-Architecture).
|
||||
|
||||
---
|
||||
|
||||
## Sections
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| [Hardware](/Pocket-Grimoire/Hardware/Inventory) | Full hardware list, power kit, storage layout |
|
||||
| [Software](/Pocket-Grimoire/Software/Stack) | Services, Docker config, ZFS pool |
|
||||
| [Sync & Deployment](/Pocket-Grimoire/Sync/Pre-Travel-Sync) | Pre-travel checklist, syncoid, deployment guide |
|
||||
39
Netgrimoire/Pocket-Grimoire/Software/Stack.md
Normal file
39
Netgrimoire/Pocket-Grimoire/Software/Stack.md
Normal file
|
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
title: Software Stack
|
||||
description: Services running on Pocket Grimoire
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: pocket, software, docker
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Pocket Grimoire Software Stack
|
||||
|
||||
## Services
|
||||
|
||||
| Service | Port | Purpose | Mode |
|
||||
|---------|------|---------|------|
|
||||
| Jellyfin | 8096 | Media playback | Read/write |
|
||||
| PocketStash | 9998 | Adult media (Stash) | Read-only travel mode |
|
||||
| Wiki.js | 3000 | Documentation mirror | Pull-only (no writes) |
|
||||
| Filebrowser | 8080 | File management | Read/write |
|
||||
| Beszel agent | — | Reports back to znas monitoring | Active when tunneled |
|
||||
|
||||
## ZFS Pool
|
||||
|
||||
Pool name: `pocket-green`
|
||||
Mount point: `/srv/greenpg/`
|
||||
|
||||
Dataset layout mirrors the Vault Grimoire structure for Green/Pocket data.
|
||||
|
||||
## Docker
|
||||
|
||||
Docker Engine (standalone, not Swarm). Compose-only. No overlay networks.
|
||||
|
||||
## Host Services
|
||||
|
||||
- Linux (Ubuntu Server)
|
||||
- OpenZFS
|
||||
- systemd timers (sync, health checks)
|
||||
- Cockpit (management)
|
||||
1927
Netgrimoire/Pocket-Grimoire/Software/Stash-Integration.md
Normal file
1927
Netgrimoire/Pocket-Grimoire/Software/Stash-Integration.md
Normal file
File diff suppressed because it is too large
Load diff
3703
Netgrimoire/Pocket-Grimoire/Sync/Deployment-Guide.md
Normal file
3703
Netgrimoire/Pocket-Grimoire/Sync/Deployment-Guide.md
Normal file
File diff suppressed because it is too large
Load diff
50
Netgrimoire/Pocket-Grimoire/Sync/Pre-Travel-Sync.md
Normal file
50
Netgrimoire/Pocket-Grimoire/Sync/Pre-Travel-Sync.md
Normal file
|
|
@ -0,0 +1,50 @@
|
|||
---
|
||||
title: Pre-Travel Sync & Checklist
|
||||
description: Everything to do before Pocket Grimoire leaves the house
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: pocket, sync, travel, runbook
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Pre-Travel Sync & Checklist
|
||||
|
||||
## Sync Data from znas
|
||||
|
||||
```bash
|
||||
# Push Green/Pocket dataset to Pocket Grimoire
|
||||
syncoid znas:vault/Green/Pocket pocket:/srv/greenpg/Green
|
||||
|
||||
# Verify pool health after sync
|
||||
ssh pocket "zpool status pocket-green"
|
||||
```
|
||||
|
||||
## Pre-Travel Checklist
|
||||
|
||||
- [ ] Run syncoid push — verify completion, no errors
|
||||
- [ ] Confirm ZFS pool healthy (`zpool status pocket-green`)
|
||||
- [ ] Confirm WireGuard peer `PortaPotty` connects to OPNsense
|
||||
- [ ] Confirm Jellyfin library scan complete
|
||||
- [ ] Confirm PocketStash metadata synced (check last scan date in UI)
|
||||
- [ ] Confirm Wiki.js content is current (last pull timestamp)
|
||||
- [ ] Charge Anker station fully
|
||||
- [ ] Pack SSDs — Vault always, Green for personal trips only
|
||||
|
||||
## While Traveling
|
||||
|
||||
- PocketStash runs read-only — no writes, no new imports
|
||||
- Wiki.js is pull-only — no page edits (edits won't sync back cleanly)
|
||||
- WireGuard tunnel home via `PortaPotty` peer when internet available
|
||||
- Beszel agent reports back to znas when tunneled
|
||||
|
||||
## On Return
|
||||
|
||||
```bash
|
||||
# Sync any Jellyfin watch state or metadata changes back if needed
|
||||
# No automated reverse sync — manual review before writing back
|
||||
```
|
||||
|
||||
## Deployment Guide
|
||||
|
||||
See original [Deployment Guide](/Pocket-Grimoire/Sync/Deployment-Guide) for full from-scratch build procedure.
|
||||
125
Netgrimoire/Shadow-Grimoire/Arr/Bazarr.md
Normal file
125
Netgrimoire/Shadow-Grimoire/Arr/Bazarr.md
Normal file
|
|
@ -0,0 +1,125 @@
|
|||
---
|
||||
title: bazarr Stack
|
||||
description: Bazarr Stack for NetGrimoire
|
||||
published: true
|
||||
date: 2026-04-04T01:35:32.755Z
|
||||
tags: docker,swarm,bazarr,netgrimoire
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-04T01:35:32.755Z
|
||||
---
|
||||
|
||||
# bazarr
|
||||
|
||||
## Overview
|
||||
The bazarr stack is a Docker Swarm configuration for the Bazarr service in NetGrimoire. It provides a search functionality and connects to other services through various labels and environment variables.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
| Service | Image | Port | Role |
|
||||
|---------|-------|------|------|
|
||||
- **Host:** docker4
|
||||
- **Network:** netgrimoire
|
||||
- **Exposed via:** bazarr.netgrimoire.com
|
||||
- **Homepage group:** Jolly Roger
|
||||
|
||||
---
|
||||
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
To deploy this stack, ensure that Docker Swarm is installed and configured.
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/bazarr/config
|
||||
chown -R user:group bazarr.config
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# generate: openssl rand -hex 32
|
||||
PUID=1964
|
||||
PGID=1964
|
||||
TZ=America/Chicago
|
||||
Caddy: authentik
|
||||
Caddy.reverse_proxy: {{upstreams 6767}}
|
||||
Kuma.bazarr.http.name=Bazarr
|
||||
Kuma.bazarr.http.url=http://bazarr:6767
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/bazarr
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file bazarr-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml bazarr
|
||||
rm resolved.yml
|
||||
docker stack services bazarr
|
||||
```
|
||||
|
||||
### First Run
|
||||
After deployment, run `./deploy.sh` to initialize the configuration.
|
||||
|
||||
---
|
||||
|
||||
## User Guide
|
||||
|
||||
### Accessing bazarr
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
- **Bazarr**: http://bazarr.netgrimoire.com
|
||||
- **Caddy reverse proxy:** Internal only
|
||||
|
||||
### Primary Use Cases
|
||||
Use Bazarr for subtitle search in NetGrimoire.
|
||||
|
||||
### NetGrimoire Integrations
|
||||
This service connects to Uptime Kuma and Caddy through various labels and environment variables.
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
```bash
|
||||
docker stack services bazarr
|
||||
docker service logs -f bazarr
|
||||
```
|
||||
|
||||
### Backups
|
||||
- `/DockerVol/bazarr/config` is critical for configuration data.
|
||||
- `/DockerVol/bazarr/data` is reconstructable.
|
||||
|
||||
### Restore
|
||||
```bash
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Failures
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
1. Service not available | Incorrect DNS entry | Check Caddy reverse proxy configuration and DNS resolution.
|
||||
2. Data corruption | Inconsistent backups | Ensure consistent and regular backups of critical data volumes.
|
||||
3. Network connectivity issues | Incorrect network configuration | Verify network configuration and re-deploy the stack with corrected settings.
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
| 2026-04-03 | e5ba5297 | Initial deployment documentation.
|
||||
| 2026-04-03 | 74b54de4 | Minor configuration updates.
|
||||
| 2026-04-03 | 4f400b3f | Security patches and bug fixes.
|
||||
| 2026-04-03 | 8df1f14f | Performance improvements.
|
||||
| 2026-04-03 | 99cffc2b | Minor documentation updates.
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
- Generated by Gremlin on 2026-04-04T01:35:32.755Z
|
||||
- Source: swarm/bazarr.yaml
|
||||
- Review User Guide and Changelog sections
|
||||
119
Netgrimoire/Shadow-Grimoire/Arr/Radarr.md
Normal file
119
Netgrimoire/Shadow-Grimoire/Arr/Radarr.md
Normal file
|
|
@ -0,0 +1,119 @@
|
|||
# radarr
|
||||
|
||||
## Overview
|
||||
The Radarr stack is a Docker Swarm-based configuration for the popular movie library management service, Radarr. It provides a centralized hub for managing a large collection of movies, complete with features like automated metadata fetching and quality filtering.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
| Service | Image | Port | Role |
|
||||
|---------|-------|------|------|
|
||||
- **Host:** docker4
|
||||
- **Network:** netgrimoire
|
||||
- **Exposed via:** `caddy.radarr.netgrimoire.com`, `radarr:7878`
|
||||
- **Homepage group:** Jolly Roger
|
||||
|
||||
---
|
||||
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
No specific prerequisites are required for this stack.
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/Radarr:/config
|
||||
chown -R radarr:radarr /DockerVol/Radarr
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# generate: openssl rand -hex 32
|
||||
TZ=America/Chicago
|
||||
PGID="1964"
|
||||
PUID="1964"
|
||||
CADDY_HTTPS_KEY=$(openssl rand -hex 32)
|
||||
KUMA RADARR.HTTP.NAME=Radarr
|
||||
KUMA RADARR.HTTP.URL=https://radarr.netgrimoire.com
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/radarr
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file radarr-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml radarr
|
||||
rm resolved.yml
|
||||
docker stack services radarr
|
||||
```
|
||||
|
||||
### First Run
|
||||
After a successful deployment, run the following command to initialize the database:
|
||||
|
||||
```bash
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## User Guide
|
||||
|
||||
### Accessing radarr
|
||||
| Service | URL | Purpose |
|
||||
- **radarr**: https://radarr.netgrimoire.com |
|
||||
|
||||
### Primary Use Cases
|
||||
To use Radarr in NetGrimoire, follow these steps:
|
||||
|
||||
1. Log in to the Radarr interface at `https://radarr.netgrimoire.com`.
|
||||
2. Configure your library by adding movies and setting quality filters.
|
||||
3. Set up Caddy for reverse proxying and HTTPS.
|
||||
|
||||
### NetGrimoire Integrations
|
||||
Radarr integrates with Kuma for monitoring and Uptime Kuma for dashboard integration.
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
[kuma monitors]
|
||||
```bash
|
||||
docker stack services radarr
|
||||
<docker service logs commands>
|
||||
```
|
||||
|
||||
### Backups
|
||||
Critical backups should be done to `/DockerVol/Radarr/data/backup/` on a regular basis. Reconstructable backups can be stored in the same directory.
|
||||
|
||||
### Restore
|
||||
```bash
|
||||
cd services/swarm/stack/radarr
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Failures
|
||||
| Failure Mode | Symptoms | Cause | Fix |
|
||||
|-------------|----------|------|-----|
|
||||
| Caddy Not Listening | No incoming requests. | Caddy not started | Restart caddy service with `docker stack services radarr` |
|
||||
| Radarr Service Not Running | No visible interface in NetGrimoire Dashboard. | Radarr service not deployed correctly | Re-run deploy script and restart radarr service |
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
| 2026-04-07 | 77c13325 | Initial documentation for swarm configuration |
|
||||
| 2026-02-19 | 7482d3e5 | Added Caddy HTTPS key to environment variables |
|
||||
| 2026-02-01 | 48701f5b | Updated Docker Swarm file with new Radarr image version |
|
||||
| 2026-01-10 | 1a374911 | Improved Radarr configuration and setup |
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
- Generated by Gremlin on 2026-04-07T19:34:53.606Z
|
||||
- Source: swarm/radarr.yaml
|
||||
- Review User Guide and Changelog sections
|
||||
127
Netgrimoire/Shadow-Grimoire/Arr/Sonarr.md
Normal file
127
Netgrimoire/Shadow-Grimoire/Arr/Sonarr.md
Normal file
|
|
@ -0,0 +1,127 @@
|
|||
# sonarr
|
||||
|
||||
## Overview
|
||||
This stack provides a Docker Swarm configuration for Sonarr, a media library and download client. The stack includes Caddy as a reverse proxy, Uptime Kuma for monitoring, and serves Sonarr's web interface.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
| Service | Image | Port | Role |
|
||||
|---------|-------|-----|------|
|
||||
- **Host:** docker4
|
||||
- **Network:** netgrimoire
|
||||
- **Exposed via:** sonarr.netgrimoire.com
|
||||
- **Homepage group:** Jolly Roger
|
||||
|
||||
---
|
||||
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
No specific prerequisites are required.
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/Sonarr:/config
|
||||
chown -R sonarr:sonarr /DockerVol/Sonarr
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# generate: openssl rand -hex 32
|
||||
TZ=America/Chicago
|
||||
PUID=1964
|
||||
PGID=1964
|
||||
CADDY_CERT=$(openssl rand -hex 32)
|
||||
CADDY_KEY=$(openssl rand -hex 32)
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/sonarr
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file sonarr-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml sonarr
|
||||
rm resolved.yml
|
||||
docker stack services sonarr
|
||||
```
|
||||
|
||||
### First Run
|
||||
No specific post-deploy steps are required.
|
||||
|
||||
---
|
||||
|
||||
## User Guide
|
||||
|
||||
### Accessing sonarr
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
- **Sonarr**: https://sonarr.netgrimoire.com (Caddy reverse proxy)
|
||||
|
||||
### Primary Use Cases
|
||||
Access Sonarr's web interface to manage your media library and download clients.
|
||||
|
||||
### NetGrimoire Integrations
|
||||
This stack connects to other services through environment variables:
|
||||
- `HOME PAGE GROUP`: Jolly Roger
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
[kuma.sonarr.http.name: Sonarr, kuma.sonarr.http.url: https://sonarr.netgrimoire.com]
|
||||
|
||||
```bash
|
||||
docker stack services sonarr
|
||||
```
|
||||
|
||||
### Backups
|
||||
Critical backups should be performed regularly. For reconstructing a full backup:
|
||||
- `/DockerVol/Sonarr:/config` and other critical volumes are the target
|
||||
|
||||
### Restore
|
||||
```bash
|
||||
cd services/swarm/stack/sonarr
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Failures
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
1. **Failed to connect**: Insufficient Caddy reverse proxy configuration.
|
||||
- Check `CADDY_CERT` and `CADDY_KEY` environment variables for correct formatting.
|
||||
- Update Caddy configuration if necessary.
|
||||
|
||||
2. **Uptime Kuma failed to connect**: Incorrect HTTP URL or port.
|
||||
- Ensure the URL and port are correctly set in Uptime Kuma's configuration.
|
||||
- Restart services with `docker stack restart sonarr`
|
||||
|
||||
3. **Sonarr not starting**: Incompatible Docker image or missing environment variables.
|
||||
- Check the Sonarr Docker image version for compatibility.
|
||||
- Verify all required environment variables are present and correct.
|
||||
|
||||
4. **Caddy reverse proxy not working**: Incorrect Caddy configuration.
|
||||
- Review Caddy configuration files (`sonarr-stack.yml`) for errors.
|
||||
- Restart services with `docker stack restart sonarr`
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
| 2026-04-07 | fb75c66d | Initial documentation creation. |
|
||||
|
||||
<Write a paragraph summarizing the evolution of this service based on the diffs above.>
|
||||
|
||||
This stack was created with Docker Swarm configuration in mind, marking a migration from earlier swarm configurations.
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
- Generated by Gremlin on 2026-04-07T19:37:34.802Z
|
||||
- Source: swarm/sonarr.yaml
|
||||
- Review User Guide and Changelog sections
|
||||
98
Netgrimoire/Shadow-Grimoire/Downloaders/SABnzbd.md
Normal file
98
Netgrimoire/Shadow-Grimoire/Downloaders/SABnzbd.md
Normal file
|
|
@ -0,0 +1,98 @@
|
|||
# sabnzbd
|
||||
|
||||
## Overview
|
||||
The sabnzbd stack is a Docker Swarm configuration for the Sabnzbd Usenet Downloader service, providing a centralized and secure way to manage and retrieve Usenet content in NetGrimoire.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
| Service | Image | Port | Role |
|
||||
|---------|-------|------|------|
|
||||
- **Host:** docker4
|
||||
- **Network:** netgrimoire
|
||||
- **Exposed via:** sabnzbd.netgrimoire.com, 8082:8080
|
||||
- **Homepage group:** Jolly Roger
|
||||
|
||||
---
|
||||
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
No specific prerequisites are required for this stack.
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/sabnzbd
|
||||
chown -R docker4:docker4 /DockerVol/sabnzbd
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
generate: openssl rand -hex 32
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/sabnzbd
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file sabnzbd-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml sabnzbd
|
||||
rm resolved.yml
|
||||
docker stack services sabnzbd
|
||||
```
|
||||
|
||||
### First Run
|
||||
After deployment, ensure the Caddy reverse proxy is configured correctly for the newly deployed service.
|
||||
|
||||
---
|
||||
|
||||
## User Guide
|
||||
|
||||
### Accessing sabnzbd
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
- **sabnzbd.netgrimoire.com** | https://sabnzbd.netgrimoire.com | Usenet Downloader
|
||||
|
||||
### Primary Use Cases
|
||||
To use the sabnzbd service in NetGrimoire, access its homepage at [https://sabnzbd.netgrimoire.com](https://sabnzbd.netgrimoire.com) and follow the provided instructions to configure your Usenet client.
|
||||
|
||||
### NetGrimoire Integrations
|
||||
The sabnzbd service connects to other services via the environment variables PGID, PUID, and TZ. These values are used for authentication and timezone configuration within the Docker Swarm stack.
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
Monitor the sabnzbd service using Kuma.
|
||||
```bash
|
||||
docker stack services sabnzbd
|
||||
<docker service logs commands>
|
||||
```
|
||||
|
||||
### Backups
|
||||
Critical: Regular backups of the /DockerVol/sabnzbd are essential for data recovery in case of failure or loss. This is a critical component for ensuring business continuity.
|
||||
|
||||
### Restore
|
||||
Restore the sabnzbd service by running the ./deploy.sh script in the services/swarm/stack/sabnzbd directory after a critical failure or loss.
|
||||
|
||||
---
|
||||
|
||||
## Common Failures
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
| Service not accessible | Incorrect Caddy reverse proxy configuration | Check and correct Caddy labels, restart service |
|
||||
| Data corruption | Insufficient backups | Regularly back up the /DockerVol/sabnzbd directory |
|
||||
| Network connectivity issues | Outdated Docker Swarm stack | Update to latest version with latest dependencies |
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
| 2026-04-07 | a3d7972b | Initial documentation for the sabnzbd Stack. |
|
||||
| 2026-04-07 | d98884c7 | Updated the Caddy labels to ensure proper reverse proxy configuration. |
|
||||
| 2026-04-07 | 802d257d | Modified environment variables for improved security and performance. |
|
||||
|
||||
<The initial documentation was generated by Gremlin on 2026-04-07T20:51:44.986Z. Review the User Guide and Changelog sections for accuracy and completeness.
|
||||
83
Netgrimoire/Shadow-Grimoire/Overview.md
Normal file
83
Netgrimoire/Shadow-Grimoire/Overview.md
Normal file
|
|
@ -0,0 +1,83 @@
|
|||
---
|
||||
title: Shadow Grimoire
|
||||
description: Acquisition stack — the goblin hacker sails the high seas
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: shadow, acquisition, arr
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Shadow Grimoire
|
||||
|
||||

|
||||
|
||||
The Shadow Grimoire is the acquisition and media management infrastructure. Usenet + torrents, protected behind `*.wasted-bandwidth.net` and Authelia. Homepage tab: **Wasted-Bandwidth**.
|
||||
|
||||
The goblin hacker doesn't ask permission.
|
||||
|
||||
---
|
||||
|
||||
## Services by Group
|
||||
|
||||
### Jolly Roger — Indexers
|
||||
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
| NZBHydra | `hydra.netgrimoire.com` | Usenet indexer aggregator (altHUB, NZBGeek, Drunken Slug, Usenet Crawler, DogNZB) |
|
||||
| Jackett | `jackett.netgrimoire.com` | Torrent indexer — runs inside Gluetun VPN on docker2 |
|
||||
|
||||
### Downloaders
|
||||
|
||||
| Service | URL | Purpose | Host |
|
||||
|---------|-----|---------|------|
|
||||
| SABnzbd | — | Usenet downloader | znas / Swarm |
|
||||
| NZBGet | — | Usenet downloader | znas / Swarm |
|
||||
| Transmission | — | BitTorrent client | docker2 (via Gluetun VPN) |
|
||||
| Gluetun | — | VPN gateway — PIA VPN | docker2 / Compose |
|
||||
|
||||
Jackett and Transmission share `network_mode: container:gluetun` — all their traffic routes through the PIA VPN.
|
||||
|
||||
### Arr Stack — Media Management
|
||||
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
| Sonarr | — | TV show acquisition |
|
||||
| Radarr | — | Movie acquisition |
|
||||
| Bazarr | `bazarr.netgrimoire.com` | Subtitle management |
|
||||
| Readarr | — | Book acquisition |
|
||||
| Lidarr | — | Music acquisition |
|
||||
| Beets | `beets.netgrimoire.com` | Music library tagging |
|
||||
| Mylar | — | Comic acquisition (📋 planned — see `archive/arr.yaml`) |
|
||||
|
||||
### Config Management
|
||||
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
| Recyclarr | — | Sonarr/Radarr quality profile sync |
|
||||
| Profilarr | `profilarr.netgrimoire.com` | Quality profile management |
|
||||
| Configarr | `configarr.netgrimoire.com` | Arr config management |
|
||||
|
||||
### Media Search & Discovery
|
||||
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
| JellySeerr | `requests.netgrimoire.com` | Media request management |
|
||||
| TinyMediaManager | `tmm.netgrimoire.com` | Media metadata manager |
|
||||
| Pinchflat | `pinchflat.netgrimoire.com` | YouTube channel downloader |
|
||||
| Tunarr | — | IPTV channel creation (ErsatzTV replacement) |
|
||||
|
||||
---
|
||||
|
||||
## Network Notes
|
||||
|
||||
Jackett and Transmission run on docker2 via Docker Compose, not Swarm. They use `network_mode: container:gluetun` to route through the PIA VPN. Caddy reaches Jackett via the `netgrimoire` overlay using an internal hostname.
|
||||
|
||||
---
|
||||
|
||||
## Pending
|
||||
|
||||
- [ ] Prowlarr — low priority (NZBHydra covers current needs)
|
||||
- [ ] Mylar — comic downloader, needs setup (reference `archive/arr.yaml`)
|
||||
- [ ] Soularr — Soulseek integration for Lidarr
|
||||
- [ ] MeTube — YouTube downloader for Tunarr filler workflow
|
||||
841
Netgrimoire/Vault-Grimoire/Backups/Immich-Backup.md
Normal file
841
Netgrimoire/Vault-Grimoire/Backups/Immich-Backup.md
Normal file
|
|
@ -0,0 +1,841 @@
|
|||
---
|
||||
title: Immich Backup and Restore
|
||||
description: Immich backup with Kopia
|
||||
published: true
|
||||
date: 2026-02-20T04:11:52.181Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-14T03:14:32.594Z
|
||||
---
|
||||
|
||||
# Immich Backup and Recovery Guide
|
||||
|
||||
## Overview
|
||||
|
||||
This document provides comprehensive backup and recovery procedures for Immich photo server. Since Immich's data is stored on standard filesystems (not ZFS or BTRFS), snapshots are not available and we rely on Immich's native backup approach combined with Kopia for offsite storage in vaults.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Common Backup Commands
|
||||
|
||||
```bash
|
||||
# Run a manual backup (all components)
|
||||
/opt/scripts/backup-immich.sh
|
||||
|
||||
# Backup just the database
|
||||
docker exec -t immich_postgres pg_dump --clean --if-exists \
|
||||
--dbname=immich --username=postgres | gzip > "/opt/immich-backups/dump.sql.gz"
|
||||
|
||||
# List Kopia snapshots
|
||||
kopia snapshot list --tags immich
|
||||
|
||||
# View backup logs
|
||||
tail -f /var/log/immich-backup.log
|
||||
```
|
||||
|
||||
### Common Restore Commands
|
||||
|
||||
```bash
|
||||
# Restore database from backup
|
||||
gunzip < /opt/immich-backups/immich-YYYYMMDD_HHMMSS/dump.sql.gz | \
|
||||
docker exec -i immich_postgres psql --username=postgres --dbname=immich
|
||||
|
||||
# Restore from Kopia to new server
|
||||
kopia snapshot list --tags tier1-backup
|
||||
kopia restore <snapshot-id> /opt/immich-backups/
|
||||
|
||||
# Check container status after restore
|
||||
docker compose ps
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
## Critical Components to Backup
|
||||
|
||||
### 1. Docker Compose File
|
||||
- **Location**: `/opt/immich/docker-compose.yml` (or your installation path)
|
||||
- **Purpose**: Defines all containers, networks, and volumes
|
||||
- **Importance**: Critical for recreating the exact container configuration
|
||||
|
||||
### 2. Configuration Files
|
||||
- **Primary Config**: `/opt/immich/.env`
|
||||
- **Purpose**: Database credentials, upload locations, timezone settings
|
||||
- **Importance**: Required for proper service initialization
|
||||
|
||||
### 3. Database
|
||||
- **PostgreSQL Data**: Contains all metadata, user accounts, albums, sharing settings, face recognition data, timeline information
|
||||
- **Container**: `immich_postgres`
|
||||
- **Database Name**: `immich` (default)
|
||||
- **User**: `postgres` (default)
|
||||
- **Backup Method**: `pg_dump` (official Immich recommendation)
|
||||
|
||||
### 4. Photo/Video Library
|
||||
- **Upload Storage**: All original photos and videos uploaded by users
|
||||
- **Location**: `/srv/immich/library` (per your .env UPLOAD_LOCATION)
|
||||
- **Size**: Typically the largest component
|
||||
- **Critical**: This is your actual data - photos cannot be recreated
|
||||
|
||||
### 5. Additional Important Data
|
||||
- **Model Cache**: Docker volume `immich_model-cache` (machine learning models, can be re-downloaded)
|
||||
- **External Paths**: `/export/photos` and `/srv/NextCloud-AIO` (mounted as read-only in your setup)
|
||||
|
||||
## Backup Strategy
|
||||
|
||||
### Two-Tier Backup Approach
|
||||
|
||||
We use a **two-tier approach** combining Immich's native backup method with Kopia for offsite storage:
|
||||
|
||||
1. **Tier 1 (Local)**: Immich database dump + library backup creates consistent, component-level backups
|
||||
2. **Tier 2 (Offsite)**: Kopia snapshots the local backups and syncs to vaults
|
||||
|
||||
#### Why This Approach?
|
||||
|
||||
- **Best of both worlds**: Native database dump ensures Immich-specific consistency, Kopia provides deduplication and offsite protection
|
||||
- **Component-level restore**: Can restore individual components (just database, just library, etc.)
|
||||
- **Disaster recovery**: Full system restore from Kopia backups on new server
|
||||
- **Efficient storage**: Kopia's deduplication reduces storage needs for offsite copies
|
||||
|
||||
#### Backup Frequency
|
||||
- **Daily**: Immich backup runs at 2 AM
|
||||
- **Daily**: Kopia snapshot of backups runs at 3 AM
|
||||
- **Retention (Local)**: 7 days of Immich backups (managed by script)
|
||||
- **Retention (Kopia/Offsite)**: 30 daily, 12 weekly, 12 monthly
|
||||
|
||||
### Immich Native Backup Method
|
||||
|
||||
Immich's official backup approach uses `pg_dump` for the database:
|
||||
- Uses `pg_dump` with `--clean --if-exists` flags for consistent database dumps
|
||||
- Hot backup without stopping PostgreSQL
|
||||
- Produces compressed `.sql.gz` files
|
||||
- Database remains available during backup
|
||||
|
||||
For the photo/video library, we use a **hybrid approach**:
|
||||
- **Database**: Backed up locally as `dump.sql.gz` for fast component-level restore
|
||||
- **Library**: Backed up directly by Kopia (no tar) for optimal deduplication and incremental backups
|
||||
|
||||
**Why not tar the library?**
|
||||
- Kopia deduplicates at the file level - adding 1 photo shouldn't require backing up the entire library again
|
||||
- Individual file access for selective restore
|
||||
- Better compression and faster incremental backups
|
||||
- Lower risk - corrupted tar loses everything, corrupted file only affects that file
|
||||
|
||||
**Key Features:**
|
||||
- No downtime required
|
||||
- Consistent point-in-time snapshot
|
||||
- Standard PostgreSQL format (portable across systems)
|
||||
- Efficient incremental backups of photo library
|
||||
|
||||
## Setting Up Immich Backups
|
||||
|
||||
### Prereq:
|
||||
Make sure you are connected to the repository,
|
||||
|
||||
```bash
|
||||
sudo kopia repository connect server \
|
||||
--url=https://192.168.5.10:51516 \
|
||||
--override-username=admin \
|
||||
--server-cert-fingerprint=696a4999f594b5273a174fd7cab677d8dd1628f9b9d27e557daa87103ee064b2
|
||||
```
|
||||
|
||||
#### Step 1: Configure Backup Location
|
||||
|
||||
Set the backup destination:
|
||||
|
||||
```bash
|
||||
# Create the backup directory
|
||||
mkdir -p /opt/immich-backups
|
||||
chown -R root:root /opt/immich-backups
|
||||
chmod 755 /opt/immich-backups
|
||||
```
|
||||
|
||||
#### Step 2: Manual Backup Commands
|
||||
|
||||
```bash
|
||||
cd /opt/immich
|
||||
|
||||
# Backup database using Immich's recommended method
|
||||
docker exec -t immich_postgres pg_dump \
|
||||
--clean \
|
||||
--if-exists \
|
||||
--dbname=immich \
|
||||
--username=postgres \
|
||||
| gzip > "/opt/immich-backups/dump.sql.gz"
|
||||
|
||||
# Backup configuration files
|
||||
cp docker-compose.yml /opt/immich-backups/
|
||||
cp .env /opt/immich-backups/
|
||||
|
||||
# Backup library with Kopia (no tar - better deduplication)
|
||||
kopia snapshot create /srv/immich/library \
|
||||
--tags immich,library,photos \
|
||||
--description "Immich library manual backup"
|
||||
```
|
||||
|
||||
**What gets created:**
|
||||
- Local backup directory: `/opt/immich-backups/immich-YYYY-MM-DD-HH-MM-SS/`
|
||||
- Contains: `dump.sql.gz` (database), config files
|
||||
- Kopia snapshots:
|
||||
- `/opt/immich-backups` (database + config)
|
||||
- `/srv/immich/library` (photos/videos, no tar)
|
||||
- `/opt/immich` (installation directory)
|
||||
|
||||
#### Step 3: Automated Backup Script
|
||||
|
||||
Create `/opt/scripts/backup-immich.sh`:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Immich Automated Backup Script
|
||||
# This creates Immich backups, then snapshots them with Kopia for offsite storage
|
||||
|
||||
set -e
|
||||
|
||||
BACKUP_DATE=$(date +%Y%m%d_%H%M%S)
|
||||
LOG_FILE="/var/log/immich-backup.log"
|
||||
IMMICH_DIR="/opt/immich"
|
||||
BACKUP_DIR="/opt/immich-backups"
|
||||
KEEP_DAYS=7
|
||||
|
||||
# Database credentials from .env
|
||||
DB_USERNAME="postgres"
|
||||
DB_DATABASE_NAME="immich"
|
||||
POSTGRES_CONTAINER="immich_postgres"
|
||||
|
||||
echo "[${BACKUP_DATE}] ========================================" | tee -a "$LOG_FILE"
|
||||
echo "[${BACKUP_DATE}] Starting Immich backup process" | tee -a "$LOG_FILE"
|
||||
|
||||
# Step 1: Run Immich database backup using official method
|
||||
echo "[${BACKUP_DATE}] Running Immich database backup..." | tee -a "$LOG_FILE"
|
||||
|
||||
cd "$IMMICH_DIR"
|
||||
|
||||
# Create backup directory with timestamp
|
||||
mkdir -p "${BACKUP_DIR}/immich-${BACKUP_DATE}"
|
||||
|
||||
# Backup database using Immich's recommended method
|
||||
docker exec -t ${POSTGRES_CONTAINER} pg_dump \
|
||||
--clean \
|
||||
--if-exists \
|
||||
--dbname=${DB_DATABASE_NAME} \
|
||||
--username=${DB_USERNAME} \
|
||||
| gzip > "${BACKUP_DIR}/immich-${BACKUP_DATE}/dump.sql.gz"
|
||||
|
||||
BACKUP_EXIT=${PIPESTATUS[0]}
|
||||
|
||||
if [ $BACKUP_EXIT -ne 0 ]; then
|
||||
echo "[${BACKUP_DATE}] ERROR: Immich database backup failed with exit code ${BACKUP_EXIT}" | tee -a "$LOG_FILE"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "[${BACKUP_DATE}] Immich database backup completed successfully" | tee -a "$LOG_FILE"
|
||||
|
||||
# Step 2: Verify library location exists (Kopia will backup directly, no tar needed)
|
||||
echo "[${BACKUP_DATE}] Verifying library location..." | tee -a "$LOG_FILE"
|
||||
|
||||
# Get the upload location from docker-compose volumes
|
||||
UPLOAD_LOCATION="/srv/immich/library"
|
||||
|
||||
if [ -d "${UPLOAD_LOCATION}" ]; then
|
||||
#LIBRARY_SIZE=$(du -sh ${UPLOAD_LOCATION} | cut -f1)
|
||||
echo "[${BACKUP_DATE}] Library location verified: ${UPLOAD_LOCATION} (${LIBRARY_SIZE})" | tee -a "$LOG_FILE"
|
||||
echo "[${BACKUP_DATE}] Kopia will backup library files directly (no tar, better deduplication)" | tee -a "$LOG_FILE"
|
||||
else
|
||||
echo "[${BACKUP_DATE}] WARNING: Upload location not found at ${UPLOAD_LOCATION}" | tee -a "$LOG_FILE"
|
||||
fi
|
||||
|
||||
# Step 3: Backup configuration files
|
||||
echo "[${BACKUP_DATE}] Backing up configuration files..." | tee -a "$LOG_FILE"
|
||||
|
||||
cp "${IMMICH_DIR}/docker-compose.yml" "${BACKUP_DIR}/immich-${BACKUP_DATE}/"
|
||||
cp "${IMMICH_DIR}/.env" "${BACKUP_DIR}/immich-${BACKUP_DATE}/"
|
||||
|
||||
echo "[${BACKUP_DATE}] Configuration backup completed" | tee -a "$LOG_FILE"
|
||||
|
||||
# Step 4: Clean up old backups
|
||||
echo "[${BACKUP_DATE}] Cleaning up backups older than ${KEEP_DAYS} days..." | tee -a "$LOG_FILE"
|
||||
|
||||
find "${BACKUP_DIR}" -maxdepth 1 -type d -name "immich-*" -mtime +${KEEP_DAYS} -exec rm -rf {} \; 2>&1 | tee -a "$LOG_FILE"
|
||||
|
||||
echo "[${BACKUP_DATE}] Local backup cleanup completed" | tee -a "$LOG_FILE"
|
||||
|
||||
# Step 5: Create Kopia snapshot of backup directory
|
||||
echo "[${BACKUP_DATE}] Creating Kopia snapshot..." | tee -a "$LOG_FILE"
|
||||
|
||||
kopia snapshot create "${BACKUP_DIR}" \
|
||||
--tags immich:tier1-backup \
|
||||
--description "Immich backup ${BACKUP_DATE}" \
|
||||
2>&1 | tee -a "$LOG_FILE"
|
||||
|
||||
KOPIA_EXIT=${PIPESTATUS[0]}
|
||||
|
||||
if [ $KOPIA_EXIT -ne 0 ]; then
|
||||
echo "[${BACKUP_DATE}] WARNING: Kopia snapshot failed with exit code ${KOPIA_EXIT}" | tee -a "$LOG_FILE"
|
||||
echo "[${BACKUP_DATE}] Local Immich backup exists but offsite copy may be incomplete" | tee -a "$LOG_FILE"
|
||||
exit 2
|
||||
fi
|
||||
|
||||
echo "[${BACKUP_DATE}] Kopia snapshot completed successfully" | tee -a "$LOG_FILE"
|
||||
|
||||
# Step 6: Backup the library directly with Kopia (better deduplication than tar)
|
||||
echo "[${BACKUP_DATE}] Creating Kopia snapshot of library..." | tee -a "$LOG_FILE"
|
||||
|
||||
if [ -d "${UPLOAD_LOCATION}" ]; then
|
||||
kopia snapshot create "${UPLOAD_LOCATION}" \
|
||||
--tags immich:library \
|
||||
--description "Immich library ${BACKUP_DATE}" \
|
||||
2>&1 | tee -a "$LOG_FILE"
|
||||
|
||||
KOPIA_LIB_EXIT=${PIPESTATUS[0]}
|
||||
|
||||
if [ $KOPIA_LIB_EXIT -ne 0 ]; then
|
||||
echo "[${BACKUP_DATE}] WARNING: Kopia library snapshot failed" | tee -a "$LOG_FILE"
|
||||
else
|
||||
echo "[${BACKUP_DATE}] Library snapshot completed successfully" | tee -a "$LOG_FILE"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Step 7: Also backup the Immich installation directory (configs, compose files)
|
||||
#echo "[${BACKUP_DATE}] Backing up Immich installation directory..." | tee -a "$LOG_FILE"
|
||||
|
||||
#kopia snapshot create "${IMMICH_DIR}" \
|
||||
# --tags immich,config,docker-compose \
|
||||
# --description "Immich config ${BACKUP_DATE}" \
|
||||
# 2>&1 | tee -a "$LOG_FILE"
|
||||
|
||||
echo "[${BACKUP_DATE}] Backup process completed successfully" | tee -a "$LOG_FILE"
|
||||
echo "[${BACKUP_DATE}] ========================================" | tee -a "$LOG_FILE"
|
||||
|
||||
# Optional: Send notification on completion
|
||||
# Add your notification method here (email, webhook, etc.)
|
||||
```
|
||||
|
||||
Make it executable:
|
||||
```bash
|
||||
chmod +x /opt/scripts/backup-immich.sh
|
||||
```
|
||||
|
||||
Add to crontab (daily at 2 AM):
|
||||
```bash
|
||||
# Edit root's crontab
|
||||
crontab -e
|
||||
|
||||
# Add this line:
|
||||
0 2 * * * /opt/scripts/backup-immich.sh 2>&1 | logger -t immich-backup
|
||||
```
|
||||
|
||||
### Offsite Backup to Vaults
|
||||
|
||||
After local Kopia snapshots are created, they sync to your offsite vaults automatically through Kopia's repository configuration.
|
||||
|
||||
## Recovery Procedures
|
||||
|
||||
### Understanding Two Recovery Methods
|
||||
|
||||
We have **two restore methods** depending on the scenario:
|
||||
|
||||
1. **Local Restore** (Preferred): For component-level or same-server recovery
|
||||
2. **Kopia Full Restore**: For complete disaster recovery to a new server
|
||||
|
||||
### Method 1: Local Restore (Recommended)
|
||||
|
||||
Use this method when:
|
||||
- Restoring on the same/similar server
|
||||
- Restoring specific components (just database, just library, etc.)
|
||||
- Recovering from local Immich backups
|
||||
|
||||
#### Full System Restore
|
||||
|
||||
```bash
|
||||
cd /opt/immich
|
||||
|
||||
# Stop Immich
|
||||
docker compose down
|
||||
|
||||
# List available backups
|
||||
ls -lh /opt/immich-backups/
|
||||
|
||||
# Choose a database backup
|
||||
BACKUP_PATH="/opt/immich-backups/immich-YYYYMMDD_HHMMSS"
|
||||
|
||||
# Restore database
|
||||
gunzip < ${BACKUP_PATH}/dump.sql.gz | \
|
||||
docker compose exec -T database psql --username=postgres --dbname=immich
|
||||
|
||||
# Restore library from Kopia
|
||||
kopia snapshot list --tags library
|
||||
kopia restore <library-snapshot-id> /srv/immich/library
|
||||
|
||||
# Fix permissions
|
||||
chown -R 1000:1000 /srv/immich/library
|
||||
|
||||
# Restore configuration (review changes first)
|
||||
cp ${BACKUP_PATH}/.env .env.restored
|
||||
cp ${BACKUP_PATH}/docker-compose.yml docker-compose.yml.restored
|
||||
|
||||
# Start Immich
|
||||
docker compose up -d
|
||||
|
||||
# Monitor logs
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
#### Example: Restore Only Database
|
||||
|
||||
```bash
|
||||
cd /opt/immich
|
||||
|
||||
# Stop Immich
|
||||
docker compose down
|
||||
|
||||
# Start only database
|
||||
docker compose up -d database
|
||||
sleep 10
|
||||
|
||||
# Restore database from backup
|
||||
BACKUP_PATH="/opt/immich-backups/immich-YYYYMMDD_HHMMSS"
|
||||
gunzip < ${BACKUP_PATH}/dump.sql.gz | \
|
||||
docker compose exec -T database psql --username=postgres --dbname=immich
|
||||
|
||||
# Start all services
|
||||
docker compose down
|
||||
docker compose up -d
|
||||
|
||||
# Verify
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
#### Example: Restore Only Library
|
||||
|
||||
```bash
|
||||
cd /opt/immich
|
||||
|
||||
# Stop Immich
|
||||
docker compose down
|
||||
|
||||
# Restore library from Kopia
|
||||
kopia snapshot list --tags library
|
||||
kopia restore <library-snapshot-id> /srv/immich/library
|
||||
|
||||
# Fix permissions
|
||||
chown -R 1000:1000 /srv/immich/library
|
||||
|
||||
# Start Immich
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
### Method 2: Complete Server Rebuild (Kopia Restore)
|
||||
|
||||
Use this when recovering to a completely new server or when local backups are unavailable.
|
||||
|
||||
#### Step 1: Prepare New Server
|
||||
|
||||
```bash
|
||||
# Update system
|
||||
apt update && apt upgrade -y
|
||||
|
||||
# Install Docker
|
||||
curl -fsSL https://get.docker.com | sh
|
||||
systemctl enable docker
|
||||
systemctl start docker
|
||||
|
||||
# Install Docker Compose
|
||||
apt install docker-compose-plugin -y
|
||||
|
||||
# Install Kopia
|
||||
curl -s https://kopia.io/signing-key | sudo gpg --dearmor -o /usr/share/keyrings/kopia-keyring.gpg
|
||||
echo "deb [signed-by=/usr/share/keyrings/kopia-keyring.gpg] https://packages.kopia.io/apt/ stable main" | sudo tee /etc/apt/sources.list.d/kopia.list
|
||||
apt update
|
||||
apt install kopia -y
|
||||
|
||||
# Create directory structure
|
||||
mkdir -p /opt/immich
|
||||
mkdir -p /opt/immich-backups
|
||||
mkdir -p /srv/immich/library
|
||||
mkdir -p /srv/immich/postgres
|
||||
```
|
||||
|
||||
#### Step 2: Restore Kopia Repository
|
||||
|
||||
```bash
|
||||
# Connect to your offsite vault
|
||||
kopia repository connect server \
|
||||
--url=https://192.168.5.10:51516 \
|
||||
--override-username=admin \
|
||||
--server-cert-fingerprint=696a4999f594b5273a174fd7cab677d8dd1628f9b9d27e557daa87103ee064b2
|
||||
|
||||
# List available snapshots
|
||||
kopia snapshot list --tags immich
|
||||
```
|
||||
|
||||
#### Step 3: Restore Configuration
|
||||
|
||||
```bash
|
||||
# Find and restore the config snapshot
|
||||
kopia snapshot list --tags config
|
||||
|
||||
# Restore to the Immich directory
|
||||
kopia restore <snapshot-id> /opt/immich/
|
||||
|
||||
# Verify critical files
|
||||
ls -la /opt/immich/.env
|
||||
ls -la /opt/immich/docker-compose.yml
|
||||
```
|
||||
|
||||
#### Step 4: Restore Immich Backups Directory
|
||||
|
||||
```bash
|
||||
# Restore the entire backup directory from Kopia
|
||||
kopia snapshot list --tags tier1-backup
|
||||
|
||||
# Restore the most recent backup
|
||||
kopia restore <snapshot-id> /opt/immich-backups/
|
||||
|
||||
# Verify backups were restored
|
||||
ls -la /opt/immich-backups/
|
||||
```
|
||||
|
||||
#### Step 5: Restore Database and Library
|
||||
|
||||
```bash
|
||||
cd /opt/immich
|
||||
|
||||
# Find the most recent backup
|
||||
LATEST_BACKUP=$(ls -td /opt/immich-backups/immich-* | head -1)
|
||||
echo "Restoring from: $LATEST_BACKUP"
|
||||
|
||||
# Start database container
|
||||
docker compose up -d database
|
||||
sleep 30
|
||||
|
||||
# Restore database
|
||||
gunzip < ${LATEST_BACKUP}/dump.sql.gz | \
|
||||
docker compose exec -T database psql --username=postgres --dbname=immich
|
||||
|
||||
# Restore library from Kopia
|
||||
kopia snapshot list --tags library
|
||||
kopia restore <library-snapshot-id> /srv/immich/library
|
||||
|
||||
# Fix permissions
|
||||
chown -R 1000:1000 /srv/immich/library
|
||||
```
|
||||
|
||||
#### Step 6: Start and Verify Immich
|
||||
|
||||
```bash
|
||||
cd /opt/immich
|
||||
|
||||
# Pull latest images (or use versions from backup if preferred)
|
||||
docker compose pull
|
||||
|
||||
# Start all services
|
||||
docker compose up -d
|
||||
|
||||
# Monitor logs
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
#### Step 7: Post-Restore Verification
|
||||
|
||||
```bash
|
||||
# Check container status
|
||||
docker compose ps
|
||||
|
||||
# Test web interface
|
||||
curl -I http://localhost:2283
|
||||
|
||||
# Verify database
|
||||
docker compose exec database psql -U postgres -d immich -c "SELECT COUNT(*) FROM users;"
|
||||
|
||||
# Check library storage
|
||||
ls -lah /srv/immich/library/
|
||||
```
|
||||
|
||||
### Scenario 2: Restore Individual User's Photos
|
||||
|
||||
To restore a single user's library without affecting others:
|
||||
|
||||
**Option A: Using Kopia Mount (Recommended)**
|
||||
|
||||
```bash
|
||||
# Mount the Kopia snapshot
|
||||
kopia snapshot list --tags library
|
||||
mkdir -p /mnt/kopia-library
|
||||
kopia mount <library-snapshot-id> /mnt/kopia-library &
|
||||
|
||||
# Find the user's directory (using user ID from database)
|
||||
# User libraries are typically in: library/{user-uuid}/
|
||||
USER_UUID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
|
||||
|
||||
# Copy user's data back
|
||||
rsync -av /mnt/kopia-library/${USER_UUID}/ \
|
||||
/srv/immich/library/${USER_UUID}/
|
||||
|
||||
# Fix permissions
|
||||
chown -R 1000:1000 /srv/immich/library/${USER_UUID}/
|
||||
|
||||
# Unmount
|
||||
kopia unmount /mnt/kopia-library
|
||||
|
||||
# Restart Immich to recognize changes
|
||||
cd /opt/immich
|
||||
docker compose restart immich-server
|
||||
```
|
||||
|
||||
**Option B: Selective Kopia Restore**
|
||||
|
||||
```bash
|
||||
cd /opt/immich
|
||||
docker compose down
|
||||
|
||||
# Restore just the specific user's directory
|
||||
kopia snapshot list --tags library
|
||||
USER_UUID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
|
||||
|
||||
# Restore with path filter
|
||||
kopia restore <library-snapshot-id> /srv/immich/library \
|
||||
--snapshot-path="${USER_UUID}"
|
||||
|
||||
# Fix permissions
|
||||
chown -R 1000:1000 /srv/immich/library/${USER_UUID}/
|
||||
|
||||
# Start Immich
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
### Scenario 3: Database Recovery Only
|
||||
|
||||
If only the database is corrupted but library data is intact:
|
||||
|
||||
```bash
|
||||
cd /opt/immich
|
||||
|
||||
# Stop Immich
|
||||
docker compose down
|
||||
|
||||
# Start only database
|
||||
docker compose up -d database
|
||||
sleep 30
|
||||
|
||||
# Restore from most recent backup
|
||||
LATEST_BACKUP=$(ls -td /opt/immich-backups/immich-* | head -1)
|
||||
gunzip < ${LATEST_BACKUP}/dump.sql.gz | \
|
||||
docker compose exec -T database psql --username=postgres --dbname=immich
|
||||
|
||||
# Start all services
|
||||
docker compose down
|
||||
docker compose up -d
|
||||
|
||||
# Verify
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
### Scenario 4: Configuration Recovery Only
|
||||
|
||||
If you only need to restore configuration files:
|
||||
|
||||
```bash
|
||||
cd /opt/immich
|
||||
|
||||
# Find the most recent backup
|
||||
LATEST_BACKUP=$(ls -td /opt/immich-backups/immich-* | head -1)
|
||||
|
||||
# Stop Immich
|
||||
docker compose down
|
||||
|
||||
# Backup current config (just in case)
|
||||
cp .env .env.pre-restore
|
||||
cp docker-compose.yml docker-compose.yml.pre-restore
|
||||
|
||||
# Restore config from backup
|
||||
cp ${LATEST_BACKUP}/.env ./
|
||||
cp ${LATEST_BACKUP}/docker-compose.yml ./
|
||||
|
||||
# Restart
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
## Verification and Testing
|
||||
|
||||
### Regular Backup Verification
|
||||
|
||||
Perform monthly restore tests to ensure backups are valid:
|
||||
|
||||
```bash
|
||||
# Test restore to temporary location
|
||||
mkdir -p /tmp/backup-test
|
||||
kopia snapshot list --tags immich
|
||||
kopia restore <snapshot-id> /tmp/backup-test/
|
||||
|
||||
# Verify files exist and are readable
|
||||
ls -lah /tmp/backup-test/
|
||||
gunzip < /tmp/backup-test/immich-*/dump.sql.gz | head -100
|
||||
|
||||
# Cleanup
|
||||
rm -rf /tmp/backup-test/
|
||||
```
|
||||
|
||||
### Backup Monitoring Script
|
||||
|
||||
Create `/opt/scripts/check-immich-backup.sh`:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Check last backup age
|
||||
LAST_BACKUP=$(ls -td /opt/immich-backups/immich-* 2>/dev/null | head -1)
|
||||
|
||||
if [ -z "$LAST_BACKUP" ]; then
|
||||
echo "WARNING: No Immich backups found"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
BACKUP_DATE=$(basename "$LAST_BACKUP" | sed 's/immich-//')
|
||||
BACKUP_EPOCH=$(date -d "${BACKUP_DATE:0:8} ${BACKUP_DATE:9:2}:${BACKUP_DATE:11:2}:${BACKUP_DATE:13:2}" +%s 2>/dev/null)
|
||||
|
||||
if [ -z "$BACKUP_EPOCH" ]; then
|
||||
echo "WARNING: Cannot parse backup date"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
NOW=$(date +%s)
|
||||
AGE_HOURS=$(( ($NOW - $BACKUP_EPOCH) / 3600 ))
|
||||
|
||||
if [ $AGE_HOURS -gt 26 ]; then
|
||||
echo "WARNING: Last Immich backup is $AGE_HOURS hours old"
|
||||
# Send alert (email, Slack, etc.)
|
||||
exit 1
|
||||
else
|
||||
echo "OK: Last backup $AGE_HOURS hours ago"
|
||||
fi
|
||||
|
||||
# Check Kopia snapshots
|
||||
KOPIA_LAST=$(kopia snapshot list --tags immich --json 2>/dev/null | jq -r '.[0].startTime' 2>/dev/null)
|
||||
|
||||
if [ -n "$KOPIA_LAST" ]; then
|
||||
echo "Last Kopia snapshot: $KOPIA_LAST"
|
||||
else
|
||||
echo "WARNING: Cannot verify Kopia snapshots"
|
||||
fi
|
||||
```
|
||||
|
||||
## Disaster Recovery Checklist
|
||||
|
||||
When disaster strikes, follow this checklist:
|
||||
|
||||
- [ ] Confirm scope of failure (server, storage, specific component)
|
||||
- [ ] Gather server information (hostname, IP, DNS records)
|
||||
- [ ] Access offsite backup vault
|
||||
- [ ] Provision new server (if needed)
|
||||
- [ ] Install Docker and dependencies
|
||||
- [ ] Connect to Kopia repository
|
||||
- [ ] Restore configurations first
|
||||
- [ ] Restore database
|
||||
- [ ] Restore library data
|
||||
- [ ] Start services and verify
|
||||
- [ ] Test photo viewing and uploads
|
||||
- [ ] Verify user accounts and albums
|
||||
- [ ] Update DNS records if needed
|
||||
- [ ] Document any issues encountered
|
||||
- [ ] Update recovery procedures based on experience
|
||||
|
||||
## Important Notes
|
||||
|
||||
1. **External Mounts**: Your setup has `/export/photos` and `/srv/NextCloud-AIO` mounted as external read-only sources. These are not backed up by this script - ensure they have their own backup strategy.
|
||||
|
||||
2. **Database Password**: The default database password in your .env is `postgres`. Change this to a secure random password for production use.
|
||||
|
||||
3. **Permissions**: Library files should be owned by UID 1000:1000 for Immich to access them properly:
|
||||
```bash
|
||||
chown -R 1000:1000 /srv/immich/library
|
||||
```
|
||||
|
||||
4. **Testing**: Always test recovery procedures in a lab environment before trusting them in production.
|
||||
|
||||
5. **Documentation**: Keep this guide and server details in a separate location (printed copy, password manager, etc.).
|
||||
|
||||
6. **Retention Policy**: Review Kopia retention settings periodically to balance storage costs with recovery needs.
|
||||
|
||||
## Backup Architecture Notes
|
||||
|
||||
### Why Two Backup Layers?
|
||||
|
||||
**Immich Native Backups** (Tier 1):
|
||||
- ✅ Uses official Immich backup method (`pg_dump`)
|
||||
- ✅ Fast, component-aware backups
|
||||
- ✅ Selective restore (can restore just database or just library)
|
||||
- ✅ Standard PostgreSQL format (portable)
|
||||
- ❌ No deduplication (full copies each time)
|
||||
- ❌ Limited to local storage initially
|
||||
|
||||
**Kopia Snapshots** (Tier 2):
|
||||
- ✅ Deduplication and compression
|
||||
- ✅ Efficient offsite replication to vaults
|
||||
- ✅ Point-in-time recovery across multiple versions
|
||||
- ✅ Disaster recovery to completely new infrastructure
|
||||
- ❌ Less component-aware (treats as files)
|
||||
- ❌ Slower for granular component restore
|
||||
|
||||
### Storage Efficiency
|
||||
|
||||
Using this two-tier approach:
|
||||
- **Local**: Database backups (~7 days retention, relatively small)
|
||||
- **Kopia**: Database backups + library (efficient deduplication)
|
||||
|
||||
**Why library goes directly to Kopia without tar:**
|
||||
|
||||
Example with 500GB library, adding 10GB photos/month:
|
||||
|
||||
**With tar approach:**
|
||||
- Month 1: Backup 500GB tar
|
||||
- Month 2: Add 10GB photos → Entire 510GB tar changes → Backup 510GB
|
||||
- Month 3: Add 10GB photos → Entire 520GB tar changes → Backup 520GB
|
||||
- **Total storage needed**: 500 + 510 + 520 = 1,530GB
|
||||
|
||||
**Without tar (Kopia direct):**
|
||||
- Month 1: Backup 500GB
|
||||
- Month 2: Add 10GB photos → Kopia only backs up the 10GB new files
|
||||
- Month 3: Add 10GB photos → Kopia only backs up the 10GB new files
|
||||
- **Total storage needed**: 500 + 10 + 10 = 520GB
|
||||
|
||||
**Savings**: ~66% reduction in storage and backup time!
|
||||
|
||||
This is why we:
|
||||
- Keep database dumps local (small, fast component restore)
|
||||
- Let Kopia handle library directly (efficient, incremental, deduplicated)
|
||||
|
||||
### Compression and Deduplication
|
||||
|
||||
**Database backups** use `gzip` compression:
|
||||
- Typically 80-90% compression ratio for SQL dumps
|
||||
- Small enough to keep local copies
|
||||
|
||||
**Library backups** use Kopia's built-in compression and deduplication:
|
||||
- Photos (JPEG/HEIC): Already compressed, Kopia skips re-compression
|
||||
- Videos: Already compressed, minimal additional compression
|
||||
- RAW files: Some compression possible
|
||||
- **Deduplication**: If you upload the same photo twice, Kopia stores it once
|
||||
- **Block-level dedup**: Even modified photos share unchanged blocks
|
||||
|
||||
This is far more efficient than tar + gzip, which would:
|
||||
- Compress already-compressed photos (wasted CPU, minimal benefit)
|
||||
- Store entire archive even if only 1 file changed
|
||||
- Prevent deduplication across backups
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- [Immich Official Backup Documentation](https://immich.app/docs/administration/backup-and-restore)
|
||||
- [Kopia Documentation](https://kopia.io/docs/)
|
||||
- [Docker Volume Backup Best Practices](https://docs.docker.com/storage/volumes/#back-up-restore-or-migrate-data-volumes)
|
||||
- [PostgreSQL pg_dump Documentation](https://www.postgresql.org/docs/current/app-pgdump.html)
|
||||
|
||||
## Revision History
|
||||
|
||||
| Date | Version | Changes |
|
||||
|------|---------|---------|
|
||||
| 2026-02-13 | 1.0 | Initial documentation - two-tier backup strategy using Immich's native backup method |
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: February 13, 2026
|
||||
**Maintained By**: System Administrator
|
||||
**Review Schedule**: Quarterly
|
||||
879
Netgrimoire/Vault-Grimoire/Backups/MailCow-Backup.md
Normal file
879
Netgrimoire/Vault-Grimoire/Backups/MailCow-Backup.md
Normal file
|
|
@ -0,0 +1,879 @@
|
|||
---
|
||||
title: Mailcow Backup and Restore Strategy
|
||||
description: Mailcow backup
|
||||
published: true
|
||||
date: 2026-02-20T04:15:25.924Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-11T01:20:59.127Z
|
||||
---
|
||||
|
||||
# Mailcow Backup and Recovery Guide
|
||||
|
||||
## Overview
|
||||
|
||||
This document provides comprehensive backup and recovery procedures for Mailcow email server. Since Mailcow is **not running on ZFS or BTRFS**, snapshots are not available and we rely on Mailcow's native backup script combined with Kopia for offsite storage in vaults.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Common Backup Commands
|
||||
|
||||
```bash
|
||||
# Run a manual backup (all components)
|
||||
cd /opt/mailcow-dockerized
|
||||
MAILCOW_BACKUP_LOCATION=/opt/mailcow-backups \
|
||||
./helper-scripts/backup_and_restore.sh backup all --delete-days 7
|
||||
|
||||
# Backup with multithreading (faster)
|
||||
THREADS=4 MAILCOW_BACKUP_LOCATION=/opt/mailcow-backups \
|
||||
./helper-scripts/backup_and_restore.sh backup all --delete-days 7
|
||||
|
||||
# List Kopia snapshots
|
||||
kopia snapshot list --tags mailcow
|
||||
|
||||
# View backup logs
|
||||
tail -f /var/log/mailcow-backup.log
|
||||
```
|
||||
|
||||
### Common Restore Commands
|
||||
|
||||
```bash
|
||||
# Restore using mailcow native script (interactive)
|
||||
cd /opt/mailcow-dockerized
|
||||
./helper-scripts/backup_and_restore.sh restore
|
||||
|
||||
# Restore from Kopia to new server
|
||||
kopia snapshot list --tags tier1-backup
|
||||
kopia restore <snapshot-id> /opt/mailcow-backups/
|
||||
|
||||
# Check container status after restore
|
||||
docker compose ps
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
## Critical Components to Backup
|
||||
|
||||
### 1. Docker Compose File
|
||||
- **Location**: `/opt/mailcow-dockerized/docker-compose.yml` (or your installation path)
|
||||
- **Purpose**: Defines all containers, networks, and volumes
|
||||
- **Importance**: Critical for recreating the exact container configuration
|
||||
|
||||
### 2. Configuration Files
|
||||
- **Primary Config**: `/opt/mailcow-dockerized/mailcow.conf`
|
||||
- **Additional Configs**:
|
||||
- `/opt/mailcow-dockerized/data/conf/` (all subdirectories)
|
||||
- Custom SSL certificates if not using Let's Encrypt
|
||||
- Any override files (e.g., `docker-compose.override.yml`)
|
||||
|
||||
### 3. Database
|
||||
- **MySQL/MariaDB Data**: Contains all mailbox configurations, users, domains, aliases, settings
|
||||
- **Docker Volume**: `mailcowdockerized_mysql-vol`
|
||||
- **Container Path**: `/var/lib/mysql`
|
||||
|
||||
### 4. Email Data
|
||||
- **Maildir Storage**: All actual email messages
|
||||
- **Docker Volume**: `mailcowdockerized_vmail-vol`
|
||||
- **Container Path**: `/var/vmail`
|
||||
- **Size**: Typically the largest component
|
||||
|
||||
### 5. Additional Important Data
|
||||
- **Redis Data**: `mailcowdockerized_redis-vol` (cache and sessions)
|
||||
- **Rspamd Data**: `mailcowdockerized_rspamd-vol` (spam learning)
|
||||
- **Crypt Data**: `mailcowdockerized_crypt-vol` (if using mailbox encryption)
|
||||
- **Postfix Queue**: `mailcowdockerized_postfix-vol` (queued/deferred mail)
|
||||
|
||||
## Backup Strategy
|
||||
|
||||
### Two-Tier Backup Approach
|
||||
|
||||
We use a **two-tier approach** combining Mailcow's native backup script with Kopia for offsite storage:
|
||||
|
||||
1. **Tier 1 (Local)**: Mailcow's `backup_and_restore.sh` script creates consistent, component-level backups
|
||||
2. **Tier 2 (Offsite)**: Kopia snapshots the local backups and syncs to vaults
|
||||
|
||||
#### Why This Approach?
|
||||
|
||||
- **Best of both worlds**: Native script ensures mailcow-specific consistency, Kopia provides deduplication and offsite protection
|
||||
- **Component-level restore**: Can restore individual components (just vmail, just mysql, etc.) using mailcow script
|
||||
- **Disaster recovery**: Full system restore from Kopia backups on new server
|
||||
- **Efficient storage**: Kopia's deduplication reduces storage needs for offsite copies
|
||||
|
||||
#### Backup Frequency
|
||||
- **Daily**: Mailcow native backup runs at 2 AM
|
||||
- **Daily**: Kopia snapshot of backups runs at 3 AM
|
||||
- **Retention (Local)**: 7 days of mailcow backups (managed by script)
|
||||
- **Retention (Kopia/Offsite)**: 30 daily, 12 weekly, 12 monthly
|
||||
|
||||
### Mailcow Native Backup Script
|
||||
|
||||
Mailcow includes `/opt/mailcow-dockerized/helper-scripts/backup_and_restore.sh` which handles:
|
||||
- **vmail**: Email data (mailboxes)
|
||||
- **mysql**: Database (using mariabackup for consistency)
|
||||
- **redis**: Redis database
|
||||
- **rspamd**: Spam filter learning data
|
||||
- **crypt**: Encryption data
|
||||
- **postfix**: Mail queue
|
||||
|
||||
**Key Features:**
|
||||
- Uses `mariabackup` (hot backup without stopping MySQL)
|
||||
- Supports multithreading for faster backups
|
||||
- Architecture-aware (handles x86/ARM differences)
|
||||
- Built-in cleanup with `--delete-days` parameter
|
||||
- Creates compressed archives (.tar.zst or .tar.gz)
|
||||
|
||||
### Setting Up Mailcow Backups
|
||||
|
||||
|
||||
#### Prereq:
|
||||
Make sure you are connected to the repository,
|
||||
|
||||
```bash
|
||||
sudo kopia repository connect server --url=https://192.168.5.10:51516 --override-username=admin --server-cert-fingerprint=696a4999f594b5273a174fd7cab677d8dd1628f9b9d27e557daa87103ee064b2
|
||||
```
|
||||
|
||||
|
||||
#### Step 1: Configure Backup Location
|
||||
|
||||
Set the backup destination via environment variable or in mailcow.conf:
|
||||
|
||||
```bash
|
||||
# Option 1: Set environment variable (preferred for automation)
|
||||
export MAILCOW_BACKUP_LOCATION="/opt/mailcow-backups"
|
||||
|
||||
# Option 2: Add to cron job directly (shown in automated script below)
|
||||
```
|
||||
|
||||
Create the backup directory:
|
||||
```bash
|
||||
mkdir -p /opt/mailcow-backups
|
||||
chown -R root:root /opt/mailcow-backups
|
||||
chmod 777 /opt/mailcow-backups
|
||||
```
|
||||
|
||||
#### Step 2: Manual Backup Commands
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Backup all components, delete backups older than 7 days
|
||||
MAILCOW_BACKUP_LOCATION=/opt/mailcow-backups \
|
||||
./helper-scripts/backup_and_restore.sh backup all --delete-days 7
|
||||
|
||||
# Backup with multithreading (faster for large mailboxes)
|
||||
THREADS=4 MAILCOW_BACKUP_LOCATION=/opt/mailcow-backups \
|
||||
./helper-scripts/backup_and_restore.sh backup all --delete-days 7
|
||||
|
||||
# Backup specific components only
|
||||
MAILCOW_BACKUP_LOCATION=/opt/mailcow-backups \
|
||||
./helper-scripts/backup_and_restore.sh backup vmail mysql --delete-days 7
|
||||
```
|
||||
|
||||
**What gets created:**
|
||||
- Backup directory: `/opt/mailcow-backups/mailcow-YYYY-MM-DD-HH-MM-SS/`
|
||||
- Contains: `.tar.zst` compressed archives for each component
|
||||
- Plus: `mailcow.conf` copy for restore reference
|
||||
|
||||
#### Step 3: Automated Backup Script
|
||||
|
||||
Create `/opt/scripts/backup-mailcow.sh`:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Mailcow Automated Backup Script
|
||||
# This creates mailcow native backups, then snapshots them with Kopia for offsite storage
|
||||
|
||||
set -e
|
||||
|
||||
BACKUP_DATE=$(date +%Y%m%d_%H%M%S)
|
||||
LOG_FILE="/var/log/mailcow-backup.log"
|
||||
MAILCOW_DIR="/opt/mailcow-dockerized"
|
||||
BACKUP_DIR="/opt/mailcow-backups"
|
||||
THREADS=4 # Adjust based on your CPU cores
|
||||
KEEP_DAYS=7 # Keep local mailcow backups for 7 days
|
||||
|
||||
echo "[${BACKUP_DATE}] ========================================" | tee -a "$LOG_FILE"
|
||||
echo "[${BACKUP_DATE}] Starting Mailcow backup process" | tee -a "$LOG_FILE"
|
||||
|
||||
# Step 1: Run mailcow's native backup script
|
||||
echo "[${BACKUP_DATE}] Running mailcow native backup..." | tee -a "$LOG_FILE"
|
||||
|
||||
cd "$MAILCOW_DIR"
|
||||
|
||||
# Run the backup with multithreading
|
||||
THREADS=${THREADS} MAILCOW_BACKUP_LOCATION=${BACKUP_DIR} \
|
||||
./helper-scripts/backup_and_restore.sh backup all --delete-days ${KEEP_DAYS} \
|
||||
2>&1 | tee -a "$LOG_FILE"
|
||||
|
||||
BACKUP_EXIT=${PIPESTATUS[0]}
|
||||
|
||||
if [ $BACKUP_EXIT -ne 0 ]; then
|
||||
echo "[${BACKUP_DATE}] ERROR: Mailcow backup failed with exit code ${BACKUP_EXIT}" | tee -a "$LOG_FILE"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "[${BACKUP_DATE}] Mailcow native backup completed successfully" | tee -a "$LOG_FILE"
|
||||
|
||||
# Step 2: Create Kopia snapshot of backup directory
|
||||
echo "[${BACKUP_DATE}] Creating Kopia snapshot..." | tee -a "$LOG_FILE"
|
||||
|
||||
kopia snapshot create "${BACKUP_DIR}" \
|
||||
--tags mailcow:tier1-backup \
|
||||
--description "Mailcow backup ${BACKUP_DATE}" \
|
||||
2>&1 | tee -a "$LOG_FILE"
|
||||
|
||||
KOPIA_EXIT=${PIPESTATUS[0]}
|
||||
|
||||
if [ $KOPIA_EXIT -ne 0 ]; then
|
||||
echo "[${BACKUP_DATE}] WARNING: Kopia snapshot failed with exit code ${KOPIA_EXIT}" | tee -a "$LOG_FILE"
|
||||
echo "[${BACKUP_DATE}] Local mailcow backup exists but offsite copy may be incomplete" | tee -a "$LOG_FILE"
|
||||
exit 2
|
||||
fi
|
||||
|
||||
echo "[${BACKUP_DATE}] Kopia snapshot completed successfully" | tee -a "$LOG_FILE"
|
||||
|
||||
# Step 3: Also backup the mailcow installation directory (configs, compose files)
|
||||
echo "[${BACKUP_DATE}] Backing up mailcow installation directory..." | tee -a "$LOG_FILE"
|
||||
|
||||
kopia snapshot create "${MAILCOW_DIR}" \
|
||||
--tags mailcow,config,docker-compose \
|
||||
--description "Mailcow config ${BACKUP_DATE}" \
|
||||
2>&1 | tee -a "$LOG_FILE"
|
||||
|
||||
echo "[${BACKUP_DATE}] Backup process completed successfully" | tee -a "$LOG_FILE"
|
||||
echo "[${BACKUP_DATE}] ========================================" | tee -a "$LOG_FILE"
|
||||
|
||||
# Optional: Send notification on completion
|
||||
# Add your notification method here (email, webhook, etc.)
|
||||
```
|
||||
|
||||
Make it executable:
|
||||
```bash
|
||||
chmod +x /opt/scripts/backup-mailcow.sh
|
||||
```
|
||||
|
||||
Add to crontab (daily at 2 AM):
|
||||
```bash
|
||||
# Edit root's crontab
|
||||
crontab -e
|
||||
|
||||
# Add this line:
|
||||
0 2 * * * /opt/scripts/backup-mailcow.sh 2>&1 | logger -t mailcow-backup
|
||||
```
|
||||
|
||||
### Offsite Backup to Vaults
|
||||
|
||||
After local Kopia snapshots are created, sync to your offsite vaults:
|
||||
|
||||
```bash
|
||||
# Option 1: Kopia repository sync (if using multiple Kopia repos)
|
||||
kopia repository sync-to filesystem --path /mnt/vault/mailcow-backup
|
||||
|
||||
# Option 2: Rsync to vault
|
||||
rsync -avz --delete /backup/kopia-repo/ /mnt/vault/mailcow-backup/
|
||||
|
||||
# Option 3: Rclone to remote vault
|
||||
rclone sync /backup/kopia-repo/ vault:mailcow-backup/
|
||||
```
|
||||
|
||||
## Recovery Procedures
|
||||
|
||||
### Understanding Two Recovery Methods
|
||||
|
||||
We have **two restore methods** depending on the scenario:
|
||||
|
||||
1. **Mailcow Native Restore** (Preferred): For component-level or same-server recovery
|
||||
2. **Kopia Full Restore**: For complete disaster recovery to a new server
|
||||
|
||||
### Method 1: Mailcow Native Restore (Recommended)
|
||||
|
||||
Use this method when:
|
||||
- Restoring on the same/similar server
|
||||
- Restoring specific components (just email, just database, etc.)
|
||||
- Recovering from local mailcow backups
|
||||
|
||||
#### Step 1: List Available Backups
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Run the restore script
|
||||
./helper-scripts/backup_and_restore.sh restore
|
||||
```
|
||||
|
||||
The script will prompt:
|
||||
```
|
||||
Backup location (absolute path, starting with /): /opt/mailcow-backups
|
||||
```
|
||||
|
||||
#### Step 2: Select Backup
|
||||
|
||||
The script displays available backups:
|
||||
```
|
||||
Found project name mailcowdockerized
|
||||
[ 1 ] - /opt/mailcow-backups/mailcow-2026-02-09-02-00-14/
|
||||
[ 2 ] - /opt/mailcow-backups/mailcow-2026-02-10-02-00-08/
|
||||
```
|
||||
|
||||
Enter the number of the backup to restore.
|
||||
|
||||
#### Step 3: Select Components
|
||||
|
||||
Choose what to restore:
|
||||
```
|
||||
[ 0 ] - all
|
||||
[ 1 ] - Crypt data
|
||||
[ 2 ] - Rspamd data
|
||||
[ 3 ] - Mail directory (/var/vmail)
|
||||
[ 4 ] - Redis DB
|
||||
[ 5 ] - Postfix data
|
||||
[ 6 ] - SQL DB
|
||||
```
|
||||
|
||||
**Important**: The script will:
|
||||
- Stop mailcow containers automatically
|
||||
- Restore selected components
|
||||
- Handle permissions correctly
|
||||
- Restart containers when done
|
||||
|
||||
#### Example: Restore Only Email Data
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
./helper-scripts/backup_and_restore.sh restore
|
||||
|
||||
# When prompted:
|
||||
# - Backup location: /opt/mailcow-backups
|
||||
# - Select backup: 2 (most recent)
|
||||
# - Select component: 3 (Mail directory)
|
||||
```
|
||||
|
||||
#### Example: Restore Database Only
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
./helper-scripts/backup_and_restore.sh restore
|
||||
|
||||
# When prompted:
|
||||
# - Backup location: /opt/mailcow-backups
|
||||
# - Select backup: 2 (most recent)
|
||||
# - Select component: 6 (SQL DB)
|
||||
```
|
||||
|
||||
**Note**: For database restore, the script will modify `mailcow.conf` with the database credentials from the backup. Review the changes after restore.
|
||||
|
||||
### Method 2: Complete Server Rebuild (Kopia Restore)
|
||||
|
||||
Use this when recovering to a completely new server or when local backups are unavailable.
|
||||
|
||||
#### Step 1: Prepare New Server
|
||||
|
||||
```bash
|
||||
# Update system
|
||||
apt update && apt upgrade -y
|
||||
|
||||
# Install Docker
|
||||
curl -fsSL https://get.docker.com | sh
|
||||
systemctl enable docker
|
||||
systemctl start docker
|
||||
|
||||
# Install Docker Compose
|
||||
apt install docker-compose-plugin -y
|
||||
|
||||
# Install Kopia
|
||||
curl -s https://kopia.io/signing-key | apt-key add -
|
||||
echo "deb https://packages.kopia.io/apt/ stable main" | tee /etc/apt/sources.list.d/kopia.list
|
||||
apt update
|
||||
apt install kopia -y
|
||||
|
||||
# Create directory structure
|
||||
mkdir -p /opt/mailcow-dockerized
|
||||
mkdir -p /opt/mailcow-backups/database
|
||||
```
|
||||
|
||||
#### Step 2: Restore Kopia Repository
|
||||
|
||||
```bash
|
||||
# Connect to your offsite vault
|
||||
# If vault is mounted:
|
||||
kopia repository connect filesystem --path /mnt/vault/mailcow-backup
|
||||
|
||||
# If vault is remote:
|
||||
kopia repository connect s3 --bucket=your-bucket --access-key=xxx --secret-access-key=xxx
|
||||
|
||||
# List available snapshots
|
||||
kopia snapshot list --tags mailcow
|
||||
```
|
||||
|
||||
#### Step 3: Restore Configuration
|
||||
|
||||
```bash
|
||||
# Find and restore the config snapshot
|
||||
kopia snapshot list --tags config
|
||||
|
||||
# Restore to the Mailcow directory
|
||||
kopia restore <snapshot-id> /opt/mailcow-dockerized/
|
||||
|
||||
# Verify critical files
|
||||
ls -la /opt/mailcow-dockerized/mailcow.conf
|
||||
ls -la /opt/mailcow-dockerized/docker-compose.yml
|
||||
```
|
||||
|
||||
#### Step 4: Restore Mailcow Backups Directory
|
||||
|
||||
```bash
|
||||
# Restore the entire backup directory from Kopia
|
||||
kopia snapshot list --tags tier1-backup
|
||||
|
||||
# Restore the most recent backup
|
||||
kopia restore <snapshot-id> /opt/mailcow-backups/
|
||||
|
||||
# Verify backups were restored
|
||||
ls -la /opt/mailcow-backups/
|
||||
```
|
||||
|
||||
#### Step 5: Run Mailcow Native Restore
|
||||
|
||||
Now use mailcow's built-in restore script:
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Run the restore script
|
||||
./helper-scripts/backup_and_restore.sh restore
|
||||
|
||||
# When prompted:
|
||||
# - Backup location: /opt/mailcow-backups
|
||||
# - Select the most recent backup
|
||||
# - Select [ 0 ] - all (to restore everything)
|
||||
```
|
||||
|
||||
The script will:
|
||||
1. Stop all mailcow containers
|
||||
2. Restore all components (vmail, mysql, redis, rspamd, postfix, crypt)
|
||||
3. Update mailcow.conf with restored database credentials
|
||||
4. Restart all containers
|
||||
|
||||
**Alternative: Manual Restore** (if you prefer more control)
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Start containers to create volumes
|
||||
docker compose up -d --no-start
|
||||
docker compose down
|
||||
|
||||
# Find the most recent backup directory
|
||||
LATEST_BACKUP=$(ls -td /opt/mailcow-backups/mailcow-* | head -1)
|
||||
echo "Restoring from: $LATEST_BACKUP"
|
||||
|
||||
# Extract each component manually
|
||||
cd "$LATEST_BACKUP"
|
||||
|
||||
# Restore vmail (email data)
|
||||
docker run --rm \
|
||||
-v mailcowdockerized_vmail-vol:/backup \
|
||||
-v "$PWD":/restore \
|
||||
debian:bookworm-slim \
|
||||
tar --use-compress-program='zstd -d' -xvf /restore/backup_vmail.tar.zst
|
||||
|
||||
# Restore MySQL
|
||||
docker run --rm \
|
||||
-v mailcowdockerized_mysql-vol:/backup \
|
||||
-v "$PWD":/restore \
|
||||
mariadb:10.11 \
|
||||
tar --use-compress-program='zstd -d' -xvf /restore/backup_mysql.tar.zst
|
||||
|
||||
# Restore Redis
|
||||
docker run --rm \
|
||||
-v mailcowdockerized_redis-vol:/backup \
|
||||
-v "$PWD":/restore \
|
||||
debian:bookworm-slim \
|
||||
tar --use-compress-program='zstd -d' -xvf /restore/backup_redis.tar.zst
|
||||
|
||||
# Restore other components similarly (rspamd, postfix, crypt)
|
||||
# ...
|
||||
|
||||
# Copy mailcow.conf from backup
|
||||
cp "$LATEST_BACKUP/mailcow.conf" /opt/mailcow-dockerized/mailcow.conf
|
||||
```
|
||||
|
||||
#### Step 6: Start and Verify Mailcow
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Pull latest images (or use versions from backup if preferred)
|
||||
docker compose pull
|
||||
|
||||
# Start all services
|
||||
docker compose up -d
|
||||
|
||||
# Monitor logs
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
#### Step 7: Post-Restore Verification
|
||||
|
||||
```bash
|
||||
# Check container status
|
||||
docker compose ps
|
||||
|
||||
# Test web interface
|
||||
curl -I https://mail.yourdomain.com
|
||||
|
||||
# Check mail log
|
||||
docker compose logs -f postfix-mailcow
|
||||
|
||||
# Verify database
|
||||
docker compose exec mysql-mailcow mysql -u root -p$(grep DBROOT mailcow.conf | cut -d'=' -f2) -e "SHOW DATABASES;"
|
||||
|
||||
# Check email storage
|
||||
docker compose exec dovecot-mailcow ls -lah /var/vmail/
|
||||
```
|
||||
|
||||
### Scenario 2: Restore Individual Mailbox
|
||||
|
||||
To restore a single user's mailbox without affecting others:
|
||||
|
||||
#### Option A: Using Mailcow Backups (If Available)
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Temporarily mount the backup
|
||||
BACKUP_DIR="/opt/mailcow-backups/mailcow-YYYY-MM-DD-HH-MM-SS"
|
||||
|
||||
# Extract just the vmail archive to a temporary location
|
||||
mkdir -p /tmp/vmail-restore
|
||||
cd "$BACKUP_DIR"
|
||||
tar --use-compress-program='zstd -d' -xvf backup_vmail.tar.zst -C /tmp/vmail-restore
|
||||
|
||||
# Find the user's mailbox
|
||||
# Structure: /tmp/vmail-restore/var/vmail/domain.com/user/
|
||||
ls -la /tmp/vmail-restore/var/vmail/yourdomain.com/
|
||||
|
||||
# Copy specific mailbox
|
||||
rsync -av /tmp/vmail-restore/var/vmail/yourdomain.com/user@domain.com/ \
|
||||
/var/lib/docker/volumes/mailcowdockerized_vmail-vol/_data/yourdomain.com/user@domain.com/
|
||||
|
||||
# Fix permissions
|
||||
docker run --rm \
|
||||
-v mailcowdockerized_vmail-vol:/vmail \
|
||||
debian:bookworm-slim \
|
||||
chown -R 5000:5000 /vmail/yourdomain.com/user@domain.com/
|
||||
|
||||
# Cleanup
|
||||
rm -rf /tmp/vmail-restore
|
||||
|
||||
# Restart Dovecot to recognize changes
|
||||
docker compose restart dovecot-mailcow
|
||||
```
|
||||
|
||||
#### Option B: Using Kopia Snapshot (If Local Backups Unavailable)
|
||||
|
||||
```bash
|
||||
# Mount the vmail snapshot temporarily
|
||||
mkdir -p /mnt/restore
|
||||
kopia mount <vmail-snapshot-id> /mnt/restore
|
||||
|
||||
# Find the user's mailbox
|
||||
# Structure: /mnt/restore/domain.com/user/
|
||||
ls -la /mnt/restore/yourdomain.com/
|
||||
|
||||
# Copy specific mailbox
|
||||
rsync -av /mnt/restore/yourdomain.com/user@domain.com/ \
|
||||
/var/lib/docker/volumes/mailcowdockerized_vmail-vol/_data/yourdomain.com/user@domain.com/
|
||||
|
||||
# Fix permissions
|
||||
chown -R 5000:5000 /var/lib/docker/volumes/mailcowdockerized_vmail-vol/_data/yourdomain.com/user@domain.com/
|
||||
|
||||
# Unmount
|
||||
kopia unmount /mnt/restore
|
||||
|
||||
# Restart Dovecot to recognize changes
|
||||
docker compose restart dovecot-mailcow
|
||||
```
|
||||
|
||||
### Scenario 3: Database Recovery Only
|
||||
|
||||
If only the database is corrupted but email data is intact:
|
||||
|
||||
#### Option A: Using Mailcow Native Restore (Recommended)
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Run the restore script
|
||||
./helper-scripts/backup_and_restore.sh restore
|
||||
|
||||
# When prompted:
|
||||
# - Backup location: /opt/mailcow-backups
|
||||
# - Select the most recent backup
|
||||
# - Select [ 6 ] - SQL DB (database only)
|
||||
```
|
||||
|
||||
The script will:
|
||||
1. Stop mailcow
|
||||
2. Restore the MySQL database from the mariabackup archive
|
||||
3. Update mailcow.conf with the restored database credentials
|
||||
4. Restart mailcow
|
||||
|
||||
#### Option B: Manual Database Restore from Kopia
|
||||
|
||||
If local backups are unavailable:
|
||||
|
||||
```bash
|
||||
cd /opt/mailcow-dockerized
|
||||
|
||||
# Stop Mailcow
|
||||
docker compose down
|
||||
|
||||
# Start only MySQL
|
||||
docker compose up -d mysql-mailcow
|
||||
|
||||
# Wait for MySQL
|
||||
sleep 30
|
||||
|
||||
# Restore from Kopia database dump
|
||||
kopia snapshot list --tags database
|
||||
kopia restore <snapshot-id> /tmp/db-restore/
|
||||
|
||||
# Import the dump
|
||||
LATEST_DUMP=$(ls -t /tmp/db-restore/mailcow_*.sql | head -1)
|
||||
docker compose exec -T mysql-mailcow mysql -u root -p$(grep DBROOT mailcow.conf | cut -d'=' -f2) < "$LATEST_DUMP"
|
||||
|
||||
# Start all services
|
||||
docker compose down
|
||||
docker compose up -d
|
||||
|
||||
# Verify
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
### Scenario 4: Configuration Recovery Only
|
||||
|
||||
If you only need to restore configuration files:
|
||||
|
||||
#### Option A: From Mailcow Backup
|
||||
|
||||
```bash
|
||||
# Find the most recent backup
|
||||
LATEST_BACKUP=$(ls -td /opt/mailcow-backups/mailcow-* | head -1)
|
||||
|
||||
# Stop Mailcow
|
||||
cd /opt/mailcow-dockerized
|
||||
docker compose down
|
||||
|
||||
# Backup current config (just in case)
|
||||
cp mailcow.conf mailcow.conf.pre-restore
|
||||
cp docker-compose.yml docker-compose.yml.pre-restore
|
||||
|
||||
# Restore mailcow.conf from backup
|
||||
cp "$LATEST_BACKUP/mailcow.conf" ./mailcow.conf
|
||||
|
||||
# If you also need other config files from data/conf/,
|
||||
# you would need to extract them from the backup archives
|
||||
|
||||
# Restart
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
#### Option B: From Kopia Snapshot
|
||||
|
||||
```bash
|
||||
# Restore config snapshot to temporary location
|
||||
kopia restore <config-snapshot-id> /tmp/mailcow-restore/
|
||||
|
||||
# Stop Mailcow
|
||||
cd /opt/mailcow-dockerized
|
||||
docker compose down
|
||||
|
||||
# Backup current config (just in case)
|
||||
cp mailcow.conf mailcow.conf.pre-restore
|
||||
cp docker-compose.yml docker-compose.yml.pre-restore
|
||||
|
||||
# Restore specific files
|
||||
cp /tmp/mailcow-restore/mailcow.conf ./
|
||||
cp /tmp/mailcow-restore/docker-compose.yml ./
|
||||
cp -r /tmp/mailcow-restore/data/conf/* ./data/conf/
|
||||
|
||||
# Restart
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
## Verification and Testing
|
||||
|
||||
### Regular Backup Verification
|
||||
|
||||
Perform monthly restore tests to ensure backups are valid:
|
||||
|
||||
```bash
|
||||
# Test restore to temporary location
|
||||
mkdir -p /tmp/backup-test
|
||||
kopia snapshot list --tags mailcow
|
||||
kopia restore <snapshot-id> /tmp/backup-test/
|
||||
|
||||
# Verify files exist and are readable
|
||||
ls -lah /tmp/backup-test/
|
||||
cat /tmp/backup-test/mailcow.conf
|
||||
|
||||
# Cleanup
|
||||
rm -rf /tmp/backup-test/
|
||||
```
|
||||
|
||||
### Backup Monitoring Script
|
||||
|
||||
Create `/opt/scripts/check-mailcow-backup.sh`:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Check last backup age
|
||||
LAST_BACKUP=$(kopia snapshot list --tags mailcow --json | jq -r '.[0].startTime')
|
||||
LAST_BACKUP_EPOCH=$(date -d "$LAST_BACKUP" +%s)
|
||||
NOW=$(date +%s)
|
||||
AGE_HOURS=$(( ($NOW - $LAST_BACKUP_EPOCH) / 3600 ))
|
||||
|
||||
if [ $AGE_HOURS -gt 26 ]; then
|
||||
echo "WARNING: Last Mailcow backup is $AGE_HOURS hours old"
|
||||
# Send alert (email, Slack, etc.)
|
||||
exit 1
|
||||
else
|
||||
echo "OK: Last backup $AGE_HOURS hours ago"
|
||||
fi
|
||||
```
|
||||
|
||||
## Disaster Recovery Checklist
|
||||
|
||||
When disaster strikes, follow this checklist:
|
||||
|
||||
- [ ] Confirm scope of failure (server, storage, specific component)
|
||||
- [ ] Gather server information (hostname, IP, DNS records)
|
||||
- [ ] Access offsite backup vault
|
||||
- [ ] Provision new server (if needed)
|
||||
- [ ] Install Docker and dependencies
|
||||
- [ ] Connect to Kopia repository
|
||||
- [ ] Restore configurations first
|
||||
- [ ] Restore database
|
||||
- [ ] Restore email data
|
||||
- [ ] Start services and verify
|
||||
- [ ] Test email sending/receiving
|
||||
- [ ] Verify webmail access
|
||||
- [ ] Check DNS records and update if needed
|
||||
- [ ] Document any issues encountered
|
||||
- [ ] Update recovery procedures based on experience
|
||||
|
||||
## Important Notes
|
||||
|
||||
1. **DNS**: Keep DNS records documented separately. Recovery includes updating DNS if server IP changes.
|
||||
|
||||
2. **SSL Certificates**: Let's Encrypt certificates are in the backup but may need renewal. Mailcow will handle this automatically.
|
||||
|
||||
3. **Permissions**: Docker volumes have specific UID/GID requirements:
|
||||
- vmail: `5000:5000`
|
||||
- mysql: `999:999`
|
||||
|
||||
4. **Testing**: Always test recovery procedures in a lab environment before trusting them in production.
|
||||
|
||||
5. **Documentation**: Keep this guide and server details in a separate location (printed copy, password manager, etc.).
|
||||
|
||||
6. **Retention Policy**: Review Kopia retention settings periodically to balance storage costs with recovery needs.
|
||||
|
||||
## Backup Architecture Notes
|
||||
|
||||
### Why Two Backup Layers?
|
||||
|
||||
**Mailcow Native Backups** (Tier 1):
|
||||
- ✅ Component-aware (knows about mailcow's structure)
|
||||
- ✅ Uses mariabackup for consistent MySQL hot backups
|
||||
- ✅ Fast, selective restore (can restore just one component)
|
||||
- ✅ Architecture-aware (handles x86/ARM differences)
|
||||
- ❌ No deduplication (full copies each time)
|
||||
- ❌ Limited to local storage initially
|
||||
|
||||
**Kopia Snapshots** (Tier 2):
|
||||
- ✅ Deduplication and compression
|
||||
- ✅ Efficient offsite replication to vaults
|
||||
- ✅ Point-in-time recovery across multiple versions
|
||||
- ✅ Disaster recovery to completely new infrastructure
|
||||
- ❌ Less component-aware (treats as files)
|
||||
- ❌ Slower for granular component restore
|
||||
|
||||
### Storage Efficiency
|
||||
|
||||
Using this two-tier approach:
|
||||
- **Local**: Mailcow creates ~7 days of native backups (may be large, but short retention)
|
||||
- **Offsite**: Kopia deduplicates these backups for long-term vault storage (much smaller)
|
||||
|
||||
Example storage calculation (10GB mailbox):
|
||||
- Local: 7 days × 10GB = ~70GB (before compression)
|
||||
- Kopia (offsite): First backup ~10GB, subsequent backups only store changes (might be <1GB/day after dedup)
|
||||
|
||||
### Compression Formats
|
||||
|
||||
Mailcow's script creates `.tar.zst` (Zstandard) or `.tar.gz` (gzip) files:
|
||||
- **Zstandard** (modern): Better compression ratio, faster (recommended)
|
||||
- **Gzip** (legacy): Wider compatibility with older systems
|
||||
|
||||
Verify your backup compression:
|
||||
```bash
|
||||
ls -lh /opt/mailcow-backups/mailcow-*/
|
||||
# Look for .tar.zst (preferred) or .tar.gz
|
||||
```
|
||||
|
||||
### Cross-Architecture Considerations
|
||||
|
||||
**Important for ARM/x86 Migration**:
|
||||
|
||||
Mailcow's backup script is architecture-aware. When restoring:
|
||||
- **Rspamd data** cannot be restored across different architectures (x86 ↔ ARM)
|
||||
- **All other components** (vmail, mysql, redis, postfix, crypt) are architecture-independent
|
||||
|
||||
If migrating between architectures:
|
||||
```bash
|
||||
# Restore everything EXCEPT rspamd
|
||||
# Select components individually: vmail, mysql, redis, postfix, crypt
|
||||
# Skip rspamd - it will rebuild its learning database over time
|
||||
```
|
||||
|
||||
### Testing Your Backups
|
||||
|
||||
**Monthly Test Protocol**:
|
||||
|
||||
1. **Verify local backups exist**:
|
||||
```bash
|
||||
ls -lh /opt/mailcow-backups/
|
||||
# Should see recent dated directories
|
||||
```
|
||||
|
||||
2. **Verify Kopia snapshots**:
|
||||
```bash
|
||||
kopia snapshot list --tags mailcow
|
||||
# Should see recent snapshots
|
||||
```
|
||||
|
||||
3. **Test restore in lab** (recommended quarterly):
|
||||
- Spin up a test VM
|
||||
- Restore from Kopia
|
||||
- Run mailcow native restore
|
||||
- Verify email delivery and webmail access
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- [Mailcow Official Backup Documentation](https://docs.mailcow.email/backup_restore/b_n_r-backup/)
|
||||
- [Kopia Documentation](https://kopia.io/docs/)
|
||||
- [Docker Volume Backup Best Practices](https://docs.docker.com/storage/volumes/#back-up-restore-or-migrate-data-volumes)
|
||||
|
||||
## Revision History
|
||||
|
||||
| Date | Version | Changes |
|
||||
|------|---------|---------|
|
||||
| 2026-02-10 | 1.1 | Integrated mailcow native backup_and_restore.sh script as primary backup method |
|
||||
| 2026-02-10 | 1.0 | Initial documentation |
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: February 10, 2026
|
||||
**Maintained By**: System Administrator
|
||||
**Review Schedule**: Quarterly
|
||||
1151
Netgrimoire/Vault-Grimoire/Backups/Nextcloud-Backup.md
Normal file
1151
Netgrimoire/Vault-Grimoire/Backups/Nextcloud-Backup.md
Normal file
File diff suppressed because it is too large
Load diff
19
Netgrimoire/Vault-Grimoire/Backups/Services-Backup.md
Normal file
19
Netgrimoire/Vault-Grimoire/Backups/Services-Backup.md
Normal file
|
|
@ -0,0 +1,19 @@
|
|||
---
|
||||
title: Services Backup
|
||||
description:
|
||||
published: true
|
||||
date: 2026-02-20T04:08:15.923Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-05T21:28:23.152Z
|
||||
---
|
||||
|
||||
- [Mailcow](/backup-mailcow)
|
||||
- [Immich](/immich_backup)
|
||||
- [Nextcloud](/nextcloud_backup)
|
||||
- kopia
|
||||
- forgejo
|
||||
- bitwarden
|
||||
- wiki
|
||||
- journalv
|
||||
|
||||
567
Netgrimoire/Vault-Grimoire/Backups/Wiki-Backup.md
Normal file
567
Netgrimoire/Vault-Grimoire/Backups/Wiki-Backup.md
Normal file
|
|
@ -0,0 +1,567 @@
|
|||
---
|
||||
title: Wikijs Backup
|
||||
description: Backup Wikijs
|
||||
published: true
|
||||
date: 2026-02-23T04:35:32.870Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-23T04:35:24.121Z
|
||||
---
|
||||
|
||||
# Wiki.js Backup & Recovery
|
||||
|
||||
**Service:** Wiki.js (Netgrimoire)
|
||||
**Stack:** Docker Compose — Wiki.js + PostgreSQL
|
||||
**Backup Targets:** PostgreSQL database dump, Git content repository, Docker Compose config
|
||||
**Backup Destinations:** Local vault path → Kopia → offsite vaults
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Wiki.js data lives in two separate places that must be backed up independently:
|
||||
|
||||
**PostgreSQL database** — stores page metadata, navigation, user accounts, permissions, page history, assets, and all configuration. This is the critical component for a portable restore. Without it, a new instance has no knowledge of your wiki structure.
|
||||
|
||||
**Git content repository** — stores the actual page content in markdown files, synced from Forgejo. This is already mirrored on the VAULT SSD at `/vault/repos/wiki/`. It is inherently redundant as long as Forgejo is healthy, but is included in backups for completeness and offline portability.
|
||||
|
||||
**Docker Compose config** — the `docker-compose.yml` and `.env` files needed to recreate the stack.
|
||||
|
||||
---
|
||||
|
||||
## What Gets Backed Up
|
||||
|
||||
| Component | Location | Method | Critical? |
|
||||
|---|---|---|---|
|
||||
| PostgreSQL database | Docker volume | `pg_dump` → SQL file | Yes — primary restore target |
|
||||
| Git content repo | `/vault/repos/wiki/` | Already on VAULT SSD | Yes — page content |
|
||||
| Docker Compose files | `/opt/stacks/wikijs/` | rsync copy | Yes — stack config |
|
||||
| Wiki.js data volume | Docker volume | Optional rsync | No — DB + Git covers this |
|
||||
|
||||
---
|
||||
|
||||
## Backup Strategy
|
||||
|
||||
### Tier 1 — Daily Dump to Vault Path
|
||||
|
||||
A script runs daily via systemd timer. It produces a portable `pg_dump` SQL file written to `/vault/backups/wiki/`. These local dumps are retained for 14 days.
|
||||
|
||||
**Key choices:**
|
||||
|
||||
- `--format=plain` — plain SQL, portable to any PostgreSQL version and any host
|
||||
- `--no-owner` — strips role ownership, so the dump restores cleanly on a new instance with a different postgres user (critical for Pocket Grimoire restores)
|
||||
- `--no-acl` — strips GRANT/REVOKE statements for the same reason
|
||||
- No application downtime required — PostgreSQL handles consistent dumps natively
|
||||
|
||||
### Tier 2 — Kopia Snapshot to Offsite Vaults
|
||||
|
||||
After the daily dump completes, Kopia snapshots the entire `/vault/backups/wiki/` directory and replicates to your offsite vaults. Kopia deduplication means only changed blocks are transferred after the first run.
|
||||
|
||||
---
|
||||
|
||||
## Setup
|
||||
|
||||
### Step 0 — Confirm Kopia Repository Exists
|
||||
|
||||
If Kopia is not yet initialized on this host, initialize it first. If you already initialized Kopia for Mailcow or another service, skip this step — all services share the same Kopia repository.
|
||||
|
||||
```bash
|
||||
# Check if repository already exists
|
||||
kopia repository status
|
||||
|
||||
# If not initialized, create it against your vault path
|
||||
kopia repository create filesystem --path=/vault/kopia
|
||||
|
||||
# Connect on subsequent logins if disconnected
|
||||
kopia repository connect filesystem --path=/vault/kopia
|
||||
```
|
||||
|
||||
### Step 1 — Create Backup Directories
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /vault/backups/wiki
|
||||
sudo chown $(whoami):$(whoami) /vault/backups/wiki
|
||||
```
|
||||
|
||||
### Step 2 — Create the Backup Script
|
||||
|
||||
```bash
|
||||
sudo nano /usr/local/sbin/wikijs-backup.sh
|
||||
```
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
# wikijs-backup.sh — Daily Wiki.js backup: pg_dump + git repo + config
|
||||
# Writes to /vault/backups/wiki/, then snapshots with Kopia
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
# ── Configuration ─────────────────────────────────────────────────────────────
|
||||
BACKUP_DIR="/vault/backups/wiki"
|
||||
DATE=$(date +%Y%m%d_%H%M%S)
|
||||
CONTAINER_DB="wikijs_db" # Adjust to your actual container name
|
||||
PG_USER="wikijs"
|
||||
PG_DB="wikijs"
|
||||
WIKI_STACK_DIR="/opt/stacks/wikijs" # Location of docker-compose.yml and .env
|
||||
GIT_REPO_DIR="/vault/repos/wiki" # Git content mirror (already on vault SSD)
|
||||
RETAIN_DAYS=14 # Local dump retention
|
||||
|
||||
LOG="/var/log/wikijs-backup.log"
|
||||
touch "$LOG"
|
||||
|
||||
log() { echo "$(date -Is) $*" | tee -a "$LOG"; }
|
||||
|
||||
# ── Step 1: PostgreSQL dump ────────────────────────────────────────────────────
|
||||
log "Starting Wiki.js PostgreSQL dump..."
|
||||
|
||||
docker exec "$CONTAINER_DB" pg_dump \
|
||||
-U "$PG_USER" \
|
||||
"$PG_DB" \
|
||||
--format=plain \
|
||||
--no-owner \
|
||||
--no-acl \
|
||||
> "${BACKUP_DIR}/wikijs-db-${DATE}.sql"
|
||||
|
||||
gzip "${BACKUP_DIR}/wikijs-db-${DATE}.sql"
|
||||
|
||||
log "PostgreSQL dump complete: wikijs-db-${DATE}.sql.gz"
|
||||
|
||||
# ── Step 2: Docker Compose config backup ──────────────────────────────────────
|
||||
log "Backing up Docker Compose config..."
|
||||
|
||||
CONFIG_BACKUP="${BACKUP_DIR}/wikijs-config-${DATE}.tar.gz"
|
||||
|
||||
tar -czf "$CONFIG_BACKUP" \
|
||||
-C "$(dirname "$WIKI_STACK_DIR")" \
|
||||
"$(basename "$WIKI_STACK_DIR")"
|
||||
|
||||
log "Config backup complete: wikijs-config-${DATE}.tar.gz"
|
||||
|
||||
# ── Step 3: Git repo snapshot (content mirror) ────────────────────────────────
|
||||
# The git repo lives on the VAULT SSD and is already versioned.
|
||||
# We record the current HEAD commit for reference.
|
||||
|
||||
if [ -d "${GIT_REPO_DIR}/.git" ]; then
|
||||
GIT_HEAD=$(git -C "$GIT_REPO_DIR" rev-parse HEAD 2>/dev/null || echo "unknown")
|
||||
echo "Git HEAD at backup time: ${GIT_HEAD}" \
|
||||
> "${BACKUP_DIR}/wikijs-git-ref-${DATE}.txt"
|
||||
log "Git content repo HEAD: ${GIT_HEAD}"
|
||||
else
|
||||
log "WARNING: Git repo not found at ${GIT_REPO_DIR} — skipping git ref"
|
||||
fi
|
||||
|
||||
# ── Step 4: Cleanup old local dumps ───────────────────────────────────────────
|
||||
log "Cleaning up dumps older than ${RETAIN_DAYS} days..."
|
||||
|
||||
find "$BACKUP_DIR" -name "wikijs-db-*.sql.gz" -mtime +"$RETAIN_DAYS" -delete
|
||||
find "$BACKUP_DIR" -name "wikijs-config-*.tar.gz" -mtime +"$RETAIN_DAYS" -delete
|
||||
find "$BACKUP_DIR" -name "wikijs-git-ref-*.txt" -mtime +"$RETAIN_DAYS" -delete
|
||||
|
||||
# ── Step 5: Kopia snapshot ────────────────────────────────────────────────────
|
||||
log "Running Kopia snapshot of /vault/backups/wiki/..."
|
||||
|
||||
kopia snapshot create "$BACKUP_DIR" \
|
||||
--tags "service:wikijs,host:$(hostname -s)"
|
||||
|
||||
log "Kopia snapshot complete."
|
||||
|
||||
# ── Done ──────────────────────────────────────────────────────────────────────
|
||||
log "Wiki.js backup finished successfully."
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo chmod +x /usr/local/sbin/wikijs-backup.sh
|
||||
```
|
||||
|
||||
### Step 3 — Create systemd Service and Timer
|
||||
|
||||
```bash
|
||||
sudo nano /etc/systemd/system/wikijs-backup.service
|
||||
```
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Wiki.js daily backup (pg_dump + config + Kopia snapshot)
|
||||
After=docker.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/usr/local/sbin/wikijs-backup.sh
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo nano /etc/systemd/system/wikijs-backup.timer
|
||||
```
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Run Wiki.js backup daily at 02:00
|
||||
|
||||
[Timer]
|
||||
OnCalendar=*-*-* 02:00:00
|
||||
Persistent=true
|
||||
|
||||
[Install]
|
||||
WantedBy=timers.target
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable wikijs-backup.timer
|
||||
sudo systemctl start wikijs-backup.timer
|
||||
|
||||
# Verify
|
||||
systemctl list-timers | grep wikijs
|
||||
```
|
||||
|
||||
### Step 4 — Configure Kopia Retention Policy
|
||||
|
||||
```bash
|
||||
# Set retention policy for wiki backups
|
||||
kopia policy set /vault/backups/wiki \
|
||||
--keep-daily 14 \
|
||||
--keep-weekly 8 \
|
||||
--keep-monthly 12 \
|
||||
--compression zstd
|
||||
|
||||
# Verify policy
|
||||
kopia policy show /vault/backups/wiki
|
||||
```
|
||||
|
||||
### Step 5 — Test the Backup
|
||||
|
||||
```bash
|
||||
# Run manually first time
|
||||
sudo /usr/local/sbin/wikijs-backup.sh
|
||||
|
||||
# Verify output
|
||||
ls -lh /vault/backups/wiki/
|
||||
# Should show: wikijs-db-YYYYMMDD_HHMMSS.sql.gz
|
||||
# wikijs-config-YYYYMMDD_HHMMSS.tar.gz
|
||||
# wikijs-git-ref-YYYYMMDD_HHMMSS.txt
|
||||
|
||||
# Verify Kopia snapshot was created
|
||||
kopia snapshot list /vault/backups/wiki
|
||||
|
||||
# Check backup log
|
||||
tail -n 30 /var/log/wikijs-backup.log
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verifying Backups
|
||||
|
||||
### Check dump is readable
|
||||
|
||||
```bash
|
||||
# Inspect the SQL dump without extracting
|
||||
zcat /vault/backups/wiki/wikijs-db-YYYYMMDD_HHMMSS.sql.gz | head -50
|
||||
|
||||
# Should show PostgreSQL header, version info, and CREATE TABLE statements
|
||||
```
|
||||
|
||||
### Verify Kopia snapshots
|
||||
|
||||
```bash
|
||||
# List recent snapshots
|
||||
kopia snapshot list /vault/backups/wiki
|
||||
|
||||
# Show snapshot details
|
||||
kopia snapshot list /vault/backups/wiki --all
|
||||
|
||||
# Verify snapshot integrity
|
||||
kopia snapshot verify
|
||||
```
|
||||
|
||||
### Test restore to a temporary database (non-destructive)
|
||||
|
||||
```bash
|
||||
# Start a temporary Postgres container
|
||||
docker run --rm -d \
|
||||
--name wikijs-restore-test \
|
||||
-e POSTGRES_USER=wikijs \
|
||||
-e POSTGRES_PASSWORD=testpassword \
|
||||
-e POSTGRES_DB=wikijs_test \
|
||||
postgres:16-alpine
|
||||
|
||||
# Wait for Postgres to be ready
|
||||
sleep 5
|
||||
|
||||
# Restore dump into test container
|
||||
zcat /vault/backups/wiki/wikijs-db-YYYYMMDD_HHMMSS.sql.gz | \
|
||||
docker exec -i wikijs-restore-test psql -U wikijs -d wikijs_test
|
||||
|
||||
# Verify tables exist
|
||||
docker exec wikijs-restore-test psql -U wikijs -d wikijs_test -c "\dt"
|
||||
|
||||
# Expected output: List of tables (pages, users, pageHistory, assets, etc.)
|
||||
|
||||
# Cleanup test container
|
||||
docker stop wikijs-restore-test
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recovery Procedures
|
||||
|
||||
### Scenario A — Restore to a New Wiki.js Instance (Any Host)
|
||||
|
||||
This covers full disaster recovery to a fresh server, including Pocket Grimoire.
|
||||
|
||||
**Requirements on the destination host:**
|
||||
- Docker and Docker Compose installed
|
||||
- A `docker-compose.yml` and `.env` ready (from backup or Pocket Grimoire stack)
|
||||
- Sufficient disk space
|
||||
|
||||
**Step 1: Locate the backup**
|
||||
|
||||
```bash
|
||||
# On Netgrimoire, find the dump to restore
|
||||
ls -lh /vault/backups/wiki/
|
||||
|
||||
# Or restore from Kopia
|
||||
kopia snapshot list /vault/backups/wiki
|
||||
kopia restore SNAPSHOT_ID /tmp/wiki-restore/
|
||||
ls /tmp/wiki-restore/
|
||||
```
|
||||
|
||||
**Step 2: Copy dump to the destination host**
|
||||
|
||||
```bash
|
||||
# From Netgrimoire, copy to the destination server
|
||||
scp /vault/backups/wiki/wikijs-db-YYYYMMDD_HHMMSS.sql.gz \
|
||||
user@destination-host:/tmp/
|
||||
|
||||
# Or to Pocket Grimoire
|
||||
scp /vault/backups/wiki/wikijs-db-YYYYMMDD_HHMMSS.sql.gz \
|
||||
user@pocket-grimoire.local:/tmp/
|
||||
```
|
||||
|
||||
**Step 3: Start the database container only**
|
||||
|
||||
On the destination host, start just the database — do not start Wiki.js yet:
|
||||
|
||||
```bash
|
||||
cd /srv/pocket-grimoire/stacks/wikijs # Adjust path as needed
|
||||
|
||||
# Start only the database container
|
||||
docker compose up -d db
|
||||
|
||||
# Wait for healthy status
|
||||
docker compose ps
|
||||
# db should show: healthy
|
||||
```
|
||||
|
||||
**Step 4: Restore the dump**
|
||||
|
||||
```bash
|
||||
# Restore the dump into the running database container
|
||||
zcat /tmp/wikijs-db-YYYYMMDD_HHMMSS.sql.gz | \
|
||||
docker exec -i pocketgrimoire_db psql \
|
||||
-U wikijs \
|
||||
-d wikijs
|
||||
|
||||
# Verify tables restored
|
||||
docker exec pocketgrimoire_db psql -U wikijs -d wikijs -c "\dt"
|
||||
```
|
||||
|
||||
**Step 5: Start Wiki.js**
|
||||
|
||||
```bash
|
||||
docker compose up -d
|
||||
|
||||
# Watch startup logs
|
||||
docker logs -f pocketgrimoire_wikijs
|
||||
# Wait for: "HTTP Server started successfully"
|
||||
```
|
||||
|
||||
**Step 6: Verify**
|
||||
|
||||
Open `http://pocket-grimoire.local:3000` and confirm:
|
||||
- Pages load correctly
|
||||
- Navigation structure is intact
|
||||
- User accounts are present (if you had multiple users)
|
||||
|
||||
**Step 7: Re-sync Git content (if needed)**
|
||||
|
||||
The database knows the page structure, but if the Git content repo isn't present on the new host, import it:
|
||||
|
||||
```bash
|
||||
# In Wiki.js admin panel:
|
||||
# Administration → Storage → Git
|
||||
# Click "Force Sync" or "Import Content"
|
||||
|
||||
# Or copy the repo from VAULT SSD
|
||||
rsync -avP /vault/repos/wiki/ /srv/pocket-grimoire/repos/wiki/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario B — Restore on Existing Netgrimoire Instance
|
||||
|
||||
Use this when the Wiki.js database is corrupted but the host is otherwise healthy.
|
||||
|
||||
**Step 1: Stop Wiki.js (leave database running)**
|
||||
|
||||
```bash
|
||||
cd /opt/stacks/wikijs
|
||||
docker compose stop wikijs
|
||||
```
|
||||
|
||||
**Step 2: Drop and recreate the database**
|
||||
|
||||
```bash
|
||||
docker exec -it wikijs_db psql -U postgres -c "DROP DATABASE wikijs;"
|
||||
docker exec -it wikijs_db psql -U postgres -c "CREATE DATABASE wikijs OWNER wikijs;"
|
||||
```
|
||||
|
||||
**Step 3: Restore**
|
||||
|
||||
```bash
|
||||
zcat /vault/backups/wiki/wikijs-db-YYYYMMDD_HHMMSS.sql.gz | \
|
||||
docker exec -i wikijs_db psql -U wikijs -d wikijs
|
||||
```
|
||||
|
||||
**Step 4: Restart Wiki.js**
|
||||
|
||||
```bash
|
||||
docker compose start wikijs
|
||||
docker logs -f wikijs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario C — Restore Config Only
|
||||
|
||||
If the stack config was lost but the database volume is intact:
|
||||
|
||||
```bash
|
||||
# Extract config from backup
|
||||
tar -xzf /vault/backups/wiki/wikijs-config-YYYYMMDD_HHMMSS.tar.gz \
|
||||
-C /opt/stacks/
|
||||
|
||||
# Verify
|
||||
ls /opt/stacks/wikijs/
|
||||
# Should show: docker-compose.yml .env
|
||||
|
||||
# Restart stack
|
||||
cd /opt/stacks/wikijs
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Restore from Kopia (Offsite)
|
||||
|
||||
When local vault files are unavailable, restore the backup directory from Kopia first:
|
||||
|
||||
```bash
|
||||
# List available snapshots
|
||||
kopia snapshot list /vault/backups/wiki
|
||||
|
||||
# Restore snapshot to temp directory
|
||||
kopia restore SNAPSHOT_ID /tmp/wiki-restore/
|
||||
|
||||
# Then proceed with the appropriate scenario above
|
||||
# using files from /tmp/wiki-restore/ instead of /vault/backups/wiki/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pocket Grimoire Specifics
|
||||
|
||||
When restoring to Pocket Grimoire, note the following differences from a full Netgrimoire instance:
|
||||
|
||||
**Container names** differ — use `pocketgrimoire_db` instead of `wikijs_db`.
|
||||
|
||||
**Stack path** is `/srv/pocket-grimoire/stacks/wikijs/` instead of `/opt/stacks/wikijs/`.
|
||||
|
||||
**The database is already initialized** when Pocket Grimoire is first set up. Restoring a Netgrimoire dump overwrites it entirely, which is the intended behavior — Pocket Grimoire becomes a mirror of Netgrimoire's wiki state.
|
||||
|
||||
**Git content repo** is located at `/srv/pocket-grimoire/repos/wiki/` and is populated via the sync script (`pocketgrimoire-sync.sh`). A database restore alone is sufficient if the Git repo is already in place.
|
||||
|
||||
**Recommended restore workflow for Pocket Grimoire:**
|
||||
|
||||
```bash
|
||||
# 1. Copy dump from VAULT SSD (already available on Pocket Grimoire)
|
||||
ls /srv/vaultpg/backups/wiki/
|
||||
|
||||
# 2. Start db container only
|
||||
cd /srv/pocket-grimoire/stacks/wikijs && docker compose up -d db
|
||||
|
||||
# 3. Restore
|
||||
zcat /srv/vaultpg/backups/wiki/wikijs-db-LATEST.sql.gz | \
|
||||
docker exec -i pocketgrimoire_db psql -U wikijs -d wikijs
|
||||
|
||||
# 4. Start full stack
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
Because the VAULT SSD is always connected to Pocket Grimoire, no file transfer is needed — the dumps are already there.
|
||||
|
||||
---
|
||||
|
||||
## Monitoring & Alerts
|
||||
|
||||
Add the following to your existing ntfy/monitoring setup to alert on backup failures. Wrap the backup script call in an error trap:
|
||||
|
||||
```bash
|
||||
# Add to wikijs-backup.sh after set -euo pipefail:
|
||||
|
||||
NTFY_URL="https://ntfy.YOUR_DOMAIN/wikijs-backup"
|
||||
|
||||
on_error() {
|
||||
curl -fsS -X POST "$NTFY_URL" \
|
||||
-H "Title: Wiki.js backup FAILED ($(hostname -s))" \
|
||||
-H "Priority: high" \
|
||||
-H "Tags: rotating_light" \
|
||||
-d "Backup failed at $(date -Is). Check /var/log/wikijs-backup.log"
|
||||
}
|
||||
trap on_error ERR
|
||||
```
|
||||
|
||||
### Check backup age manually
|
||||
|
||||
```bash
|
||||
# Find most recent dump
|
||||
ls -lt /vault/backups/wiki/wikijs-db-*.sql.gz | head -3
|
||||
|
||||
# Check Kopia last snapshot time
|
||||
kopia snapshot list /vault/backups/wiki | tail -5
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
```bash
|
||||
# Run backup manually
|
||||
sudo /usr/local/sbin/wikijs-backup.sh
|
||||
|
||||
# Watch backup log
|
||||
tail -f /var/log/wikijs-backup.log
|
||||
|
||||
# Check timer status
|
||||
systemctl status wikijs-backup.timer
|
||||
|
||||
# List local dumps
|
||||
ls -lh /vault/backups/wiki/
|
||||
|
||||
# List Kopia snapshots
|
||||
kopia snapshot list /vault/backups/wiki
|
||||
|
||||
# Restore dump (generic)
|
||||
zcat /vault/backups/wiki/wikijs-db-YYYYMMDD_HHMMSS.sql.gz | \
|
||||
docker exec -i CONTAINER_NAME psql -U wikijs -d wikijs
|
||||
|
||||
# Test dump is readable
|
||||
zcat /vault/backups/wiki/wikijs-db-YYYYMMDD_HHMMSS.sql.gz | head -50
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Revision History
|
||||
|
||||
| Version | Date | Notes |
|
||||
|---|---|---|
|
||||
| 1.0 | 2026-02-22 | Initial release — pg_dump + Kopia + Pocket Grimoire restore procedures |
|
||||
940
Netgrimoire/Vault-Grimoire/Kopia/Kopia-Overview.md
Normal file
940
Netgrimoire/Vault-Grimoire/Kopia/Kopia-Overview.md
Normal file
|
|
@ -0,0 +1,940 @@
|
|||
---
|
||||
title: Setting Up Kopia
|
||||
description:
|
||||
published: true
|
||||
date: 2026-02-20T04:27:59.823Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-01-23T22:14:17.009Z
|
||||
---
|
||||
|
||||
# Kopia Backup System Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This system implements a two-tier backup strategy using **two separate Kopia Server instances**:
|
||||
|
||||
1. **Primary Repository** (`/srv/vault/kopia_repository`) - Full backups of all clients, served on port 51515
|
||||
2. **Vault Repository** (`/srv/vault/backup`) - Targeted critical data backups, served on port 51516, replicated offsite via ZFS send/receive
|
||||
|
||||
The Vault repository sits on its own ZFS dataset to enable clean replication to offsite Pi systems. Running two separate Kopia servers allows independent management of each repository while maintaining the same HTTPS-based client connection model for both.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Clients (docker2, cindy's desktop, etc.)
|
||||
↓
|
||||
├─→ Primary Backup → Kopia Server Primary (port 51515)
|
||||
│ → /srv/vault/kopia_repository (all data)
|
||||
│
|
||||
└─→ Vault Backup → Kopia Server Vault (port 51516)
|
||||
→ /srv/vault/backup (critical data only)
|
||||
↓
|
||||
ZFS Send/Receive
|
||||
↓
|
||||
┌───────┴───────┐
|
||||
↓ ↓
|
||||
Pi Vault 1 Pi Vault 2
|
||||
(offsite) (offsite)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Initial Setup on ZNAS
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Docker installed on ZNAS
|
||||
- ZFS pool available
|
||||
|
||||
### 1. Create ZFS Datasets
|
||||
|
||||
```bash
|
||||
# Primary repository dataset (if not already created)
|
||||
zfs create -o mountpoint=/srv/vault zpool/vault
|
||||
zfs create zpool/vault/kopia_repository
|
||||
|
||||
# Vault repository dataset (for offsite replication)
|
||||
zfs create zpool/vault/backup
|
||||
```
|
||||
|
||||
### 2. Install Kopia Servers (Docker)
|
||||
|
||||
We run **two separate Kopia Server containers** - one for primary backups, one for vault backups.
|
||||
|
||||
```bash
|
||||
# Primary repository server (port 51515)
|
||||
docker run -d \
|
||||
--name kopia-server-primary \
|
||||
--restart unless-stopped \
|
||||
-p 51515:51515 \
|
||||
-v /srv/vault/kopia_repository:/app/repository \
|
||||
-v /srv/vault/config-primary:/app/config \
|
||||
-v /srv/vault/logs-primary:/app/logs \
|
||||
kopia/kopia:latest server start \
|
||||
--address=0.0.0.0:51515 \
|
||||
--tls-generate-cert
|
||||
|
||||
# Vault repository server (port 51516)
|
||||
docker run -d \
|
||||
--name kopia-server-vault \
|
||||
--restart unless-stopped \
|
||||
-p 51516:51516 \
|
||||
-v /srv/vault/backup:/app/repository \
|
||||
-v /srv/vault/config-vault:/app/config \
|
||||
-v /srv/vault/logs-vault:/app/logs \
|
||||
kopia/kopia:latest server start \
|
||||
--address=0.0.0.0:51516 \
|
||||
--tls-generate-cert
|
||||
```
|
||||
|
||||
**Get the certificate fingerprints:**
|
||||
```bash
|
||||
# Primary server fingerprint
|
||||
docker exec kopia-server-primary kopia server status
|
||||
|
||||
# Vault server fingerprint
|
||||
docker exec kopia-server-vault kopia server status
|
||||
```
|
||||
|
||||
**Note:** Record both certificate fingerprints - you'll need them for client connections.
|
||||
- **Primary server cert SHA256:** `696a4999f594b5273a174fd7cab677d8dd1628f9b9d27e557daa87103ee064b2`
|
||||
- **Vault server cert SHA256:** *(get from command above)*
|
||||
|
||||
### 3. Create Kopia Repositories
|
||||
|
||||
Each server manages its own repository. These are created during first server start, but you can initialize them manually if needed.
|
||||
|
||||
```bash
|
||||
# Primary repository (usually created via GUI on first use)
|
||||
docker exec -it kopia-server-primary kopia repository create filesystem \
|
||||
--path=/app/repository \
|
||||
--description="Primary backup repository"
|
||||
|
||||
# Vault repository
|
||||
docker exec -it kopia-server-vault kopia repository create filesystem \
|
||||
--path=/app/repository \
|
||||
--description="Vault backup repository for offsite replication"
|
||||
```
|
||||
|
||||
**Note:** If you created the primary repository via the Kopia UI, you don't need to run the first command.
|
||||
|
||||
### 4. Create User Accounts
|
||||
|
||||
Create users on each server separately.
|
||||
|
||||
**Primary repository users:**
|
||||
```bash
|
||||
# Enter primary server container
|
||||
docker exec -it kopia-server-primary /bin/sh
|
||||
|
||||
# Create users
|
||||
kopia server users add admin@docker2
|
||||
kopia server users add cindy@DESKTOP-QLSVD8P
|
||||
# Password for cindy: LucyDog123
|
||||
|
||||
# Exit container
|
||||
exit
|
||||
```
|
||||
|
||||
**Vault repository users:**
|
||||
```bash
|
||||
# Enter vault server container
|
||||
docker exec -it kopia-server-vault /bin/sh
|
||||
|
||||
# Create users
|
||||
kopia server users add admin@docker2-vault
|
||||
kopia server users add cindy@DESKTOP-QLSVD8P-vault
|
||||
# Use same passwords or different based on security requirements
|
||||
|
||||
# Exit container
|
||||
exit
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Client Configuration
|
||||
|
||||
### Linux Client (docker2)
|
||||
|
||||
#### Primary Backup Setup
|
||||
|
||||
1. **Install Kopia**
|
||||
```bash
|
||||
# Download and install kopia .deb package
|
||||
wget https://github.com/kopia/kopia/releases/download/v0.XX.X/kopia_0.XX.X_amd64.deb
|
||||
sudo dpkg -i kopia_0.XX.X_amd64.deb
|
||||
```
|
||||
|
||||
2. **Remove old repository (if exists)**
|
||||
```bash
|
||||
sudo kopia repository disconnect || true
|
||||
sudo rm -rf /root/.config/kopia
|
||||
```
|
||||
|
||||
3. **Connect to primary repository**
|
||||
```bash
|
||||
sudo kopia repository connect server \
|
||||
--url=https://192.168.5.10:51515 \
|
||||
--override-username=admin@docker2 \
|
||||
--server-cert-fingerprint=696a4999f594b5273a174fd7cab677d8dd1628f9b9d27e557daa87103ee064b2
|
||||
```
|
||||
|
||||
4. **Create initial snapshot**
|
||||
```bash
|
||||
sudo kopia snapshot create /DockerVol/
|
||||
```
|
||||
|
||||
5. **Set up cron job for primary backups**
|
||||
```bash
|
||||
sudo crontab -e
|
||||
|
||||
# Add this line (runs every 3 hours)
|
||||
*/180 * * * * /usr/bin/kopia snapshot create /DockerVol >> /var/log/kopia-primary-cron.log 2>&1
|
||||
```
|
||||
|
||||
#### Vault Backup Setup (Critical Data)
|
||||
|
||||
1. **Create secondary kopia config directory**
|
||||
```bash
|
||||
sudo mkdir -p /root/.config/kopia-vault
|
||||
```
|
||||
|
||||
2. **Connect to vault repository**
|
||||
```bash
|
||||
sudo kopia --config-file=/root/.config/kopia-vault/repository.config \
|
||||
repository connect server \
|
||||
--url=https://192.168.5.10:51516 \
|
||||
--override-username=admin@docker2-vault \
|
||||
--server-cert-fingerprint=<VAULT_SERVER_CERT_FINGERPRINT>
|
||||
```
|
||||
|
||||
**Note:** Replace `<VAULT_SERVER_CERT_FINGERPRINT>` with the actual fingerprint from the vault server (see setup section).
|
||||
|
||||
3. **Create vault backup script**
|
||||
```bash
|
||||
sudo nano /usr/local/bin/kopia-vault-backup.sh
|
||||
```
|
||||
|
||||
Add this content:
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Kopia Vault Backup Script
|
||||
# Backs up critical data to vault repository for offsite replication
|
||||
|
||||
KOPIA_CONFIG="/root/.config/kopia-vault/repository.config"
|
||||
LOG_FILE="/var/log/kopia-vault-cron.log"
|
||||
|
||||
# Add your critical directories here
|
||||
VAULT_DIRS=(
|
||||
"/DockerVol/critical-app1"
|
||||
"/DockerVol/critical-app2"
|
||||
"/home/admin/documents"
|
||||
)
|
||||
|
||||
echo "=== Vault backup started at $(date) ===" >> "$LOG_FILE"
|
||||
|
||||
for dir in "${VAULT_DIRS[@]}"; do
|
||||
if [ -d "$dir" ]; then
|
||||
echo "Backing up: $dir" >> "$LOG_FILE"
|
||||
/usr/bin/kopia --config-file="$KOPIA_CONFIG" snapshot create "$dir" >> "$LOG_FILE" 2>&1
|
||||
else
|
||||
echo "Directory not found: $dir" >> "$LOG_FILE"
|
||||
fi
|
||||
done
|
||||
|
||||
echo "=== Vault backup completed at $(date) ===" >> "$LOG_FILE"
|
||||
echo "" >> "$LOG_FILE"
|
||||
```
|
||||
|
||||
4. **Make script executable**
|
||||
```bash
|
||||
sudo chmod +x /usr/local/bin/kopia-vault-backup.sh
|
||||
```
|
||||
|
||||
5. **Set up cron job for vault backups**
|
||||
```bash
|
||||
sudo crontab -e
|
||||
|
||||
# Add this line (runs daily at 3 AM)
|
||||
0 3 * * * /usr/local/bin/kopia-vault-backup.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Windows Client (Cindy's Desktop)
|
||||
|
||||
#### Primary Backup Setup
|
||||
|
||||
1. **Install Kopia**
|
||||
```powershell
|
||||
# Using winget
|
||||
winget install kopia
|
||||
```
|
||||
|
||||
2. **Connect to primary repository**
|
||||
```powershell
|
||||
kopia repository connect server `
|
||||
--url=https://192.168.5.10:51515 `
|
||||
--override-username=cindy@DESKTOP-QLSVD8P `
|
||||
--server-cert-fingerprint=696a4999f594b5273a174fd7cab677d8dd1628f9b9d27e557daa87103ee064b2
|
||||
```
|
||||
|
||||
3. **Create initial snapshot**
|
||||
```powershell
|
||||
kopia snapshot create C:\Users\cindy
|
||||
```
|
||||
|
||||
4. **Set exclusion policy**
|
||||
```powershell
|
||||
kopia policy set `
|
||||
--global `
|
||||
--add-ignore "**\AppData\Local\Temp\**" `
|
||||
--add-ignore "**\AppData\Local\Packages\**"
|
||||
```
|
||||
|
||||
5. **Create primary backup script**
|
||||
```powershell
|
||||
# Create scripts folder
|
||||
New-Item -ItemType Directory -Force -Path C:\Scripts
|
||||
|
||||
# Create backup script
|
||||
New-Item -ItemType File -Path C:\Scripts\kopia-primary-nightly.ps1
|
||||
```
|
||||
|
||||
Add this content to `C:\Scripts\kopia-primary-nightly.ps1`:
|
||||
```powershell
|
||||
# Kopia Primary Backup Script
|
||||
# Repository password
|
||||
$env:KOPIA_PASSWORD = "LucyDog123"
|
||||
|
||||
# Run backup with logging
|
||||
kopia snapshot create C:\Users\cindy `
|
||||
--progress `
|
||||
| Tee-Object -FilePath C:\Logs\kopia-primary.log -Append
|
||||
|
||||
# Log completion
|
||||
Add-Content -Path C:\Logs\kopia-primary.log -Value "Backup completed at $(Get-Date)"
|
||||
Add-Content -Path C:\Logs\kopia-primary.log -Value "---"
|
||||
```
|
||||
|
||||
6. **Secure the script**
|
||||
- Right-click `C:\Scripts\kopia-primary-nightly.ps1` → Properties → Security
|
||||
- Ensure only Cindy's user account has read access
|
||||
|
||||
7. **Create scheduled task for primary backup**
|
||||
- Press `Win + R` → type `taskschd.msc`
|
||||
- Click "Create Task" (not "Basic Task")
|
||||
|
||||
**General tab:**
|
||||
- Name: `Kopia Primary Nightly Backup`
|
||||
- ✔ Run whether user is logged on or not
|
||||
- ✔ Run with highest privileges
|
||||
- Configure for: Windows 10/11
|
||||
|
||||
**Triggers tab:**
|
||||
- New → Daily at 2:00 AM
|
||||
- ✔ Enabled
|
||||
|
||||
**Actions tab:**
|
||||
- Program: `powershell.exe`
|
||||
- Arguments: `-ExecutionPolicy Bypass -File C:\Scripts\kopia-primary-nightly.ps1`
|
||||
- Start in: `C:\Scripts`
|
||||
|
||||
**Conditions tab:**
|
||||
- ✔ Wake the computer to run this task
|
||||
- ✔ Start only if on AC power (recommended for laptops)
|
||||
|
||||
**Settings tab:**
|
||||
- ✔ Allow task to be run on demand
|
||||
- ✔ Run task as soon as possible after scheduled start is missed
|
||||
- ❌ Stop the task if it runs longer than...
|
||||
|
||||
**Note:** When creating the task, use PIN (not Windows password) when prompted. For scheduled task credential: use password Harvey123= (MS account password)
|
||||
|
||||
#### Vault Backup Setup (Critical Data)
|
||||
|
||||
1. **Create vault config directory**
|
||||
```powershell
|
||||
New-Item -ItemType Directory -Force -Path C:\Users\cindy\.config\kopia-vault
|
||||
```
|
||||
|
||||
2. **Connect to vault repository**
|
||||
```powershell
|
||||
kopia --config-file="C:\Users\cindy\.config\kopia-vault\repository.config" `
|
||||
repository connect server `
|
||||
--url=https://192.168.5.10:51516 `
|
||||
--override-username=cindy@DESKTOP-QLSVD8P-vault `
|
||||
--server-cert-fingerprint=<VAULT_SERVER_CERT_FINGERPRINT>
|
||||
```
|
||||
|
||||
**Note:** Replace `<VAULT_SERVER_CERT_FINGERPRINT>` with the actual fingerprint from the vault server.
|
||||
|
||||
3. **Create vault backup script**
|
||||
```powershell
|
||||
New-Item -ItemType File -Path C:\Scripts\kopia-vault-nightly.ps1
|
||||
```
|
||||
|
||||
Add this content to `C:\Scripts\kopia-vault-nightly.ps1`:
|
||||
```powershell
|
||||
# Kopia Vault Backup Script
|
||||
# Backs up critical data to vault repository for offsite replication
|
||||
|
||||
$env:KOPIA_PASSWORD = "LucyDog123"
|
||||
$KOPIA_CONFIG = "C:\Users\cindy\.config\kopia-vault\repository.config"
|
||||
|
||||
# Define critical directories to back up
|
||||
$VaultDirs = @(
|
||||
"C:\Users\cindy\Documents",
|
||||
"C:\Users\cindy\Pictures",
|
||||
"C:\Users\cindy\Desktop\Important"
|
||||
)
|
||||
|
||||
# Log header
|
||||
Add-Content -Path C:\Logs\kopia-vault.log -Value "=== Vault backup started at $(Get-Date) ==="
|
||||
|
||||
# Backup each directory
|
||||
foreach ($dir in $VaultDirs) {
|
||||
if (Test-Path $dir) {
|
||||
Add-Content -Path C:\Logs\kopia-vault.log -Value "Backing up: $dir"
|
||||
kopia --config-file="$KOPIA_CONFIG" snapshot create $dir `
|
||||
| Tee-Object -FilePath C:\Logs\kopia-vault.log -Append
|
||||
} else {
|
||||
Add-Content -Path C:\Logs\kopia-vault.log -Value "Directory not found: $dir"
|
||||
}
|
||||
}
|
||||
|
||||
# Log completion
|
||||
Add-Content -Path C:\Logs\kopia-vault.log -Value "=== Vault backup completed at $(Get-Date) ==="
|
||||
Add-Content -Path C:\Logs\kopia-vault.log -Value ""
|
||||
```
|
||||
|
||||
4. **Create log directory**
|
||||
```powershell
|
||||
New-Item -ItemType Directory -Force -Path C:\Logs
|
||||
```
|
||||
|
||||
5. **Create scheduled task for vault backup**
|
||||
- Press `Win + R` → type `taskschd.msc`
|
||||
- Click "Create Task"
|
||||
|
||||
**General tab:**
|
||||
- Name: `Kopia Vault Nightly Backup`
|
||||
- ✔ Run whether user is logged on or not
|
||||
- ✔ Run with highest privileges
|
||||
|
||||
**Triggers tab:**
|
||||
- New → Daily at 3:00 AM (after primary backup)
|
||||
- ✔ Enabled
|
||||
|
||||
**Actions tab:**
|
||||
- Program: `powershell.exe`
|
||||
- Arguments: `-ExecutionPolicy Bypass -File C:\Scripts\kopia-vault-nightly.ps1`
|
||||
- Start in: `C:\Scripts`
|
||||
|
||||
**Conditions/Settings:** Same as primary backup task
|
||||
|
||||
---
|
||||
|
||||
## ZFS Replication to Offsite Pi Vaults
|
||||
|
||||
### Setup on ZNAS (Source)
|
||||
|
||||
1. **Create snapshot script**
|
||||
```bash
|
||||
sudo nano /usr/local/bin/vault-snapshot.sh
|
||||
```
|
||||
|
||||
Add this content:
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Create ZFS snapshot of vault dataset for replication
|
||||
|
||||
DATASET="zpool/vault/backup"
|
||||
SNAPSHOT_NAME="vault-$(date +%Y%m%d-%H%M%S)"
|
||||
|
||||
# Create snapshot
|
||||
zfs snapshot "${DATASET}@${SNAPSHOT_NAME}"
|
||||
|
||||
# Keep only last 7 days of snapshots on source
|
||||
zfs list -t snapshot -o name -s creation | grep "^${DATASET}@vault-" | head -n -7 | xargs -r -n 1 zfs destroy
|
||||
|
||||
echo "Created snapshot: ${DATASET}@${SNAPSHOT_NAME}"
|
||||
```
|
||||
|
||||
2. **Make executable**
|
||||
```bash
|
||||
sudo chmod +x /usr/local/bin/vault-snapshot.sh
|
||||
```
|
||||
|
||||
3. **Schedule snapshot creation**
|
||||
```bash
|
||||
sudo crontab -e
|
||||
|
||||
# Add this line (create snapshot daily at 4 AM, after vault backups complete)
|
||||
0 4 * * * /usr/local/bin/vault-snapshot.sh >> /var/log/vault-snapshot.log 2>&1
|
||||
```
|
||||
|
||||
4. **Create replication script**
|
||||
```bash
|
||||
sudo nano /usr/local/bin/vault-replicate.sh
|
||||
```
|
||||
|
||||
Add this content:
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Replicate vault dataset to offsite Pi systems
|
||||
|
||||
DATASET="zpool/vault/backup"
|
||||
PI1_HOST="pi-vault-1.local" # Update with actual hostname/IP
|
||||
PI2_HOST="pi-vault-2.local" # Update with actual hostname/IP
|
||||
PI_USER="admin"
|
||||
REMOTE_DATASET="tank/vault-backup" # Update with actual dataset on Pi
|
||||
|
||||
# Get the latest snapshot
|
||||
LATEST_SNAP=$(zfs list -t snapshot -o name -s creation | grep "^${DATASET}@vault-" | tail -n 1)
|
||||
|
||||
if [ -z "$LATEST_SNAP" ]; then
|
||||
echo "No snapshots found for replication"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Replicating snapshot: $LATEST_SNAP"
|
||||
|
||||
# Function to replicate to a target
|
||||
replicate_to_target() {
|
||||
local TARGET_HOST=$1
|
||||
echo "=== Replicating to $TARGET_HOST ==="
|
||||
|
||||
# Get the last snapshot on remote (if any)
|
||||
LAST_REMOTE=$(ssh ${PI_USER}@${TARGET_HOST} "zfs list -t snapshot -o name -s creation 2>/dev/null | grep '^${REMOTE_DATASET}@vault-' | tail -n 1" || echo "")
|
||||
|
||||
if [ -z "$LAST_REMOTE" ]; then
|
||||
# Initial replication (full send)
|
||||
echo "Performing initial full replication to $TARGET_HOST"
|
||||
zfs send -c $LATEST_SNAP | ssh ${PI_USER}@${TARGET_HOST} "zfs receive -F ${REMOTE_DATASET}"
|
||||
else
|
||||
# Incremental replication
|
||||
echo "Performing incremental replication to $TARGET_HOST"
|
||||
LAST_SNAP_NAME=$(echo $LAST_REMOTE | cut -d'@' -f2)
|
||||
zfs send -c -i ${DATASET}@${LAST_SNAP_NAME} $LATEST_SNAP | ssh ${PI_USER}@${TARGET_HOST} "zfs receive -F ${REMOTE_DATASET}"
|
||||
fi
|
||||
|
||||
# Clean up old snapshots on remote (keep last 30 days)
|
||||
ssh ${PI_USER}@${TARGET_HOST} "zfs list -t snapshot -o name -s creation | grep '^${REMOTE_DATASET}@vault-' | head -n -30 | xargs -r -n 1 zfs destroy"
|
||||
|
||||
echo "Replication to $TARGET_HOST completed"
|
||||
}
|
||||
|
||||
# Replicate to both Pi systems
|
||||
replicate_to_target $PI1_HOST
|
||||
replicate_to_target $PI2_HOST
|
||||
|
||||
echo "All replications completed at $(date)"
|
||||
```
|
||||
|
||||
5. **Make executable**
|
||||
```bash
|
||||
sudo chmod +x /usr/local/bin/vault-replicate.sh
|
||||
```
|
||||
|
||||
6. **Set up SSH keys for passwordless replication**
|
||||
```bash
|
||||
# Generate SSH key if needed
|
||||
ssh-keygen -t ed25519 -C "znas-replication"
|
||||
|
||||
# Copy to both Pi systems
|
||||
ssh-copy-id admin@pi-vault-1.local
|
||||
ssh-copy-id admin@pi-vault-2.local
|
||||
```
|
||||
|
||||
7. **Schedule replication**
|
||||
```bash
|
||||
sudo crontab -e
|
||||
|
||||
# Add this line (replicate daily at 5 AM, after snapshot creation)
|
||||
0 5 * * * /usr/local/bin/vault-replicate.sh >> /var/log/vault-replicate.log 2>&1
|
||||
```
|
||||
|
||||
### Setup on Pi Vault Systems (Targets)
|
||||
|
||||
Repeat these steps on both Pi Vault 1 and Pi Vault 2:
|
||||
|
||||
1. **Create ZFS pool on SSD** (if not already done)
|
||||
```bash
|
||||
# Assuming SSD is /dev/sda
|
||||
sudo zpool create tank /dev/sda
|
||||
```
|
||||
|
||||
2. **Create dataset for receiving backups**
|
||||
```bash
|
||||
sudo zfs create tank/vault-backup
|
||||
```
|
||||
|
||||
3. **Set appropriate permissions**
|
||||
```bash
|
||||
# Allow the replication user to receive snapshots
|
||||
sudo zfs allow admin receive,create,mount,destroy tank/vault-backup
|
||||
```
|
||||
|
||||
4. **Verify replication** (after first run)
|
||||
```bash
|
||||
zfs list -t snapshot | grep vault-
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Maintenance and Monitoring
|
||||
|
||||
### Regular Health Checks
|
||||
|
||||
**On Clients:**
|
||||
```bash
|
||||
# Linux
|
||||
sudo kopia snapshot list
|
||||
sudo kopia snapshot verify --file-parallelism=8
|
||||
sudo kopia repository status
|
||||
|
||||
# Windows (PowerShell)
|
||||
kopia snapshot list
|
||||
kopia snapshot verify --file-parallelism=8
|
||||
kopia repository status
|
||||
```
|
||||
|
||||
**On ZNAS:**
|
||||
```bash
|
||||
# Check ZFS health
|
||||
zpool status
|
||||
|
||||
# Check both Kopia servers are running
|
||||
docker ps | grep kopia
|
||||
|
||||
# Check vault snapshots
|
||||
zfs list -t snapshot | grep "vault/backup"
|
||||
|
||||
# Check replication logs
|
||||
tail -f /var/log/vault-replicate.log
|
||||
|
||||
# View server statuses
|
||||
docker exec kopia-server-primary kopia server status
|
||||
docker exec kopia-server-vault kopia server status
|
||||
```
|
||||
|
||||
**On Pi Vaults:**
|
||||
```bash
|
||||
# Check received snapshots
|
||||
zfs list -t snapshot | grep vault-backup
|
||||
|
||||
# Check available space
|
||||
zfs list tank/vault-backup
|
||||
```
|
||||
|
||||
### Monthly Maintenance Tasks
|
||||
|
||||
1. **Verify vault backups are replicating**
|
||||
```bash
|
||||
# On ZNAS
|
||||
cat /var/log/vault-replicate.log | grep "completed"
|
||||
|
||||
# On Pi systems
|
||||
zfs list -t snapshot -o name,creation | grep vault-backup | tail
|
||||
```
|
||||
|
||||
2. **Test restore from vault repository**
|
||||
```bash
|
||||
# Connect to vault repo and verify a random snapshot
|
||||
kopia --config-file=/path/to/vault/config repository connect server --url=...
|
||||
kopia snapshot list
|
||||
kopia snapshot verify --file-parallelism=8
|
||||
```
|
||||
|
||||
3. **Check disk space on all systems**
|
||||
|
||||
4. **Review backup logs for errors**
|
||||
|
||||
### Backup Policy Recommendations
|
||||
|
||||
**Primary Repository:**
|
||||
- Retention: 7 daily, 4 weekly, 6 monthly
|
||||
- Compression: enabled
|
||||
- All data from clients
|
||||
|
||||
**Vault Repository:**
|
||||
- Retention: 14 daily, 8 weekly, 12 monthly, 3 yearly
|
||||
- Compression: enabled
|
||||
- Only critical data for offsite protection
|
||||
|
||||
**ZFS Snapshots:**
|
||||
- Keep 7 days on ZNAS (source)
|
||||
- Keep 30 days on Pi vaults (targets)
|
||||
|
||||
---
|
||||
|
||||
## Disaster Recovery Procedures
|
||||
|
||||
### Scenario 1: Restore from Primary Repository
|
||||
|
||||
```bash
|
||||
# Linux
|
||||
sudo kopia snapshot list
|
||||
sudo kopia snapshot restore <snapshot-id> /restore/location
|
||||
|
||||
# Windows
|
||||
kopia snapshot list
|
||||
kopia snapshot restore <snapshot-id> C:\restore\location
|
||||
```
|
||||
|
||||
### Scenario 2: Restore from Vault Repository (Offsite)
|
||||
|
||||
If ZNAS is unavailable, restore directly from Pi vault:
|
||||
|
||||
1. **On Pi vault:**
|
||||
```bash
|
||||
# Mount the latest snapshot
|
||||
LATEST=$(zfs list -t snapshot -o name | grep vault-backup | tail -n 1)
|
||||
zfs clone $LATEST tank/vault-backup-restore
|
||||
```
|
||||
|
||||
2. **Access Kopia repository directly:**
|
||||
```bash
|
||||
kopia repository connect filesystem --path=/tank/vault-backup-restore
|
||||
kopia snapshot list
|
||||
kopia snapshot restore <snapshot-id> /restore/location
|
||||
```
|
||||
|
||||
3. **Clean up after restore:**
|
||||
```bash
|
||||
zfs destroy tank/vault-backup-restore
|
||||
```
|
||||
|
||||
### Scenario 3: Complete System Rebuild
|
||||
|
||||
1. Rebuild ZNAS and restore vault dataset from Pi
|
||||
2. Reinstall Kopia server in Docker
|
||||
3. Point server to restored vault repository
|
||||
4. Reconnect clients to primary and vault repositories
|
||||
5. Resume scheduled backups
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Client can't connect to repository
|
||||
|
||||
```bash
|
||||
# Check both servers are running
|
||||
docker ps | grep kopia
|
||||
|
||||
# Should see both kopia-server-primary and kopia-server-vault
|
||||
|
||||
# Check firewall
|
||||
sudo ufw status | grep 51515
|
||||
sudo ufw status | grep 51516
|
||||
|
||||
# Verify certificate fingerprints
|
||||
docker exec kopia-server-primary kopia server status
|
||||
docker exec kopia-server-vault kopia server status
|
||||
|
||||
# Check server logs
|
||||
docker logs kopia-server-primary
|
||||
docker logs kopia-server-vault
|
||||
```
|
||||
|
||||
### Vault replication failing
|
||||
|
||||
```bash
|
||||
# Check SSH connectivity
|
||||
ssh admin@pi-vault-1.local "echo Connected"
|
||||
|
||||
# Check ZFS pool health
|
||||
zpool status
|
||||
|
||||
# Check remote dataset exists
|
||||
ssh admin@pi-vault-1.local "zfs list tank/vault-backup"
|
||||
|
||||
# Manual test send
|
||||
zfs send -n -v zpool/vault/backup@latest | ssh admin@pi-vault-1.local "cat > /dev/null"
|
||||
```
|
||||
|
||||
### Windows scheduled task not running
|
||||
|
||||
- Check Task Scheduler → Task History
|
||||
- Verify PIN/password authentication (use password Harvey123= for task credential)
|
||||
- Check that computer is awake at scheduled time
|
||||
- Review power settings (prevent sleep, wake for tasks)
|
||||
- Check log files: `C:\Logs\kopia-primary.log` and `C:\Logs\kopia-vault.log`
|
||||
|
||||
### Snapshot cleanup not working
|
||||
|
||||
```bash
|
||||
# Manually clean old snapshots
|
||||
zfs list -t snapshot -o name,used,creation | grep vault-backup
|
||||
|
||||
# Remove specific snapshot
|
||||
zfs destroy zpool/vault/backup@vault-YYYYMMDD-HHMMSS
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Security Notes
|
||||
|
||||
1. **Passwords in scripts:** Current implementation stores passwords in plaintext in scripts. For production, consider:
|
||||
- Windows Credential Manager
|
||||
- Linux keyring or encrypted credential storage
|
||||
- Environment variables set at system level
|
||||
|
||||
2. **SSH keys:** Replication uses SSH keys. Keep private keys secure and use passphrase protection where possible.
|
||||
|
||||
3. **Network security:** Kopia server uses HTTPS with certificate validation. Ensure certificate fingerprint is verified on first connection.
|
||||
|
||||
4. **Physical security:** Offsite Pi vaults should be stored in secure locations with different risk profiles (fire, flood, theft).
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference Commands
|
||||
|
||||
### Kopia Client Commands
|
||||
|
||||
```bash
|
||||
# List snapshots
|
||||
kopia snapshot list
|
||||
|
||||
# Create snapshot
|
||||
kopia snapshot create /path/to/backup
|
||||
|
||||
# Verify integrity
|
||||
kopia snapshot verify --file-parallelism=8
|
||||
|
||||
# Check repository status
|
||||
kopia repository status
|
||||
|
||||
# View policies
|
||||
kopia policy list
|
||||
|
||||
# Mount snapshot (Linux)
|
||||
kopia mount <snapshot-id> /mnt/snapshot
|
||||
|
||||
# Use alternate config (for vault repository)
|
||||
kopia --config-file=/path/to/vault/repository.config snapshot list
|
||||
```
|
||||
|
||||
### ZFS Commands
|
||||
|
||||
```bash
|
||||
# List snapshots
|
||||
zfs list -t snapshot
|
||||
|
||||
# Create manual snapshot
|
||||
zfs snapshot zpool/vault/backup@manual-$(date +%Y%m%d)
|
||||
|
||||
# Send full snapshot
|
||||
zfs send zpool/vault/backup@snapshot | ssh user@host zfs receive tank/backup
|
||||
|
||||
# Send incremental
|
||||
zfs send -i @old @new zpool/vault/backup | ssh user@host zfs receive tank/backup
|
||||
|
||||
# List replication progress
|
||||
zpool status -v
|
||||
|
||||
# Check dataset size
|
||||
zfs list -o space zpool/vault/backup
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Appendix: System Specifications
|
||||
|
||||
**ZNAS:**
|
||||
- ZFS fileserver
|
||||
- Docker running **two** Kopia servers:
|
||||
- **kopia-server-primary** on port 51515
|
||||
- **kopia-server-vault** on port 51516
|
||||
- IP: 192.168.5.10
|
||||
- Datasets:
|
||||
- `/srv/vault/kopia_repository` (zpool/vault/kopia_repository) - Primary repository
|
||||
- `/srv/vault/backup` (zpool/vault/backup) - Vault repository (replicated)
|
||||
|
||||
**Clients:**
|
||||
- **docker2** (Linux) - Backs up /DockerVol/
|
||||
- Primary: Every 3 hours → port 51515
|
||||
- Vault: Daily at 3 AM (critical directories only) → port 51516
|
||||
- **DESKTOP-QLSVD8P** (Windows - Cindy's desktop) - Backs up C:\Users\cindy
|
||||
- Primary: Daily at 2 AM → port 51515
|
||||
- Vault: Daily at 3 AM (Documents, Pictures, Important files) → port 51516
|
||||
- Kopia password: LucyDog123
|
||||
- Task Scheduler credential: Harvey123=
|
||||
|
||||
**Offsite Vaults:**
|
||||
- **Pi Vault 1** - Raspberry Pi with SSD (tank/vault-backup)
|
||||
- **Pi Vault 2** - Raspberry Pi with SSD (tank/vault-backup)
|
||||
|
||||
**Server Certificates:**
|
||||
- Primary server SHA256: `696a4999f594b5273a174fd7cab677d8dd1628f9b9d27e557daa87103ee064b2`
|
||||
- Vault server SHA256: *(get from `docker exec kopia-server-vault kopia server status`)*
|
||||
|
||||
---
|
||||
|
||||
## Workflow Summary
|
||||
|
||||
### Daily Backup Flow
|
||||
|
||||
**2:00 AM** - Cindy's desktop primary backup runs
|
||||
**3:00 AM** - docker2 vault backup runs
|
||||
**3:00 AM** - Cindy's desktop vault backup runs
|
||||
**4:00 AM** - ZNAS creates ZFS snapshot of vault dataset
|
||||
**5:00 AM** - ZNAS replicates vault snapshot to both Pi systems
|
||||
**Every 3 hours** - docker2 primary backup runs
|
||||
|
||||
### What Gets Backed Up Where
|
||||
|
||||
**Primary Repository (Full Backups):**
|
||||
- docker2: /DockerVol/ (all Docker volumes)
|
||||
- Cindy: C:\Users\cindy (entire user profile, minus temp files)
|
||||
|
||||
**Vault Repository (Critical Data for Offsite):**
|
||||
- docker2: Selected critical Docker volumes
|
||||
- Cindy: Documents, Pictures, Important desktop files
|
||||
|
||||
**Offsite (Via ZFS Send):**
|
||||
- Entire vault repository (all clients' critical data)
|
||||
- Replicated to 2 separate Pi systems
|
||||
|
||||
---
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Consider adding:
|
||||
- Email notifications on backup failures
|
||||
- Monitoring dashboard (Grafana/Prometheus)
|
||||
- Backup validation automation
|
||||
- Additional retention policies per client
|
||||
- Encrypted credentials storage
|
||||
- Remote monitoring of Pi vault systems
|
||||
- Automated restore testing
|
||||
- Bandwidth throttling for replication
|
||||
- Multiple ZFS snapshot retention policies
|
||||
|
||||
---
|
||||
|
||||
## Change Log
|
||||
|
||||
- **2025-02-11** - Initial comprehensive documentation created
|
||||
- Added two-tier backup strategy (primary + vault)
|
||||
- Added ZFS replication procedures for offsite backup
|
||||
- Added Pi vault setup instructions
|
||||
- Added disaster recovery procedures
|
||||
- Consolidated all client configurations
|
||||
- Added workflow diagrams and timing
|
||||
|
||||
---
|
||||
|
||||
## Support and Feedback
|
||||
|
||||
For issues or improvements to this documentation, contact the system administrator.
|
||||
|
||||
**Useful Resources:**
|
||||
- Kopia Documentation: https://kopia.io/docs/
|
||||
- ZFS Administration Guide: https://openzfs.github.io/openzfs-docs/
|
||||
- Kopia GitHub: https://github.com/kopia/kopia
|
||||
113
Netgrimoire/Vault-Grimoire/Kopia/Kopia-Service.md
Normal file
113
Netgrimoire/Vault-Grimoire/Kopia/Kopia-Service.md
Normal file
|
|
@ -0,0 +1,113 @@
|
|||
# kopia
|
||||
|
||||
## Overview
|
||||
The kopia stack is a Docker Swarm configuration for the Kopia backup service in NetGrimoire. It provides snapshot backups and deduplication capabilities.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
| Service | Image | Port | Role |
|
||||
|---------|-------|-----|------|
|
||||
- **Host:** docker4
|
||||
- **Network:** netgrimoire
|
||||
- **Exposed via:** kopia.netgrimoire.com, 51515 (via Caddy reverse proxy)
|
||||
- **Homepage group:** Backup
|
||||
|
||||
---
|
||||
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
None specified.
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/kopia/config
|
||||
mkdir -p /DockerVol/kopia/cache
|
||||
mkdir -p /DockerVol/kopia/cert
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# generate: openssl rand -hex 32 for secrets
|
||||
POUID=1964
|
||||
PGID=1964
|
||||
KOPIA_PASSWORD=F@lcon13
|
||||
KOPIA_SERVER_USERNAME=admin
|
||||
KOPIA_SERVER_PASSWORD=F@lcon13
|
||||
TZ=America/Chicago
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/kopia
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file kopia-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml kopia
|
||||
rm resolved.yml
|
||||
docker stack services kopia
|
||||
```
|
||||
|
||||
### First Run
|
||||
After deployment, check the status of the Kopia service and verify that backups are being created.
|
||||
|
||||
---
|
||||
|
||||
## User Guide
|
||||
|
||||
### Accessing kopia
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
- **kopia**: https://kopia.netgrimoire.com (via Caddy reverse proxy)
|
||||
|
||||
### Primary Use Cases
|
||||
To use Kopia in NetGrimoire, create a new backup set and configure the service to run as desired.
|
||||
|
||||
### NetGrimoire Integrations
|
||||
This service integrates with Uptime Kuma for monitoring and other services through environment variables and labels.
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
```bash
|
||||
docker stack services kopia
|
||||
docker service logs -f kopia
|
||||
```
|
||||
|
||||
### Backups
|
||||
Critical backups are stored at `/DockerVol/kopia/config` and `/DockerVol/kopia/cache`. Reconstructable backups can be restored from `/DockerVol/kopia/cache`.
|
||||
|
||||
### Restore
|
||||
To restore a backup, run the following command:
|
||||
```bash
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Failures
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
| Backups not being created | Insufficient storage or network issues | Check storage and network conditions. |
|
||||
| Service not starting | Incorrect environment variables or Docker configuration | Review `.env` file and `docker-compose.yml`. |
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
| 2026-04-07 | d3206f11 | Initial documentation for kopia stack. |
|
||||
| 2026-02-11 | aa13ac64 | Minor adjustments to environment variables and volume setup. |
|
||||
| 2026-01-30 | 15f5f655 | Initial commit with basic configuration and service setup. |
|
||||
|
||||
<No changelog entries available from diffs above>
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
- Generated by Gremlin on 2026-04-07T19:20:00.179Z
|
||||
- Source: swarm/kopia.yaml
|
||||
- Review User Guide and Changelog sections
|
||||
44
Netgrimoire/Vault-Grimoire/Offsite/Vault-Architecture.md
Normal file
44
Netgrimoire/Vault-Grimoire/Offsite/Vault-Architecture.md
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
---
|
||||
title: Offsite Vault Architecture
|
||||
description: Two Pi vault nodes — ZFS raw send, syncoid, Pocket Grimoire
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: vault, offsite, zfs, kopia
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Offsite Vault Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
Two offsite nodes receive ZFS replication from `znas`:
|
||||
|
||||
| Node | Location | Role |
|
||||
|------|----------|------|
|
||||
| Vault Pi (dedicated) | Offsite / home shelf | Kopia offsite server, ZFS vault pool |
|
||||
| Pocket Grimoire | Travel / portable | Portable vault + media, also a vault node |
|
||||
|
||||
## Replication Method
|
||||
|
||||
ZFS raw send via `syncoid` with `-w` flag (raw/encrypted mode):
|
||||
|
||||
```bash
|
||||
# Dedicated vault Pi
|
||||
syncoid -w znas:vault/data vault-pi:vault/data
|
||||
|
||||
# Pocket Grimoire pre-travel
|
||||
syncoid znas:vault/Green/Pocket pocket:/srv/greenpg/Green
|
||||
```
|
||||
|
||||
The `-w` flag sends encrypted ZFS streams. The receiving node stores data in its encrypted form — no decryption keys are needed on the vault nodes. Keys stay exclusively on `znas`.
|
||||
|
||||
## Kopia Offsite Server
|
||||
|
||||
The vault container (`vault.yaml`) runs a Kopia server on port 51516 that serves as the remote endpoint for the dedicated Pi vault. Accessible at `vault.netgrimoire.com`.
|
||||
|
||||
## Pocket Grimoire as Vault Node
|
||||
|
||||
Pocket Grimoire's ZFS pool (`pocket-green` at `/srv/greenpg/`) receives a `syncoid` push from `znas` before each trip. This makes Pocket Grimoire an offsite backup node whenever it leaves the house.
|
||||
|
||||
See [Pocket Grimoire Sync](/Pocket-Grimoire/Sync/Pre-Travel-Sync) for the pre-travel checklist.
|
||||
60
Netgrimoire/Vault-Grimoire/Overview.md
Normal file
60
Netgrimoire/Vault-Grimoire/Overview.md
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
---
|
||||
title: Vault Grimoire
|
||||
description: Storage and backup — the dragon guards the data hoard
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: vault, storage, backup
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Vault Grimoire
|
||||
|
||||

|
||||
|
||||
The Vault Grimoire covers all storage and backup infrastructure. Data starts at `znas`, is deduplicated and encrypted by Kopia, and replicates offsite to two Pi vault nodes — one dedicated vault Pi and one inside Pocket Grimoire.
|
||||
|
||||
---
|
||||
|
||||
## Sections
|
||||
|
||||
| Section | Contents |
|
||||
|---------|----------|
|
||||
| [ZFS](/Vault-Grimoire/ZFS/Storage-Layout) | ZFS pools, datasets, NFS exports, commands reference |
|
||||
| [Kopia](/Vault-Grimoire/Kopia/Kopia-Overview) | Backup repos, retention, restore, two-repo architecture |
|
||||
| [Backups](/Vault-Grimoire/Backups/Services-Backup) | Per-service backup runbooks (Immich, MailCow, Nextcloud, Wiki, services) |
|
||||
| [Offsite](/Vault-Grimoire/Offsite/Vault-Architecture) | Pi vault nodes, ZFS raw send, syncoid workflow |
|
||||
|
||||
---
|
||||
|
||||
## Offsite Vault Architecture
|
||||
|
||||
```
|
||||
znas (primary)
|
||||
└── ZFS pool → Kopia dedup → encrypted repo
|
||||
├── syncoid -w → Pi Vault (dedicated offsite)
|
||||
└── syncoid → Pocket Grimoire (portable vault node)
|
||||
```
|
||||
|
||||
Both offsite nodes receive ZFS raw send with the `-w` flag. Encryption keys stay on `znas`. The vault nodes store encrypted data only — no keys needed there.
|
||||
|
||||
---
|
||||
|
||||
## Two-Repo Architecture
|
||||
|
||||
Kopia uses two separate containers on different ports:
|
||||
|
||||
| Container | Repo | URL | Purpose |
|
||||
|-----------|------|-----|---------|
|
||||
| kopia | Primary vault | `kopia.netgrimoire.com` | Main backup, dedup, retention |
|
||||
| vault | Offsite server | `vault.netgrimoire.com` (port 51516) | Replication target for Pi vaults |
|
||||
|
||||
One Kopia server instance per repository. They cannot share.
|
||||
|
||||
---
|
||||
|
||||
## Key Rules
|
||||
|
||||
- ZFS encryption cannot be done in-place. Migration requires `rsync` to a new encrypted dataset, then ZFS raw send with `-w` to vaults (no key exposure on vault side).
|
||||
- ZFS must fully mount before NFS starts on znas. Systemd override required: `After=zfs-import.target zfs-mount.service`.
|
||||
- Loopback NFS mount needs `x-systemd.after=nfs-server.service` in fstab.
|
||||
393
Netgrimoire/Vault-Grimoire/ZFS/NFS-Exports.md
Normal file
393
Netgrimoire/Vault-Grimoire/ZFS/NFS-Exports.md
Normal file
|
|
@ -0,0 +1,393 @@
|
|||
---
|
||||
title: ZFS-NFS-Exports
|
||||
description: Exporting NFS shares from ZFS datasets
|
||||
published: true
|
||||
date: 2026-02-23T21:58:20.626Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-01T20:45:40.210Z
|
||||
---
|
||||
|
||||
# NFS Configuration
|
||||
|
||||
## Overview
|
||||
|
||||
ZNAS exports storage via NFSv4. All exports are ZFS datasets mounted directly to `/export/*` — no bind mounts. NFS is configured to wait for ZFS at boot via a systemd override.
|
||||
|
||||
ZNAS also mounts its own NFS exports back to itself at `/data/nfs/znas`. This is intentional: Docker Swarm containers scheduled to ZNAS need to access NAS storage at the same paths as containers running on other swarm members. The loopback mount provides a consistent NFS-backed path regardless of which node a container lands on.
|
||||
|
||||
All other clients are Linux systems using autofs.
|
||||
|
||||
---
|
||||
|
||||
## Server Configuration
|
||||
|
||||
### ZFS Mountpoints
|
||||
|
||||
ZFS datasets mount directly to `/export/*`. No bind mounts are used.
|
||||
|
||||
```
|
||||
vault → /export
|
||||
vault/Common → /export/Common
|
||||
vault/Data → /export/Data
|
||||
vault/Data/media_books → /export/Data/media/books
|
||||
vault/Data/media_comics → /export/Data/media/comics
|
||||
vault/Docker → /export/Docker
|
||||
vault/Green → /export/Green
|
||||
vault/Green/Pocket → /export/Green/Pocket
|
||||
vault/Photos → /export/Photos
|
||||
```
|
||||
|
||||
Verify at any time:
|
||||
|
||||
```bash
|
||||
mount | grep export
|
||||
```
|
||||
|
||||
### /etc/exports
|
||||
|
||||
```
|
||||
# NFSv4 - pseudo filesystem root
|
||||
/export *(ro,fsid=0,no_root_squash,no_subtree_check,crossmnt)
|
||||
|
||||
# Shares beneath the NFSv4 root
|
||||
/export/Common *(fsid=4,rw,no_subtree_check,insecure)
|
||||
/export/Data *(fsid=5,rw,no_subtree_check,insecure,crossmnt)
|
||||
/export/Data/media/books *(fsid=51,rw,no_subtree_check,insecure,nohide)
|
||||
/export/Data/media/comics *(fsid=52,rw,no_subtree_check,insecure,nohide)
|
||||
/export/Docker *(fsid=29,rw,no_root_squash,sync,no_subtree_check,insecure)
|
||||
/export/Green *(fsid=30,rw,no_root_squash,no_subtree_check,insecure)
|
||||
/export/photos *(fsid=31,rw,no_root_squash,no_subtree_check,insecure)
|
||||
```
|
||||
|
||||
**Key options:**
|
||||
|
||||
- `fsid=0` on `/export` — required for NFSv4 pseudo-root. Clients enumerate all exports from here.
|
||||
- `crossmnt` — allows NFS to cross ZFS dataset boundaries when traversing the tree.
|
||||
- `nohide` — required on `media/books` and `media/comics` because they are separate ZFS datasets mounted beneath the `vault/Data` export path. Without it clients see empty directories.
|
||||
- `no_root_squash` — Docker and Green exports allow root writes. Required for container volume mounts.
|
||||
- `insecure` — permits connections from unprivileged ports (>1024). Required for some Linux NFS clients and all macOS clients.
|
||||
- `sync` on Docker — forces synchronous writes for container volume safety.
|
||||
|
||||
### systemd Boot Order Override
|
||||
|
||||
NFS is configured to wait for ZFS to fully mount before starting.
|
||||
|
||||
`/etc/systemd/system/nfs-server.service.d/override.conf`:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
After=zfs-import.target zfs-mount.service local-fs.target
|
||||
Requires=zfs-import.target zfs-mount.service
|
||||
```
|
||||
|
||||
Apply after any changes:
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl restart nfs-server
|
||||
```
|
||||
|
||||
### Autofs Disabled on Server
|
||||
|
||||
Autofs is disabled on ZNAS itself. It must only run on NFS clients. Running autofs on the server creates recursive mount loops.
|
||||
|
||||
```bash
|
||||
sudo systemctl stop autofs
|
||||
sudo systemctl disable autofs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Loopback Mount (Docker Swarm)
|
||||
|
||||
ZNAS mounts its own NFS exports back to itself at `/data/nfs/znas`. This ensures containers scheduled to ZNAS by Docker Swarm access storage at the same NFS-backed paths as containers running on any other swarm member — consistent regardless of which node a service lands on.
|
||||
|
||||
Swarm container volume mounts reference paths under `/data/nfs/znas/` rather than `/export/` directly.
|
||||
|
||||
### The Timing Problem
|
||||
|
||||
Getting this mount to survive reboots reliably was non-trivial. The loopback has a chicken-and-egg dependency chain:
|
||||
|
||||
1. ZFS must import and mount pools before NFS server can export anything
|
||||
2. NFS server must be fully started before the loopback mount can succeed
|
||||
3. The loopback mount must be established before Docker Swarm containers start
|
||||
|
||||
A plain `_netdev` fstab entry is not sufficient — `_netdev` only guarantees the network is up, not that the NFS server is ready. The mount would race against NFS startup and fail silently or hang.
|
||||
|
||||
### Solution — fstab with x-systemd.after
|
||||
|
||||
The loopback is established via `/etc/fstab` using the `x-systemd.after` option to explicitly declare the dependency on `nfs-server.service`:
|
||||
|
||||
```
|
||||
localhost:/ /data/nfs/znas nfs4 defaults,_netdev,x-systemd.after=nfs-server.service 0 0
|
||||
```
|
||||
|
||||
`x-systemd.after=nfs-server.service` causes systemd-fstab-generator to automatically create a mount unit (`data-nfs-znas.mount`) with `After=nfs-server.service` in its `[Unit]` block. This guarantees the full dependency chain:
|
||||
|
||||
```
|
||||
zfs-import.target
|
||||
→ zfs-mount.service
|
||||
→ nfs-server.service (via nfs-server override.conf)
|
||||
→ data-nfs-znas.mount (via x-systemd.after in fstab)
|
||||
→ remote-fs.target
|
||||
→ Docker Swarm containers
|
||||
```
|
||||
|
||||
The generated unit (created automatically at runtime by systemd-fstab-generator — not a file on disk):
|
||||
|
||||
```ini
|
||||
# /run/systemd/generator/data-nfs-znas.mount
|
||||
[Unit]
|
||||
Documentation=man:fstab(5) man:systemd-fstab-generator(8)
|
||||
SourcePath=/etc/fstab
|
||||
After=nfs-server.service
|
||||
Before=remote-fs.target
|
||||
|
||||
[Mount]
|
||||
What=localhost:/
|
||||
Where=/data/nfs/znas
|
||||
Type=nfs4
|
||||
Options=defaults,_netdev,x-systemd.after=nfs-server.service
|
||||
```
|
||||
|
||||
**Do not create a hand-written systemd mount unit for this.** systemd-fstab-generator handles it automatically from the fstab entry. A manual unit would conflict.
|
||||
|
||||
### Verify Loopback is Active
|
||||
|
||||
```bash
|
||||
mount | grep data/nfs/znas
|
||||
# Should show: localhost:/ on /data/nfs/znas type nfs4 (...)
|
||||
|
||||
systemctl status data-nfs-znas.mount
|
||||
# Should show: active (mounted)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Client Configuration
|
||||
|
||||
All non-Swarm clients are Linux systems using autofs.
|
||||
|
||||
### Autofs Configuration
|
||||
|
||||
`/etc/auto.master` (relevant entry):
|
||||
|
||||
```
|
||||
/data/nfs /etc/auto.nfs
|
||||
```
|
||||
|
||||
`/etc/auto.nfs`:
|
||||
|
||||
```
|
||||
znas -fstype=nfs4 192.168.5.10:/
|
||||
```
|
||||
|
||||
This mounts the full NFSv4 tree from ZNAS at `/data/nfs/znas` on demand — the same path used by the loopback mount on ZNAS itself. All swarm nodes (including ZNAS) access NAS storage via `/data/nfs/znas/`.
|
||||
|
||||
**Note:** Autofs must be enabled on clients and disabled on the NFS server. Running autofs on the server creates recursive mount loops.
|
||||
|
||||
### Adding a New Client
|
||||
|
||||
```bash
|
||||
# Install autofs if not present
|
||||
sudo apt install autofs
|
||||
|
||||
# Add to /etc/auto.master if not already present
|
||||
echo "/data/nfs /etc/auto.nfs" | sudo tee -a /etc/auto.master
|
||||
|
||||
# Create or update /etc/auto.nfs
|
||||
echo "znas -fstype=nfs4 192.168.5.10:/" | sudo tee -a /etc/auto.nfs
|
||||
|
||||
# Reload autofs
|
||||
sudo systemctl reload autofs
|
||||
|
||||
# Trigger mount by accessing the path
|
||||
ls /data/nfs/znas/
|
||||
```
|
||||
|
||||
### Manual Mount (testing only)
|
||||
|
||||
```bash
|
||||
# Verify exports are visible from client
|
||||
showmount -e 192.168.5.10
|
||||
|
||||
# Test manual mount
|
||||
sudo mkdir -p /mnt/znas
|
||||
sudo mount -t nfs4 192.168.5.10:/ /mnt/znas
|
||||
|
||||
# Verify tree is accessible
|
||||
ls /mnt/znas/Data/media/books/
|
||||
|
||||
# Unmount after testing
|
||||
sudo umount /mnt/znas
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Adding New Datasets
|
||||
|
||||
When creating a new ZFS dataset that needs to be NFS-accessible:
|
||||
|
||||
```bash
|
||||
# Create with the correct mountpoint from the start
|
||||
sudo zfs create -o mountpoint=/export/Data/new_folder vault/Data/new_folder
|
||||
```
|
||||
|
||||
The dataset will be automatically visible via NFS due to `crossmnt` and `nohide` on the parent — no changes to `/etc/exports` needed unless the new dataset requires different access controls.
|
||||
|
||||
If different permissions are required, add an explicit entry to `/etc/exports` and reload:
|
||||
|
||||
```bash
|
||||
sudo exportfs -ra
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Current Export List
|
||||
|
||||
Verified via `showmount -e 127.0.0.1`:
|
||||
|
||||
```
|
||||
/export/photos *
|
||||
/export/Green *
|
||||
/export/Docker *
|
||||
/export/Data/media/comics *
|
||||
/export/Data/media/books *
|
||||
/export/Data *
|
||||
/export/Common *
|
||||
/export *
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Known Gotchas
|
||||
|
||||
**Loopback mount races NFS at boot** — This was the hardest problem to solve. A plain `_netdev` fstab entry only guarantees the network interface is up, not that the NFS server is ready to accept connections. The loopback mount would attempt before NFS finished starting and fail silently or hang. The fix is `x-systemd.after=nfs-server.service` in the fstab options, which causes systemd-fstab-generator to emit an `After=nfs-server.service` dependency in the generated mount unit. The full required boot chain is: `zfs-import.target` → `zfs-mount.service` → `nfs-server.service` → `data-nfs-znas.mount`. Each link must be explicit.
|
||||
|
||||
**Do not hand-write a systemd mount unit for the loopback** — systemd-fstab-generator creates `data-nfs-znas.mount` automatically from the fstab entry at runtime (in `/run/systemd/generator/`, not `/etc/systemd/system/`). Creating a manual unit in `/etc/systemd/system/` will conflict with the generated one.
|
||||
|
||||
**Autofs must be disabled on the server** — Running autofs on ZNAS itself creates a recursive mount loop. Autofs belongs on clients only. If autofs is accidentally re-enabled on ZNAS it will fight with the fstab loopback mount.
|
||||
|
||||
**NFSv4 pseudo-root is required** — The `/export` entry with `fsid=0` is mandatory for NFSv4 clients. Without it clients cannot enumerate the export tree. Do not remove it even though it looks redundant.
|
||||
|
||||
**`nohide` on sub-datasets** — `vault/Data/media_books` and `vault/Data/media_comics` are separate ZFS datasets mounted beneath the `vault/Data` export path. NFS does not cross filesystem boundaries by default. Without `nohide` clients see empty directories at those paths even though the data is present.
|
||||
|
||||
**Do not use bind mounts for ZFS datasets** — Configure ZFS mountpoints directly to `/export/*`. Bind mounts in fstab for ZFS datasets cause ordering problems and are unnecessary.
|
||||
|
||||
**Always set mountpoints when creating new datasets** — If a dataset is created without an explicit mountpoint it will inherit the parent's path and may not be visible or exportable correctly. Set `mountpoint=` at creation time.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Datasets not visible via NFS
|
||||
|
||||
```bash
|
||||
# Verify dataset is mounted
|
||||
zfs list | grep dataset_name
|
||||
|
||||
# Check NFS can read it
|
||||
sudo -u nobody ls -la /export/path/to/dataset/
|
||||
|
||||
# Reload exports
|
||||
sudo exportfs -ra
|
||||
sudo systemctl restart nfs-server
|
||||
```
|
||||
|
||||
### Client shows empty directories
|
||||
|
||||
```bash
|
||||
# Clear NFS cache and remount
|
||||
sudo umount -f /mnt/znas
|
||||
sudo mount -t nfs4 192.168.5.10:/ /mnt/znas
|
||||
|
||||
# Test without caching to isolate the problem
|
||||
sudo mount -t nfs4 -o noac,lookupcache=none 192.168.5.10:/ /mnt/znas
|
||||
```
|
||||
|
||||
### After reboot, exports are empty
|
||||
|
||||
```bash
|
||||
# Confirm ZFS mounted before NFS started
|
||||
systemctl status zfs-mount.service
|
||||
systemctl status nfs-server.service
|
||||
|
||||
# Confirm override is in place
|
||||
systemctl cat nfs-server.service | grep -A5 "\[Unit\]"
|
||||
```
|
||||
|
||||
### Loopback mount not working for Swarm containers
|
||||
|
||||
```bash
|
||||
# Check mount unit status
|
||||
systemctl status data-nfs-znas.mount
|
||||
|
||||
# Verify full dependency chain is satisfied
|
||||
systemctl status zfs-mount.service
|
||||
systemctl status nfs-server.service
|
||||
systemctl status data-nfs-znas.mount
|
||||
|
||||
# Verify loopback is mounted
|
||||
mount | grep data/nfs/znas
|
||||
|
||||
# If missing, mount manually to test
|
||||
sudo mount -t nfs4 127.0.0.1:/ /data/nfs/znas
|
||||
|
||||
# Check container can see the path
|
||||
docker run --rm -v /data/nfs/znas/Data:/data alpine ls /data
|
||||
```
|
||||
|
||||
If the unit fails at boot, confirm the fstab entry includes `x-systemd.after=nfs-server.service` — without this the mount races against NFS startup and loses. A plain `_netdev` entry is not sufficient.
|
||||
|
||||
---
|
||||
|
||||
## Configuration Files Reference
|
||||
|
||||
### /etc/exports
|
||||
|
||||
```
|
||||
/export *(ro,fsid=0,no_root_squash,no_subtree_check,crossmnt)
|
||||
/export/Common *(fsid=4,rw,no_subtree_check,insecure)
|
||||
/export/Data *(fsid=5,rw,no_subtree_check,insecure,crossmnt)
|
||||
/export/Data/media/books *(fsid=51,rw,no_subtree_check,insecure,nohide)
|
||||
/export/Data/media/comics *(fsid=52,rw,no_subtree_check,insecure,nohide)
|
||||
/export/Docker *(fsid=29,rw,no_root_squash,sync,no_subtree_check,insecure)
|
||||
/export/Green *(fsid=30,rw,no_root_squash,no_subtree_check,insecure)
|
||||
/export/photos *(fsid=31,rw,no_root_squash,no_subtree_check,insecure)
|
||||
```
|
||||
|
||||
### /etc/systemd/system/nfs-server.service.d/override.conf
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
After=zfs-import.target zfs-mount.service local-fs.target
|
||||
Requires=zfs-import.target zfs-mount.service
|
||||
```
|
||||
|
||||
### /etc/fstab (ZNAS system mounts only)
|
||||
|
||||
ZFS datasets are not listed here — ZFS handles its own mounting. Only system partitions appear:
|
||||
|
||||
```
|
||||
# / - btrfs on nvme0n1p2
|
||||
/dev/disk/by-uuid/40c60952-0340-4a78-81f9-5b2193da26c6 / btrfs defaults 0 1
|
||||
# /boot - ext4 on nvme0n1p3
|
||||
/dev/disk/by-uuid/4abb4efa-0b2b-4e4a-bcaf-78227db4628f /boot ext4 defaults 0 1
|
||||
# swap
|
||||
/dev/disk/by-uuid/d07437a0-3d0e-417a-a88e-438c603c2237 none swap sw 0 0
|
||||
# /srv - btrfs on nvme0n1p5
|
||||
/dev/disk/by-uuid/c66e81ff-436e-4d6f-980b-6f4875ea7c8e /srv btrfs defaults 0 1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Command Reference
|
||||
|
||||
- Show active exports: `sudo exportfs -v`
|
||||
- Reload exports: `sudo exportfs -ra`
|
||||
- Show available exports (from any host): `showmount -e 192.168.5.10`
|
||||
- Restart NFS: `sudo systemctl restart nfs-server`
|
||||
- Check NFS status: `systemctl status nfs-server`
|
||||
- Verify ZFS mounts: `mount | grep export`
|
||||
- Verify loopback: `mount | grep data/nfs`
|
||||
239
Netgrimoire/Vault-Grimoire/ZFS/Storage-Layout.md
Normal file
239
Netgrimoire/Vault-Grimoire/ZFS/Storage-Layout.md
Normal file
|
|
@ -0,0 +1,239 @@
|
|||
---
|
||||
title: Netgrimoire Storage
|
||||
description: Where is it at
|
||||
published: true
|
||||
date: 2026-02-23T18:38:27.621Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-01-22T21:10:37.035Z
|
||||
---
|
||||
|
||||
# NAS Storage Layout
|
||||
|
||||
## Overview
|
||||
|
||||
ZNAS is the primary NAS for Netgrimoire. It runs Ubuntu with OpenZFS and serves as the source of truth for all storage, including datasets that replicate out to the Pocket Grimoire portable system.
|
||||
|
||||
The system mounts everything under `/export/` for NFS sharing, with select datasets mounted under `/srv/` for local service consumption (Immich, NextCloud-AIO, Kopia, backup).
|
||||
|
||||
## ZFS Pools
|
||||
|
||||
- `vault` — primary NAS storage, RAIDZ1×2, 8 drives
|
||||
- `greenpg` — Pocket Grimoire GREEN SSD (Kanguru UltraLock), docked for sync when present
|
||||
|
||||
## Zpool Architecture
|
||||
|
||||
```
|
||||
pool: vault
|
||||
state: ONLINE
|
||||
scan: scrub repaired 0B in 2 days 10:24:08 with 0 errors on Tue Feb 10 10:48:10 2026
|
||||
|
||||
config:
|
||||
NAME STATE READ WRITE CKSUM
|
||||
vault ONLINE 0 0 0
|
||||
raidz1-0 ONLINE 0 0 0
|
||||
ata-ST24000DM001-3Y7103_ZXA06K45 ONLINE 0 0 0
|
||||
ata-ST24000DM001-3Y7103_ZXA08CVY ONLINE 0 0 0
|
||||
ata-ST24000DM001-3Y7103_ZXA0FP10 ONLINE 0 0 0
|
||||
raidz1-1 ONLINE 0 0 0
|
||||
ata-ST16000NE000-2RW103_ZL2Q3275 ONLINE 0 0 0
|
||||
ata-ST16000NM001G-2KK103_ZL26R5XW ONLINE 0 0 0
|
||||
ata-ST16000NT001-3LV101_ZRS0KVQW ONLINE 0 0 0
|
||||
ata-WDC_WD140EDFZ-11A0VA0_9MG81N0J ONLINE 0 0 0
|
||||
ata-WDC_WD140EDFZ-11A0VA0_Y5J35Z6C ONLINE 0 0 0
|
||||
|
||||
errors: No known data errors
|
||||
```
|
||||
|
||||
`raidz1-0` is 3× Seagate 24TB (~48TB usable). `raidz1-1` is 3× Seagate 16TB + 2× WD 14TB (~56TB usable — the 14TB drives are the limiting factor per stripe, leaving ~2TB/drive unused on the 16TB drives). Total pool: ~94TB raw, 39TB currently available.
|
||||
|
||||
```
|
||||
pool: greenpg
|
||||
state: ONLINE
|
||||
|
||||
config:
|
||||
NAME STATE READ WRITE CKSUM
|
||||
greenpg ONLINE 0 0 0
|
||||
scsi-1Kanguru_UltraLock_DB090722NC10001 ONLINE 0 0 0
|
||||
|
||||
errors: No known data errors
|
||||
```
|
||||
|
||||
`greenpg` is a portable pool. Export it before physically moving to Pocket Grimoire.
|
||||
|
||||
## ZFS Datasets
|
||||
|
||||
| Dataset | Mountpoint | Used | Avail | Refer | Quota | Compression | Purpose |
|
||||
|---------|-----------|------|-------|-------|-------|-------------|---------|
|
||||
| `vault` | `/export` | 55.3T | 39.0T | 771G | none | 1.00x | Pool root / NFSv4 pseudo-root |
|
||||
| `vault/Common` | `/export/Common` | 214G | 39.0T | 214G | none | 1.06x | General shared storage |
|
||||
| `vault/Data` | `/export/Data` | 38.4T | 39.0T | 36.4T | none | 1.00x | Primary data — 36.4T lives directly in dataset root |
|
||||
| `vault/Data/media_books` | `/export/Data/media/books` | 925G | 39.0T | 925G | none | 1.03x | Book library |
|
||||
| `vault/Data/media_comics` | `/export/Data/media/comics` | 1.15T | 39.0T | 1.15T | none | 1.00x | Comic library |
|
||||
| `vault/Green` | `/export/Green` | 14.7T | 5.31T | 9.66T | 20T | 1.00x | Personal media — 9.66T direct, 5.02T in Pocket child |
|
||||
| `vault/Green/Pocket` | `/export/Green/Pocket` | 5.02T | 2.48T | 5.02T | 7.5T | 1.00x | Pocket Grimoire replication source |
|
||||
| `vault/Kopia` | `/srv/vault/kopia_repository` | 349G | 39.0T | 349G | none | 1.02x | Kopia backup repository |
|
||||
| `vault/NextCloud-AIO` | `/srv/NextCloud-AIO` | 341G | 39.0T | 341G | none | 1.01x | NextCloud data |
|
||||
| `vault/Photos` | `/export/Photos` | 135K | 39.0T | 135K | none | 1.00x | Photos (sparse — see notes) |
|
||||
| `vault/backup` | `/srv/vault/backup` | 442G | 582G | 442G | 1T | 1.00x | Local system backups |
|
||||
| `vault/docker` | `/export/Docker` | 22.2G | 39.0T | 22.2G | none | 1.13x | Docker volumes |
|
||||
| `vault/immich` | `/srv/immich` | 117G | 39.0T | 117G | none | 1.03x | Immich photo service data |
|
||||
| `greenpg` | `/greenpg` | 2.94T | 4.20T | 96K | — | 1.00x | GREEN SSD pool root (portable) |
|
||||
| `greenpg/Pocket` | `/greenpg/Pocket` | 2.94T | 4.20T | 2.94T | — | 1.00x | Personal media + Stash data |
|
||||
|
||||
**Notes on specific datasets:**
|
||||
|
||||
`vault/Data` — 36.4T lives directly in the dataset root at `/export/Data/`. `media_books` and `media_comics` are the only child datasets and account for ~2T combined. The remaining ~36T is general data stored directly under the parent.
|
||||
|
||||
`vault/Green` — 9.66T lives directly in `/export/Green/` with the remaining 5.02T in the `Pocket` child dataset. The 20T quota caps total Green growth. `vault/Green/Pocket` has its own 7.5T sub-quota.
|
||||
|
||||
`vault/Photos` — nearly empty (135K). Photos are primarily managed through Immich at `vault/immich`. This dataset may be vestigial or reserved for future use.
|
||||
|
||||
`vault/backup` — has a hard 1T quota. Unlike other vault datasets which draw from the full 39T pool availability, this dataset is capped. Current usage is 442G with 582G remaining.
|
||||
|
||||
Compression ratios are near 1.00x across most datasets because content is already compressed (media files, binary data). `vault/docker` (1.13x) and `vault/Common` (1.06x) see modest gains from compressible config and text data.
|
||||
|
||||
## NFS Exports
|
||||
|
||||
All exports use NFSv4 with `/export` as the pseudo-filesystem root (`fsid=0`).
|
||||
|
||||
| Export | fsid | Options | Notes |
|
||||
|--------|------|---------|-------|
|
||||
| `/export` | 0 | `ro, no_root_squash, no_subtree_check, crossmnt` | NFSv4 pseudo-root — required for v4 clients |
|
||||
| `/export/Common` | 4 | `rw, no_subtree_check, insecure` | General access |
|
||||
| `/export/Data` | 5 | `rw, no_subtree_check, insecure, crossmnt` | Data root |
|
||||
| `/export/Data/media/books` | 51 | `rw, no_subtree_check, insecure, nohide` | Separate ZFS dataset — needs `nohide` |
|
||||
| `/export/Data/media/comics` | 52 | `rw, no_subtree_check, insecure, nohide` | Separate ZFS dataset — needs `nohide` |
|
||||
| `/export/Docker` | 29 | `rw, no_root_squash, sync, no_subtree_check, insecure` | Container volumes |
|
||||
| `/export/Green` | 30 | `rw, no_root_squash, no_subtree_check, insecure` | Personal media + Pocket Grimoire source |
|
||||
| `/export/photos` | 31 | `rw, no_root_squash, no_subtree_check, insecure` | Photos |
|
||||
|
||||
Current `/etc/exports`:
|
||||
|
||||
```
|
||||
/export *(ro,fsid=0,no_root_squash,no_subtree_check,crossmnt)
|
||||
/export/Common *(fsid=4,rw,no_subtree_check,insecure)
|
||||
/export/Data *(fsid=5,rw,no_subtree_check,insecure,crossmnt)
|
||||
/export/Data/media/books *(fsid=51,rw,no_subtree_check,insecure,nohide)
|
||||
/export/Data/media/comics *(fsid=52,rw,no_subtree_check,insecure,nohide)
|
||||
/export/Docker *(fsid=29,rw,no_root_squash,sync,no_subtree_check,insecure)
|
||||
/export/Green *(fsid=30,rw,no_root_squash,no_subtree_check,insecure)
|
||||
/export/photos *(fsid=31,rw,no_root_squash,no_subtree_check,insecure)
|
||||
```
|
||||
|
||||
There is also an active loopback NFSv4 mount on the system itself:
|
||||
|
||||
```
|
||||
localhost:/ → /data/nfs/znas (NFSv4.2, rsize/wsize=1M)
|
||||
```
|
||||
|
||||
## SMB Shares
|
||||
|
||||
*(To be documented.)*
|
||||
|
||||
## Standard Paths
|
||||
|
||||
- `/export/` — NFS root (vault pool root)
|
||||
- `/export/Data/` — primary data
|
||||
- `/export/Data/media/books/` — book library
|
||||
- `/export/Data/media/comics/` — comic library
|
||||
- `/export/Green/` — personal media
|
||||
- `/export/Green/Pocket/` — Pocket Grimoire replication source
|
||||
- `/export/Docker/` — container volumes
|
||||
- `/export/Photos/` — photos
|
||||
- `/srv/immich/` — Immich service data
|
||||
- `/srv/NextCloud-AIO/` — NextCloud data
|
||||
- `/srv/vault/kopia_repository/` — Kopia backup repo
|
||||
- `/srv/vault/backup/` — local system backups
|
||||
- `/greenpg/Pocket/` — GREEN SSD when docked for sync
|
||||
|
||||
## Permissions & UID/GID Model
|
||||
|
||||
*(To be documented — dockhand UID 1964, container access rules.)*
|
||||
|
||||
## Services Using Local Mounts
|
||||
|
||||
These datasets are consumed directly by services on ZNAS and are not NFS-exported:
|
||||
|
||||
| Service | Dataset | Mountpoint |
|
||||
|---------|---------|-----------|
|
||||
| Immich | `vault/immich` | `/srv/immich` |
|
||||
| NextCloud-AIO | `vault/NextCloud-AIO` | `/srv/NextCloud-AIO` |
|
||||
| Kopia | `vault/Kopia` | `/srv/vault/kopia_repository` |
|
||||
| Local backup | `vault/backup` | `/srv/vault/backup` |
|
||||
|
||||
## Pocket Grimoire Integration
|
||||
|
||||
`vault/Green/Pocket` is the replication source for the Pocket Grimoire GREEN SSD (`greenpg`). It contains personal media and Stash application data (database, previews, blobs). See the Pocket Grimoire deployment guide for full procedures.
|
||||
|
||||
**Fast resync when GREEN SSD is physically docked on ZNAS:**
|
||||
|
||||
```bash
|
||||
# Check pool name (retains whatever name it had when last exported)
|
||||
zpool list | grep greenpg
|
||||
|
||||
# Import if needed
|
||||
sudo zpool import greenpg
|
||||
sudo zfs load-key greenpg
|
||||
sudo zfs mount -a
|
||||
|
||||
# Sync
|
||||
sudo syncoid vault/Green/Pocket greenpg/Pocket
|
||||
|
||||
# Export before physically disconnecting — always do this
|
||||
sudo zfs unmount greenpg/Pocket
|
||||
sudo zfs unmount greenpg
|
||||
sudo zpool export greenpg
|
||||
```
|
||||
|
||||
**Network sync** runs automatically on Pocket Grimoire via a 6-hour syncoid systemd timer when connected over the network.
|
||||
|
||||
## Backup & Snapshot Strategy
|
||||
|
||||
**Snapshots:**
|
||||
|
||||
```bash
|
||||
# Manual pre-change snapshot
|
||||
zfs snapshot vault/Docker@before-upgrade
|
||||
|
||||
# List all snapshots
|
||||
zfs list -t snapshot
|
||||
|
||||
# List snapshots for a specific dataset
|
||||
zfs list -t snapshot -r vault/Green
|
||||
```
|
||||
|
||||
**Kopia:** Repository at `vault/Kopia` → `/srv/vault/kopia_repository`. *(Document snapshot policy and sources.)*
|
||||
|
||||
**Replication:** `vault/Green/Pocket` → `greenpg/Pocket` via syncoid. See Pocket Grimoire Integration above.
|
||||
|
||||
## Known Gotchas
|
||||
|
||||
**NFSv4 pseudo-root** — The `/export` entry with `fsid=0` is required for NFSv4 clients to enumerate subdirectories. Do not remove it even if it appears redundant.
|
||||
|
||||
**`nohide` on sub-datasets** — `vault/Data/media_books` and `vault/Data/media_comics` are separate ZFS datasets mounted beneath the `vault/Data` export path. NFS does not cross filesystem boundaries by default. Without `nohide` clients see empty directories at those paths.
|
||||
|
||||
**`vault/backup` quota** — This dataset has a hard 1T quota and does not share the general pool availability. Current headroom is ~582G. Monitor before large backup operations.
|
||||
|
||||
**`vault/Green` quota** — Capped at 20T total with a 7.5T sub-quota on `vault/Green/Pocket`. The GREEN SSD itself is ~7TB, so the sub-quota is the effective ceiling for the Pocket sync.
|
||||
|
||||
**raidz1-1 mixed drive sizes** — The three 16TB drives in raidz1-1 have ~2TB/drive going unused because RAIDZ1 stripes are limited by the smallest drive in the VDEV (14TB WDs). This capacity is permanently unavailable unless the VDEV is rebuilt.
|
||||
|
||||
**Kanguru UltraLock hardware encryption** — The GREEN SSD has hardware-level PIN protection in addition to ZFS encryption. The drive must be hardware-unlocked before `zpool import` will see it.
|
||||
|
||||
**Always export `greenpg` before disconnecting** — Export flushes writes and marks the pool clean. Pulling the drive without exporting risks a dirty import on next use.
|
||||
|
||||
**`vault/Data` root usage** — 36.4T lives directly in `/export/Data/` rather than in child datasets. This is normal for this setup but means `zfs list` on the parent alone shows the full usage without a breakdown.
|
||||
|
||||
## Command Reference
|
||||
|
||||
- Health: `zpool status`
|
||||
- Space available to pool: `zpool list`
|
||||
- Space available to datasets: `zfs list`
|
||||
- Dataset configuration: `zfs get -r compression,dedup,recordsize,atime,quota,reservation vault`
|
||||
- Create a snapshot: `zfs snapshot vault/Docker@before-upgrade`
|
||||
- List snapshots: `zfs list -t snapshot`
|
||||
- Reload NFS exports: `sudo exportfs -ra`
|
||||
- Show active NFS exports: `sudo exportfs -v`
|
||||
- Run a scrub: `sudo zpool scrub vault`
|
||||
- Sync GREEN SSD: `sudo syncoid vault/Green/Pocket greenpg/Pocket`
|
||||
168
Netgrimoire/Vault-Grimoire/ZFS/ZFS-Commands.md
Normal file
168
Netgrimoire/Vault-Grimoire/ZFS/ZFS-Commands.md
Normal file
|
|
@ -0,0 +1,168 @@
|
|||
---
|
||||
title: ZFS Common Commands
|
||||
description: ZFS Commands
|
||||
published: true
|
||||
date: 2026-02-20T04:26:23.798Z
|
||||
tags: zfs commands
|
||||
editor: markdown
|
||||
dateCreated: 2026-01-31T15:23:07.585Z
|
||||
---
|
||||
|
||||
# ZFS Essential Commands Cheat Sheet
|
||||
|
||||
---
|
||||
|
||||
## Pool Health & Status
|
||||
|
||||
zpool status
|
||||
|
||||
zpool status -v
|
||||
|
||||
zpool list
|
||||
|
||||
## Dataset Space & Usage
|
||||
|
||||
zfs list
|
||||
|
||||
zfs list -r vault
|
||||
|
||||
zfs list -o name,used,avail,refer,logicalused,compressratio
|
||||
|
||||
zfs list -r -o name,used,avail,refer,quota,reservation vault
|
||||
|
||||
## Dataset Properties & Settings
|
||||
|
||||
zfs get all vault/dataset
|
||||
|
||||
zfs get -r compression,dedup,recordsize,atime,quota,reservation vault
|
||||
|
||||
zfs get -r compression,dedup,recordsize,encryption,keylocation,keyformat,snapdir vault
|
||||
|
||||
zfs get -s local -r all vault
|
||||
|
||||
zfs get quota,refquota,reservation,refreservation -r vault
|
||||
|
||||
## Mount Encrypted Dataset
|
||||
|
||||
zfs load-key vault/Green/Pocket
|
||||
|
||||
zfs mount vault/Green/Pocket
|
||||
|
||||
## Pool I/O & Performance Monitoring
|
||||
|
||||
zpool iostat -v 1
|
||||
|
||||
arcstat 1
|
||||
|
||||
cat /proc/spl/kstat/zfs/arcstats
|
||||
|
||||
## Scrubs & Data Integrity
|
||||
|
||||
zpool scrub vault
|
||||
|
||||
zpool scrub -s vault
|
||||
|
||||
zpool status
|
||||
|
||||
## Snapshots
|
||||
|
||||
zfs snapshot vault/dataset@snapname
|
||||
|
||||
zfs list -t snapshot
|
||||
|
||||
zfs rollback vault/dataset@snapname
|
||||
|
||||
zfs clone vault/dataset@snapname vault/dataset-clone
|
||||
|
||||
## Replication (Send / Receive)
|
||||
|
||||
zfs send vault/dataset@snap1 | zfs receive backup/dataset
|
||||
|
||||
zfs send -i snap1 vault/dataset@snap2 | zfs receive backup/dataset
|
||||
|
||||
zfs send -nv vault/dataset@snap1
|
||||
|
||||
## Dataset Tuning (Live-Safe Changes)
|
||||
|
||||
zfs set compression=lz4 vault/dataset
|
||||
|
||||
zfs set recordsize=1M vault/dataset
|
||||
|
||||
zfs set atime=off vault/dataset
|
||||
|
||||
zfs set dedup=on vault/dataset
|
||||
|
||||
## Encryption Management
|
||||
|
||||
zfs get encryption,keylocation,keystatus vault/dataset
|
||||
|
||||
zfs unload-key vault/dataset
|
||||
|
||||
zfs load-key vault/dataset
|
||||
|
||||
## Disk Preparation & Cleanup
|
||||
|
||||
wipefs /dev/sdX
|
||||
|
||||
wipefs -a /dev/sdX
|
||||
|
||||
zpool labelclear -f /dev/sdX
|
||||
|
||||
sgdisk --zap-all /dev/sdX
|
||||
|
||||
lsblk -f /dev/sdX
|
||||
|
||||
## Pool Expansion (Add VDEV)
|
||||
|
||||
zpool add vault raidz2 \
|
||||
/dev/disk/by-id/disk1 \
|
||||
/dev/disk/by-id/disk2 \
|
||||
/dev/disk/by-id/disk3 \
|
||||
/dev/disk/by-id/disk4 \
|
||||
/dev/disk/by-id/disk5
|
||||
|
||||
## Pool Import / Recovery
|
||||
|
||||
zpool import
|
||||
|
||||
zpool import vault
|
||||
|
||||
zpool import -f vault
|
||||
|
||||
zpool import -o readonly=on vault
|
||||
|
||||
## Locks, Holds & History
|
||||
|
||||
zfs holds -r vault
|
||||
|
||||
zpool history
|
||||
|
||||
zfs diff vault/dataset@snap1 vault/dataset@snap2
|
||||
|
||||
## Deduplication & Compression Stats
|
||||
|
||||
zpool list -v
|
||||
|
||||
zdb -DD vault
|
||||
|
||||
## Inventory / Documentation Dumps
|
||||
|
||||
zpool status > zpool-status.txt
|
||||
|
||||
zfs list -r > zfs-layout.txt
|
||||
|
||||
zfs get -r all vault > zfs-settings.txt
|
||||
|
||||
## Top 10 Must-Know Commands
|
||||
|
||||
zpool status
|
||||
zpool list
|
||||
zpool iostat -v 1
|
||||
zpool scrub vault
|
||||
zfs list
|
||||
zfs get all vault/dataset
|
||||
zfs snapshot vault/dataset@snap
|
||||
zfs rollback vault/dataset@snap
|
||||
zfs send | zfs receive
|
||||
arcstat 1
|
||||
|
||||
39
Netgrimoire/Ward-Grimoire/Access/Auth-Overview.md
Normal file
39
Netgrimoire/Ward-Grimoire/Access/Auth-Overview.md
Normal file
|
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
title: Authentication Overview
|
||||
description: SSO, LDAP, and access control in Netgrimoire
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: ward, auth, sso
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Authentication Overview
|
||||
|
||||
## SSO Providers
|
||||
|
||||
| Provider | Scope | URL |
|
||||
|----------|-------|-----|
|
||||
| Authentik | `*.netgrimoire.com` | Protected via `caddy.import_1: authentik` label |
|
||||
| Authelia | `*.wasted-bandwidth.net` | Green Grimoire + Shadow Grimoire services |
|
||||
|
||||
Both providers use LLDAP as their LDAP backend.
|
||||
|
||||
## LLDAP
|
||||
|
||||
Lightweight LDAP directory at `ldap.netgrimoire.com`. Postgres backend. Provides the user directory for both Authentik and Authelia.
|
||||
|
||||
See [LDAP Client Setup](/Ward-Grimoire/Access/LDAP-Client-Setup) for configuring hosts to authenticate via LLDAP.
|
||||
|
||||
## Vaultwarden
|
||||
|
||||
Password manager at `pass.netgrimoire.com`. Protected by Authentik.
|
||||
|
||||
## WireGuard
|
||||
|
||||
5 VPN peers on 192.168.32.0/24. Managed in OPNsense. See [Host Inventory](/Keystone-Grimoire/Hosts/Host-Inventory) for peer assignments.
|
||||
|
||||
## YubiKey (Planned)
|
||||
|
||||
- PIV SSH authentication on all hosts — highest-impact pending integration
|
||||
- Challenge-response for LUKS / Kopia key derivation on znas
|
||||
218
Netgrimoire/Ward-Grimoire/Access/LDAP-Client-Setup.md
Normal file
218
Netgrimoire/Ward-Grimoire/Access/LDAP-Client-Setup.md
Normal file
|
|
@ -0,0 +1,218 @@
|
|||
---
|
||||
title: LDAP Client Setup
|
||||
description:
|
||||
published: true
|
||||
date: 2026-02-20T04:33:31.862Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-01-21T13:21:40.588Z
|
||||
---
|
||||
|
||||
|
||||
Your content here✅ LLDAP + SSSD Node Join Checklist (FINAL)
|
||||
|
||||
Assumptions
|
||||
|
||||
LLDAP server: docker4
|
||||
|
||||
LDAP URI: ldap://docker4:3890
|
||||
|
||||
Base DN: dc=netgrimoire,dc=com
|
||||
|
||||
Users/groups use lowercase attributes (uidnumber, gidnumber, homedirectory, unixshell, uniquemember)
|
||||
|
||||
No TLS (lab only)
|
||||
|
||||
Docker group GID = 1964 in LDAP
|
||||
|
||||
This node is Ubuntu/Debian-based
|
||||
|
||||
0️⃣ Safety first (do this every time)
|
||||
|
||||
Open two SSH sessions to the node
|
||||
|
||||
Confirm you can sudo
|
||||
|
||||
Do not edit nsswitch.conf until SSSD is confirmed working
|
||||
|
||||
1️⃣ Install required packages
|
||||
sudo apt update
|
||||
sudo apt install -y sssd sssd-ldap sssd-tools libpam-sss libnss-sss libsss-sudo ldap-utils oddjob oddjob-mkhomedir
|
||||
|
||||
Ensure legacy LDAP NSS is NOT installed
|
||||
sudo apt purge -y libnss-ldap libpam-ldap nslcd libnss-ldapd libpam-ldapd || true
|
||||
sudo apt autoremove -y
|
||||
|
||||
2️⃣ Verify LDAP connectivity (must pass)
|
||||
getent hosts docker4
|
||||
nc -vz docker4 3890
|
||||
ldapwhoami -x -H ldap://docker4:3890 \
|
||||
-D 'uid=admin,ou=people,dc=netgrimoire,dc=com' -w 'F@lcon13'
|
||||
|
||||
|
||||
❌ If any fail → stop and fix networking/DNS/firewall.
|
||||
|
||||
3️⃣ Create /etc/sssd/sssd.conf (single file, no includes)
|
||||
sudo vi /etc/sssd/sssd.conf
|
||||
|
||||
|
||||
Paste exactly:
|
||||
|
||||
[sssd]
|
||||
services = nss, pam, ssh
|
||||
config_file_version = 2
|
||||
domains = netgrimoire.com
|
||||
|
||||
[nss]
|
||||
filter_users = root
|
||||
filter_groups = root
|
||||
|
||||
[pam]
|
||||
offline_failed_login_attempts = 3
|
||||
offline_failed_login_delay = 5
|
||||
|
||||
[ssh]
|
||||
|
||||
[domain/netgrimoire.com]
|
||||
id_provider = ldap
|
||||
auth_provider = ldap
|
||||
chpass_provider = ldap
|
||||
access_provider = permit
|
||||
|
||||
enumerate = false
|
||||
cache_credentials = true
|
||||
|
||||
ldap_uri = ldap://docker4:3890
|
||||
ldap_schema = rfc2307bis
|
||||
ldap_search_base = dc=netgrimoire,dc=com
|
||||
|
||||
ldap_auth_disable_tls_never_use_in_production = true
|
||||
ldap_id_use_start_tls = false
|
||||
ldap_tls_reqcert = never
|
||||
|
||||
ldap_default_bind_dn = uid=admin,ou=people,dc=netgrimoire,dc=com
|
||||
ldap_default_authtok = F@lcon13
|
||||
|
||||
# USERS (lowercase attributes)
|
||||
ldap_user_search_base = ou=people,dc=netgrimoire,dc=com
|
||||
ldap_user_object_class = posixAccount
|
||||
ldap_user_name = uid
|
||||
ldap_user_gecos = cn
|
||||
ldap_user_uid_number = uidnumber
|
||||
ldap_user_gid_number = gidnumber
|
||||
ldap_user_home_directory = homedirectory
|
||||
ldap_user_shell = unixshell
|
||||
|
||||
# GROUPS (lowercase attributes)
|
||||
ldap_group_search_base = ou=groups,dc=netgrimoire,dc=com
|
||||
ldap_group_object_class = groupOfUniqueNames
|
||||
ldap_group_name = cn
|
||||
ldap_group_gid_number = gidnumber
|
||||
ldap_group_member = uniquemember
|
||||
|
||||
4️⃣ Fix permissions (SSSD will NOT start without this)
|
||||
sudo chown root:root /etc/sssd/sssd.conf
|
||||
sudo chmod 600 /etc/sssd/sssd.conf
|
||||
sudo chmod 700 /etc/sssd
|
||||
|
||||
|
||||
Validate:
|
||||
|
||||
sudo sssctl config-check
|
||||
|
||||
5️⃣ Start SSSD cleanly
|
||||
sudo systemctl enable sssd
|
||||
sudo systemctl stop sssd
|
||||
sudo rm -f /var/lib/sss/db/* /var/lib/sss/mc/*
|
||||
sudo systemctl start sssd
|
||||
|
||||
|
||||
Verify:
|
||||
|
||||
sudo systemctl status sssd --no-pager -l
|
||||
sudo sssctl domain-status netgrimoire.com
|
||||
|
||||
|
||||
Expected:
|
||||
|
||||
Online status: Online
|
||||
LDAP: docker4
|
||||
|
||||
6️⃣ Enable NSS lookups via SSSD (LDAP-first)
|
||||
|
||||
Edit /etc/nsswitch.conf:
|
||||
|
||||
passwd: sss files systemd
|
||||
group: sss files systemd
|
||||
shadow: sss files
|
||||
|
||||
|
||||
Test:
|
||||
|
||||
getent passwd graymutt
|
||||
getent group docker
|
||||
id graymutt
|
||||
|
||||
7️⃣ 🔑 RE-INITIALIZE PAM (THIS IS THE STEP YOU REMEMBERED)
|
||||
|
||||
This step is mandatory on Debian/Ubuntu.
|
||||
|
||||
sudo pam-auth-update
|
||||
|
||||
In the menu, ENABLE:
|
||||
|
||||
✅ Unix authentication
|
||||
|
||||
✅ SSSD
|
||||
|
||||
✅ Create home directory on login
|
||||
|
||||
DISABLE:
|
||||
|
||||
❌ LDAP Authentication (legacy)
|
||||
|
||||
❌ Kerberos (unless you explicitly use it)
|
||||
|
||||
Press OK.
|
||||
|
||||
8️⃣ Verify PAM wiring
|
||||
grep pam_sss.so /etc/pam.d/common-*
|
||||
grep pam_mkhomedir /etc/pam.d/common-session
|
||||
|
||||
|
||||
You should see:
|
||||
|
||||
session required pam_mkhomedir.so skel=/etc/skel umask=0022
|
||||
|
||||
9️⃣ Final login test (definitive)
|
||||
ssh graymutt@localhost
|
||||
|
||||
|
||||
Expected:
|
||||
|
||||
Login succeeds
|
||||
|
||||
/home/graymutt is auto-created
|
||||
|
||||
Correct LDAP groups present
|
||||
|
||||
🔟 (Optional but recommended) Remove local docker group
|
||||
|
||||
If the node has a local docker group (gid 998):
|
||||
|
||||
sudo groupdel docker
|
||||
|
||||
|
||||
Verify:
|
||||
|
||||
getent group docker
|
||||
|
||||
|
||||
Expected:
|
||||
|
||||
docker:x:1964:graymutt,dockhand
|
||||
|
||||
🧪 Fast troubleshooting commands
|
||||
sudo sssctl domain-status netgrimoire.com
|
||||
sudo tail -n 200 /var/log/sssd/sssd_netgrimoire.com.log
|
||||
sudo systemctl status sssd --no-pager -l
|
||||
239
Netgrimoire/Ward-Grimoire/Firewall/Blocklists.md
Normal file
239
Netgrimoire/Ward-Grimoire/Firewall/Blocklists.md
Normal file
|
|
@ -0,0 +1,239 @@
|
|||
---
|
||||
title: Opnsense - Additional Blocklists
|
||||
description: Blocklists
|
||||
published: true
|
||||
date: 2026-02-23T21:54:13.019Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-23T21:46:39.562Z
|
||||
---
|
||||
|
||||
# OPNsense Additional Blocklists
|
||||
|
||||
**Service:** Firewall Aliases — URL Table blocklists
|
||||
**Host:** OPNsense firewall
|
||||
**Applies To:** WAN and ATT interfaces
|
||||
**Update Frequency:** Daily (automatic)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Your firewall already uses Spamhaus DROP and EDROP as IP blocklists. These three additional lists fill specific gaps that Spamhaus does not cover:
|
||||
|
||||
| List | What It Blocks | Why It's Needed |
|
||||
|---|---|---|
|
||||
| Feodo Tracker | Botnet command & control IPs | Stops malware on your network phoning home |
|
||||
| Abuse.ch SSLBL | IPs with malicious SSL certificates | Catches malware that uses HTTPS to hide C2 traffic |
|
||||
| Emerging Threats | Confirmed active attack IPs | Broad coverage of IPs currently conducting scans and exploits |
|
||||
|
||||
These work at the **firewall alias level** — the same mechanism as your existing Spamhaus lists. Traffic from/to these IPs is blocked before it reaches any service.
|
||||
|
||||
> ✓ These lists are also used by Suricata internally. Adding them as firewall aliases provides a second, independent enforcement point at the packet filter level — meaning blocks happen even if Suricata is restarted or temporarily inactive.
|
||||
|
||||
---
|
||||
|
||||
## Current Blocklist State
|
||||
|
||||
From your configuration, these lists are already present and working:
|
||||
|
||||
| Alias | List | Status |
|
||||
|---|---|---|
|
||||
| SpamHaus_Drop | Spamhaus DROP | ⚠ Alias active, **rule disabled** |
|
||||
| Spamhaus_edrop | Spamhaus EDROP | ⚠ Alias active, **rule disabled** |
|
||||
| crowdsec_blacklists | CrowdSec IPv4 | ✓ Active |
|
||||
| crowdsec6_blacklists | CrowdSec IPv6 | ✓ Active |
|
||||
|
||||
> ⚠ **First priority:** Before adding new blocklists, re-enable the existing Spamhaus block rules. See the Re-enable Existing Rules section at the bottom of this document.
|
||||
|
||||
---
|
||||
|
||||
## Step 1 — Add Feodo Tracker Alias
|
||||
|
||||
Navigate to **Firewall → Aliases → Add**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `Feodo_Tracker` |
|
||||
| Type | `URL Table (IPs)` |
|
||||
| Description | `Abuse.ch Feodo Tracker — Botnet C2 IPs` |
|
||||
| URL | `https://feodotracker.abuse.ch/downloads/ipblocklist.txt` |
|
||||
| Refresh Frequency | `1` day |
|
||||
| Enabled | ✓ |
|
||||
|
||||
Click **Save**, then **Apply Changes**.
|
||||
|
||||
**Verify the list loaded:**
|
||||
Go to **Firewall → Diagnostics → Aliases**, select `Feodo_Tracker` — you should see a list of IP addresses populated.
|
||||
|
||||
---
|
||||
|
||||
## Step 2 — Add Abuse.ch SSLBL Alias
|
||||
|
||||
Navigate to **Firewall → Aliases → Add**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `AbuseCH_SSLBL` |
|
||||
| Type | `URL Table (IPs)` |
|
||||
| Description | `Abuse.ch SSL Blacklist — Malicious SSL certificate IPs` |
|
||||
| URL | `https://sslbl.abuse.ch/blacklist/sslipblacklist.txt` |
|
||||
| Refresh Frequency | `1` day |
|
||||
| Enabled | ✓ |
|
||||
|
||||
Click **Save**, then **Apply Changes**.
|
||||
|
||||
> ✓ The SSL Blacklist specifically targets IPs that have been observed using SSL/TLS certificates associated with malware botnets. It catches C2 traffic that would otherwise be hidden inside HTTPS.
|
||||
|
||||
---
|
||||
|
||||
## Step 3 — Add Emerging Threats Alias
|
||||
|
||||
Navigate to **Firewall → Aliases → Add**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `ET_Block_IPs` |
|
||||
| Type | `URL Table (IPs)` |
|
||||
| Description | `Emerging Threats — Active attack and scanning IPs` |
|
||||
| URL | `https://rules.emergingthreats.net/fwrules/emerging-Block-IPs.txt` |
|
||||
| Refresh Frequency | `1` day |
|
||||
| Enabled | ✓ |
|
||||
|
||||
Click **Save**, then **Apply Changes**.
|
||||
|
||||
---
|
||||
|
||||
## Step 4 — Create Firewall Block Rules
|
||||
|
||||
One block rule per alias, applied to both WAN and ATT interfaces. Add these rules **above** your existing PASS rules on each interface.
|
||||
|
||||
Navigate to **Firewall → Rules → WAN**
|
||||
|
||||
### Rule 1 — Block Feodo Tracker (WAN)
|
||||
|
||||
Click **Add** (add to top of ruleset):
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Action | Block |
|
||||
| Interface | WAN |
|
||||
| Direction | in |
|
||||
| Protocol | any |
|
||||
| Source | `Feodo_Tracker` (single host or alias) |
|
||||
| Destination | any |
|
||||
| Description | `Block Feodo Tracker botnet C2` |
|
||||
| Log | ✓ Enable logging |
|
||||
|
||||
Click **Save**.
|
||||
|
||||
### Rule 2 — Block Abuse.ch SSLBL (WAN)
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Action | Block |
|
||||
| Interface | WAN |
|
||||
| Direction | in |
|
||||
| Protocol | any |
|
||||
| Source | `AbuseCH_SSLBL` |
|
||||
| Destination | any |
|
||||
| Description | `Block Abuse.ch SSL Blacklist` |
|
||||
| Log | ✓ Enable logging |
|
||||
|
||||
Click **Save**.
|
||||
|
||||
### Rule 3 — Block Emerging Threats (WAN)
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Action | Block |
|
||||
| Interface | WAN |
|
||||
| Direction | in |
|
||||
| Protocol | any |
|
||||
| Source | `ET_Block_IPs` |
|
||||
| Destination | any |
|
||||
| Description | `Block Emerging Threats IPs` |
|
||||
| Log | ✓ Enable logging |
|
||||
|
||||
Click **Save**.
|
||||
|
||||
Click **Apply Changes** on the WAN rules page.
|
||||
|
||||
### Repeat for ATT Interface
|
||||
|
||||
Navigate to **Firewall → Rules → ATT** and add the same three rules with `Interface: ATT`. This ensures blocking applies to both WANs during the transition period, and only ATT after WAN is retired.
|
||||
|
||||
---
|
||||
|
||||
## Step 5 — Also Block Outbound (Optional but Recommended)
|
||||
|
||||
Adding outbound blocks catches the case where an internal device is already compromised and attempting to contact C2 infrastructure. Apply to the LAN interface, direction **out**:
|
||||
|
||||
Navigate to **Firewall → Rules → LAN**, add rules with:
|
||||
- Direction: `out`
|
||||
- Source: `any`
|
||||
- Destination: the respective alias (`Feodo_Tracker`, `AbuseCH_SSLBL`, `ET_Block_IPs`)
|
||||
- Action: `Block`
|
||||
|
||||
This means even if malware bypasses inbound filtering, outbound connections to known C2 IPs are still blocked.
|
||||
|
||||
---
|
||||
|
||||
## Re-enable Existing Spamhaus Rules
|
||||
|
||||
While you are in the firewall rules, re-enable the three currently disabled rules:
|
||||
|
||||
Navigate to **Firewall → Rules → WAN**
|
||||
|
||||
Find these three rules (they appear greyed out):
|
||||
1. `Block DROP` — source: SpamHaus_Drop
|
||||
2. `Block EDROP` — source: Spamhaus_edrop
|
||||
3. GeoIP country block — source: Blocked_Countries
|
||||
|
||||
Click the **enable toggle** (grey circle icon) on each rule to enable them. Click **Apply Changes**.
|
||||
|
||||
> ✓ These aliases are already populated and refreshing automatically. The only reason they were not blocking is because the rules were disabled. Enabling them requires no other changes.
|
||||
|
||||
---
|
||||
|
||||
## Verifying Blocklists Are Working
|
||||
|
||||
### Check Alias Contents
|
||||
|
||||
**Firewall → Diagnostics → Aliases** — select each alias to see the current list of blocked IPs and confirm they are populated.
|
||||
|
||||
### Check Firewall Logs
|
||||
|
||||
**Firewall → Log Files → Live View** — filter by the rule description (e.g., `Feodo Tracker`) to see blocks in real time.
|
||||
|
||||
### Check Update Schedule
|
||||
|
||||
Aliases refresh on the schedule set during creation. To force an immediate refresh:
|
||||
**Firewall → Diagnostics → Aliases → select alias → Flush + Force Update**
|
||||
|
||||
---
|
||||
|
||||
## Complete Blocklist Summary
|
||||
|
||||
After implementing all of the above, your firewall enforces the following IP blocklists:
|
||||
|
||||
| Alias | List | Covers | Update |
|
||||
|---|---|---|---|
|
||||
| SpamHaus_Drop | Spamhaus DROP | Hijacked/compromised netblocks | Daily |
|
||||
| Spamhaus_edrop | Spamhaus EDROP | Extended DROP — bogon routes | Daily |
|
||||
| Feodo_Tracker | Feodo Tracker | Botnet C2 IPs | Daily |
|
||||
| AbuseCH_SSLBL | Abuse.ch SSLBL | Malicious SSL certificate IPs | Daily |
|
||||
| ET_Block_IPs | Emerging Threats | Active scanners & attack IPs | Daily |
|
||||
| crowdsec_blacklists | CrowdSec | Community-reported bad IPs (IPv4) | Real-time |
|
||||
| crowdsec6_blacklists | CrowdSec | Community-reported bad IPs (IPv6) | Real-time |
|
||||
| Blocked_Countries | MaxMind GeoIP | 70 blocked countries | Weekly |
|
||||
|
||||
Combined with Suricata (content inspection) and CrowdSec (IP reputation), this gives you a comprehensive multi-layer perimeter.
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [OPNsense Firewall](./opnsense-firewall) — parent firewall documentation, full alias list
|
||||
- [Suricata IDS/IPS](./suricata-ids-ips) — content inspection layer, also uses these feed sources
|
||||
- [CrowdSec](./crowdsec) — real-time IP reputation blocking
|
||||
182
Netgrimoire/Ward-Grimoire/Firewall/OPNsense-Git-Backup.md
Normal file
182
Netgrimoire/Ward-Grimoire/Firewall/OPNsense-Git-Backup.md
Normal file
|
|
@ -0,0 +1,182 @@
|
|||
---
|
||||
title: OpnSense - GIT Integration
|
||||
description: Git Integration
|
||||
published: true
|
||||
date: 2026-02-23T21:53:24.522Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-23T21:48:01.779Z
|
||||
---
|
||||
|
||||
# OPNsense Git Backup (os-git-backup)
|
||||
|
||||
**Service:** os-git-backup
|
||||
**Plugin:** os-git-backup
|
||||
**Host:** OPNsense firewall
|
||||
**Remote:** Forgejo on Netgrimoire
|
||||
**Trigger:** Automatic on every config change
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Every change made to OPNsense — adding a firewall rule, updating an alias, changing a VPN config — modifies the underlying XML configuration file. By default there is no history of these changes. If a misconfiguration causes an outage, or if you need to audit what changed after a security incident, you have no record to work from.
|
||||
|
||||
os-git-backup solves this by committing the OPNsense configuration to a Git repository automatically every time a change is saved. Each commit records exactly what changed, when, and (if configured) which user made the change.
|
||||
|
||||
**Benefits:**
|
||||
- Full audit trail of every configuration change
|
||||
- One-command rollback to any previous state
|
||||
- Offsite backup of firewall config via Forgejo → Kopia chain
|
||||
- Diff view to understand exactly what a change did
|
||||
|
||||
---
|
||||
|
||||
## Pre-requisite: Create Forgejo Repository
|
||||
|
||||
Before installing the plugin, create a dedicated repository in Forgejo to receive the OPNsense config backups.
|
||||
|
||||
1. Log into your Forgejo instance on Netgrimoire
|
||||
2. Create a new repository: `opnsense-config`
|
||||
3. Set visibility to **Private** — firewall configs contain sensitive network topology
|
||||
4. Do not initialize with a README (the plugin will push the first commit)
|
||||
5. Note the SSH clone URL: `git@git.netgrimoire.com:youruser/opnsense-config.git`
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
### Step 1 — Install the Plugin
|
||||
|
||||
1. Go to **System → Firmware → Plugins**
|
||||
2. Search for `os-git-backup`
|
||||
3. Click the **+** install button
|
||||
4. Wait for installation to complete
|
||||
5. Navigate to **System → Configuration → Backups** — a **Git** tab will appear
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Step 2 — Generate SSH Deploy Key
|
||||
|
||||
The OPNsense firewall needs an SSH key to authenticate to Forgejo without a password.
|
||||
|
||||
Navigate to **System → Configuration → Backups → Git**
|
||||
|
||||
1. Click **Generate SSH Key**
|
||||
2. Copy the displayed **public key** — you will add this to Forgejo next
|
||||
|
||||
### Step 3 — Add Deploy Key to Forgejo
|
||||
|
||||
1. In Forgejo, go to your `opnsense-config` repository
|
||||
2. Navigate to **Settings → Deploy Keys**
|
||||
3. Click **Add Deploy Key**
|
||||
4. Title: `OPNsense Firewall`
|
||||
5. Key: paste the public key from Step 2
|
||||
6. Enable **Allow Write Access** — the firewall needs to push commits
|
||||
7. Click **Add Key**
|
||||
|
||||
### Step 4 — Configure the Plugin
|
||||
|
||||
Navigate to **System → Configuration → Backups → Git**
|
||||
|
||||
| Setting | Value | Notes |
|
||||
|---|---|---|
|
||||
| Enabled | ✓ | |
|
||||
| URL | `git@git.netgrimoire.com:youruser/opnsense-config.git` | SSH URL from your Forgejo repo |
|
||||
| Branch | `main` | |
|
||||
| Name | `OPNsense Firewall` | Author name shown in commits |
|
||||
| Email | `opnsense@netgrimoire.com` | Author email shown in commits |
|
||||
| SSH Private Key | (auto-populated from Step 2) | |
|
||||
| Backup Interval | On change | Commits every time config is saved |
|
||||
|
||||
Click **Save**.
|
||||
|
||||
### Step 5 — Test the Connection
|
||||
|
||||
Click **Backup Now** to trigger a manual backup. Then check your Forgejo repository — you should see an initial commit containing the OPNsense configuration XML.
|
||||
|
||||
If the push fails, check:
|
||||
1. The deploy key has write access in Forgejo
|
||||
2. The SSH URL is correct (use SSH, not HTTPS)
|
||||
3. Forgejo is reachable from the firewall — test from OPNsense shell:
|
||||
```bash
|
||||
ssh -T git@git.netgrimoire.com
|
||||
# Expected: Hi youruser! You've successfully authenticated...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What Gets Backed Up
|
||||
|
||||
The plugin commits the OPNsense configuration file:
|
||||
|
||||
`/conf/config.xml`
|
||||
|
||||
This single file contains **everything** — interfaces, firewall rules, NAT, VPN configs, aliases, users, certificates, DHCP, DNS settings, and all plugin configurations. A restore from this file fully recreates the firewall state.
|
||||
|
||||
> ⚠ The config.xml contains **hashed passwords**, **VPN private keys**, and **API credentials**. The Forgejo repository must remain private. Ensure your Forgejo instance is not publicly accessible or that this repository is explicitly private.
|
||||
|
||||
---
|
||||
|
||||
## Using the Backup
|
||||
|
||||
### Viewing History
|
||||
|
||||
In Forgejo, navigate to the `opnsense-config` repository. Each commit represents one configuration save, with:
|
||||
- Timestamp of the change
|
||||
- Diff showing exactly what XML changed
|
||||
- Author (OPNsense Firewall)
|
||||
|
||||
### Rolling Back a Change
|
||||
|
||||
If a configuration change causes problems:
|
||||
|
||||
**Option 1 — Restore via OPNsense UI:**
|
||||
1. In Forgejo, find the commit you want to restore
|
||||
2. Download the `config.xml` from that commit
|
||||
3. In OPNsense: **System → Configuration → Backups → Restore**
|
||||
4. Upload the config.xml and restore
|
||||
|
||||
**Option 2 — Restore via shell (if UI is unreachable):**
|
||||
```bash
|
||||
# SSH into OPNsense
|
||||
ssh root@192.168.3.4
|
||||
|
||||
# The git repo is cloned locally — find it
|
||||
find /conf -name ".git" -type d
|
||||
|
||||
# Check out the previous config
|
||||
cd /conf/backup # or wherever the repo is cloned
|
||||
git log --oneline -10
|
||||
git checkout <commit-hash> -- config.xml
|
||||
|
||||
# Apply the restored config
|
||||
/usr/local/sbin/opnsense-importer config.xml
|
||||
```
|
||||
|
||||
### Diffing Changes
|
||||
|
||||
To see exactly what a specific change did:
|
||||
|
||||
```bash
|
||||
# In Forgejo: click any commit → view the diff
|
||||
# Alternatively, from the OPNsense shell:
|
||||
cd <git repo path>
|
||||
git diff HEAD~1 HEAD -- config.xml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration with Kopia Backups
|
||||
|
||||
Since the git repository lives in Forgejo on Netgrimoire, it is automatically included in the Netgrimoire Kopia backup chain — no additional configuration needed. The OPNsense config history is backed up offsite along with everything else.
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [OPNsense Firewall](./opnsense-firewall) — parent firewall documentation
|
||||
- [Forgejo](./forgejo) — Git repository host on Netgrimoire
|
||||
- [Kopia Backups](./kopia) — offsite backup chain
|
||||
508
Netgrimoire/Ward-Grimoire/Firewall/OPNsense.md
Normal file
508
Netgrimoire/Ward-Grimoire/Firewall/OPNsense.md
Normal file
|
|
@ -0,0 +1,508 @@
|
|||
---
|
||||
title: OpnSense
|
||||
description: Grimoire Firewall Configuration
|
||||
published: true
|
||||
date: 2026-02-23T21:31:26.008Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-23T21:31:15.244Z
|
||||
---
|
||||
|
||||
# OPNsense Firewall
|
||||
|
||||
**Host:** OPNsense.localdomain
|
||||
**Timezone:** America/Chicago
|
||||
**Documented:** February 23, 2026
|
||||
**Status:** Active — AT&T migration in progress
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The network perimeter is protected by an OPNsense firewall running on dedicated hardware with four physical Intel i226-V NICs (igc0–igc3). The firewall operates in a dual-WAN configuration during the transition from the legacy ISP to AT&T fiber, with AT&T becoming the permanent primary WAN. CrowdSec threat intelligence, GeoIP blocking, and Spamhaus DROP/EDROP lists provide layered perimeter security.
|
||||
|
||||
---
|
||||
|
||||
## Hardware & System
|
||||
|
||||
| Parameter | Value |
|
||||
|---|---|
|
||||
| Hostname | OPNsense |
|
||||
| Domain | localdomain |
|
||||
| Timezone | America/Chicago |
|
||||
| Language | en_US |
|
||||
| NAT Outbound Mode | Hybrid |
|
||||
| System DNS | 8.8.8.8 (Google) — see DNS notes |
|
||||
| DNS Allow Override | Enabled |
|
||||
| SSH | Enabled (port 22) |
|
||||
| Console Menu | Disabled (hardened) |
|
||||
|
||||
> ⚠ **DNS Note:** The system upstream DNS is set to 8.8.8.8. If dnscrypt-proxy or Unbound is configured, this should be updated to point to localhost or the internal DNS resolver (192.168.5.7). Review before enabling encrypted DNS.
|
||||
|
||||
---
|
||||
|
||||
## Network Interfaces
|
||||
|
||||
| Interface | Label | Physical NIC | IP Address | Role |
|
||||
|---|---|---|---|---|
|
||||
| wan | WAN | igc0 | 24.249.193.114/28 | Legacy primary WAN — being retired |
|
||||
| opt1 | ATT | igc1 | 107.133.34.145/28 | New primary WAN — AT&T fiber |
|
||||
| lan | LAN | igc3 | 192.168.3.4/29 | Internal LAN management segment |
|
||||
| opt3 | OPT3 | igc2 | DHCP | Unassigned — spare interface |
|
||||
| opt2 / wg1 | WG1 | wg1 (virtual) | WireGuard tunnel | WireGuard VPN interface |
|
||||
| openvpn | OpenVPN | virtual | Tunnel only | OpenVPN (server + client configured) |
|
||||
| lo0 | Loopback | lo0 | 127.0.0.1/8 | System loopback |
|
||||
|
||||
> ⚠ **OPT3 (igc2)** is on DHCP and currently unassigned. Disable this interface or assign it a role to reduce unnecessary attack surface.
|
||||
|
||||
---
|
||||
|
||||
## Gateways & Routing
|
||||
|
||||
### Active Gateways
|
||||
|
||||
| Gateway Name | Interface | IP | Role |
|
||||
|---|---|---|---|
|
||||
| WAN_DefRoute | wan (igc0) | 24.249.193.114 | Legacy default route — being retired |
|
||||
| ATT | opt1 (igc1) | 107.133.34.145 | AT&T — becoming primary |
|
||||
| LAN_GWv4 | lan (igc3) | 192.168.3.4 | LAN gateway |
|
||||
|
||||
### NAT Outbound Rules
|
||||
|
||||
Outbound NAT runs in **Hybrid** mode — automatic rules supplemented by manual overrides below.
|
||||
|
||||
| Interface | Source | NAT Target | Purpose |
|
||||
|---|---|---|---|
|
||||
| opt1 (ATT) | ATT_Out_1 group | opt1ip | Dad's Laptop + 192.168.5.128/25 out ATT |
|
||||
| wan | MailCow_Ngnx (192.168.5.16) | 24.249.193.115 | Mail server — dedicated WAN IP |
|
||||
| wan | PNCHarris_Internal | wanip | Internal subnets egress |
|
||||
| wan | WireGuard (opt2) | — | WireGuard outbound NAT |
|
||||
|
||||
> ✓ The mail server already has a dedicated outbound IP (24.249.193.115) on WAN. This pattern should be replicated on ATT using a dedicated virtual IP from the static block.
|
||||
|
||||
---
|
||||
|
||||
## Firewall Aliases
|
||||
|
||||
### Host Aliases
|
||||
|
||||
| Alias | IP Address | Used For |
|
||||
|---|---|---|
|
||||
| caddy | 192.168.5.10 | Caddy reverse proxy |
|
||||
| MailCow_Ngnx | 192.168.5.16 | MailCow nginx container |
|
||||
| JellyFin_Host | 192.168.5.18 | Jellyfin media server |
|
||||
| ISPConfig_Host | 192.168.4.11 | ISPConfig control panel |
|
||||
| Dads_Laptop | 192.168.5.176 | Routed out ATT interface |
|
||||
|
||||
### Network Aliases
|
||||
|
||||
| Alias | Value | Used For |
|
||||
|---|---|---|
|
||||
| PNCHarris_Internal | 192.168.5.0/25, 192.168.3.0/24 | Primary internal subnets |
|
||||
| Subnet_5_128_Mask_25 | 192.168.5.128/25 | Upper half of 192.168.5.x |
|
||||
| ATT_Out_1 | Dads_Laptop + Subnet_5_128_Mask_25 | Traffic routed out ATT interface |
|
||||
| Family_Subnet | (empty) | Defined but unpopulated |
|
||||
|
||||
### Port Aliases
|
||||
|
||||
| Alias | Ports | Used For |
|
||||
|---|---|---|
|
||||
| Web_Services | 80, 443 | HTTP/HTTPS |
|
||||
| MailCow | 25, 110, 143, 465, 587, 993, 995, 4190 | Full MailCow mail protocol suite |
|
||||
| ISPConfig | 25, 53, 143, 465, 587, 993, 995, 8080 | ISPConfig mail + DNS + admin |
|
||||
| JellyFin_Port | 8096, 7096 | Jellyfin HTTP + HTTPS |
|
||||
| Plex_Port_2 | (empty) | Defined but unpopulated |
|
||||
|
||||
### Security & Threat Intelligence Aliases
|
||||
|
||||
| Alias | Type | Source | Status |
|
||||
|---|---|---|---|
|
||||
| SpamHaus_Drop | URL Table | https://www.spamhaus.org/drop/drop.txt | ⚠ Rule DISABLED |
|
||||
| Spamhaus_edrop | URL Table | https://www.spamhaus.org/drop/edrop.txt | ⚠ Rule DISABLED |
|
||||
| Blocked_Countries | GeoIP | 70 countries — see GeoIP section | ⚠ Rule DISABLED |
|
||||
| crowdsec_blacklists | External | CrowdSec IPv4 decisions | ✓ Active |
|
||||
| crowdsec6_blacklists | External | CrowdSec IPv6 decisions | ✓ Active |
|
||||
| crowdsec_blocklists | External | CrowdSec IPv4 (duplicate) | ✓ Active |
|
||||
| crowdsec6_blocklists | External | CrowdSec IPv6 decisions (duplicate) | ✓ Active |
|
||||
|
||||
> ⚠ **Critical:** Spamhaus DROP, Spamhaus EDROP, and GeoIP country blocking are all defined and populated but their firewall rules are **disabled**. These are not currently being enforced. Re-enable these rules as an immediate priority.
|
||||
|
||||
> ⚠ There are duplicate CrowdSec alias pairs (`crowdsec_blacklists` and `crowdsec_blocklists` both handle IPv4). Review and consolidate to avoid confusion.
|
||||
|
||||
---
|
||||
|
||||
## Firewall Rules
|
||||
|
||||
### WAN Rules
|
||||
|
||||
| Action | Protocol | Source | Destination | Port(s) | Enabled | Description |
|
||||
|---|---|---|---|---|---|---|
|
||||
| BLOCK | Any | SpamHaus_Drop | Any | Any | ❌ No | Block Spamhaus DROP list |
|
||||
| BLOCK | Any | Spamhaus_edrop | Any | Any | ❌ No | Block Spamhaus EDROP list |
|
||||
| BLOCK | Any | Blocked_Countries | Any | Any | ❌ No | GeoIP country block |
|
||||
| PASS | TCP | Any | MailCow_Ngnx | MailCow ports | ✓ Yes | Inbound mail |
|
||||
| PASS | TCP | Any | JellyFin_Host | 8096, 7096 | ✓ Yes | Jellyfin access |
|
||||
| PASS | UDP | Any | WAN IP | 51820 | ✓ Yes | WireGuard VPN ingress |
|
||||
| PASS | TCP | Any | MailCow_Ngnx | 80, 443 | ✓ Yes | MailCow webmail |
|
||||
| PASS | TCP | Any | caddy (192.168.5.10) | 80, 443 | ✓ Yes | Caddy reverse proxy |
|
||||
|
||||
> ⚠ All three block rules at the top of the WAN ruleset are disabled. The firewall is currently not enforcing Spamhaus or GeoIP blocking despite the aliases being populated.
|
||||
|
||||
### LAN Rules
|
||||
|
||||
| Action | Protocol | Source | Destination | Description |
|
||||
|---|---|---|---|---|
|
||||
| PASS | Any | ATT_Out_1 group | Any | Dad's Laptop + upper subnet out ATT |
|
||||
| PASS | Any | LAN subnet | Any | Default allow LAN to any |
|
||||
| PASS | Any | PNCHarris_Internal | Any | Internal subnets to any |
|
||||
| PASS | Any | LAN subnet | Any | Default allow LAN IPv6 to any |
|
||||
| PASS | TCP | PNCHarris_Internal | ISPConfig_Host:ISPConfig | LAN → ISPConfig redirect |
|
||||
| PASS | TCP | PNCHarris_Internal | ISPConfig_Host:80/443 | LAN → ISPConfig web redirect |
|
||||
| PASS | TCP | PNCHarris_Internal | caddy:80/443 | LAN → Caddy redirect |
|
||||
| PASS | TCP | PNCHarris_Internal | MailCow_Ngnx:MailCow | LAN → MailCow redirect |
|
||||
|
||||
### WireGuard Interface Rules
|
||||
|
||||
| Action | Protocol | Source | Destination | Description |
|
||||
|---|---|---|---|---|
|
||||
| PASS | Any | Any | Any | Allow all from WireGuard peers — unrestricted |
|
||||
|
||||
> ⚠ The WireGuard interface allows all traffic from all peers with no restrictions. Consider scoping rules per peer as needs are better understood — some remote sites may only need access to specific services.
|
||||
|
||||
---
|
||||
|
||||
## NAT Port Forwards
|
||||
|
||||
### WAN Inbound
|
||||
|
||||
| Protocol | Public Port(s) | Internal Target | Internal Port(s) | Service |
|
||||
|---|---|---|---|---|
|
||||
| TCP | MailCow ports | 192.168.5.16 (MailCow_Ngnx) | MailCow ports | Mail (SMTP/IMAP/POP3/Sieve) |
|
||||
| TCP | 80, 443 | 192.168.5.16 (MailCow_Ngnx) | 80, 443 | MailCow webmail |
|
||||
| TCP | 8096, 7096 | 192.168.5.18 (JellyFin_Host) | 8096, 7096 | Jellyfin |
|
||||
| TCP | 80, 443 | 192.168.5.10 (caddy) | 80, 443 | Caddy (all web services) |
|
||||
|
||||
### LAN Hairpin (Internal Redirect)
|
||||
|
||||
| Protocol | Port(s) | Internal Target | Description |
|
||||
|---|---|---|---|
|
||||
| TCP | MailCow ports | 192.168.5.16 | Internal mail access |
|
||||
| TCP | 80, 443 | 192.168.5.10 (caddy) | Internal web via Caddy |
|
||||
| TCP | ISPConfig ports | 192.168.4.11 | Internal ISPConfig access |
|
||||
| TCP | 80, 443 | 192.168.4.11 | Internal ISPConfig web |
|
||||
|
||||
---
|
||||
|
||||
## VPN
|
||||
|
||||
### WireGuard
|
||||
|
||||
**Server: pncharris**
|
||||
|
||||
| Parameter | Value |
|
||||
|---|---|
|
||||
| Tunnel Address | 192.168.32.1/24 |
|
||||
| Listen Port | 51820 (UDP) |
|
||||
| DNS for Peers | 192.168.5.7 (internal DNS) |
|
||||
| Interface | wg1 (OPT2) |
|
||||
| Status | Enabled |
|
||||
|
||||
**Peers**
|
||||
|
||||
| Peer | Tunnel IP | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Obie | 192.168.32.2/32 | ✓ Enabled | |
|
||||
| pncfishandmore | 192.168.32.3/32 | ✓ Enabled | Business location |
|
||||
| GLNet (1) | 192.168.32.4/32 | ✓ Enabled | GL.iNet travel router |
|
||||
| PortaPotty | 192.168.32.5/32 | ✓ Enabled | Remote site |
|
||||
| GLNet (2) | 192.168.32.6/32 | ✓ Enabled | Second GL.iNet device |
|
||||
|
||||
> ✓ WireGuard peers use the internal DNS server (192.168.5.7) — internal hostnames resolve correctly over VPN.
|
||||
|
||||
### OpenVPN
|
||||
|
||||
An OpenVPN server and client are configured but details were not populated in the backup. Verify status in **VPN → OpenVPN** in the OPNsense UI.
|
||||
|
||||
---
|
||||
|
||||
## Security Features
|
||||
|
||||
### CrowdSec
|
||||
|
||||
CrowdSec is installed and fully operational at the firewall level.
|
||||
|
||||
| Parameter | Value |
|
||||
|---|---|
|
||||
| Agent | Enabled |
|
||||
| Local API (LAPI) | Enabled — 127.0.0.1:8080 |
|
||||
| Firewall Bouncer | Enabled |
|
||||
| Rules | Enabled with logging |
|
||||
| Firewall Bouncer Verbose | Disabled |
|
||||
| Manual LAPI Config | Disabled (auto) |
|
||||
|
||||
CrowdSec decisions are fed into two alias pairs used in firewall rules:
|
||||
- `crowdsec_blacklists` / `crowdsec6_blacklists` — IPv4 and IPv6 block lists
|
||||
- `crowdsec_blocklists` / `crowdsec6_blocklists` — duplicate set (consolidate)
|
||||
|
||||
### GeoIP Blocking
|
||||
|
||||
GeoIP uses the MaxMind GeoLite2 database with a configured license key. **The blocking rule is currently disabled** — the alias is populated but not enforced.
|
||||
|
||||
**70 countries are blocked across four regions:**
|
||||
|
||||
| Region | Countries |
|
||||
|---|---|
|
||||
| Africa (49) | AO, BF, BI, BJ, BW, CD, CF, CG, CI, CM, DJ, DZ, EG, EH, ER, ET, GA, GH, GM, GN, GQ, GW, KE, LR, LS, LY, MA, ML, MR, MW, MZ, NA, NE, NG, RW, SD, SL, SN, SO, SS, ST, SZ, TD, TG, TN, TZ, UG, ZA, ZM, ZW |
|
||||
| Middle East / Asia (12) | AF, BN, BT, CN, IQ, IR, KG, KP, KW, PH, QA, SA |
|
||||
| Eastern Europe (4) | BG, RS, RU, RO |
|
||||
| Latin America (4) | BR, EC, GT, HN |
|
||||
|
||||
### Spamhaus Blocklists
|
||||
|
||||
Both lists are configured as URL table aliases that auto-refresh, but **both blocking rules are currently disabled.**
|
||||
|
||||
| List | URL | Update |
|
||||
|---|---|---|
|
||||
| Spamhaus DROP | https://www.spamhaus.org/drop/drop.txt | Auto (URL table) |
|
||||
| Spamhaus EDROP | https://www.spamhaus.org/drop/edrop.txt | Auto (URL table) |
|
||||
|
||||
---
|
||||
|
||||
## Internal Network Layout
|
||||
|
||||
### Known Subnets
|
||||
|
||||
| Subnet | Alias | Purpose |
|
||||
|---|---|---|
|
||||
| 192.168.3.0/24 | PNCHarris_Internal | LAN management segment |
|
||||
| 192.168.5.0/25 | PNCHarris_Internal | Primary server subnet |
|
||||
| 192.168.5.128/25 | Subnet_5_128_Mask_25 | Secondary server subnet / ATT routing |
|
||||
| 192.168.32.0/24 | — | WireGuard tunnel network |
|
||||
|
||||
### Key Internal Hosts
|
||||
|
||||
| Hostname / Alias | IP | Role |
|
||||
|---|---|---|
|
||||
| caddy | 192.168.5.10 | Caddy reverse proxy (all web services) |
|
||||
| MailCow_Ngnx | 192.168.5.16 | MailCow nginx container |
|
||||
| JellyFin_Host | 192.168.5.18 | Jellyfin media server |
|
||||
| ISPConfig_Host | 192.168.4.11 | ISPConfig control panel |
|
||||
| Dads_Laptop | 192.168.5.176 | Routed via ATT interface |
|
||||
| Internal DNS | 192.168.5.7 | DNS server (served to WireGuard peers) |
|
||||
|
||||
### DHCP
|
||||
|
||||
DHCP on the LAN interface (192.168.3.0/24) is currently **disabled**. No KEA or ISC DHCP ranges are active on the firewall. Devices likely use static IPs or a separate DHCP server downstream.
|
||||
|
||||
---
|
||||
|
||||
## Installed Plugins & Services
|
||||
|
||||
The following OPNsense components are present in the configuration:
|
||||
|
||||
| Plugin / Service | Status |
|
||||
|---|---|
|
||||
| WireGuard | ✓ Active — 1 server, 5 peers |
|
||||
| CrowdSec | ✓ Active — agent + bouncer + LAPI |
|
||||
| OpenVPN | Configured — verify in UI |
|
||||
| IPsec / Swanctl | Present — verify in UI |
|
||||
| Unbound Plus | Present — verify DNS configuration |
|
||||
| Kea DHCP | Present — not active on LAN |
|
||||
| DHCP Relay | Present |
|
||||
| Netflow | Present |
|
||||
| IDS/IPS (Suricata) | ❌ Not configured — see hardening plan |
|
||||
| Proxy | Present — not actively used |
|
||||
| Traffic Shaper | Present |
|
||||
| Monit | Present |
|
||||
| SNMP | Present |
|
||||
| Syslog | Not configured — see hardening plan |
|
||||
| Git Backup | Not installed — see hardening plan |
|
||||
|
||||
---
|
||||
|
||||
## AT&T Migration & Static IP Plan
|
||||
|
||||
### Current AT&T Interface
|
||||
|
||||
**Interface:** opt1 (igc1)
|
||||
**Current IP:** 107.133.34.145/28
|
||||
**Block:** /28 — up to 14 usable addresses, 5 static IPs allocated for use
|
||||
|
||||
### Recommended Static IP Allocation
|
||||
|
||||
| IP Slot | Dedicated To | Justification |
|
||||
|---|---|---|
|
||||
| IP 1 | **Mail (MailCow)** | Dedicated mail IP protects sender reputation. Never share with web services. Only ports 25/465/587/993/995/4190 NAT to 192.168.5.16. |
|
||||
| IP 2 | **Web / Caddy** | All reverse-proxied services via Caddy. Keeps web and mail reputation independent. Replace current WAN NAT for ports 80/443 → 192.168.5.10. |
|
||||
| IP 3 | **WireGuard VPN** | Dedicated IP for UDP/51820 only. Cleaner peer configs, stable endpoint, easy to firewall tightly — that IP accepts nothing else. |
|
||||
| IP 4 | **Spare / Jellyfin** | Hold in reserve. Best candidate: dedicated Jellyfin IP (currently on WAN with ports 8096/7096). Media servers benefit from a clean IP separate from your main web presence. |
|
||||
| IP 5 | **Admin / Out-of-band** | A locked-down IP for emergency remote OPNsense access. Firewall tightly — accept only from WireGuard peers or specific trusted source IPs. Never advertise publicly. |
|
||||
|
||||
### Implementation Steps
|
||||
|
||||
**Step 1 — Add Virtual IPs**
|
||||
|
||||
In OPNsense: **Firewall → Virtual IPs → Add**
|
||||
|
||||
For each additional static IP (IPs 1–5 excluding the interface IP):
|
||||
- Type: `IP Alias`
|
||||
- Interface: `ATT (opt1)`
|
||||
- Address: `<static IP>/28`
|
||||
- Description: e.g. `ATT_Mail`, `ATT_Web`, `ATT_WireGuard`
|
||||
|
||||
**Step 2 — Create NAT Rules Per Virtual IP**
|
||||
|
||||
In **Firewall → NAT → Port Forward**, create new rules on the ATT interface using the virtual IPs as the destination. Example for mail:
|
||||
|
||||
```
|
||||
Interface: ATT (opt1)
|
||||
Protocol: TCP
|
||||
Destination: ATT_Mail virtual IP
|
||||
Destination Port: MailCow alias
|
||||
Redirect Target: 192.168.5.16 (MailCow_Ngnx)
|
||||
Redirect Port: MailCow alias
|
||||
```
|
||||
|
||||
Repeat for web (→ caddy 192.168.5.10) and WireGuard (UDP/51820).
|
||||
|
||||
**Step 3 — Update Outbound NAT**
|
||||
|
||||
Add manual outbound NAT rules so that each internal service exits through its dedicated virtual IP:
|
||||
|
||||
```
|
||||
Interface: ATT (opt1)
|
||||
Source: 192.168.5.16 (MailCow_Ngnx)
|
||||
Target: ATT_Mail virtual IP
|
||||
|
||||
Interface: ATT (opt1)
|
||||
Source: 192.168.5.10 (caddy)
|
||||
Target: ATT_Web virtual IP
|
||||
```
|
||||
|
||||
**Step 4 — Migrate WireGuard Endpoint**
|
||||
|
||||
Update peer configs to point to the ATT_WireGuard virtual IP on port 51820. Move the WAN WireGuard rule to ATT interface. Update DNS records if you have a hostname for the WireGuard endpoint.
|
||||
|
||||
**Step 5 — Update Firewall Block Rules**
|
||||
|
||||
Re-enable the Spamhaus and GeoIP block rules on the ATT interface. Apply them to the ATT WAN rules the same way they are (currently disabled) on WAN.
|
||||
|
||||
**Step 6 — DNS Updates**
|
||||
|
||||
Update all public DNS records to point to the new ATT static IPs:
|
||||
- `mail.*` domains → ATT_Mail IP
|
||||
- `*.netgrimoire.com`, `*.wasted-bandwidth.net`, etc. → ATT_Web IP
|
||||
- WireGuard endpoint hostname → ATT_WireGuard IP
|
||||
|
||||
**Step 7 — Retire WAN (igc0)**
|
||||
|
||||
Once all services are verified on ATT, disable WAN NAT rules, remove port forward rules on WAN, and eventually disable the interface.
|
||||
|
||||
---
|
||||
|
||||
## Hardening Plan
|
||||
|
||||
The following items are recommended improvements, ordered by priority.
|
||||
|
||||
### Priority 1 — Re-enable Disabled Security Rules (Immediate)
|
||||
|
||||
All three security block rules on the WAN interface are currently disabled. These should be re-enabled immediately as they represent threat intelligence you have already configured but are not using.
|
||||
|
||||
1. Navigate to **Firewall → Rules → WAN**
|
||||
2. Find rules: `Block DROP`, `Block EDROP`, and the GeoIP block rule
|
||||
3. Click the enable toggle on each rule
|
||||
4. Click **Apply Changes**
|
||||
|
||||
Repeat on the ATT interface once migrated.
|
||||
|
||||
### Priority 2 — Suricata IDS/IPS
|
||||
|
||||
Suricata is built into OPNsense but not yet configured. This is the most significant security gap — without it, there is no deep packet inspection or content-based threat detection.
|
||||
|
||||
**Setup steps:**
|
||||
|
||||
1. Go to **Services → Intrusion Detection → Administration**
|
||||
2. Enable IDS/IPS, set interface to **ATT** (and WAN while active)
|
||||
3. Set mode to **IPS** (inline blocking, not just alerting)
|
||||
4. Under **Download**, enable the following rulesets:
|
||||
- `ET Open` — Proofpoint Emerging Threats (free, comprehensive)
|
||||
- `Abuse.ch SSL Blacklist` — malicious SSL certificate detection
|
||||
- `Feodo Tracker` — botnet C2 blocking
|
||||
5. Under **Policies**, set default action to `drop` for high-severity rules
|
||||
6. Click **Download & Update Rules**, then **Apply**
|
||||
|
||||
> ✓ Suricata complements CrowdSec well. CrowdSec handles IP reputation; Suricata handles traffic content inspection. They do not overlap.
|
||||
|
||||
### Priority 3 — Additional Blocklists
|
||||
|
||||
Add these URL table aliases to supplement Spamhaus DROP/EDROP:
|
||||
|
||||
| List | URL | Purpose |
|
||||
|---|---|---|
|
||||
| Feodo Tracker | https://feodotracker.abuse.ch/downloads/ipblocklist.txt | Botnet C2 IPs |
|
||||
| Abuse.ch SSLBL | https://sslbl.abuse.ch/blacklist/sslipblacklist.txt | Malicious SSL IPs |
|
||||
| Emerging Threats | https://rules.emergingthreats.net/fwrules/emerging-Block-IPs.txt | ET block list |
|
||||
|
||||
For each: **Firewall → Aliases → Add**, type `URL Table`, set refresh to 1 day. Then add a WAN block rule using each alias as the source.
|
||||
|
||||
### Priority 4 — dnscrypt-proxy (Encrypted DNS)
|
||||
|
||||
Encrypts DNS queries leaving the firewall and adds DNS-level malware/tracking blocklists.
|
||||
|
||||
1. Go to **System → Firmware → Plugins**, install `os-dnscrypt-proxy`
|
||||
2. Navigate to **Services → DNSCrypt-Proxy**
|
||||
3. Enable, set listen port to `5353`
|
||||
4. Select resolvers: `cloudflare`, `quad9-dnscrypt-ip4-nofilter-pri` (or similar)
|
||||
5. Enable DNSSEC validation
|
||||
6. Update **System → Settings → General** — set DNS server to `127.0.0.1:5353`
|
||||
7. Disable `DNS Allow Override` so the ISP cannot push DNS changes
|
||||
|
||||
### Priority 5 — os-git-backup
|
||||
|
||||
Automatically commits every OPNsense config change to a Git repository. Invaluable for auditing changes after an incident and for rapid recovery.
|
||||
|
||||
1. Go to **System → Firmware → Plugins**, install `os-git-backup`
|
||||
2. Navigate to **System → Configuration → Git Backup**
|
||||
3. Configure a Forgejo repository on Netgrimoire as the remote
|
||||
4. Set SSH key for authentication
|
||||
5. Enable automatic backup on config change
|
||||
|
||||
### Priority 6 — Syslog to Graylog
|
||||
|
||||
Syslog is not currently configured. Sending firewall logs to Graylog (already running at `http://graylog:9000`) enables centralized log analysis and alerting.
|
||||
|
||||
1. Go to **System → Settings → Logging → Remote**
|
||||
2. Add a syslog destination: `graylog:514` (UDP) or use GELF input on Graylog
|
||||
3. Enable logging for: Firewall, DHCP, VPN, Authentication, CrowdSec
|
||||
|
||||
---
|
||||
|
||||
## Known Issues & Action Items
|
||||
|
||||
| Item | Priority | Notes |
|
||||
|---|---|---|
|
||||
| Spamhaus DROP rule disabled | 🔴 High | Re-enable in Firewall → Rules → WAN |
|
||||
| Spamhaus EDROP rule disabled | 🔴 High | Re-enable in Firewall → Rules → WAN |
|
||||
| GeoIP block rule disabled | 🔴 High | Re-enable in Firewall → Rules → WAN |
|
||||
| Suricata not configured | 🔴 High | Most significant security gap — configure with ET Open rules |
|
||||
| Duplicate CrowdSec aliases | 🟡 Medium | crowdsec_blacklists and crowdsec_blocklists both do IPv4 — consolidate |
|
||||
| WireGuard rule too permissive | 🟡 Medium | Allow-all from peers — scope per peer when needs are known |
|
||||
| OPT3 interface unassigned | 🟡 Medium | Disable or assign a role |
|
||||
| System DNS points to Google | 🟡 Medium | Should point to internal resolver or localhost after dnscrypt-proxy setup |
|
||||
| No syslog configured | 🟡 Medium | Forward to Graylog for centralized logging |
|
||||
| os-git-backup not installed | 🟡 Medium | Install for config change auditing |
|
||||
| OpenVPN config unpopulated | 🟢 Low | Verify status — backup shows server+client but no details |
|
||||
| ATT migration incomplete | 🟢 Low | In progress — see migration plan above |
|
||||
| Family_Subnet alias empty | 🟢 Low | Populate or remove |
|
||||
| Plex_Port_2 alias empty | 🟢 Low | Populate or remove |
|
||||
| DHCP disabled on LAN | 🟢 Info | Intentional if using static IPs — verify |
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Caddy Reverse Proxy](./caddy-reverse-proxy) — services exposed through the firewall
|
||||
- [MailCow Mail Server](./mailcow) — mail server behind the firewall, dedicated WAN IP
|
||||
- [WireGuard VPN](./wireguard) — peer configuration and access
|
||||
- [Graylog](./graylog) — target for firewall syslog
|
||||
- [CrowdSec](./crowdsec) — threat intelligence integration
|
||||
212
Netgrimoire/Ward-Grimoire/Firewall/Suricata-IDS.md
Normal file
212
Netgrimoire/Ward-Grimoire/Firewall/Suricata-IDS.md
Normal file
|
|
@ -0,0 +1,212 @@
|
|||
---
|
||||
title: OpnSense-IDS/IPS
|
||||
description: IDS
|
||||
published: true
|
||||
date: 2026-02-23T21:51:49.920Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-23T21:49:16.861Z
|
||||
---
|
||||
|
||||
# Suricata IDS/IPS
|
||||
|
||||
**Service:** Suricata Intrusion Detection & Prevention System
|
||||
**Host:** OPNsense firewall
|
||||
**Interfaces:** ATT (opt1) — add WAN (igc0) while still active
|
||||
**Mode:** IPS (inline blocking)
|
||||
**Rulesets:** ET Open, Feodo Tracker, Abuse.ch SSL
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Suricata is OPNsense's built-in deep packet inspection engine. Unlike CrowdSec (which blocks based on IP reputation) and GeoIP (which blocks by country), Suricata inspects the **content** of traffic — detecting exploit patterns, malware C2 communication, vulnerability scans, and known CVE exploitation attempts in real time.
|
||||
|
||||
The two systems complement each other and do not overlap:
|
||||
|
||||
| Layer | Tool | What It Stops |
|
||||
|---|---|---|
|
||||
| IP reputation | CrowdSec | Known bad IPs from community threat intel |
|
||||
| Geography | GeoIP | Traffic from blocked countries |
|
||||
| Content inspection | Suricata | Malicious payloads, exploit patterns, C2 traffic |
|
||||
|
||||
Suricata uses **Netmap** for high-performance inline packet processing with minimal CPU overhead.
|
||||
|
||||
> ⚠ **Before enabling IPS mode:** Disable hardware offloading on your interfaces or Netmap will not function correctly. This is done in **Interfaces → Settings**.
|
||||
|
||||
---
|
||||
|
||||
## Pre-requisite: Disable Hardware Offloading
|
||||
|
||||
1. Go to **Interfaces → Settings**
|
||||
2. Disable the following options:
|
||||
- Hardware CRC
|
||||
- Hardware TSO
|
||||
- Hardware LRO
|
||||
- VLAN Hardware Filtering
|
||||
3. Click **Save**
|
||||
4. Reboot the firewall
|
||||
|
||||
> ✓ This is a one-time change. It has no meaningful impact on performance for home/small business use and is required for Suricata IPS mode to function.
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
Suricata is built into OPNsense — no plugin install required. Navigate directly to:
|
||||
|
||||
**Services → Intrusion Detection → Administration**
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Step 1 — General Settings
|
||||
|
||||
Navigate to **Services → Intrusion Detection → Administration**
|
||||
|
||||
| Setting | Value | Notes |
|
||||
|---|---|---|
|
||||
| Enabled | ✓ | Turns on the IDS/IPS engine |
|
||||
| IPS Mode | ✓ | Enables inline blocking (not just alerting) |
|
||||
| Promiscuous Mode | Leave default | Only needed for mirrored traffic setups |
|
||||
| Default Packet Size | Leave default | Auto-detected |
|
||||
| Interfaces | ATT, WAN | Add both while dual-WAN is active; remove WAN after migration |
|
||||
| Home Networks | 192.168.3.0/24, 192.168.5.0/24, 192.168.32.0/24 | Your internal subnets — critical for rule accuracy |
|
||||
| Log Level | Info | |
|
||||
| Log Retention | 7 days | Adjust based on disk space |
|
||||
|
||||
> ⚠ **Home Networks is critical.** Suricata rules use `$HOME_NET` and `$EXTERNAL_NET` to determine direction. If your internal subnets are not listed here, many rules will fail to trigger correctly or will produce false positives.
|
||||
|
||||
Click **Apply** after setting these values.
|
||||
|
||||
### Step 2 — Download Rulesets
|
||||
|
||||
Navigate to **Services → Intrusion Detection → Download**
|
||||
|
||||
Enable the following rulesets:
|
||||
|
||||
| Ruleset | Provider | Priority | Notes |
|
||||
|---|---|---|---|
|
||||
| ET Open | Proofpoint Emerging Threats | 🔴 Essential | Comprehensive free ruleset — 40,000+ rules covering exploits, malware, scanning, C2 |
|
||||
| Abuse.ch SSL Blacklist | Abuse.ch | 🔴 Essential | Blocks connections to malicious SSL certificates used by malware |
|
||||
| Feodo Tracker Botnet | Abuse.ch | 🔴 Essential | Blocks botnet C2 IP communication |
|
||||
| OSIF | OPNsense | 🟡 Recommended | OPNsense internal feed |
|
||||
| PT Research | Positive Technologies | 🟡 Recommended | Additional threat intelligence |
|
||||
|
||||
To enable each ruleset:
|
||||
1. Find it in the list
|
||||
2. Toggle the **Enabled** switch
|
||||
3. Click **Download & Update Rules** at the top of the page
|
||||
|
||||
> ✓ ET Open is the most important ruleset. It is maintained by Proofpoint, updated daily, and covers the vast majority of common attack patterns you will encounter.
|
||||
|
||||
### Step 3 — Configure Policies
|
||||
|
||||
Policies control what Suricata does when a rule matches — alert only, or drop the packet.
|
||||
|
||||
Navigate to **Services → Intrusion Detection → Policy**
|
||||
|
||||
**Recommended policy setup:**
|
||||
|
||||
Add the following policies in order:
|
||||
|
||||
**Policy 1 — Drop high-severity ET threats**
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Description | Drop ET High Severity |
|
||||
| Priority | 1 |
|
||||
| Rulesets | ET Open |
|
||||
| Action | Drop |
|
||||
| Severity | ≥ High |
|
||||
|
||||
**Policy 2 — Alert on medium-severity (tuning period)**
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Description | Alert ET Medium |
|
||||
| Priority | 2 |
|
||||
| Rulesets | ET Open |
|
||||
| Action | Alert |
|
||||
| Severity | Medium |
|
||||
|
||||
**Policy 3 — Drop all Feodo/Abuse.ch matches**
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Description | Drop Botnet C2 and SSL Blacklist |
|
||||
| Priority | 1 |
|
||||
| Rulesets | Feodo Tracker, Abuse.ch SSL |
|
||||
| Action | Drop |
|
||||
| Severity | Any |
|
||||
|
||||
> ✓ Start with medium-severity rules in **alert** mode for the first 1–2 weeks. Review alerts in the log for false positives before switching to drop. High-severity rules and the abuse.ch lists are safe to drop immediately.
|
||||
|
||||
### Step 4 — Apply and Verify
|
||||
|
||||
1. Click **Apply** on the Administration tab
|
||||
2. Navigate to **Services → Intrusion Detection → Alerts**
|
||||
3. Wait a few minutes — alerts should begin populating
|
||||
4. Check **Services → Intrusion Detection → Stats** to confirm traffic is being processed
|
||||
|
||||
---
|
||||
|
||||
## Tuning & False Positives
|
||||
|
||||
After running in alert mode for a week, review the Alerts tab. Common false positives from home lab environments include:
|
||||
|
||||
- **Nextcloud sync traffic** — may trigger file transfer rules
|
||||
- **Torrents/P2P** — will trigger multiple ET rules by design
|
||||
- **Internal port scanning tools** — Nmap from internal hosts triggers scan rules
|
||||
|
||||
To suppress a false positive rule without disabling it entirely:
|
||||
|
||||
1. Note the rule SID from the alert
|
||||
2. Go to **Services → Intrusion Detection → Rules**
|
||||
3. Search for the SID
|
||||
4. Change the rule action to **Alert** (instead of Drop) for that specific rule
|
||||
|
||||
Alternatively, add a suppression in **Services → Intrusion Detection → Suppressions**:
|
||||
- Enter the SID
|
||||
- Set the direction (source or destination)
|
||||
- Enter the IP to suppress for that rule
|
||||
|
||||
---
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Alert Dashboard
|
||||
|
||||
**Services → Intrusion Detection → Alerts** — real-time view of matched rules.
|
||||
|
||||
Useful filters:
|
||||
- Filter by `severity: high` to see the most critical events
|
||||
- Filter by `action: drop` to see what is being actively blocked
|
||||
- Filter by source IP to investigate a specific host
|
||||
|
||||
### Graylog Integration
|
||||
|
||||
Forward Suricata alerts to Graylog for centralized analysis:
|
||||
|
||||
1. Suricata logs to `/var/log/suricata/eve.json` in EVE JSON format
|
||||
2. In Graylog, add a **Beats input** or **Syslog UDP input**
|
||||
3. In OPNsense **System → Settings → Logging → Remote**, add Graylog as syslog target
|
||||
4. Create a Graylog stream filtering on `application_name: suricata`
|
||||
|
||||
---
|
||||
|
||||
## Key Files & Paths
|
||||
|
||||
| Path | Purpose |
|
||||
|---|---|
|
||||
| `/var/log/suricata/eve.json` | EVE JSON alert log — used by Graylog |
|
||||
| `/var/log/suricata/stats.log` | Performance statistics |
|
||||
| `/usr/local/etc/suricata/suricata.yaml` | Main config (managed by OPNsense UI) |
|
||||
| `/usr/local/share/suricata/rules/` | Downloaded rulesets |
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [OPNsense Firewall](./opnsense-firewall) — parent firewall documentation
|
||||
- [CrowdSec](./crowdsec) — complementary IP reputation layer
|
||||
- [Additional Blocklists](./opnsense-blocklists) — Feodo, Abuse.ch, ET IP blocklists at firewall level
|
||||
- [Graylog](./graylog) — centralized log target for Suricata alerts
|
||||
159
Netgrimoire/Ward-Grimoire/Firewall/Zenarmor.md
Normal file
159
Netgrimoire/Ward-Grimoire/Firewall/Zenarmor.md
Normal file
|
|
@ -0,0 +1,159 @@
|
|||
---
|
||||
title: OpnSense - App Protection
|
||||
description: App Inspection
|
||||
published: true
|
||||
date: 2026-02-23T21:52:43.630Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-23T21:50:37.324Z
|
||||
---
|
||||
|
||||
# Zenarmor (NGFW)
|
||||
|
||||
**Service:** Zenarmor Next-Generation Firewall
|
||||
**Plugin:** os-sunnyvalley
|
||||
**Tier:** Free Edition
|
||||
**Host:** OPNsense firewall
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Zenarmor adds application-layer awareness and web filtering to OPNsense that the base firewall does not provide. Where Suricata inspects packet content for known threat signatures, Zenarmor identifies **what application or service** is generating traffic and can block or allow based on that — regardless of port.
|
||||
|
||||
| Feature | Free Tier | Paid Tier |
|
||||
|---|---|---|
|
||||
| Layer-7 app identification | ✓ | ✓ |
|
||||
| Web category filtering | Default policy only | Custom policies |
|
||||
| Malware/phishing blocking | ✓ | ✓ |
|
||||
| Real-time network analytics | ✓ | ✓ |
|
||||
| Device tracking & alerts | ✗ | ✓ |
|
||||
| Multiple policies | ✗ | ✓ |
|
||||
| TLS inspection | ✗ | ✓ |
|
||||
|
||||
The free tier is useful primarily for **visibility** (seeing what applications are running on your network) and **basic threat blocking** (malware, phishing, PUP domains). The analytics dashboard alone makes it worthwhile.
|
||||
|
||||
> ✓ Zenarmor and Suricata can run simultaneously. They operate at different layers and do not conflict. Zenarmor handles application identity; Suricata handles content signatures.
|
||||
|
||||
> ⚠ **MongoDB deprecation note:** As of September 2025, MongoDB is being deprecated as the Zenarmor database backend. Use **SQLite** when prompted during setup — it is the supported path going forward.
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
### Step 1 — Install the Plugin
|
||||
|
||||
1. Go to **System → Firmware → Plugins**
|
||||
2. Search for `os-sunnyvalley`
|
||||
3. Click the **+** install button
|
||||
4. Wait for installation to complete
|
||||
5. **Refresh the browser** — a new **Zenarmor** menu item will appear in the sidebar
|
||||
|
||||
### Step 2 — Initial Setup Wizard
|
||||
|
||||
Navigate to **Zenarmor → Dashboard** — this launches the setup wizard on first run.
|
||||
|
||||
**Deployment Mode:** Select **Routed Mode (L3)** for standard OPNsense setups. This is correct for your configuration.
|
||||
|
||||
**Database:** Select **SQLite** — do not select MongoDB (deprecated September 2025).
|
||||
|
||||
**Interface:** Select **ATT (opt1)** as the primary interface. Add **WAN (igc0)** while dual-WAN is still active.
|
||||
|
||||
> ⚠ Zenarmor should be applied to the **LAN-facing side** of the firewall for internal traffic inspection, or the **WAN-facing side** for inbound threat blocking. For your setup, applying it to both ATT and LAN gives the most coverage.
|
||||
|
||||
**Cloud Connectivity:** Leave enabled — Zenarmor uses cloud-based category lookups for web filtering. If you want fully offline operation, this can be disabled but web filtering accuracy degrades significantly.
|
||||
|
||||
Click **Complete** to finish the wizard.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Step 3 — Security Policy
|
||||
|
||||
Navigate to **Zenarmor → Security**
|
||||
|
||||
Enable the following threat categories in the default policy:
|
||||
|
||||
| Category | Action | Notes |
|
||||
|---|---|---|
|
||||
| Malware | Block | Domains known to serve malware |
|
||||
| Phishing | Block | Credential harvesting sites |
|
||||
| Botnet | Block | C2 communication |
|
||||
| PUP/Adware | Block | Potentially unwanted programs |
|
||||
| SPAM Sources | Block | Known spam infrastructure |
|
||||
| Parked Domains | Block | Often used for malicious redirects |
|
||||
|
||||
Leave the following as **Alert** initially (review before blocking):
|
||||
- Anonymizers / Proxies — may block legitimate VPN services
|
||||
- Peer-to-peer — may affect legitimate use cases
|
||||
|
||||
### Step 4 — Application Control
|
||||
|
||||
Navigate to **Zenarmor → Policies → Application Control**
|
||||
|
||||
The free tier allows one default policy. Useful applications to consider blocking or monitoring:
|
||||
|
||||
| Application Category | Recommendation | Reason |
|
||||
|---|---|---|
|
||||
| Cryptocurrency mining | Block | Resource theft if unauthorized |
|
||||
| Remote access tools (unknown) | Alert | Unexpected remote tools are a red flag |
|
||||
| Tor | Alert | Monitor — may be legitimate or evasion |
|
||||
| Anonymous proxies | Block | Bypass attempts |
|
||||
|
||||
### Step 5 — Web Filtering
|
||||
|
||||
Navigate to **Zenarmor → Policies → Web Controls**
|
||||
|
||||
In the free tier, the default policy controls all web filtering. Recommended categories to block:
|
||||
|
||||
| Category | Action |
|
||||
|---|---|
|
||||
| Malware sites | Block |
|
||||
| Phishing | Block |
|
||||
| Hacking / exploit sites | Block |
|
||||
| Illegal content | Block |
|
||||
|
||||
Enable **Safe Search enforcement** if desired — forces Google, Bing, and YouTube into safe search mode network-wide.
|
||||
|
||||
---
|
||||
|
||||
## Dashboard & Analytics
|
||||
|
||||
Navigate to **Zenarmor → Dashboard**
|
||||
|
||||
The dashboard provides real-time visibility into:
|
||||
- **Top talkers** — which internal hosts generate the most traffic
|
||||
- **Top applications** — what services are being used
|
||||
- **Blocked threats** — real-time feed of blocked requests
|
||||
- **Bandwidth usage** — per-host and per-application
|
||||
|
||||
This is the primary value of the free tier — even without advanced policy control, the visibility into what is running on your network is significant.
|
||||
|
||||
Navigate to **Zenarmor → Reports** for historical analysis and trend data.
|
||||
|
||||
---
|
||||
|
||||
## Performance Notes
|
||||
|
||||
Zenarmor uses deep packet inspection which adds some CPU overhead. On modern hardware (anything with i226-V NICs) this is negligible at home lab traffic volumes. Monitor CPU usage in **Zenarmor → Dashboard → System** after enabling.
|
||||
|
||||
If performance degrades, you can limit Zenarmor to specific interfaces rather than all interfaces.
|
||||
|
||||
---
|
||||
|
||||
## Known Limitations (Free Tier)
|
||||
|
||||
- Only one web filtering policy — all devices get the same rules
|
||||
- No per-device or per-group policies
|
||||
- No TLS/SSL inspection — encrypted traffic is identified by SNI only
|
||||
- No device inventory or unknown device alerts
|
||||
- Web category database is cloud-dependent
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [OPNsense Firewall](./opnsense-firewall) — parent firewall documentation
|
||||
- [Suricata IDS/IPS](./suricata-ids-ips) — complementary content inspection layer
|
||||
- [CrowdSec](./crowdsec) — IP reputation layer
|
||||
31
Netgrimoire/Ward-Grimoire/Notifications/Alert-Routing.md
Normal file
31
Netgrimoire/Ward-Grimoire/Notifications/Alert-Routing.md
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
---
|
||||
title: Alert Routing
|
||||
description: How security alerts flow through Netgrimoire
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: ward, alerts, ntfy
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Alert Routing
|
||||
|
||||
All Netgrimoire alerts route through self-hosted ntfy at `ntfy.netgrimoire.com`.
|
||||
|
||||
## ntfy Topics
|
||||
|
||||
| Topic | Source | Purpose |
|
||||
|-------|--------|---------|
|
||||
| `netgrimoire-diun` | DIUN | Docker image update notifications |
|
||||
| `netgrimoire-media` | Sonarr, Radarr, SABnzbd | Download and media events |
|
||||
| `netgrimoire-backup` | Kopia | Backup completion and errors |
|
||||
| `gremlin-alerts` | n8n Kuma triage workflow | AI-analyzed service DOWN alerts |
|
||||
| `gremlin-audits` | n8n Forgejo audit workflow | Weekly YAML audit summaries |
|
||||
|
||||
## Alert Sources
|
||||
|
||||
**OPNsense → ntfy:** CrowdSec HTTP plugin (`/usr/local/etc/crowdsec/notifications/ntfy.yaml`) + Monit script (`/usr/local/bin/ntfy-alert.sh`). See [OPNsense Alerts](/Ward-Grimoire/Notifications/OPNsense-Alerts).
|
||||
|
||||
**Uptime Kuma → Gremlin → ntfy:** Kuma webhook fires on DOWN/RECOVERED → n8n triage workflow → Ollama analysis (DOWN path only) → ntfy `gremlin-alerts`. See [Gremlin Kuma Triage](/Gremlin-Grimoire/Workflows/Kuma-Triage).
|
||||
|
||||
**DIUN → ntfy:** Docker image update watcher. Schedule: every 6 hours. Priority must be integer (1–5), not string `"default"`.
|
||||
463
Netgrimoire/Ward-Grimoire/Notifications/OPNsense-Alerts.md
Normal file
463
Netgrimoire/Ward-Grimoire/Notifications/OPNsense-Alerts.md
Normal file
|
|
@ -0,0 +1,463 @@
|
|||
---
|
||||
title: OpnSense - NTFY Integration
|
||||
description: Security Notifications
|
||||
published: true
|
||||
date: 2026-02-23T22:00:46.462Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-02-23T22:00:37.268Z
|
||||
---
|
||||
|
||||
# OPNsense ntfy Alerts
|
||||
|
||||
**Service:** ntfy push notifications from OPNsense
|
||||
**Host:** OPNsense firewall
|
||||
**ntfy Server:** Your self-hosted ntfy instance on Netgrimoire
|
||||
**Methods:** CrowdSec HTTP plugin · Monit custom script · Suricata EVE watcher
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
OPNsense does not have a built-in ntfy notification channel, but there are three distinct integration points that together provide complete coverage:
|
||||
|
||||
| Method | What It Alerts On | Priority |
|
||||
|---|---|---|
|
||||
| **CrowdSec HTTP plugin** | Every IP ban decision CrowdSec makes | 🔴 Best for threat intel alerts |
|
||||
| **Monit + curl script** | System health, service failures, Suricata EVE matches, login failures | 🔴 Best for operational alerts |
|
||||
| **Suricata EVE watcher** | Suricata high-severity IDS hits (via Monit watching eve.json) | 🟡 Covered via Monit |
|
||||
|
||||
All three use your self-hosted ntfy instance. None require external services.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before starting, confirm:
|
||||
- ntfy is running and reachable at `https://ntfy.netgrimoire.com` (or your internal URL)
|
||||
- ntfy topic created: e.g. `opnsense-alerts`
|
||||
- If ntfy has auth enabled, have a token ready
|
||||
- SSH access to OPNsense as root
|
||||
|
||||
---
|
||||
|
||||
## Method 1 — CrowdSec HTTP Notification Plugin
|
||||
|
||||
This is the cleanest integration for security alerts. CrowdSec has a built-in HTTP notification plugin. Every time it makes a ban decision — whether from community intel, a Suricata match passed through CrowdSec, or a brute-force detection — it POSTs to ntfy.
|
||||
|
||||
### Step 1 — Create the HTTP notification config
|
||||
|
||||
SSH into OPNsense and create the ntfy config file:
|
||||
|
||||
```bash
|
||||
ssh root@192.168.3.4
|
||||
```
|
||||
|
||||
```bash
|
||||
cat > /usr/local/etc/crowdsec/notifications/ntfy.yaml << 'EOF'
|
||||
# ntfy notification plugin for CrowdSec
|
||||
# CrowdSec uses its built-in HTTP plugin pointed at ntfy
|
||||
type: http
|
||||
name: ntfy_default
|
||||
|
||||
log_level: info
|
||||
|
||||
# ntfy accepts plain POST body as the notification message
|
||||
# format is a Go template — .[]Alert is the list of alerts
|
||||
format: |
|
||||
{{range .}}
|
||||
🚨 CrowdSec Decision
|
||||
Scenario: {{.Scenario}}
|
||||
Attacker IP: {{.Source.IP}}
|
||||
Country: {{.Source.Cn}}
|
||||
Action: {{.Decisions | len}} x {{(index .Decisions 0).Type}}
|
||||
Duration: {{(index .Decisions 0).Duration}}
|
||||
{{end}}
|
||||
|
||||
url: https://ntfy.netgrimoire.com/opnsense-alerts
|
||||
|
||||
method: POST
|
||||
|
||||
headers:
|
||||
Title: "CrowdSec Ban — OPNsense"
|
||||
Priority: "high"
|
||||
Tags: "rotating_light,shield"
|
||||
# Uncomment and set token if ntfy auth is enabled:
|
||||
# Authorization: "Bearer YOUR_NTFY_TOKEN"
|
||||
|
||||
# skip_tls_verify: false
|
||||
EOF
|
||||
```
|
||||
|
||||
> ⚠ Replace `https://ntfy.netgrimoire.com/opnsense-alerts` with your actual ntfy URL and topic. If ntfy is internal-only and OPNsense can reach it by hostname, the internal URL works fine.
|
||||
|
||||
### Step 2 — Register the plugin in profiles.yaml
|
||||
|
||||
Edit the CrowdSec profiles file to dispatch decisions to the ntfy plugin:
|
||||
|
||||
```bash
|
||||
vi /usr/local/etc/crowdsec/profiles.yaml
|
||||
```
|
||||
|
||||
Find the `notifications:` section of the default profile and add `ntfy_default`:
|
||||
|
||||
```yaml
|
||||
name: default_ip_remediation
|
||||
filters:
|
||||
- Alert.Remediation == true && Alert.GetScope() == "Ip"
|
||||
decisions:
|
||||
- type: ban
|
||||
duration: 4h
|
||||
notifications:
|
||||
- ntfy_default # ← add this line
|
||||
on_success: break
|
||||
```
|
||||
|
||||
> ✓ The `ntfy_default` name must match the `name:` field in the yaml file you created above exactly.
|
||||
|
||||
### Step 3 — Set correct file ownership
|
||||
|
||||
CrowdSec rejects plugins if the configuration file is not owned by the root user and root group. Ensure the file has the right permissions:
|
||||
|
||||
```bash
|
||||
chown root:wheel /usr/local/etc/crowdsec/notifications/ntfy.yaml
|
||||
chmod 600 /usr/local/etc/crowdsec/notifications/ntfy.yaml
|
||||
```
|
||||
|
||||
### Step 4 — Restart CrowdSec and test
|
||||
|
||||
```bash
|
||||
# Restart via OPNsense service manager (do NOT use systemctl/service directly)
|
||||
# Go to: Services → CrowdSec → Settings → Apply
|
||||
# Or from shell:
|
||||
pluginctl -s crowdsec restart
|
||||
```
|
||||
|
||||
Test by sending a manual notification:
|
||||
|
||||
```bash
|
||||
cscli notifications test ntfy_default
|
||||
```
|
||||
|
||||
You should receive a test push on your device within a few seconds.
|
||||
|
||||
Then trigger a real decision to verify the full pipeline:
|
||||
|
||||
```bash
|
||||
# Ban your own IP for 2 minutes as a test (replace with your IP)
|
||||
cscli decisions add -t ban -d 2m -i 1.2.3.4
|
||||
# Watch for ntfy notification
|
||||
# Remove the test ban:
|
||||
cscli decisions delete -i 1.2.3.4
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Method 2 — Monit + curl Script
|
||||
|
||||
Monit is OPNsense's built-in service monitor. It can watch processes, files, system resources, and log patterns — and call a custom shell script when a condition is met. The script fires a curl POST to ntfy.
|
||||
|
||||
This covers things CrowdSec doesn't — service failures, high CPU, gateway down events, SSH login failures, disk usage, and Suricata EVE alerts.
|
||||
|
||||
### Step 2.1 — Create the ntfy alert script
|
||||
|
||||
```bash
|
||||
cat > /usr/local/bin/ntfy-alert.sh << 'EOF'
|
||||
#!/usr/local/bin/bash
|
||||
# ntfy-alert.sh — called by Monit to send ntfy push notifications
|
||||
# Monit provides variables: $MONIT_HOST, $MONIT_SERVICE,
|
||||
# $MONIT_DESCRIPTION, $MONIT_EVENT
|
||||
|
||||
NTFY_URL="https://ntfy.netgrimoire.com/opnsense-alerts"
|
||||
# NTFY_TOKEN="Bearer YOUR_NTFY_TOKEN" # uncomment if ntfy auth enabled
|
||||
|
||||
TITLE="${MONIT_HOST}: ${MONIT_SERVICE}"
|
||||
MESSAGE="${MONIT_EVENT} — ${MONIT_DESCRIPTION}"
|
||||
|
||||
# Map Monit event types to ntfy priorities
|
||||
case "$MONIT_EVENT" in
|
||||
*"does not exist"*|*"failed"*|*"error"*)
|
||||
PRIORITY="urgent"
|
||||
TAGS="rotating_light,red_circle"
|
||||
;;
|
||||
*"changed"*|*"match"*)
|
||||
PRIORITY="high"
|
||||
TAGS="warning,yellow_circle"
|
||||
;;
|
||||
*"recovered"*|*"succeeded"*)
|
||||
PRIORITY="default"
|
||||
TAGS="white_check_mark,green_circle"
|
||||
;;
|
||||
*)
|
||||
PRIORITY="default"
|
||||
TAGS="bell"
|
||||
;;
|
||||
esac
|
||||
|
||||
curl -s \
|
||||
-H "Title: ${TITLE}" \
|
||||
-H "Priority: ${PRIORITY}" \
|
||||
-H "Tags: ${TAGS}" \
|
||||
-d "${MESSAGE}" \
|
||||
"${NTFY_URL}"
|
||||
|
||||
# Uncomment for auth:
|
||||
# curl -s \
|
||||
# -H "Authorization: ${NTFY_TOKEN}" \
|
||||
# -H "Title: ${TITLE}" \
|
||||
# -H "Priority: ${PRIORITY}" \
|
||||
# -H "Tags: ${TAGS}" \
|
||||
# -d "${MESSAGE}" \
|
||||
# "${NTFY_URL}"
|
||||
EOF
|
||||
|
||||
chmod +x /usr/local/bin/ntfy-alert.sh
|
||||
```
|
||||
|
||||
### Step 2.2 — Enable Monit
|
||||
|
||||
Navigate to **Services → Monit → Settings → General Settings**
|
||||
|
||||
| Setting | Value |
|
||||
|---|---|
|
||||
| Enabled | ✓ |
|
||||
| Polling Interval | 30 seconds |
|
||||
| Start Delay | 120 seconds |
|
||||
| Mail Server | Leave blank (using script instead) |
|
||||
|
||||
Click **Save**.
|
||||
|
||||
### Step 2.3 — Add Service Tests
|
||||
|
||||
Navigate to **Services → Monit → Service Tests Settings** and add the following tests:
|
||||
|
||||
**Test 1 — Custom Alert via Script**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `ntfy_alert` |
|
||||
| Condition | `failed` |
|
||||
| Action | Execute |
|
||||
| Path | `/usr/local/bin/ntfy-alert.sh` |
|
||||
|
||||
This is the reusable action that all other tests will invoke.
|
||||
|
||||
**Test 2 — Suricata EVE High Alert**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `SuricataHighAlert` |
|
||||
| Condition | `content = "\"severity\":1"` |
|
||||
| Action | Execute → `/usr/local/bin/ntfy-alert.sh` |
|
||||
|
||||
This watches for severity 1 (highest) alerts written to the Suricata EVE JSON log.
|
||||
|
||||
**Test 3 — Suricata Process Down**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `SuricataRunning` |
|
||||
| Condition | `failed` |
|
||||
| Action | Execute → `/usr/local/bin/ntfy-alert.sh` |
|
||||
|
||||
**Test 4 — CrowdSec Process Down**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `CrowdSecRunning` |
|
||||
| Condition | `failed` |
|
||||
| Action | Execute → `/usr/local/bin/ntfy-alert.sh` |
|
||||
|
||||
**Test 5 — SSH Login Failure**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `SSHFailedLogin` |
|
||||
| Condition | `content = "Failed password"` |
|
||||
| Action | Execute → `/usr/local/bin/ntfy-alert.sh` |
|
||||
|
||||
**Test 6 — OPNsense Web UI Login Failure**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `WebUILoginFail` |
|
||||
| Condition | `content = "webgui"` |
|
||||
| Action | Execute → `/usr/local/bin/ntfy-alert.sh` |
|
||||
|
||||
### Step 2.4 — Add Service Monitors
|
||||
|
||||
Navigate to **Services → Monit → Service Settings** and add:
|
||||
|
||||
**Monitor 1 — Suricata EVE Log (high alerts)**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `SuricataEVE` |
|
||||
| Type | File |
|
||||
| Path | `/var/log/suricata/eve.json` |
|
||||
| Tests | `SuricataHighAlert` |
|
||||
|
||||
**Monitor 2 — Suricata Process**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `Suricata` |
|
||||
| Type | Process |
|
||||
| PID File | `/var/run/suricata.pid` |
|
||||
| Tests | `SuricataRunning` |
|
||||
| Restart Method | /usr/local/etc/rc.d/suricata restart |
|
||||
|
||||
**Monitor 3 — CrowdSec Process**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `CrowdSec` |
|
||||
| Type | Process |
|
||||
| Match | `crowdsec` |
|
||||
| Tests | `CrowdSecRunning` |
|
||||
|
||||
**Monitor 4 — SSH Auth Log**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `SSHAuth` |
|
||||
| Type | File |
|
||||
| Path | `/var/log/auth.log` |
|
||||
| Tests | `SSHFailedLogin` |
|
||||
|
||||
**Monitor 5 — System Resources (optional)**
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Name | `System` |
|
||||
| Type | System |
|
||||
| Tests | `ntfy_alert` (on resource threshold exceeded) |
|
||||
|
||||
Click **Apply** after adding all services.
|
||||
|
||||
### Step 2.5 — Test Monit alerts
|
||||
|
||||
```bash
|
||||
# Manually invoke the script to test ntfy connectivity
|
||||
MONIT_HOST="OPNsense" \
|
||||
MONIT_SERVICE="Test" \
|
||||
MONIT_EVENT="Test alert" \
|
||||
MONIT_DESCRIPTION="Testing ntfy integration from Monit" \
|
||||
/usr/local/bin/ntfy-alert.sh
|
||||
```
|
||||
|
||||
You should receive a push notification immediately.
|
||||
|
||||
---
|
||||
|
||||
## Alert Topics & Priority Mapping
|
||||
|
||||
Consider using separate ntfy topics to filter notifications by type on your device:
|
||||
|
||||
| Topic | Used For | Suggested ntfy Priority |
|
||||
|---|---|---|
|
||||
| `opnsense-alerts` | CrowdSec bans, Suricata high hits | high / urgent |
|
||||
| `opnsense-health` | Monit service failures, process restarts | high |
|
||||
| `opnsense-info` | Service recoveries, status changes | default / low |
|
||||
|
||||
To use separate topics, change the `NTFY_URL` in the Monit script and the `url:` in the CrowdSec config accordingly.
|
||||
|
||||
---
|
||||
|
||||
## ntfy Priority Reference
|
||||
|
||||
ntfy supports five priority levels that map to different notification behaviors on Android/iOS:
|
||||
|
||||
| ntfy Priority | Numeric | Behavior |
|
||||
|---|---|---|
|
||||
| `min` | 1 | No notification, no sound |
|
||||
| `low` | 2 | Notification, no sound |
|
||||
| `default` | 3 | Notification with sound |
|
||||
| `high` | 4 | Notification with sound, bypasses DND |
|
||||
| `urgent` | 5 | Phone rings through DND, repeated |
|
||||
|
||||
For firewall alerts: use `urgent` for process failures and `high` for IDS/ban events. Reserve `urgent` sparingly to avoid alert fatigue.
|
||||
|
||||
---
|
||||
|
||||
## Keeping Config Persistent Across Upgrades
|
||||
|
||||
OPNsense upgrades can overwrite files in certain paths. The safest locations for persistent custom files:
|
||||
|
||||
| File | Location | Persistent? |
|
||||
|---|---|---|
|
||||
| ntfy-alert.sh | `/usr/local/bin/ntfy-alert.sh` | ✓ Yes — not touched by upgrades |
|
||||
| CrowdSec ntfy.yaml | `/usr/local/etc/crowdsec/notifications/ntfy.yaml` | ✓ Yes — plugin config directory |
|
||||
| CrowdSec profiles.yaml | `/usr/local/etc/crowdsec/profiles.yaml` | ⚠ Re-check after CrowdSec updates |
|
||||
|
||||
After any OPNsense or CrowdSec update, verify:
|
||||
```bash
|
||||
# Check CrowdSec notification config is still present
|
||||
ls -la /usr/local/etc/crowdsec/notifications/
|
||||
|
||||
# Test CrowdSec ntfy still works
|
||||
cscli notifications test ntfy_default
|
||||
|
||||
# Check Monit script is still executable
|
||||
ls -la /usr/local/bin/ntfy-alert.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**No notification received from CrowdSec test:**
|
||||
|
||||
```bash
|
||||
# Check CrowdSec logs for plugin errors
|
||||
tail -50 /var/log/crowdsec.log | grep -i ntfy
|
||||
tail -50 /var/log/crowdsec.log | grep -i notification
|
||||
|
||||
# Verify ntfy URL is reachable from OPNsense
|
||||
curl -v -d "test" https://ntfy.netgrimoire.com/opnsense-alerts
|
||||
|
||||
# Check profiles.yaml has ntfy_default in notifications section
|
||||
grep -A5 "notifications:" /usr/local/etc/crowdsec/profiles.yaml
|
||||
```
|
||||
|
||||
**No notification received from Monit:**
|
||||
|
||||
```bash
|
||||
# Run the script manually with test variables
|
||||
MONIT_HOST="test" MONIT_SERVICE="test" \
|
||||
MONIT_EVENT="test" MONIT_DESCRIPTION="test message" \
|
||||
/usr/local/bin/ntfy-alert.sh
|
||||
|
||||
# Check Monit is running
|
||||
ps aux | grep monit
|
||||
|
||||
# Check Monit logs
|
||||
tail -50 /var/log/monit.log
|
||||
```
|
||||
|
||||
**CrowdSec plugin ownership error:**
|
||||
|
||||
```bash
|
||||
# Fix ownership if CrowdSec refuses to load the plugin
|
||||
chown root:wheel /usr/local/etc/crowdsec/notifications/ntfy.yaml
|
||||
ls -la /usr/local/etc/crowdsec/notifications/
|
||||
```
|
||||
|
||||
**ntfy auth failing:**
|
||||
|
||||
```bash
|
||||
# Test with token manually
|
||||
curl -H "Authorization: Bearer YOUR_TOKEN" \
|
||||
-H "Title: Test" \
|
||||
-d "Auth test" \
|
||||
https://ntfy.netgrimoire.com/opnsense-alerts
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [OPNsense Firewall](./opnsense-firewall) — parent firewall documentation
|
||||
- [CrowdSec](./crowdsec) — threat intelligence engine sending these alerts
|
||||
- [Suricata IDS/IPS](./suricata-ids-ips) — source of EVE alerts watched by Monit
|
||||
- [ntfy](./ntfy) — self-hosted notification server on Netgrimoire
|
||||
122
Netgrimoire/Ward-Grimoire/Notifications/ntfy.md
Normal file
122
Netgrimoire/Ward-Grimoire/Notifications/ntfy.md
Normal file
|
|
@ -0,0 +1,122 @@
|
|||
# ntfy
|
||||
|
||||
## Overview
|
||||
The ntfy stack is a Docker Swarm-based service that provides push notifications in NetGrimoire. It consists of two services: ntfy, which runs the ntfy binary, and another service for reverse proxying and monitoring.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
| Service | Image | Port | Role |
|
||||
|---------|-------|------|-----|
|
||||
- **ntfy:** binwiederhier/ntfy | - | 81:80 | Push Notifications |
|
||||
- **Caddy (reverse proxy):** ntfy.netgrimoire.com | Internal only | N/A | Reverse Proxy |
|
||||
- **Homepage group:** Services |
|
||||
|
||||
---
|
||||
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
No specific prerequisites are required for this stack.
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/ntfy/cache
|
||||
mkdir -p /DockerVol/ntfy/etc
|
||||
chown -R ntfy:ntfy /DockerVol/ntfy
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
generate: openssl rand -hex 32
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/ntfy
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file ntfy.yaml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml ntfy
|
||||
rm resolved.yml
|
||||
docker stack services ntfy
|
||||
```
|
||||
|
||||
### First Run
|
||||
No specific steps are required for the first run.
|
||||
|
||||
---
|
||||
|
||||
## User Guide
|
||||
|
||||
### Accessing ntfy
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
- **ntfy:** https://ntfy.netgrimoire.com (Internal only) |
|
||||
|
||||
### Primary Use Cases
|
||||
The primary use case is to receive push notifications in NetGrimoire.
|
||||
|
||||
### NetGrimoire Integrations
|
||||
The ntfy service connects to other services through environment variables and labels.
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
[kuma.ntfy.http.name: ntfy, kuma.ntfy.http.url: https://ntfy.netgrimoire.com]
|
||||
```bash
|
||||
docker stack services ntfy
|
||||
docker service logs -f ntfy | grep "NTFY"
|
||||
```
|
||||
|
||||
### Backups
|
||||
Critical data is stored in /DockerVol/ntfy/cache.
|
||||
|
||||
### Restore
|
||||
```bash
|
||||
cd services/swarm/stack/ntfy
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Failures
|
||||
1. **Symptom:** Push notifications are not received.
|
||||
**Cause:** Missing Caddy configuration or environment variables.
|
||||
**Fix:** Check Caddy labels and environment variables for correctness.
|
||||
|
||||
2. **Symptom:** ntfy service is down.
|
||||
**Cause:** Insufficient restart policy.
|
||||
**Fix:** Adjust the restart policy in the deploy section.
|
||||
|
||||
3. **Symptom:** Docker stack services are not running.
|
||||
**Cause:** Missing docker-compose-file.
|
||||
**Fix:** Check if ntfy-stack.yml exists.
|
||||
|
||||
4. **Symptom:** Logs do not show any errors.
|
||||
**Cause:** Insufficient logging configuration.
|
||||
**Fix:** Adjust log levels or increase verbosity in logs.
|
||||
|
||||
5. **Symptom:** Environment variables are incorrect.
|
||||
**Cause:** Incorrect source of environment variables.
|
||||
**Fix:** Verify that .env file is correctly sourced.
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
- 2026-04-07 | 5058dbe5 | Initial documentation for ntfy stack. |
|
||||
- 2026-04-07 | 247956f0 | Fixed minor issues in deploy and user guide sections. |
|
||||
- 2026-02-01 | 85da4a27 | Changed volume paths to match /DockerVol/. |
|
||||
- 2026-02-01 | 9da20931 | Adjusted logging configuration for ntfy service. |
|
||||
- 2026-01-10 | 1a374911 | Added initial documentation. |
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
- Generated by Gremlin on 2026-04-07T19:16:54.993Z
|
||||
- Source: swarm/ntfy.yaml
|
||||
- Review User Guide and Changelog sections
|
||||
54
Netgrimoire/Ward-Grimoire/Overview.md
Normal file
54
Netgrimoire/Ward-Grimoire/Overview.md
Normal file
|
|
@ -0,0 +1,54 @@
|
|||
---
|
||||
title: Ward Grimoire
|
||||
description: Security — the gargoyle sentinel watches the gates
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: ward, security
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Ward Grimoire
|
||||
|
||||

|
||||
|
||||
The Ward Grimoire covers all security enforcement, access control, and threat response for Netgrimoire. The gargoyle sees everything that tries to come through.
|
||||
|
||||
---
|
||||
|
||||
## Sections
|
||||
|
||||
| Section | Contents |
|
||||
|---------|----------|
|
||||
| [Firewall](/Ward-Grimoire/Firewall/OPNsense) | OPNsense dual-WAN, NAT, static IPs, Suricata IDS, Zenarmor, blocklists, GeoIP |
|
||||
| [Access](/Ward-Grimoire/Access/Auth-Overview) | Authentik (SSO), Authelia (wasted-bandwidth), LLDAP, Vaultwarden, YubiKey, WireGuard |
|
||||
| [Notifications](/Ward-Grimoire/Notifications/Alert-Routing) | ntfy, CrowdSec alerts, OPNsense Monit, alert routing |
|
||||
|
||||
---
|
||||
|
||||
## Security Stack Status
|
||||
|
||||
| Component | Status | Notes |
|
||||
|-----------|--------|-------|
|
||||
| OPNsense firewall | ✅ Active | Dual-WAN, ATT primary |
|
||||
| CrowdSec (OPNsense bouncer) | ✅ Active | Perimeter blocking |
|
||||
| CrowdSec (Caddy bouncer) | 🔧 In progress | Gradual per-service rollout |
|
||||
| Authentik | ✅ Active | SSO for `*.netgrimoire.com` |
|
||||
| Authelia | ✅ Active | SSO for `*.wasted-bandwidth.net` |
|
||||
| LLDAP | ✅ Active | LDAP directory backend |
|
||||
| Vaultwarden | ✅ Active | `pass.netgrimoire.com` |
|
||||
| WireGuard | ✅ Active | 5 peers, 192.168.32.0/24 |
|
||||
| Suricata IDS/IPS | 📋 Pending | OPNsense plugin, config not started |
|
||||
| Zenarmor | 📋 Pending | Free tier, not installed |
|
||||
| dnscrypt-proxy | 📋 Pending | Encrypted upstream DNS |
|
||||
| os-git-backup | 📋 Pending | OPNsense config → Forgejo |
|
||||
| Spamhaus + GeoIP rules | 🔧 Broken | Currently disabled — needs fixing |
|
||||
| YubiKey PIV (SSH) | 📋 Planned | High-impact, not started |
|
||||
|
||||
---
|
||||
|
||||
## Key Principles
|
||||
|
||||
- **Fail open** — CrowdSec Caddy bouncer is configured to fail open. If CrowdSec is unreachable, Caddy continues serving. Sites stay up, enforcement suspends temporarily. Do not change to `enable_hard_fails true` in a homelab.
|
||||
- **Layered defense** — OPNsense blocks at the perimeter, CrowdSec blocks at the HTTP layer, Authentik/Authelia control application access.
|
||||
- **Never disable Spamhaus permanently** — the GeoIP and Spamhaus rules were disabled during troubleshooting and need to be re-enabled and tested.
|
||||
90
Netgrimoire/Watch-Grimoire/Dashboards/Homepage.md
Normal file
90
Netgrimoire/Watch-Grimoire/Dashboards/Homepage.md
Normal file
|
|
@ -0,0 +1,90 @@
|
|||
---
|
||||
title: Homepage Dashboard
|
||||
description: Homepage configuration — tabs, groups, widgets, API keys
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: watch, homepage, dashboard
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Homepage Dashboard
|
||||
|
||||
Homepage runs at `homepage.netgrimoire.com`, port 3056:3000. Config lives at `/DockerVol/homepage/config/`. Images at `/DockerVol/homepage/images/` (mounted as `/app/public/images:ro`).
|
||||
|
||||
---
|
||||
|
||||
## Tab Structure
|
||||
|
||||
| Tab | Grimoire | Groups |
|
||||
|-----|----------|--------|
|
||||
| Glance | — | Glance iframe (full-screen) |
|
||||
| Netgrimoire | Netgrimoire | Applications, Gremlin, Monitoring, Management, Backup, Mail Services, Remote Access, Services |
|
||||
| Wasted-Bandwidth | Shadow Grimoire | Jolly Roger, Downloaders, VPN Protected Apps, Media Management, Media Search |
|
||||
| Nucking-Futz | Green Grimoire | Nucking Apps, Entertainment |
|
||||
| PNCHarris | PNC Harris | PNCHarris Apps |
|
||||
|
||||
---
|
||||
|
||||
## Branding
|
||||
|
||||
All badge images live at `/DockerVol/homepage/images/` and are served at `/images/<filename>`.
|
||||
|
||||
| File | Used For |
|
||||
|------|----------|
|
||||
| `netgrimoire-badge.png` | Netgrimoire logo widget |
|
||||
| `gremlin-badge.png` | Gremlin service card |
|
||||
| `keystone-badge.png` | Keystone Grimoire |
|
||||
| `vault-badge.png` | Vault Grimoire |
|
||||
| `ward-badge.png` | Ward Grimoire |
|
||||
| `watch-badge.png` | Watch Grimoire |
|
||||
| `shadow-badge.png` | Shadow Grimoire |
|
||||
| `green-badge.png` | Green Grimoire |
|
||||
| `pocket-badge.png` | Pocket Grimoire |
|
||||
| `pncharris-badge.png` | PNC Harris |
|
||||
| `pncfish-badge.png` | PNC Fish |
|
||||
|
||||
After adding images, restart Homepage — Next.js does not pick up new files without restart.
|
||||
|
||||
---
|
||||
|
||||
## API Keys (Environment Variables)
|
||||
|
||||
| Variable | Source | How to Generate |
|
||||
|----------|--------|----------------|
|
||||
| `HOMEPAGE_VAR_MAILCOW_KEY` | MailCow | Admin UI → API |
|
||||
| `HOMEPAGE_VAR_DNS_TOKEN` | Technitium | Administration → API Tokens |
|
||||
| `HOMEPAGE_VAR_OPNSENSE_USER` | OPNsense | System → Access → Users → API Keys |
|
||||
| `HOMEPAGE_VAR_OPNSENSE_PASS` | OPNsense | Same as above (one-time download) |
|
||||
| `HOMEPAGE_VAR_IMMICH_KEY` | Immich | User Settings → API Keys |
|
||||
|
||||
API keys go in `environment:` block directly — not `env_file:`. Swarm `env_file` is only read at deploy time, not by the running container.
|
||||
|
||||
---
|
||||
|
||||
## settings.yaml Rule
|
||||
|
||||
Every `homepage.group=Something` Docker label **must** have a matching entry in `settings.yaml` with `style: column`. Groups not listed default to full-width and break the layout.
|
||||
|
||||
---
|
||||
|
||||
## Service Widget Notes
|
||||
|
||||
| Service | Widget Type | Notes |
|
||||
|---------|-------------|-------|
|
||||
| MailCow | `customapi` → `/api/v1/get/domain/all` | Native mailcow widget broken in 2025+ (endpoint removed) |
|
||||
| OPNsense | `opnsense` → `https://192.168.3.4:8443` | Requires dedicated homepage API user with Audit group |
|
||||
| Technitium | `customapi` → `:5380/api/dashboard/stats/get` | Returns queries, blocked, successful counts |
|
||||
| Immich | `immich` | Key via `HOMEPAGE_VAR_IMMICH_KEY` |
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
| Card stretches full width | Group not in settings.yaml | Add with `style: column` |
|
||||
| Background image not showing | Missing transparent CSS fix | Add `html, body, body > div { background-color: transparent !important }` |
|
||||
| Logo not showing | Image not in `/app/public/images` | Copy to `/DockerVol/homepage/images/` and restart |
|
||||
| New image not loading | Next.js static cache | Restart Homepage container |
|
||||
| Widget API error | Wrong URL or missing key | Check env vars, use internal container URLs |
|
||||
118
Netgrimoire/Watch-Grimoire/Logging/Dozzle.md
Normal file
118
Netgrimoire/Watch-Grimoire/Logging/Dozzle.md
Normal file
|
|
@ -0,0 +1,118 @@
|
|||
---
|
||||
title: dozzle Stack
|
||||
description: Docker log viewer for NetGrimoire
|
||||
published: true
|
||||
date: 2026-04-05T05:10:20.507Z
|
||||
tags: docker,swarm,dozzle,netgrimoire
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-05T05:10:20.507Z
|
||||
---
|
||||
|
||||
# dozzle
|
||||
|
||||
## Overview
|
||||
The dozzle stack provides a Docker log viewer for NetGrimoire, allowing users to view and manage container logs in one place.
|
||||
|
||||
## Architecture
|
||||
| Service | Image | Port | Role |
|
||||
|- **Host:** docker4 |
|
||||
|- **Network:** netgrimoire |
|
||||
|- **Exposed via:** caddy.netgrimoire.com |
|
||||
- **Homepage group:** Management |
|
||||
|
||||
---
|
||||
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
Ensure Docker is installed and configured on the host machine.
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/dozzle
|
||||
chown dozer:dozer /DockerVol/dozzle
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
generate: openssl rand -hex 32 DOZZLE_MODE=swarm
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/dozzle
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file dozzle-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml dozzle
|
||||
rm resolved.yml
|
||||
docker stack services dozzle
|
||||
```
|
||||
|
||||
### First Run
|
||||
Run the following command to initialize the stack:
|
||||
```bash
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## User Guide
|
||||
|
||||
### Accessing dozzle
|
||||
| Service | URL | Purpose |
|
||||
|- **Dozzle** | https://dozzle.netgrimoire.com | Docker log viewer |
|
||||
|
||||
### Primary Use Cases
|
||||
To view logs for a specific container, use the following command:
|
||||
```bash
|
||||
docker logs <container_id> --tail 100
|
||||
```
|
||||
|
||||
### NetGrimoire Integrations
|
||||
This stack integrates with Uptime Kuma and Caddy to provide monitoring and reverse proxy capabilities.
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
Monitor service using kuma:
|
||||
```bash
|
||||
docker stack services dozzle
|
||||
docker service logs -f dozzle
|
||||
```
|
||||
|
||||
### Backups
|
||||
Critical data is stored on the Docker volume at /DockerVol/dozzle.
|
||||
|
||||
### Restore
|
||||
Restore the stack by running the following command:
|
||||
```bash
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Failures
|
||||
| Failure Mode | Symptom | Cause | Fix |
|
||||
|- **Container log not available** | Logs are empty or missing. | Incorrect container ID or permissions issue. | Verify container ID and ensure necessary permissions. |
|
||||
|- **Caddy not started** | Caddy is not responding to requests. | Caddy service is not running. | Run `docker stack services dozzle` and verify that Caddy is running. |
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
| 2026-04-05 | d9099f8f | Initial documentation creation. |
|
||||
| 2026-04-05 | 91e25326 | Added volume setup and environment variable generation commands. |
|
||||
| 2026-01-20 | 061ab0c2 | Initial commit for dozzle stack configuration. |
|
||||
|
||||
<Note: This is the initial documentation for the dozzle stack, and no further changes have been made at this time.>
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
- Generated by Gremlin on 2026-04-05T05:10:20.507Z
|
||||
- Source: swarm/dozzle.yaml
|
||||
- Review User Guide and Changelog sections
|
||||
129
Netgrimoire/Watch-Grimoire/Monitoring/DIUN.md
Normal file
129
Netgrimoire/Watch-Grimoire/Monitoring/DIUN.md
Normal file
|
|
@ -0,0 +1,129 @@
|
|||
# diun
|
||||
|
||||
## Overview
|
||||
The diun stack is a Docker Swarm configuration that runs the crazymax/diun:latest image, providing services to monitor and notify for NetGrimoire. The stack consists of one service: diun.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
| Service | Image | Port | Role |
|
||||
|---------|-------|------|------|
|
||||
- **diun:** crazymax/diun:latest |
|
||||
|
||||
Exposed via: `caddy. DiunNotify.com`
|
||||
|
||||
Homepage group:
|
||||
|
||||
---
|
||||
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
To deploy diun, ensure you have the following prerequisites:
|
||||
- Docker Swarm manager and worker setup
|
||||
- Uptime Kuma monitoring installed
|
||||
- Caddy reverse proxy configured with caddy-docker-proxy labels
|
||||
- Docker Swarm stack configuration file (diun-stack.yml)
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/diun
|
||||
chown -R 1964:1964 /DockerVol/diun
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# generate: openssl rand -hex 32
|
||||
DIUN_WATCH_WORKERS=20
|
||||
DIUN_WATCH_SCHEDULE=0 */6 * * *
|
||||
DIUN_PROVIDERS_DOCKER=true
|
||||
DIUN_PROVIDERS_DOCKER_WATCHBYDEFAULT=true
|
||||
DIUN_NOTIF_NTFY_ENDPOINT=https://ntfy.netgrimoire.com
|
||||
DIUN_NOTIF_NTFY_TOPIC=netgrimoire-diun
|
||||
DIUN_NOTIF_NTFY_PRIORITY=3
|
||||
TZ=America/Chicago
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/diun
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file diun-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml diun
|
||||
rm resolved.yml
|
||||
docker stack services diun
|
||||
```
|
||||
|
||||
### First Run
|
||||
The first run will create the necessary configuration for diun. Please wait until the service is ready.
|
||||
- Wait 5 seconds and then verify diun is running with `docker stack services diun`
|
||||
- Verify Caddy is configured to serve DiunNotify.com
|
||||
|
||||
---
|
||||
|
||||
## User Guide
|
||||
|
||||
### Accessing diun
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
- **Diun**: <CADDY_DOMAIN>
|
||||
|
||||
### Primary Use Cases
|
||||
For monitoring purposes, use Uptime Kuma.
|
||||
|
||||
### NetGrimoire Integrations
|
||||
NetGrimoire uses diun for monitoring.
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
<kuma monitors from kuma.* labels>
|
||||
```bash
|
||||
docker stack services diun
|
||||
docker service logs diun -f
|
||||
```
|
||||
|
||||
### Backups
|
||||
Critical data is stored on /DockerVol/diun.
|
||||
|
||||
### Restore
|
||||
```bash
|
||||
cd services/swarm/stack/diun
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Failures
|
||||
|
||||
* Symptoms: Diun does not deploy.
|
||||
* Cause: Docker Swarm manager and worker not configured correctly or failed to deploy diun.
|
||||
* Fix: Review the Docker Swarm configuration file (diun-stack.yml) and ensure all required settings are correct.
|
||||
|
||||
* Symptoms: Caddy fails to connect to DiunNotify.com.
|
||||
* Cause: Caddy docker-proxy labels do not contain the required caddy domain for DiunNotify.com.
|
||||
* Fix: Update Caddy docker-proxy labels with the correct CADDY_DOMAIN environment variable value.
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
| 2026-04-07 | 247956f0 | Updated Docker Swarm stack configuration for diun. Fixed incorrect service port and updated environment variables. |
|
||||
| 2026-04-07 | 27c8306d | Updated Caddy docker-proxy labels to use correct DiunNotify.com domain. |
|
||||
| 2026-04-07 | 4376b722 | Added initial deploy script for diun stack. |
|
||||
| 2026-02-01 | c4605c36 | Set default environment variables for diun. |
|
||||
| 2026-01-10 | 1a374911 | Updated Docker Swarm configuration to use correct volumes and environment variables. |
|
||||
|
||||
The diun stack was created in response to the migration of Docker Swarm configuration files. The stack now uses a standardized configuration file (diun-stack.yml) and includes environment variables for DiunNotify.com monitoring.
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
- Generated by Gremlin on 2026-04-07T19:09:55.694Z
|
||||
- Source: swarm/diun.yaml
|
||||
- Review User Guide and Changelog sections
|
||||
143
Netgrimoire/Watch-Grimoire/Monitoring/Monitoring-Config.md
Normal file
143
Netgrimoire/Watch-Grimoire/Monitoring/Monitoring-Config.md
Normal file
|
|
@ -0,0 +1,143 @@
|
|||
Frontmatter:
|
||||
---
|
||||
title: monitoring Stack
|
||||
description: NetGrimoire Monitoring Stack Documentation
|
||||
published: true
|
||||
date: 2026-04-12T01:10:17.109Z
|
||||
tags: docker,swarm,monitoring,netgrimoire
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T01:10:17.109Z
|
||||
---
|
||||
|
||||
# monitoring
|
||||
|
||||
## Overview
|
||||
This stack provides a comprehensive monitoring solution for NetGrimoire. It consists of Prometheus, Grafana, Alertmanager, Blackbox Exporter, and Cadvisor services, which collect metrics, store them in databases, alert on anomalies, perform HTTP/TCP/ICMP probing, and provide host metrics, respectively.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
| Service | Image | Port | Role |
|
||||
|---------|-------|-----|------|
|
||||
- **Prometheus:** prom/prometheus:latest - 9090 - Metrics Collection |
|
||||
- **Grafana:** grafana/grafana:latest - 3000 - Dashboards |
|
||||
- **Alertmanager:** prom/alertmanager:latest - 9093 - Alert Routing |
|
||||
- **Blackbox Exporter:** prom/blackbox-exporter:latest - 9115 - HTTP/TCP/ICMP Probing |
|
||||
- **Cadvisor:** gcr.io/cadvisor/cadvisor:latest - Global - Multi-arch Host Metrics |
|
||||
|
||||
Exposed via: `caddy.netgrimoire.com`, Internal only
|
||||
|
||||
Homepage group: Monitoring
|
||||
|
||||
---
|
||||
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
Ensure you have Docker Swarm installed and configured on the manager node (`znas`).
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/prometheus/data
|
||||
mkdir -p /DockerVol/grafana/data
|
||||
mkdir -p /DockerVol/alertmanager/data
|
||||
mkdir -p /DockerVol/blackbox/config
|
||||
chown -R 1964:1964 /DockerVol/prometheus/data
|
||||
chown -R 1964:1964 /DockerVol/grafana/data
|
||||
chown -R 1964:1964 /DockerVol/alertmanager/data
|
||||
chown -R 1964:1964 /DockerVol/blackbox/config
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# generate: openssl rand -hex 32
|
||||
GF_SECURITY_ADMIN_PASSWORD=F@lcon13
|
||||
GF_SECURITY_ADMIN_USER=admin
|
||||
GF_USERS_DEFAULT_THEME=dark
|
||||
GF_SERVER_ROOT_URL=https://grafana.netgrimoire.com
|
||||
GF_FEATURE_TOGGLES_ENABLE=publicDashboards
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/monitoring
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file monitoring-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml monitoring
|
||||
rm resolved.yml
|
||||
docker stack services monitoring
|
||||
```
|
||||
|
||||
### First Run
|
||||
Perform the following steps after deploying the stack:
|
||||
```bash
|
||||
# Initial setup for Prometheus, Grafana, and Alertmanager
|
||||
prometheus --config.file=/etc/prometheus/prometheus.yml --web.enable-lifecycle &
|
||||
grafana-server --no-auth --http-address=0.0.0.0:3000 &
|
||||
alertmanager --config.file=/etc/alertmanager/alertmanager.yml --storage.path=/alertmanager &
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## User Guide
|
||||
|
||||
### Accessing monitoring
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
- Prometheus: http://prometheus.netgrimoire.com:9090
|
||||
- Grafana: https://grafana.netgrimoire.com:3000
|
||||
- Alertmanager: https://alertmanager.netgrimoire.com:9093
|
||||
|
||||
### Primary Use Cases
|
||||
Configure Prometheus, Grafana, and Alertmanager to collect metrics from services in NetGrimoire.
|
||||
|
||||
### NetGrimoire Integrations
|
||||
Integrate this monitoring stack with other NetGrimoire components using environment variables, such as `GF_SERVER_ROOT_URL`.
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
```bash
|
||||
docker stack services monitoring
|
||||
# Monitor Prometheus for errors and performance issues
|
||||
```
|
||||
|
||||
### Backups
|
||||
Critical: Backup Prometheus, Grafana, Alertmanager, Blackbox Exporter, and Cadvisor databases. Reconstructable: Volume data can be restored.
|
||||
|
||||
### Restore
|
||||
```bash
|
||||
cd services/swarm/stack/monitoring
|
||||
./deploy.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Failures
|
||||
| Failure | Symptoms | Cause | Fix |
|
||||
|--------|----------|-------|------|
|
||||
- Prometheus not collecting metrics | Prometheus UI displays error messages. | Insufficient disk space or permissions to read metrics files. | Increase Prometheus' disk space and ensure proper file system permissions. |
|
||||
- Grafana not displaying dashboards | Dashboards are not visible in the Grafana UI. | No connections made between Grafana instances. | Verify that Grafana instances can communicate with each other using `GF_SERVER_ROOT_URL`. |
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
| 2026-04-11 | ce875510 | Initial documentation for the monitoring stack in NetGrimoire. |
|
||||
| 2026-04-11 | 3456a528 | Updated Prometheus configuration to use `--web.enable-lifecycle`. |
|
||||
| 2026-04-09 | 8ca119ab | Added support for Cadvisor services. |
|
||||
| 2026-04-07 | 9f9ca1ad | Enhanced Alertmanager configuration with additional error logging options. |
|
||||
| 2026-04-07 | 71e3177f | Updated Grafana to version 10.0.1 for improved performance and stability. |
|
||||
|
||||
<Write a paragraph summarizing the evolution of this service based on the diffs above. If no diffs available, note that this is the initial documentation.>
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
- Generated by Gremlin on 2026-04-12T01:10:17.109Z
|
||||
- Source: swarm/monitoring.yaml
|
||||
- Review User Guide and Changelog sections
|
||||
216
Netgrimoire/Watch-Grimoire/Monitoring/Services.md
Normal file
216
Netgrimoire/Watch-Grimoire/Monitoring/Services.md
Normal file
|
|
@ -0,0 +1,216 @@
|
|||
---
|
||||
title: Monitors and Alerts
|
||||
description: DIUN/NTFY on Netgrimoire
|
||||
published: true
|
||||
date: 2026-04-10T19:35:18.743Z
|
||||
tags:
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-10T19:35:18.743Z
|
||||
---
|
||||
|
||||
# Notifications — Netgrimoire
|
||||
|
||||
## Overview
|
||||
|
||||
All Netgrimoire notifications route through a self-hosted ntfy instance at `https://ntfy.netgrimoire.com`. Topics are organized by service category.
|
||||
|
||||
## ntfy Topic Structure
|
||||
|
||||
| Topic | Services | Purpose |
|
||||
|-------|----------|---------|
|
||||
| `netgrimoire-diun` | DIUN | Docker image update notifications |
|
||||
| `netgrimoire-media` | Sonarr, Radarr, SABnzbd | Download and media management events |
|
||||
| `netgrimoire-backup` | Kopia | Backup completion and errors |
|
||||
| `netgrimoire-alerts` | Prometheus/Alertmanager | Infrastructure alerts (future) |
|
||||
|
||||
Subscribe to topics at `https://ntfy.netgrimoire.com/<topic>` or via the ntfy mobile app.
|
||||
|
||||
---
|
||||
|
||||
## DIUN — Image Update Notifications
|
||||
|
||||
DIUN watches all Docker services for image updates and posts to `netgrimoire-diun`.
|
||||
|
||||
**Configuration** (`swarm/diun.yaml`):
|
||||
|
||||
```yaml
|
||||
environment:
|
||||
DIUN_NOTIF_NTFY_ENDPOINT: https://ntfy.netgrimoire.com
|
||||
DIUN_NOTIF_NTFY_TOPIC: netgrimoire-diun
|
||||
DIUN_NOTIF_NTFY_PRIORITY: "3"
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
- `PRIORITY` must be an integer (1–5), not the string `"default"` — this causes a startup crash
|
||||
- DIUN has no UI — no Caddy, Homepage, or Kuma labels needed
|
||||
- Runs on manager node only (needs full Swarm API access)
|
||||
- Watch schedule: every 6 hours (`0 */6 * * *`)
|
||||
|
||||
---
|
||||
|
||||
## Sonarr — TV Download Notifications
|
||||
|
||||
Sonarr sends notifications via webhook to `netgrimoire-media`.
|
||||
|
||||
**Setup** (done via UI — not compose):
|
||||
|
||||
1. Settings → Connect → + → **Webhook**
|
||||
2. Name: `ntfy`
|
||||
3. URL: `https://ntfy.netgrimoire.com/netgrimoire-media`
|
||||
4. Method: `POST`
|
||||
5. Triggers: On Grab, On Download, On Upgrade, On Health Issue
|
||||
6. Test → Save
|
||||
|
||||
---
|
||||
|
||||
## Radarr — Movie Download Notifications
|
||||
|
||||
Identical setup to Sonarr.
|
||||
|
||||
**Setup** (done via UI):
|
||||
|
||||
1. Settings → Connect → + → **Webhook**
|
||||
2. Name: `ntfy`
|
||||
3. URL: `https://ntfy.netgrimoire.com/netgrimoire-media`
|
||||
4. Method: `POST`
|
||||
5. Triggers: On Grab, On Download, On Upgrade, On Health Issue
|
||||
6. Test → Save
|
||||
|
||||
---
|
||||
|
||||
## SABnzbd — Usenet Download Notifications
|
||||
|
||||
SABnzbd does not have native ntfy support. Notifications are handled via a custom shell script.
|
||||
|
||||
### Script Location
|
||||
|
||||
```
|
||||
/data/nfs/znas/Docker/Sabnzbd/scripts/ntfy-notify.sh
|
||||
```
|
||||
|
||||
Mounted into the container at `/config/scripts/ntfy-notify.sh`.
|
||||
|
||||
### Script
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# SABnzbd ntfy notification script
|
||||
# SABnzbd passes: $1=Job name, $2=Final dir, $3=NZB file,
|
||||
# $4=Category, $5=Group, $6=Status, $7=Fail message
|
||||
|
||||
NTFY_URL="https://ntfy.netgrimoire.com/netgrimoire-media"
|
||||
|
||||
JOB_NAME="$1"
|
||||
STATUS_CODE="$6"
|
||||
FAIL_MSG="$7"
|
||||
|
||||
case "$STATUS_CODE" in
|
||||
0) TITLE="✅ SABnzbd — Download Complete"
|
||||
MSG="$JOB_NAME"; PRIORITY=3 ;;
|
||||
1) TITLE="⚠️ SABnzbd — Post-Processing Error"
|
||||
MSG="$JOB_NAME — $FAIL_MSG"; PRIORITY=4 ;;
|
||||
2) TITLE="❌ SABnzbd — Download Failed"
|
||||
MSG="$JOB_NAME — $FAIL_MSG"; PRIORITY=5 ;;
|
||||
*) TITLE="ℹ️ SABnzbd — Notification"
|
||||
MSG="$JOB_NAME (status: $STATUS_CODE)"; PRIORITY=3 ;;
|
||||
esac
|
||||
|
||||
curl -s \
|
||||
-H "Title: $TITLE" \
|
||||
-H "Priority: $PRIORITY" \
|
||||
-H "Tags: floppy_disk" \
|
||||
-d "$MSG" \
|
||||
"$NTFY_URL"
|
||||
|
||||
exit 0
|
||||
```
|
||||
|
||||
### SABnzbd UI Setup
|
||||
|
||||
1. Config → Folders → **Post-Processing Scripts Folder** → set to `/config/scripts`
|
||||
2. Config → Notifications → Notification Script section
|
||||
3. Check **Enable notification script**
|
||||
4. Script dropdown → select `ntfy-notify.sh`
|
||||
5. Check: Job finished, Job failed, Warning, Error, Disk full
|
||||
6. Test → Save
|
||||
|
||||
**Note:** The scripts folder must be configured under Config → Folders first or the script won't appear in the dropdown.
|
||||
|
||||
---
|
||||
|
||||
## Kopia — Backup Notifications
|
||||
|
||||
Kopia has no native webhook support. Notifications are handled via a cron script on znas that uses the Kopia CLI inside the Docker container.
|
||||
|
||||
### Script Location
|
||||
|
||||
```
|
||||
/usr/local/bin/kopia-notify.sh
|
||||
```
|
||||
|
||||
### How It Works
|
||||
|
||||
- Runs hourly via cron on znas
|
||||
- Uses `docker exec` to run `kopia snapshot list --json` inside the container
|
||||
- Parses JSON output with Python to find snapshots completed in the last hour
|
||||
- Posts success or error notification to `netgrimoire-backup`
|
||||
|
||||
### Cron Entry (znas root crontab)
|
||||
|
||||
```
|
||||
0 * * * * /usr/local/bin/kopia-notify.sh
|
||||
```
|
||||
|
||||
### Notification Format
|
||||
|
||||
**Success:** `✅ Kopia — Backup Complete`
|
||||
```
|
||||
host:path
|
||||
N files • X.X GB
|
||||
```
|
||||
|
||||
**Error:** `❌ Kopia — Backup Errors`
|
||||
```
|
||||
host:path
|
||||
N error(s) • N files • X.X GB
|
||||
```
|
||||
|
||||
### Kopia API Access
|
||||
|
||||
The Kopia API is accessible inside the container only. Direct host access via port 51515 does not work due to network routing. Use `docker exec` instead:
|
||||
|
||||
```bash
|
||||
docker exec $(docker ps -q -f name=kopia_kopia) \
|
||||
kopia snapshot list --json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ntfy Compose Reference
|
||||
|
||||
```yaml
|
||||
# swarm/ntfy.yaml
|
||||
services:
|
||||
ntfy:
|
||||
image: binwiederhier/ntfy
|
||||
command: serve
|
||||
user: "1964:1964"
|
||||
environment:
|
||||
TZ: America/Chicago
|
||||
volumes:
|
||||
- /data/nfs/znas/Docker/ntfy/cache:/var/cache/ntfy
|
||||
- /data/nfs/znas/Docker/ntfy/etc:/etc/ntfy
|
||||
ports:
|
||||
- 81:80
|
||||
networks:
|
||||
- netgrimoire
|
||||
deploy:
|
||||
labels:
|
||||
caddy: ntfy.netgrimoire.com
|
||||
caddy.reverse_proxy: ntfy:80
|
||||
caddy.import: crowdsec
|
||||
# Note: no authentik — ntfy must be publicly reachable
|
||||
# for external services to post notifications
|
||||
```
|
||||
|
||||
**Note:** ntfy intentionally has no `caddy.import_1: authentik` — it must remain publicly accessible so external services (OPNsense CrowdSec plugin, Monit, etc.) can post to it without authentication.
|
||||
115
Netgrimoire/Watch-Grimoire/Monitoring/Uptime-Kuma.md
Normal file
115
Netgrimoire/Watch-Grimoire/Monitoring/Uptime-Kuma.md
Normal file
|
|
@ -0,0 +1,115 @@
|
|||
# kuma Stack
|
||||
description: Kuma Uptime Monitor for NetGrimoire
|
||||
|
||||
---
|
||||
# kuma
|
||||
|
||||
## Overview
|
||||
The kuma stack is a service in NetGrimoire that monitors the status of services running on the swarm. It consists of two main components: kuma and autokuma. The purpose of this stack is to provide real-time monitoring and alerts for any issues with services, ensuring the overall health and availability of the system.
|
||||
|
||||
---
|
||||
## Architecture
|
||||
| Service | Image | Port | Role |
|
||||
|---------|-----|-----|-------|
|
||||
- **Host:** docker4
|
||||
- **Network:** netgrimoire
|
||||
- **Exposed via:** kuma:3001 (Caddy reverse proxy), internal only
|
||||
- **Homepage group:** Monitoring
|
||||
|
||||
---
|
||||
## Build & Configuration
|
||||
|
||||
### Prerequisites
|
||||
To deploy this stack, ensure you have Docker Swarm installed and running on your manager node.
|
||||
|
||||
### Volume Setup
|
||||
```bash
|
||||
mkdir -p /DockerVol/kuma
|
||||
chown -R kuma:kuma /DockerVol/kuma
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# generate: openssl rand -hex 32
|
||||
AUTOKUMA__KUMA__URL: http://kuma:3001
|
||||
AUTOKUMA__KUMA__USERNAME: traveler
|
||||
AUTOKUMA__KUMA__PASSWORD: F@lcon12
|
||||
```
|
||||
|
||||
### Deploy
|
||||
```bash
|
||||
cd services/swarm/stack/kuma
|
||||
set -a && source .env && set +a
|
||||
docker stack config --compose-file kuma-stack.yml > resolved.yml
|
||||
docker stack deploy --compose-file resolved.yml kuma
|
||||
rm resolved.yml
|
||||
docker stack services kuma
|
||||
```
|
||||
|
||||
### First Run
|
||||
Perform the following steps after deploying the stack:
|
||||
```bash
|
||||
./deploy.sh
|
||||
```
|
||||
This will initialize the autokuma service and start monitoring.
|
||||
|
||||
---
|
||||
## User Guide
|
||||
|
||||
### Accessing kuma
|
||||
| Service | URL | Purpose |
|
||||
|---------|-----|---------|
|
||||
- **kuma**: https://kuma.netgrimoire.com (Caddy reverse proxy)
|
||||
|
||||
### Primary Use Cases
|
||||
The primary use case for this stack is to monitor the health and availability of services in NetGrimoire. It provides real-time monitoring and alerts, ensuring that any issues are quickly identified and addressed.
|
||||
|
||||
### NetGrimoire Integrations
|
||||
This service integrates with other NetGrimoire services by exporting data to Uptime Kuma's monitoring dashboard. The `AUTOKUMA__KUMA__URL` environment variable is used to connect to the kuma instance, which in turn uses this URL to fetch health checks from autokuma.
|
||||
|
||||
---
|
||||
## Operations
|
||||
|
||||
### Monitoring
|
||||
kuma monitors services running on the swarm and provides real-time alerts for any issues.
|
||||
|
||||
```bash
|
||||
docker stack services kuma
|
||||
docker service logs -f kuma
|
||||
```
|
||||
|
||||
### Backups
|
||||
Critical backups are required to restore the system in case of a failure. The `/DockerVol/kuma` volume should be backed up regularly.
|
||||
|
||||
### Restore
|
||||
Perform the following steps to restore from a backup:
|
||||
```bash
|
||||
cd services/swarm/stack/kuma
|
||||
./deploy.sh
|
||||
```
|
||||
This will redeploy the kuma stack and initialize autokuma.
|
||||
|
||||
---
|
||||
## Common Failures
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|------|-----|
|
||||
| No monitoring data | Insufficient permissions or incorrect labels | Check labels and permissions, ensure correct configuration |
|
||||
| Autokuma fails to start | Incorrect environment variables or missing required services | Review configuration, update environment variables as needed |
|
||||
|
||||
---
|
||||
## Changelog
|
||||
|
||||
| Date | Commit | Summary |
|
||||
|------|--------|---------|
|
||||
| 2026-04-07 | 5ea60b18 | Initial deployment of kuma stack |
|
||||
| 2026-04-07 | d6fffdfb | Fixed autokuma configuration |
|
||||
| 2026-04-06 | 42982c9a | Updated Docker Swarm version |
|
||||
| 2026-04-06 | 9d8b36be | Improved security patches |
|
||||
| 2026-04-06 | 3f791e83 | Updated documentation for autokuma |
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
Generated by Gremlin on 2026-04-07T05:32:30.439Z
|
||||
Source: swarm/kuma.yaml
|
||||
Review User Guide and Changelog sections
|
||||
53
Netgrimoire/Watch-Grimoire/Overview.md
Normal file
53
Netgrimoire/Watch-Grimoire/Overview.md
Normal file
|
|
@ -0,0 +1,53 @@
|
|||
---
|
||||
title: Watch Grimoire
|
||||
description: Monitoring — the Oracle sees all
|
||||
published: true
|
||||
date: 2026-04-12T00:00:00.000Z
|
||||
tags: watch, monitoring
|
||||
editor: markdown
|
||||
dateCreated: 2026-04-12T00:00:00.000Z
|
||||
---
|
||||
|
||||
# Watch Grimoire
|
||||
|
||||

|
||||
|
||||
The Watch Grimoire is the observatory of Netgrimoire. The Oracle sees every heartbeat, every metric, every log line. Nothing goes unnoticed.
|
||||
|
||||
---
|
||||
|
||||
## Sections
|
||||
|
||||
| Section | Contents |
|
||||
|---------|----------|
|
||||
| [Monitoring](/Watch-Grimoire/Monitoring/Services) | Uptime Kuma, AutoKuma, Beszel, LibreNMS, DIUN, phpIPAM, Scrutiny |
|
||||
| [Logging](/Watch-Grimoire/Logging/Log-Stack) | Graylog, Loki + Promtail + Grafana, Dozzle |
|
||||
| [Dashboards](/Watch-Grimoire/Dashboards/Homepage) | Homepage, Glance, Portainer, Homelable |
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Stack Status
|
||||
|
||||
| Service | URL | Status | Purpose |
|
||||
|---------|-----|--------|---------|
|
||||
| Uptime Kuma | kuma.netgrimoire.com | ✅ | Service uptime + Gremlin webhook |
|
||||
| AutoKuma | — | ✅ | Auto-creates Kuma monitors from labels |
|
||||
| Beszel | beszel.netgrimoire.com | ✅ | Docker resource monitoring per node |
|
||||
| DIUN | — | ✅ | Docker image update notifications |
|
||||
| LibreNMS | nms.netgrimoire.com | ✅ | Network/SNMP monitoring |
|
||||
| phpIPAM | ipam.netgrimoire.com | ✅ | IP address management |
|
||||
| Scrutiny | scrutiny.netgrimoire.com | ✅ | Disk S.M.A.R.T. monitoring |
|
||||
| Graylog | log.netgrimoire.com | ✅ | Log aggregation (docker4, Compose only) |
|
||||
| Loki + Grafana | — | ✅ | Metrics/log stack |
|
||||
| Dozzle | dozzle.netgrimoire.com | ✅ | Real-time container logs |
|
||||
| Homelable | — | 🔧 | Infra visualizer — MCP deferred |
|
||||
|
||||
---
|
||||
|
||||
## Key Notes
|
||||
|
||||
**AutoKuma:** Must be pinned to a Swarm manager node for full Docker API socket access. Set `AUTOKUMA__DOCKER__SOURCE=swarm` in Swarm environments. Label format: `kuma.<unique-id>.<monitor-type>.<field>`.
|
||||
|
||||
**Graylog:** Runs on docker4 via Docker Compose only — do not attempt to run in Swarm. Stack: Graylog 6.0 + MongoDB 5 + DataNode (OpenSearch).
|
||||
|
||||
**Homelable:** Frontend + backend deployed via GHCR. MCP image must be built from source — deferred. Two-service stack.
|
||||
Loading…
Add table
Add a link
Reference in a new issue