docs(gremlin): create monitoring
This commit is contained in:
parent
e6a42190d2
commit
fc68d883d6
1 changed files with 92 additions and 0 deletions
92
Netgrimoire/Services/monitoring/monitoring.md
Normal file
92
Netgrimoire/Services/monitoring/monitoring.md
Normal file
|
|
@ -0,0 +1,92 @@
|
||||||
|
# monitoring
|
||||||
|
|
||||||
|
Overview
|
||||||
|
---------------
|
||||||
|
|
||||||
|
The monitoring stack provides a comprehensive set of services for metrics collection, dashboard management, alert routing, container metrics, and host metrics in NetGrimoire. The stack includes Prometheus for metrics collection, Grafana for dashboards, Alertmanager for alert routing, Cadvisor for container metrics, and Node Exporter for host metrics.
|
||||||
|
|
||||||
|
Architecture
|
||||||
|
-------------
|
||||||
|
|
||||||
|
| Service | Image | Port | Role |
|
||||||
|
|---------|-------|-----|------|
|
||||||
|
- **Prometheus:** prom/prometheus:latest
|
||||||
|
- exposed via: `grafana.netgrimoire.com`
|
||||||
|
- Homepage group: Monitoring
|
||||||
|
|
||||||
|
- **Grafana:** grafana/grafana:latest
|
||||||
|
- exposed via: `grafana.netgrimoire.com`
|
||||||
|
- Homepage group: Monitoring
|
||||||
|
|
||||||
|
- **Alertmanager:** prom/alertmanager:latest
|
||||||
|
- exposed via: `alertmanager.netgrimoire.com`
|
||||||
|
- Homepage group: Monitoring
|
||||||
|
|
||||||
|
- **Cadvisor:** gcr.io/cadvisor/cadvisor:latest
|
||||||
|
- exposed via: `cadvisor.netgrimoire.com`
|
||||||
|
- Homepage group: Monitoring
|
||||||
|
|
||||||
|
- **Node Exporter:** prom/node-exporter:latest
|
||||||
|
- exposed via: `node-exporter.netgrimoire.com`
|
||||||
|
- Homepage group: Monitoring
|
||||||
|
|
||||||
|
Build & Configuration
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
|
||||||
|
- Docker and Docker Swarm installed on docker4
|
||||||
|
|
||||||
|
### Volume Setup
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mkdir -p /DockerVol/prometheus/data
|
||||||
|
mkdir -p /DockerVol/grafana/data
|
||||||
|
```
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# generate: openssl rand -hex 32
|
||||||
|
GF_SECURITY_ADMIN_PASSWORD=F@lcon13
|
||||||
|
```
|
||||||
|
|
||||||
|
### Deploy
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd services/swarm/stack/monitoring
|
||||||
|
set -a && source .env && set +a
|
||||||
|
docker stack config --compose-file monitoring-stack.yml > resolved.yml
|
||||||
|
docker stack deploy --compose-file resolved.yml monitoring
|
||||||
|
rm resolved.yml
|
||||||
|
docker stack services monitoring
|
||||||
|
```
|
||||||
|
|
||||||
|
### First Run
|
||||||
|
|
||||||
|
- Post-deploy steps specific to these services include configuring network, caddy, and uptime kuma.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## User Guide
|
||||||
|
|
||||||
|
### Accessing Monitoring
|
||||||
|
|
||||||
|
| Service | URL | Purpose |
|
||||||
|
|---------|-----|---------|
|
||||||
|
- **Prometheus:** https://prometheus.netgrimoire.com
|
||||||
|
- **Grafana:** https://grafana.netgrimoire.com
|
||||||
|
- **Alertmanager:** https://alertmanager.netgrimoire.com
|
||||||
|
- **Cadvisor:** `cadvisor.netgrimoire.com` (Container metrics)
|
||||||
|
- **Node Exporter:** `node-exporter.netgrimoire.com` (Host metrics)
|
||||||
|
|
||||||
|
### Primary Use Cases
|
||||||
|
|
||||||
|
- Monitoring system performance and health.
|
||||||
|
- Configuring alerts for critical issues.
|
||||||
|
- Visualizing metrics in real-time.
|
||||||
|
|
||||||
|
### NetGrimoire Integrations
|
||||||
|
|
||||||
|
- Connects to Crowdsec via Caddy reverse proxy.
|
||||||
|
- Uptime Kuma monitors services and detects errors.
|
||||||
Loading…
Add table
Add a link
Reference in a new issue