Monitoring Stack
The monitoring stack provides real-time visibility into cluster health and performance.
Namespace: monitoring
Chart: kube-prometheus-stack
1. Overview
Cluster health and performance metrics are aggregated via the Prometheus stack. Performance tuning is applied for ARM64 architecture consistency.
2. Components
- Prometheus: Time-series database for metric collection.
- Grafana: Dashboard visualization for infra metrics.
- Alertmanager: Notification routing based on defined alert rules.
3. Configuration and Deployment
Implementation procedures:
- Installation: Managed via ArgoCD using the standard Helm repository.
- Persistence: Longhorn volumes are utilized for historical data storage.
- Dashboards: Pre-configured dashboards for Node Exporter and K8s resources are imported.
4. Operational Maintenance
- Rule Management: Alerting rules are updated via GitOps.
- Metric Retention: Standard 15-day retention is enforced.
- Alerting: Critical infrastructure alerts are routed to Discord.