Skip to content

Monitoring Stack

The monitoring stack provides real-time visibility into cluster health and performance.

Namespace: monitoring
Chart: kube-prometheus-stack


1. Overview

Cluster health and performance metrics are aggregated via the Prometheus stack. Performance tuning is applied for ARM64 architecture consistency.


2. Components

  • Prometheus: Time-series database for metric collection.
  • Grafana: Dashboard visualization for infra metrics.
  • Alertmanager: Notification routing based on defined alert rules.

3. Configuration and Deployment

Implementation procedures:

  1. Installation: Managed via ArgoCD using the standard Helm repository.
  2. Persistence: Longhorn volumes are utilized for historical data storage.
  3. Dashboards: Pre-configured dashboards for Node Exporter and K8s resources are imported.

4. Operational Maintenance

  1. Rule Management: Alerting rules are updated via GitOps.
  2. Metric Retention: Standard 15-day retention is enforced.
  3. Alerting: Critical infrastructure alerts are routed to Discord.