Grafana architecture ────────────────────────────────────────────────────────── Data Source → Grafana knows how to talk to it e.g. Prometheus, Loki, InfluxDB, MySQL, Elasticsearch Dashboard → a collection of panels → each panel runs a query against a data source → query result is rendered as a graph / gauge / table Panel → one visualisation → has a PromQL query, a time range, and a display config → e.g. "show CPU usage for node {{ instance }} over last 1h" How kube-prometheus-stack wires everything ────────────────────────────────────────────────────────── 1. Grafana starts with a Prometheus data source pre-configured pointing to http://kube-prometheus-stack-prometheus:9090 2. Dashboard ConfigMaps are mounted into Grafana automatically via the sidecar — no manual import needed 3. You open http://grafana.lab.local → login → dashboards are there
Grafana itself stores no metrics — it only queries. This means you can point multiple Grafana instances at the same Prometheus, or swap out the data source entirely without rebuilding your dashboards. The dashboards are just JSON that describes which queries to run and how to render them.
kube-prometheus-stack ships dashboards for every component it installs. They load
automatically — no import step needed. Navigate to
Dashboards → Browse in Grafana to see all of them.
kubectl commands or failed Ansible tasks that
call the API server.
# ── Instant queries (single value at a point in time) ────────────── # Is every scrape target up? (1 = up, 0 = down) up # Free memory per node in GB node_memory_MemAvailable_bytes / 1024 / 1024 / 1024 # CPU usage % per node (averaged over last 5m) 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) # How many pods are running per namespace? count by(namespace) (kube_pod_status_phase{phase="Running"}) # ── Range queries (rate / increase over time) ─────────────────────── # HTTP requests per second (if your app exposes http_requests_total) rate(http_requests_total[5m]) # Total restarts per pod in the last hour increase(kube_pod_container_status_restarts_total[1h]) # Disk writes per second per node rate(node_disk_written_bytes_total[5m])
Use the Explore view in Grafana (Explore → Select Prometheus)
to run ad-hoc PromQL queries without creating a dashboard panel. This is the fastest way
to investigate something — type a query, pick a time range, see the result.
rate() with counters like http_requests_total, never query the raw counter.sum by(pod) gives one line per pod. Without it you get a single aggregate across all pods.namespace, pod, container, instance. Use {namespace="monitoring"} to filter to a specific namespace.node_ to see
all node-exporter metrics, or kube_pod_ for pod-level metrics from
kube-state-metrics.
# How the chart auto-wires Prometheus as a data source # (rendered inside Grafana's provisioning ConfigMap) apiVersion: 1 datasources: - name: Prometheus type: prometheus url: http://kube-prometheus-stack-prometheus:9090 isDefault: true access: proxy # Grafana server makes the request, not the browser
The chart injects a datasource.yaml into Grafana's provisioning directory
at startup — so Prometheus is already connected when you first log in. The sidecar
container watches for ConfigMaps labelled grafana_dashboard: "1" and
mounts them as dashboards automatically.
grafana_dashboard: "1" containing your dashboard JSON. The sidecar picks it up without restarting Grafana.http://grafana.lab.local
· Username: admin · Password: grafanaansible/group_vars/all.yml → grafana_admin_password
and re-run ansible-playbook phase-04.yml to apply.
# Check Grafana pod is healthy kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana # Check the Grafana service and ingress kubectl get svc,ingress -n monitoring | grep grafana # If the UI isn't loading, check Grafana logs kubectl logs -n monitoring -l app.kubernetes.io/name=grafana -c grafana # ── In the Grafana UI ─────────────────────────────────────────────── # 1. Go to http://grafana.lab.local # 2. Login: admin / grafana # 3. Connections → Data sources → Prometheus → Test (should show green) # 4. Dashboards → Browse → open "Node Exporter / Nodes" # → should see CPU/RAM/disk graphs for all 3 nodes # 5. Explore → run: up # → should show 1 for every healthy scrape target
Phase 04 is fully operational when: the Prometheus data source test passes in Grafana,
the Node Exporter dashboard shows live data for all 3 nodes, and
http://prometheus.lab.local/targets shows all targets as UP.