Grafana
Grafana is an open-source analytics and monitoring platform used in the Greenfield Cluster for visualizing metrics and creating dashboards.
Overview
Grafana in the Greenfield Cluster provides:
- Metrics Visualization: Beautiful, customizable dashboards
- Multiple Data Sources: Prometheus, Jaeger, and more
- Alerting: Visual alert rules and notifications
- Pre-built Dashboards: SLO and component monitoring
- User Management: Role-based access control
Architecture
Configuration
| Parameter | Default Value |
|---|---|
| Version | Latest |
| Default Login | admin / admin123 |
| Port | 3000 |
| CPU Request | 100m |
| Memory Request | 256Mi |
| Persistent Storage | Yes |
Usage
Accessing Grafana
# Port forward to Grafana
kubectl port-forward -n greenfield svc/grafana 3000:3000
# Open in browser
http://localhost:3000
# Default credentials (CHANGE IN PRODUCTION!)
Username: admin
Password: admin123
Data Sources
Grafana is pre-configured with:
- Prometheus: Metrics data source
- Jaeger: Distributed tracing
- Loki: Log aggregation (if enabled)
Pre-built Dashboards
The cluster includes dashboards for:
- Cluster Health SLOs: Overall cluster health metrics
- Application SLOs: Application-level SLO tracking
- Component Metrics: Individual component dashboards
- Resource Usage: CPU, memory, disk usage
- Network Traffic: Service mesh traffic patterns
Creating Dashboards
Basic Dashboard
- Click "+" → "Dashboard"
- Add new panel
- Select Prometheus data source
- Write PromQL query:
# CPU usage
rate(container_cpu_usage_seconds_total[5m])
# Memory usage
container_memory_usage_bytes
# Request rate
rate(http_requests_total[5m])
Example Dashboard JSON
{
"dashboard": {
"title": "My Dashboard",
"panels": [
{
"title": "Request Rate",
"type": "graph",
"datasource": "Prometheus",
"targets": [
{
"expr": "rate(http_requests_total[5m])"
}
]
}
]
}
}
Alerting
Creating Alerts
- Navigate to Alerting → Alert rules
- Create new alert rule
- Define query and thresholds
- Configure notification channels
Example Alert
# Alert if API error rate > 5%
expr: |
sum(rate(http_requests_total{status=~"5.."}[5m]))
/
sum(rate(http_requests_total[5m]))
> 0.05
Best Practices
- Organize Dashboards: Use folders to organize by service/team
- Use Variables: Make dashboards dynamic with template variables
- Set Refresh Rates: Appropriate intervals based on data freshness
- Alert Fatigue: Avoid too many alerts, focus on actionable items
- Dashboard as Code: Export and version control dashboard JSON
Advanced Features
Variables
Create dynamic dashboards with variables:
# Namespace variable
Query: label_values(kube_pod_info, namespace)
# Pod variable
Query: label_values(kube_pod_info{namespace="$namespace"}, pod)
Annotations
Add event annotations to graphs:
- Deployments
- Alerts
- Incidents
- Releases
Monitoring
# Check Grafana status
kubectl get pods -n greenfield -l app=grafana
# View logs
kubectl logs -n greenfield deployment/grafana
# Check persistence
kubectl get pvc -n greenfield | grep grafana