Kubernetes Prometheus & Grafana
Kubernetes & Prometheus
Prometheus by defaults pulls info from the hosts via an http endpoint by default this is the /metrics endpoint Data exposed on this /metrics endpoint needs to support the prometheus endpoint. Exporter is a standalone tool that gathers data and exposes it on the /metrics endpoint These exporters are also available via docker ( can be used as sidecar containers )
Promotheus Server
- Data Retrieval Worker ( pulls metrics data)
- Time Series Database ( stores time series metrics data )
- Api ( to access this stored data )
View Data
- Prometheus Web Ui
- Grafana
Terms
Targets: anything monitored Units: a subset of whats monitored
- cpu status
- memory usage
- disk usage
- exception count
- request count
metrics:
Each monitored target exposes a /metrics endpoint that expose your metrics in a certain format
HELP: description of what the metric is TYPE: one of three metrics types:
- Counter ( how often something happend )
- Gauge ( current value of something )
- Histogram ( How long something took / how big a request was)
Prometheus Client libraries
libraries for various languages are available here Official third-party client libraries:
- Go
- Java or Scala
- Python
- Ruby
Unofficial third-party client libraries:
- Bash
- C
- C++
- Common Lisp
- Dart
- Elixir
- Erlang
- Haskell
- Lua for Nginx
- Lua for Tarantool
- .NET / C#
- Node.js
- Perl
- PHP
- R
- Rust
promoetheus’s pull system
controlled pulling of metrics in order to avoid the monitoring becoming the bottleneck. multiple prometheus instances can pull metrics ( good scalability )
pushgateway
“short lived jobs” services that only live for short times can use short lived jobs to push data into prometheus
alert manager
responsible for firing alerts via different channels ( slack / email / sms etc )
Prometheus data storage
local time series database ( integrates with remote storage systems )
Setting up prometheus
first we need to create a cluster role:
prometheus-rbac-setup.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
- ingresses
verbs: ["get", "list", "watch"]
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: default
not the prometheus.yml is from prometheus and does not adhere to yaml file name extension - which might be confusing here as we use yaml across this blog
this configmap basically contains the info on what prometheus scrapes. the most important part of the next section would be
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
this would basically mean in our pod definition ( kind: Deployment ) that we would need two annotations to tell prometheus to scrape this pod, and which port to scrape
prometheus-configmap.yml
apiVersion: v1
data:
prometheus.yml: |
scrape_configs:
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- job_name: 'kubernetes-services'
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name
- job_name: 'kubernetes-ingresses'
metrics_path: /metrics
basic_auth:
username: "loeken"
password: "topsecure"
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: ingress
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_example_io_should_be_probed]
action: keep
regex: true
- source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
regex: (.+);(.+);(.+)
replacement: ${1}://${2}${3}
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_ingress_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_ingress_name]
target_label: kubernetes_name
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
# Example relabel to scrape only pods that have
# "example.io/should_be_scraped = true" annotation.
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
#
# Example relabel to customize metric path based on pod
# "example.io/metric_path = <metric path>" annotation.
# - source_labels: [__meta_kubernetes_pod_annotation_example_io_metric_path]
# action: replace
# target_label: __metrics_path__
# regex: (.+)
#
# Example relabel to scrape only single, desired port for the pod
# based on pod "example.io/scrape_port = <port>" annotation.
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
kind: ConfigMap
metadata:
name: prometheus-cm
namespace: default
kubectl apply -f prometheus-configmap.yaml
Similar to the last node example we are deploying this the same way with a service exposing it via the ingress
prometheus-deplyoment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-deployment
labels:
app: prometheus-server
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-server
template:
metadata:
labels:
app: prometheus-server
spec:
containers:
- name: prometheus
image: prom/prometheus
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus/"
ports:
- containerPort: 9090
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus
- name: prometheus-storage-volume
mountPath: /prometheus
volumes:
- name: config-volume
configMap:
name: prometheus-cm
- name: prometheus-storage-volume
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: prometheus-service
spec:
selector:
app: prometheus-server
ports:
- name: promui
protocol: TCP
port: 9090
targetPort: 9090
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: prometheus-ingress
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: prometheus.example.com
http:
paths:
- path: /
backend:
serviceName: prometheus-service
servicePort: 9090
apply deployment
kubectl apply -f prometheus-deplyoment.yaml
afterwards you can view it via http://prometheus.example.com/ if you point prometheus.example.com at any of the k3s nodes
Grafana
Grafana is a really nice ui to create dashboars from the data prometheus gathers
grafana-datasource.yaml
grafana-datasource.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-datasources
data:
prometheus.yaml: |-
{
"apiVersion": 1,
"datasources": [
{
"access":"proxy",
"editable": true,
"name": "prometheus",
"orgId": 1,
"type": "prometheus",
"url": "http://prometheus.example.com",
"version": 1
}
]
}
grafana-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
name: grafana
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:latest
ports:
- name: grafana
containerPort: 3000
resources:
limits:
memory: "1Gi"
cpu: "1000m"
requests:
memory: "1Gi"
cpu: "500m"
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-storage
- mountPath: /etc/grafana/provisioning/datasources
name: grafana-datasources
readOnly: false
volumes:
- name: grafana-storage
emptyDir: {}
- name: grafana-datasources
configMap:
defaultMode: 420
name: grafana-datasources
---
kind: Service
apiVersion: v1
metadata:
name: grafana-service
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '3000'
spec:
selector:
app: grafana
ports:
- name: grafanaui
protocol: TCP
port: 3000
targetPort: 3000
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: grafana-ingress
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: grafana.example.com
http:
paths:
- path: /
backend:
serviceName: grafana-service
servicePort: 3000
afterwards you can view it via http://grafana .example.com/ if you point prometheus.example.com at any of the k3s nodes
an example for a website that is using nginx to host a static site ( and has nginx /stub_status exposed ) and a second container which reads the /stub_status endpoint and converts it to prometheus format
note: something still seems buggy with the cloudflare integration
blog-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: blog
labels:
app: blog
spec:
replicas: 1
selector:
matchLabels:
app: blog
template:
metadata:
labels:
app: blog
monitoring: '1'
spec:
containers:
- name: blog
image: gcr.io/example/blog
imagePullPolicy: Always
ports:
- containerPort: 80
- name: adapter
image: nginx/nginx-prometheus-exporter:0.4.2
args: ["-nginx.scrape-uri","http://localhost/stub_status"]
ports:
- containerPort: 9113
imagePullSecrets:
- name: gcr-json-key
volumes:
- name: dhparam-volume
configMap:
name: dhparam
---
apiVersion: v1
kind: Service
metadata:
name: blog-service
annotations:
prometheus.io/port: "9113"
prometheus.io/scrape: "true"
spec:
selector:
app: blog
type: LoadBalancer
ports:
- protocol: TCP
port: 20001
targetPort: 80
nodePort: 30001
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: blog-ingress
annotations:
cert-manager.io/cluster-issue: letsencrypt-prod
#external-dns.alpha.kubernetes.io/hostname: blog.internetz.me
#external-dns.alpha.kubernetes.io/cloudflare-proxied: "false"
#external-dns.alpha.kuberentes.io/ttl: "1"
prometheus.io/port: "9113"
prometheus.io/scrape: "true"
labels:
monitoring: '1'
spec:
rules:
- host: blog.internetz.me
http:
paths:
- path: /
backend:
serviceName: blog-service
servicePort: 80
tls:
- hosts:
- blog.internetz.me
secretName: blog-internetz-me-tls
---
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: blog-internetz-me
namespace: default
spec:
secretName: blog-internetz-me-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
commonName: blog.internetz.me
dnsNames:
- blog.internetz.me
comments powered by Disqus