Learn monitoring: Prometheus, PromQL, Grafana, Loki, Tempo, Alertmanager, OpenTelemetry. SRE practices, SLI/SLO, and production Kubernetes monitoring. Taught in Russian.
The course includes 8 modules and 600+ exercises with instant feedback.
The course covers SRE practices (SLI/SLO/error budgets), Prometheus, PromQL, Grafana, observability stack (Loki, Tempo, OpenTelemetry), alerting, and production Kubernetes monitoring.
Basic Docker and Kubernetes knowledge helps, but many concepts are explained from scratch. We recommend completing the Docker and Kubernetes courses first.
An SLO (Service Level Objective) defines a target reliability level (e.g., 99.9% uptime). The error budget is the allowed downtime — for 99.9% SLO that is 43 minutes per month.