How it works
One self-correcting loop, from alert to fix
Reflexion is Agentic SRE: it reasons like a senior on-call engineer, but it checks its own work before it touches your cluster — and it's safe by default.
1. Observe
Reflexion ingests the live signal — Prometheus metrics, Loki logs, Alertmanager alerts, and Kubernetes state — and assembles a context snapshot of what's actually happening.
2. Propose (Actor)
The Actor reasons over the snapshot and your runbooks to propose a concrete remediation hypothesis — a scale, rollback, restart, config change, or HPA adjustment — with a pre-computed rollback.
3. Validate (Critic)
The Critic checks the hypothesis against SLO impact and policy. It catches logical flaws and policy violations before anything runs, and can require human approval for high-risk actions.
4. Execute
Only approved actions execute — via a GitOps pull request or kubectl — with the rollback stored atomically. Every decision is recorded immutably for audit and replay.
5. Learn
The outcome of each action becomes a training signal. The Avirka SRE models are fine-tuned on your incidents, so the loop gets sharper every time it runs.