Incident Workflow

Reliai is built around a single operational loop: Detect → Understand → Fix → Prove → Share.

Use this guide when an incident is open and you need to move through it systematically.


Detect

Reliai identifies regressions automatically using:

When a regression is detected, an incident is opened and signals are grouped into a single investigation unit.


Understand

Use the incident command center to understand what happened.

Start with metrics — identify what changed and when.

Inspect traces — compare failing traces against baseline traces from before the regression.

Review root cause — the root cause panel is computed deterministically from:

The root cause is not AI-generated. It is computed from system signals.

You may optionally use the AI explanation to get a plain-language summary of the evidence, but this is grounded in the same deterministic signals.


Fix

Apply a fix based on:

Common fixes include prompt changes, model version rollbacks, and guardrail policy updates.


Prove

After applying a fix, Reliai measures resolution impact:

This is the Fix Verified step. Do not close an incident until you have reviewed the resolution impact.


Share

Export incident context using:

These are AI-assisted drafts. Review before sending.