Operate overview
Keep services healthy: reliability, delivery insight, incidents, observability, and cost.
Shipping is half the job; keeping things healthy is the other half. The Operate area puts the signal that something needs attention next to the control to do something about it, so you are not stitching together five dashboards during an incident.
What Operate covers
- Reliability and SLOs. Define what “healthy” means for a service and track it against an error budget, so you know when to slow down and when you have room to ship.
- Delivery insight (DORA). The four delivery metrics, measured from real platform data, so you can see whether changes are getting faster and safer over time.
- Incidents and runbooks. Declare, run, and learn from incidents, with runbooks that turn a known response into a repeatable one.
- Observability. Metrics, logs, and traces for your services, with per-service views so you start from the right place.
- Posture. An evidence-derived read on your DevSecOps practice across the pillars, so improvement is grounded in what the platform can see rather than a self-assessment.
- Cost. Where spend is going and where it can be trimmed, tied back to the services and teams that own it.
How it connects
Operate reads from the same platform data as everything else, so a reliability dip, a cost spike, and a security finding all point back to a service with an owner. That is what turns a signal into an action instead of a notification.
The task-level guides for each Operate area are being expanded. In the meantime, each surface is available in the portal under Operations, and the platform overview describes the full set.