Series B SaaS: observability rollout across 4 product lines
Illustrative composite · no named client · metrics typical of patterns we work in
Unified production-monitoring system across 4 product lines and 60+ services, with reliability targets defined against what customers actually see. One platform replaces three previous vendors.
Problem
A Series B SaaS had three different observability vendors across four product lines after a series of acquisitions. Triage during a cross-product incident required three logins and inconsistent semantics.
Customer-facing SLOs existed only for one product. The other three were operating on host-level CPU graphs.
Approach
Standardized on OpenTelemetry for metrics, traces, and logs. Migrated each product line on a three-week cycle so existing dashboards stayed live until parity was verified.
Defined customer-facing SLOs per product with the product team: checkout success, search latency, dashboard render time, ingest staleness. Made them visible in a single status page.
Built a single Grafana stack on managed Prometheus + Loki. Decommissioned the three legacy vendors at $42k/month combined savings.
Outcome
Tech
"We didn't realize how much triage time was vendor-switching tax until it was gone."
Related services
The engagement categories this case primarily covered.
Tell us what you're building.
Send a project brief and we'll reply within one business day, or book a 30-minute intro call directly.
Thanks, got it.
We'll reply within one business day at the email you provided. A real person reads every message; no auto-responders.