Sergiu Petean explains how a regulated European insurer transformed a legacy DevOps organization into a cloud‑native platform engineering group, using dynamic reference architectures, stakeholder‑driven KPIs, and open‑source tooling to improve safety, performance and innovation sovereignty.
From Legacy to Sovereignty: Platform Engineering Lessons for Insurance

Why platform engineering matters for safety and performance
In heavily regulated domains such as European insurance, the cost of a single security breach or a compliance miss can dwarf any development budget. Moving from a monolithic DevOps setup to a platform engineering model gives teams three concrete safety/performance benefits:
- Isolation of risk – By exposing self‑service APIs and encapsulating shared services (IAM, observability, compliance checks) behind a well‑defined contract, accidental configuration drift is prevented. The platform owns the guardrails; developers only consume them.
- Predictable delivery – Platform‑wide DORA metrics (lead time, change failure rate, MTTR, deployment frequency) become a shared baseline. When the platform enforces a minimum test coverage and automated security scanning, the variance in reliability across squads drops dramatically.
- Cost transparency – A per‑change cost model (e.g., €49 → €13 per change) ties resource consumption directly to business outcomes, allowing finance leaders to audit spend without sacrificing safety.
Dynamic reference architectures replace static blueprints
Petean described the failure of a single, static reference diagram. Instead, the organization built clustered reference architectures that vary by:
- Regulatory zone (GDPR, Solvency II, local data‑ residency)
- Talent profile (in‑house Rust expertise vs. outsourced Go teams)
- Cloud topology (public‑cloud, private‑cloud, hybrid)
Each cluster is expressed as a set of reusable Helm charts and Terraform modules stored in a version‑controlled platform catalog. The catalog is consumed via a GitHub‑based CI pipeline that validates every change against a policy engine (OPA) before it reaches production. This approach keeps the architecture evolvable while guaranteeing that every new service inherits the same safety checks.
Stakeholder‑driven KPIs bridge the board and the codebase
Traditional DevOps KPIs focus on engineering output. Petean extended the set to include:
| Stakeholder | KPI | Safety/Performance impact |
|---|---|---|
| COO | Cost per change | Drives FinOps discipline, forces teams to eliminate waste |
| CFO | Compliance coverage ratio | Guarantees that 100 % of releases pass mandatory audits |
| CISO | Vulnerability‑to‑remediation time | Reduces exposure window for critical CVEs |
| CTO | Feature‑to‑market latency | Encourages rapid, yet safe, experimentation |
The platform surfaces these metrics on a unified dashboard (Grafana + Prometheus) and feeds them into the quarterly business review. Because the numbers are tied to concrete platform services, the board can ask “What would happen to our compliance ratio if we cut the security scanning budget?” and receive an immediate, data‑driven answer.
Open‑source tooling as the backbone of innovation sovereignty
To avoid vendor lock‑in, the team built the stack almost entirely from CNCF projects:
- Kubernetes for workload orchestration
- Tekton for CI/CD pipelines (later migrated to GitHub Actions where stability improved)
- OPA for policy‑as‑code
- Prisma Wiz for vulnerability scanning, wrapped in a custom vulnerability‑management service that de‑duplicates alerts and pushes actionable tickets to Opsgenie.
Because the tooling is open source, the organization retains the ability to relocate workloads from a hyperscaler to a private cloud with only a few configuration changes. This innovation sovereignty is a strategic asset when regulatory pressure demands data‑locality.
Reducing cognitive load with custom team topologies
Petean introduced a Distributed DevOps (DDO) model that rotates engineers through short‑term, cross‑functional squads (API, observability, security, storage). Each DDO lasts three to six months, after which the engineer moves to a new domain. Benefits include:
- Focused expertise – Engineers avoid the “jack‑of‑all‑trades” trap and become deep specialists in a single subsystem.
- Knowledge diffusion – Rotations spread best practices across the organization, lowering the overall cognitive load.
- Clear ownership – The platform catalog records the responsible DDO for every service, simplifying incident triage.
Federated SRE as a service (OrgOps)
When the organization grew to 2 000 engineers, a single centralized SRE team became a bottleneck. The solution was a federated SRE model:
- A core OrgOps team defines standards, provides shared tooling, and runs the compliance‑as‑code pipeline.
- Individual squads adopt the standards and run their own “SRE‑as‑a‑service” instances, reporting back via shared Service Level Objectives (SLOs).
- A production manager role aggregates incident data, ensuring that only alerts with runbooks reach the on‑call rotation.
This hierarchy preserves high reliability (SLO breach rate < 0.1 %) while keeping the cost of on‑call duty proportional to the value delivered.
Measuring impact: the extended DORA framework
Beyond the classic four DORA metrics, the team added two financial dimensions:
- Cost per deployment – calculated from cloud‑resource usage and CI minutes.
- Revenue impact per change – derived from A/B test results on new pricing models.
The extended dashboard showed a steady decline in cost per change (from €49 to €13) while the number of daily deployments rose from 3 to 12. This quantitative evidence convinced the CFO that the platform was a profit centre rather than a cost centre.
Key takeaways for Rust‑focused insurers
- Treat the platform as a safety contract – expose only verified, Rust‑compiled binaries through the platform’s API gateway; the compiler enforces memory safety before code ever reaches production.
- Automate compliance at the compiler level – use
cargo-auditand custom lint rules to embed regulatory checks into the build pipeline. - Expose platform KPIs as Rust metrics – implement
metricscrates that push data to Prometheus, allowing the same safety‑first mindset to extend to observability. - Invest in open‑source – the ability to move a Rust‑based microservice from AWS to an on‑prem Kubernetes cluster without rewriting code is the core of innovation sovereignty.
Closing thoughts
The journey from legacy DevOps to a sovereign, safety‑centric platform engineering organization is not a single project but a series of incremental experiments. By anchoring every decision in stakeholder‑driven KPIs, dynamic reference architectures, and open‑source tooling, an insurer can achieve both regulatory compliance and the performance needed to compete in the AI‑driven future.


Comments
Please log in or register to join the discussion