//the operating model

APEX_

Agentic Production Execution

A production-grade operating model for teams where humans design and verify while agents execute and iterate. Organizational scaffolding that makes agentic execution reliable, measurable, and improvable at scale.

3
phases
3
areas
9
domains
10
principles
5
metrics

//The APEX Cycle

The heartbeat of the framework. Three phases that repeat continuously — and a system that gets better every cycle.

01 HUMAN-FIRST

Strategic

~2–3x velocity

Humans design the system: specs, quality criteria, agent configuration, permissions. This is not overhead — it is the work. Agents can only execute within the boundaries humans draw.

02 AGENT-FIRST

Execution

~10–20x velocity

Agents execute against the specs. Agent-to-agent review loops resolve mechanical problems before a human ever sees the work. The single human role: verification.

03 LEARN

Reflection

~1–2x velocity

Evaluate → Reflect → Calibrate. Agents pre-compile the metrics; humans decide what to change. The output feeds back into Strategic — and the next cycle runs on a better system.

↺ calibration feeds back into Strategic — the cycle repeats on an evolved system

//1. Strategic — Three areas, nine domains

Every capability has a home. Every domain has a clear owner, clear boundaries, and clear artifacts. None falls through the cracks.

Platform

the foundation everything runs on
D1
Infrastructure
The harness decision, model strategy, compute, tool integrations. The most consequential choice in the framework.
D2
Operational Tooling
Dashboards, metrics pipelines, context generators. The bridge between measuring and acting.
D3
Security & Compliance
Least privilege, permission maps, audit trails. Invisible when working, catastrophic when neglected.

Spec

what humans specify & measure
D4
Business Context
The "why" and "for whom". Brand, personas, vision, constraints — everything agents need to produce domain-appropriate output.
D5
Spec Engineering
The translation layer between strategic thinking and executable work. The PRD is the single source of truth.
D6
QA Strategic
Humans define what "good" means at the system level. Measurement plans, evaluation criteria, definition of done.

Config

what humans configure for agents
D7
Agent Design
Identity files, skills, memory, behavior. The richer the identity, the less the agent infers — and inference is where drift happens.
D8
Orchestration Design
Routing rules, delegation chains, handoff protocols. Owns the traffic between agents — not what they do.
D9
QA Operational
Quality gates inside the execution loop. Designed by humans, enforced by a skeptical review agent — the generator never grades its own homework.

//2. Execution — The inner loop

Domain-agnostic. Humans see pre-validated work, not first drafts.

human
Spec
agent
Execute
agent
Review
agent
Iterate
human
Verify
agents iterate in seconds — by the time work reaches a human, the mechanical problems are resolved

// 3. Reflection — The metrics

Principles without measurement are just opinions. These are the initial recommended metrics to tell you whether the system is improving or degrading. Each implementation will need their own bespoke metrics.

M1
First-Pass Acceptance
Share of deliverables accepted at human verification without another round. The clearest signal of spec quality.
M2
Iteration Depth
Average agent-to-agent iterations per task. Watch the trend — decreasing means sharper specs.
M3
Human Touch Rate
Tasks needing human intervention outside designed verification points. Should decrease over time.
M4
Calibration Impact
The meta-metric: change in the other metrics cycle over cycle. Flat impact = ceremony without learning.
M5
Cycle Time
Spec-to-verified-delivery, end to end. Shrinking cycle time is the clearest signal the system is maturing.

//One agnostic framework, many different implementations

APEX is instantiated per use case. Same areas, domains, and phases — configured differently for fundamentally different work.

Product development

weekly cycles · autonomous harness

An Architect agent decomposes the PRD; Frontend and Integrator agents build in parallel; a QA agent codifies verified work into tests. Developers verify intent, not CSS.

7 experts · 9 domains mapped

Content production

daily cycles · autonomous harness

Brief → Research → Writer → skeptical Review agent → iterate → Editorial Lead verifies. The brand voice lives in the writer's identity file; briefs go in Monday morning, verified articles ship by afternoon.

6 experts · editorial quality gates

Data & research

daily runs · DAG harness

A fixed, auditable pipeline: three analyst agents in parallel → Correlator → Report Writer → Compliance Checker → human sign-off. Determinism beats recoverability when compliance is on the line.

6 experts · statistical quality gates

//The ten principles

01
Harness first. Your runtime choice sets all constraints. Decide it before you configure anything else.
02
Human in control of outcome. Not every step — the result. Design, verify, decide.
03
Quality in = quality out. Output is a direct function of specs, context, and criteria.
04
Agents review agents first. All work passes agent review before a human sees it.
05
Domain-mapped ownership. Quality gates map to expertise, not generic reviews.
06
Iterate often, iterate fast. Don't gate iterations behind human approval when agents can resolve them.
07
Least privilege. Every agent gets only the access it needs. No more.
08
Calibrate the system, not just the output. Repeated fixes mean the system needs to change.
09
Data-driven reflections. Agents report metrics. Humans decide on data, not gut feelings.
10
Think big, scale back. Design the whole system first. Remove what's premature. Keep the architecture.

Go deeper

The complete reference and the three full walkthroughs live in the insights.

Want APEX running in your organization? Let's talk_

Contact →
© 2026 Herbert Cuba Garcia // built by markdown & AI