Overview

Autonomous software organizations build systems that maintain and improve themselves with minimal human intervention. Developers describe what they want built through whatever medium makes sense, and the system executes that vision with quality and precision. The Autonomy Maturity Model measures how close your organization is to this state, and helps remediate gaps to get you there faster.

Getting Started

There are four ways to interact with Autonomy Maturity:

CLI: Run the /readiness-report slash command to evaluate a repository’s maturity level
Web Dashboard: View your organization’s maturity scores in the Autonomy Maturity dashboard
API: Programmatically access maturity reports via the Maturity Reports API
Remediation (Coming Soon): Automatically fix failing criteria directly from the CLI or dashboard

What Autonomous Development Looks Like

What does it look like when an organization reaches high autonomy maturity? Here are concrete examples of workflows that become possible when the technical foundation is in place.

Code from Conversation

A developer describes what they need built, and the system executes through deployment. Input: “Refactor the authentication module to support OAuth2 with PKCE, maintaining backward compatibility.” The system:

Generates idiomatic code following established patterns
Validates against linters, type checkers, and test suites
Handles the pull request and code review process
Updates documentation and notifies stakeholders
Deploys and monitors for issues

Design to Implementation

A designer shares a mockup, and the system implements it without handoffs. Input: Figma mockup of a new dashboard. The system:

Interprets the visual specification and design system
Implements UI components with proper styling and responsiveness
Connects to existing APIs or creates new endpoints
Adds loading states, error handling, and accessibility
Generates tests for user interactions
Deploys to staging for design review

Bug to Deployed Fix

A customer reports an issue, and the system diagnoses, fixes, and deploys autonomously. Input: Bug report through support system. The system:

Triages based on error logs and impact
Creates a ticket with reproduction steps and diagnostic context
Identifies the root cause from code analysis
Generates a fix and comprehensive tests
Opens a PR and assigns a developer for review
Notifies support when the fix is deployed

The 5 Maturity Levels

Repositories progress through five distinct levels, each representing a qualitative shift in how autonomous agents can operate within your codebase.

Level	Name	Description	Example Criteria
1	Functional	Code runs, but requires manual setup and lacks automated validation. Basic tooling that every repository should have.	README, linter, type checker, unit tests
2	Documented	Basic documentation and process exist. Workflows are written down and some automation is in place.	AGENTS.md, devcontainer, pre-commit hooks, branch protection
3	Standardized	Clear processes are defined, documented, and enforced through automation. Development is standardized across the organization.	Integration tests, secret scanning, distributed tracing, metrics
4	Optimized	Fast feedback loops and data-driven improvement. Systems are designed for productivity and measured continuously.	Fast CI feedback, regular deployment frequency, flaky test detection
5	Autonomous	Systems are self-improving with sophisticated orchestration. Complex requirements decompose automatically into parallelized execution.	Self-improving systems

How Scoring Works

Level Progression

To unlock a level, you must pass 80% of the criteria from the previous level. This creates a gated progression system:

All repositories start at Level 1
Pass 80% of Level 1 criteria → Unlock Level 2
Pass 80% of Level 2 criteria → Unlock Level 3
And so on…

Evaluation Scopes

Criteria are evaluated at two different scopes:

Repository Scope: Evaluated once for the entire repository (e.g., CODEOWNERS file exists, branch protection enabled)
Application Scope: Evaluated per application in monorepos (e.g., linter configured, unit tests exist for each app)

For monorepos with multiple applications, application-scoped criteria show scores like 3 / 4 (3 out of 4 apps pass).

The Technical Pillars

The Autonomy Maturity Model organizes criteria into nine technical pillars that form the foundation for autonomous operation.

Style & Validation

Linters, type checkers, and formatters catch obvious errors instantly. Agents avoid wasting cycles on syntax errors, style inconsistencies, and type mismatches.

Example criteria: Linter configuration, type checker, code formatter, pre-commit hooks

Build System

Clear, deterministic build commands let agents verify their changes compile and run before committing. No guessing about which commands to run or which flags to pass.

Example criteria: Build command documented, dependencies pinned, VCS CLI tools

Testing

Fast unit and integration tests create tight feedback loops. Agents learn whether their changes work correctly by running tests, seeing failures, and iterating.

Example criteria: Unit tests exist, integration tests exist, tests runnable locally

Documentation

Explicit instructions house the tribal knowledge that “everyone just knows.” How to set up the environment, run tests, deploy changes, debug issues. Agents need written instructions that are discoverable, accurate, and maintained.

Example criteria: AGENTS.md, README, documentation freshness

Development Environment

Reproducible environments ensure consistency. When developers and agents work in identical environments, entire classes of problems disappear. No more “works on my machine.”

Example criteria: Devcontainer, environment template, local services setup

Debugging & Observability

Structured logging, tracing, and metrics give agents runtime visibility into what code actually does. Good observability turns “it failed” into “it failed because X was null when calling Y after receiving Z.”

Example criteria: Structured logging, distributed tracing, metrics collection

Security

Branch protection, secret scanning, and code owners prevent agents from introducing security issues or bypassing required reviews. Agents move fast—automated guardrails ensure they move fast safely.

Example criteria: Branch protection, secret scanning, CODEOWNERS

Task Discovery

Infrastructure for agents to find and scope work autonomously. Well-structured issues and templates help agents understand what needs to be done.

Example criteria: Issue templates, issue labeling system, PR templates

Product & Experimentation

Tools for measuring impact, running experiments, and understanding user behavior. Agents can see whether features are actually used and measure the impact of their changes.

Example criteria: Product analytics instrumentation, experiment infrastructure

Welcome

CLI

Features

Reference

Web Platform

Enterprise

Getting Started

What Autonomous Development Looks Like

Code from Conversation

Design to Implementation

Bug to Deployed Fix

The 5 Maturity Levels

How Scoring Works

Level Progression

Evaluation Scopes

The Technical Pillars

Style & Validation

Build System

Testing

Documentation

Development Environment

Debugging & Observability

Security

Task Discovery

Product & Experimentation

Welcome

CLI

Features

Reference

Web Platform

Enterprise

​Getting Started

​What Autonomous Development Looks Like

​Code from Conversation

​Design to Implementation

​Bug to Deployed Fix

​The 5 Maturity Levels

​How Scoring Works

​Level Progression

​Evaluation Scopes

​The Technical Pillars

​Style & Validation

​Build System

​Testing

​Documentation

​Development Environment

​Debugging & Observability

​Security

​Task Discovery

​Product & Experimentation

Getting Started

What Autonomous Development Looks Like

Code from Conversation

Design to Implementation

Bug to Deployed Fix

The 5 Maturity Levels

How Scoring Works

Level Progression

Evaluation Scopes

The Technical Pillars

Style & Validation

Build System

Testing

Documentation

Development Environment

Debugging & Observability

Security

Task Discovery

Product & Experimentation