v0.60.0 · Now Available

One sentence.
Production-ready code.

ai-spec is an AI-driven development orchestrator — from a single requirement to fully reviewed, test-covered, spec-aligned code, in minutes.

Get Started → View on GitHub
9
AI Providers
10
Pipeline Steps
913+
Tests Passing
25+
Core Modules
ai-spec pipeline demo

The Problem

Why existing AI tools
aren't enough

Every AI coding tool faces the same structural limitations. ai-spec is designed to address all of them.

🧠
No Project Memory
AI doesn't know your error codes, middleware setup, or i18n constraints. Every conversation starts from zero — like code from a new hire who's never seen your codebase.
🕳️
No Structured Middle Layer
Natural language jumps directly to code. No reviewable, versionable contract in between. Misunderstandings are discovered in code — at high cost.
💥
All-or-nothing Generation
Generate a whole feature at once. One error fails everything. No checkpointing, no resume — failures restart from scratch.
🚪
Generate and Exit
Did tests pass? Any lint errors? Architecture violations? You have to check everything manually after the tool exits.
📉
Experience Lost Every Time
A security bug found in review? AI will repeat it next time. Team engineering knowledge can't systematically constrain AI behavior.
👻
Cross-task Hallucinations
Task B hallucinates function names from Task A — even though both files are in the same PR. Without a shared cache, AI guesses instead of reads.

The Pipeline

Every step, orchestrated

A fully automated 10-step pipeline from idea to reviewed, scored, production-ready code.

[1/10] Context [2/10] Spec + Tasks [3/10] Refinement [3.4/10] Quality Gate [Gate] Approval [DSL] Contract [Git] Worktree [6/10] Codegen [7/10] Tests [8/10] Auto-fix [9/10] 3-pass Review [10/10] Harness Eval
[1/10] CONTEXT LOAD

Project-Aware from the Start

Scans routes, schemas, dependencies, middleware, and the project constitution. Every prompt is grounded in your actual codebase — not a generic template.

[2/10] SPEC + TASKS

Structured Spec with Task Decomposition

Generates a human-readable Markdown spec and decomposes it into ordered tasks: data → service → api → view → route → test. One AI call, complete output.

[3/10] REFINEMENT

Interactive Polish with Diff Preview

AI polishes the spec and shows a colored diff. You approve, reject, or request changes. Multiple rounds supported — no code is written until you say so.

[DSL] CONTRACT

Machine-Readable Dual Contract

Extracts a SpecDSL JSON — models, endpoints, behaviors — from the spec. Validated against 9 schema rules. The single source of truth for codegen, tests, and exports.

[6/10] CODEGEN

Task-Layered Generation with File Cache

Generates file-by-file in dependency order. Each completed file's exports are cached and injected into subsequent prompts — eliminating cross-task hallucinations.

[8/10] AUTO-FIX

Error Feedback Loop — Up to 3 Cycles

Runs npm test / lint / tsc, parses errors by file, and sends targeted AI fixes with DSL context. Dependency-sorted repair order maximizes cycle efficiency.

[9/10] 3-PASS REVIEW

Architecture + Implementation + Impact

Pass 1: architecture & spec compliance. Pass 2: implementation correctness & edge cases. Pass 3: blast radius, complexity score, breaking change risk.

[10/10] HARNESS EVAL

Automated Quality Score

Scores on 4 dimensions: compliance (30%) + DSL coverage (25%) + compile (20%) + review (25%). Linked to prompt hash — tracks quality over time with zero AI calls.


Core Features

Everything you need to
ship with confidence

Every feature addresses a real pain point in AI-assisted development.

📜

Project Constitution System

Self-evolving knowledge base (§1–§9) that auto-injects into every prompt. Scans routes, middleware, schema, and conventions on init. Grows smarter with every review via §9 lesson accumulation.

ai-spec init
🎯

Dual-Layer Contract

Human-readable Markdown Spec for engineers to review and align on. Machine-readable SpecDSL JSON for tools to consume. Both versioned, both auditable. Codegen, tests, and exports all share one contract.

Spec + DSL
🔄

Dual Feedback Loops

DSL Gap Loop: detects sparse contracts before codegen and triggers targeted spec enrichment. Review→DSL Loop: structural review issues feed back into the contract — so the next run starts cleaner.

Self-correcting

VCR Record & Replay

Record real AI responses on first run. Replay them deterministically in subsequent runs — zero API calls, zero cost. Iterate on pipeline logic and UI without burning tokens.

ai-spec create --vcr-record
🛡️

Approval Gate

Human review happens at the right moment: after the spec is clear and the DSL is valid, but before any code is written. Abort means zero disk residue. Proceed means every step has a verified contract to follow.

[Gate] checkpoint
🔁

Fix-History Self-Learning

Every successful import fix is appended to a ledger. On the next codegen run, a "DO NOT REPEAT" section is automatically injected into prompts — preventing the same hallucination from ever reoccurring.

v0.54+ zero-cost learning
↩️

Instant Rollback

Every run gets a unique RunId. Before any file is written, the original content is snapshotted. One command restores your entire repo to pre-run state — precise to the file, precise to the run.

ai-spec restore <runId>
🌐

9 AI Providers

Gemini, Claude, OpenAI, DeepSeek, Qwen, GLM, MiniMax, Doubao, MiMo. Mix and match: use one model for spec generation, another for codegen. Per-run provider override supported.

--provider --codegen-provider

Full-stack in
one command

The only pipeline that wires your backend and frontend together — automatically.

🖥️ Backend — node-express
[W2] Spec + DSL generated
Models, endpoints, behaviors extracted
Code generated + reviewed
DSL contract ready for handoff →
DSL Contract
5 endpoints
3 models
injected into
frontend pipeline
🖼️ Frontend — vue / react
[W4] Spec generated with backend contract
HTTP client calls pre-aligned to DSL
Code generated + reviewed
[W5] Cross-stack verifier: 0 phantoms ✔
✔ Cross-Stack Contract Verification (v0.50+)

After frontend generation, the cross-stack verifier scans every API call in the frontend code and checks it against the backend DSL. Phantom routes (hallucinated endpoints), method mismatches, and string-concatenated paths are all detected and reported before you push.


DSL-Derived Artifacts

One contract,
many outputs

The SpecDSL isn't just for codegen — it powers your entire development workflow.

ai-spec export

OpenAPI 3.1.0 Export

DSL → production-ready YAML or JSON. Plug directly into Postman, Swagger UI, or any SDK generator.

openapi.yaml (3.1.0)
Paths, schemas, parameters, responses
--format json · --server <url>
ai-spec mock

Instant Mock Server

DSL → Express mock server + MSW handlers + Vite proxy config. Frontend development without waiting for the backend.

mock/server.js (Express)
mock/handlers.ts (MSW)
--serve --proxy --port 3001
ai-spec types

TypeScript Types

DSL → typed interfaces, request/response types, and API endpoint constants. Shared across frontend and backend.

export interface Model {}
export const API_ENDPOINTS
Request & Response types
ai-spec dashboard

Harness Dashboard

Generate a static HTML quality dashboard. Track harness scores, compliance rates, and review trends across all runs.

Static HTML, no server needed
Score trend charts
Per-run stage breakdown

What the pipeline
actually outputs

Every step is visible, every decision is auditable. No black box — you see exactly what's happening, what scored how, and what was fixed automatically.

Spec quality assessment with per-dimension scores
DSL extraction with validation summary
Per-file codegen with layer labels
Error auto-fix with cycle count
3-pass review with per-pass verdicts
Final harness score breakdown (4 dimensions)
ai-spec create "Add task management"
[1/10]  Loading project context...
        Constitution : ✔ found (§1–§9)
        Tech stack   : vue · vite · pinia

[2/10]  Generating spec with glm/glm-4.5...
        ✔ Spec generated  ✔ 8 tasks

[3.4/10] Spec quality assessment...
        Coverage     [██████████████████░░]  9/10
        Clarity      [████████████████░░░░]  8/10

[Gate]  Approval Gate — awaiting decision
        ✔ Approved — continuing...

[DSL]   Extracting structured contract...
        ✔ DSL valid — Models: 3  Endpoints: 7

[6/10]  Code generation (8 files)...
          service  · src/api/task.ts
          api      · src/stores/taskStore.ts
          view     · src/views/TaskList.vue
        ████████████████████  100%

[8/10]  ⚠ 3 errors — auto-fixing cycle 1...
        ✔ All errors resolved in 1 cycle

[9/10]  3-pass code review...
        Pass 1  ✔ Architecture aligned
        Pass 2  ✔ Implementation correct
        Score   [████████████████░░░░]  8.2/10

[10/10] Harness Self-Evaluation...
        Total   [██████████████████░░]  92/100
        ✔ 2 lessons → constitution §9
        RunId: 20260409-143022-a7f2

Observability

Quality you can
measure and track

ai-spec turns code generation quality into data — comparable, trackable, and improvable over time.

Harness Score Trend

Track quality across all runs. See if your pipeline is improving.

Run 1
70
Run 2
74
Run 3
82
Run 4
88
Run 5
92

Per-Run Stage Logs

Every stage is timed and logged to .ai-spec-logs/<runId>.json.

context_load 312ms
spec_gen 18.4s
dsl_extract 6.1s
codegen 51.2s
error_feedback 14.3s
review 14.8s
total 94.3s

4-Dimension Scoring

The harness score is deterministic — no AI calls after generation completes.

Compliance30%
DSL Coverage25%
Compile Pass20%
Review Score25%

Instant Rollback

Don't like the result? One command restores all modified files to their pre-run state.

$ ai-spec restore 20260409-a7f2
↩ src/api/task.ts
↩ src/stores/taskStore.ts
↩ src/views/TaskList.vue
✔ 8 files restored

AI Providers

9 providers,
your choice

Use any combination of providers. Mix a reasoning model for spec generation with a fast model for codegen.

MiMo
mimo-v2-pro
Gemini
gemini-2.5-pro
Claude
claude-opus-4-6
OpenAI
o3 · gpt-4o
DeepSeek
deepseek-chat · r1
Qwen
qwen3-235b-a22b
GLM
glm-5 · glm-4.5-air
MiniMax
MiniMax-Text-2.7
Doubao
doubao-pro-256k
MiMo
mimo-v2-pro
Gemini
gemini-2.5-pro
Claude
claude-opus-4-6
OpenAI
o3 · gpt-4o
DeepSeek
deepseek-chat · r1
Qwen
qwen3-235b-a22b
GLM
glm-5 · glm-4.5-air
MiniMax
MiniMax-Text-2.7
Doubao
doubao-pro-256k
$ ai-spec create "Add login" --provider gemini --codegen-provider deepseek

Ready in 60 seconds

Install globally, set your API key, register a repo, and start shipping.

# Install globally
$ npm install -g ai-spec-dev

# Set your API key (any provider)
$ export GEMINI_API_KEY=your_key_here

# Register your repo + generate constitution
$ ai-spec init

# Start developing
$ ai-spec create "Add user authentication to my app"
View on npm → GitHub Repo