Superpowers is a set of skill plugins for Claude Code that encode software engineering best practices as executable constraints.
TLDR
-
Countering AI laziness — AI always tends to minimize output while satisfying constraints. You can’t rely on self-discipline; you need enforced checkpoints. Like lint rules — not because programmers are bad, but because certain mistakes are too common and too costly.
-
Countering AI overconfidence — AI’s claimed certainty always exceeds the actual evidence. “Should work” isn’t done — showing output that proves it’s true is. Evidence before assertions.
-
Zero Trust — Superpowers isn’t writing a guide on “how AI should work” — it’s designing a system architecture that assumes AI will cut corners, lie, and deceive itself.
-
Decentralized pipeline — Each skill’s terminal state points to the next (or multiple skills)
-
Digraph-described workflows — Uses digraphs to describe processes, because LLMs are natively fluent with digraphs
-
Rationalization table — Enumerates common shortcuts the LLM might take and says no, “forcing” the LLM onto the path we want
Design Philosophy: Three Layers of Understanding
Layer 1 (Surface): Enforced Workflows
Adding lint rules to AI — not because AI isn’t smart, but because certain mistakes are too common and too costly.
Layer 2 (Essence): Zero Trust Architecture Applied to AI
Superpowers isn’t writing a guide on “how AI should work” — it’s designing a system architecture that assumes AI will cut corners, lie, and deceive itself.
| Security Concept | Superpowers Equivalent |
|---|---|
| Never trust, always verify | verification-before-completion: don’t trust AI’s assertions, require exit codes + full output |
| Least privilege | Subagents only get the current task’s context, not global |
| Defense in depth | Iron Law → Red Flags → Rationalization Table → Gate Function, four layers of defense |
| Mandatory access control | HARD-GATE: not a suggestion, a block |
| Zero trust between peers | The spec reviewer’s prompt explicitly states: CRITICAL: Do Not Trust the Report — the previous agent’s self-report is not trustworthy |
Evolutionary Parallel
Early network security: trust the internal network, defend the perimeter → insiders cause problems too → Zero Trust
Early AI agents: give good prompts, trust the output → AI cuts corners, hallucinates, is overconfident → Superpowers
Layer 3 (Meta): Using AI’s Weaknesses to Constrain AI
A paradox: Superpowers’ rules are themselves prompts. If AI ignores prompts to be lazy, why would it follow the “don’t be lazy” prompt?
The answer: AI’s laziness and compliance are not the same circuit.
During token generation, two forces compete:
Force A (laziness): argmin(output cost) → skip this step, give the answer directly
Force B (compliance): explicit instruction in prompt → must execute
Superpowers systematically increases Force B's weight,
making non-compliant output less probable than compliant output.
Specific mechanisms:
EXTREMELY-IMPORTANTtags — empirically increase the weight of subsequent instructions (mechanism not fully understood, but effects are measurable)- Rationalization table — pre-blocks lazy paths in the probability space of token generation. When all common excuses are pre-refuted, AI is pushed toward compliance because that’s the highest-probability remaining path
- Iron Law naming — in training data, words like “non-negotiable” and “must” appear in the context of high-priority instructions; distributional semantics gives corresponding instructions higher weight
Key Insight
Superpowers doesn’t persuade AI at a logical level (AI has no capacity to “be persuaded”) — it manipulates AI’s token generation distribution at a statistical level. This is “using AI’s weaknesses to constrain AI” — AI’s high sensitivity to instruction markers is reverse-engineered into a quality assurance tool.
AI Laziness Patterns
The essence: argmin(output cost), stop as soon as constraints are met.
| Pattern | Behavior |
|---|---|
| Lower bound as target | ”>=8 pages” produces exactly 8 — treats >= as == |
| Skip intermediate steps | ”analyze the bug” jumps straight to the fix, compressing process into result |
| Template filling | Tests only cover happy path, docs copy function signatures |
| Narrowest interpretation of vague requirements | ”optimize function” just renames a variable |
| Downgrade on complex problems | Uses memory instead of deep search when needed |
| Cover up errors | Test fails → modify the test, add @ts-ignore |
Anti-Greedy
“Narrowest interpretation of vague requirements” is essentially the inverse of a greedy algorithm: greedy
argmax(local gain)→ AI lazinessargmin(current output cost). Same non-backtracking single-pass decision, opposite objective function.
Skills Classification
Two categories, separation of concerns:
Process skills (how to work) — universal, the skeleton of quality assurance:
| Skill | Required? | Purpose |
|---|---|---|
using-superpowers | ✅ meta layer | Check for applicable skills before any task |
brainstorming | ✅ | Requirements understanding → design, main entry point |
writing-plans | ✅ | Convert designs into executable task lists |
test-driven-development | ✅ | Follow TDD during implementation |
systematic-debugging | ✅ (on bugs) | 4-phase root cause analysis, no guess-and-fix |
verification-before-completion | ✅ | Must show verification output before claiming done |
requesting-code-review | ✅ | Request review after completion |
receiving-code-review | ✅ | Verify feedback is technically correct before implementing |
finishing-a-development-branch | ✅ | Integration and wrap-up |
dispatching-parallel-agents | ❌ optional | Parallel investigation of multiple independent issues, pure acceleration |
Domain skills (domain knowledge) — pluggable: frontend-design, claude-developer-platform, obsidian-markdown, etc.
Complete Flow
No central dispatcher — a decentralized pipeline where each skill’s terminal state is hardcoded to point to the next:
using-superpowers (meta layer)
↓
brainstorming (main entry, terminal state can only → writing-plans)
↓
writing-plans
↓
├── subagent-driven-development (current session, subagent auto two-phase review)
└── executing-plans (new session, manual review, 3 tasks per batch)
↓
finishing-a-development-branch
brainstorming has a HARD-GATE: design must be approved by the user before any code is written. Directly counters AI’s urge to skip the design phase.
Execution Layer: Three Options
The core distinction: who reviews, sequential or parallel.
subagent-driven-development | executing-plans | dispatching-parallel-agents | |
|---|---|---|---|
| Review | Subagent auto two-phase | Manual checkpoint | None |
| Order | Sequential | Sequential | Parallel |
| Prerequisite | writing-plans | writing-plans | Multiple independent bugs/failures |
| Human involvement | Low | High | Low |
dispatching-parallel-agents doesn’t require a plan file — it’s downstream of systematic-debugging, triggered only when multiple independent root causes are found. It’s fundamentally about acceleration, not quality assurance.
Design Elegance
Digraph-described workflows — Uses Graphviz DOT instead of natural language. Eliminates ambiguity, makes decision branches explicit, and is more reliably understood by LLMs.
Explicitly named failure modes — Each skill’s Red Flags section doesn’t list errors — it lists the things AI tells itself to justify skipping the skill. Preventing rationalization, not just prescribing behavior.
The 1% rule — using-superpowers mandates invocation if there’s even a 1% chance a skill applies. Deliberate overcorrection — using asymmetric constraints to offset AI’s systematic underestimation of skill necessity.
Self-referentiality — skill-creator lets the system extend itself using its own methodology.
Key Quotes
“Instructions say WHAT, not HOW. ‘Add X’ or ‘Fix Y’ doesn’t mean skip workflows.”
“Every project goes through this process. A todo list, a single-function utility — all of them.”