
Last episode explained why the role system could be understood by outsiders — all three signals were satisfied. This episode examines the same question from the opposite side: why couldn’t the core?
The Meaning of Zero
Look at these numbers again:
- External PRs touching
roles/*.md: 5 - External PRs touching harness core code: 0
- External PRs touching gate logic: 0
- External PRs touching review flow: 0
- External PRs touching the extension system: 0
Zero isn’t “not yet.” Zero is a signal.
It indicates a selection process: capable people looked at the core code, assessed the uncertainty, and chose not to touch it. This isn’t a lack of ability — anyone who can read flow-transition.mjs and understand its control flow has real technical skill. But understanding control flow and understanding design intent are separated by a gap that reading more code won’t close.
163 people starred the repo. 39 people forked it. Some of them may have opened bin/opc-harness.mjs, read flow-transition.mjs, looked at gate-protocol.md. Then — closed the file and went to write a role PR instead.
PR #12 is the only external PR that came close to the core. A very small fix: wrapping JSON.parse in flow-transition.mjs with a try-catch guard, preventing malformed input from causing an unhandled exception crash. One line of defensive code. Logical, safe, minimally invasive.
It was closed without merging.
Why? Not because the fix itself was wrong — the logic was perfectly sound. But in the context of the gate state machine, that one try-catch raised more questions: What should the catch block return? FAIL? Or a special ERROR state? How would this ERROR state be handled by downstream gate counters? If the parse failure means the upstream node’s output format is broken, should the system loop back upstream or terminate directly?
One line of code fix pulled out the entire state machine’s design decisions. If you don’t understand the tradeoffs behind those decisions, your try-catch might introduce more problems than it solves.
Three Walls Around the Core
EP03 showed that when outsiders touch role files, all three signals (understandable, modifiable, safe to break) are satisfied. Core code presents the exact opposite — three walls standing between contributors and the code.
Wall 1: Tacit Knowledge
Open flow-transition.mjs and you see code. What you don’t see:
Why digraph instead of DAG? OPC’s pipeline allows loops (gate FAIL → back to build node), so a directed acyclic graph won’t work. But loops have caps (maxLoopsPerEdge=3), so it’s not an arbitrary graph either. This design decision isn’t written down anywhere — it’s in my head.
Why does synthesize use emoji counting instead of LLM judgment? S1E07’s conclusion: mechanical gates beat LLM gates because LLM gates are susceptible to tone anchoring. The context for this decision is scattered across S1E07’s narrative — not in code comments, not in an ADR (Architecture Decision Record).
Why is maxLoopsPerEdge set to 3? Not calculated. It was a judgment call during early S1 experimentation. 2 was too few (sometimes first-round review feedback needs two iterations to fully address), 5 was too many (beyond 3 rounds suggests the problem isn’t in details but in direction). The rationale for this number isn’t in the code — it’s in a session that was compacted and cleaned up two months ago.
This is tacit knowledge. Code tells you “what is,” not “why not something else.” The real cost of understanding the core isn’t reading N lines of code — it’s reconstructing the context of N abandoned alternatives.
Cline’s 3,764-line Cline.ts (analyzed in S2E04) has the same problem. You can see it uses one giant switch-case to handle all tool calls. What you can’t see: Why not split it into multiple files? Was it intentional design or tech debt? What would break if you split it? Nobody knows — except the person who wrote it.
The trickiest aspect of tacit knowledge is that its holder often doesn’t realize it’s tacit. As the maintainer, flow-transition.mjs reads clearly to me — because I carry the full context in my head. Strip away that context, and the code is just syntax. This is the curse of knowledge: maintainers genuinely believe the code speaks for itself, because for them, it does.
Wall 2: Debug Environment
Modifying a role file requires running no code. Modifying core code requires:
- A complete local environment: Node.js, npm, correct dependency versions
- Real test execution:
bash test/run-all.shrunning 109 test files - End-to-end verification: Not just unit tests passing — you need to run a complete pipeline (init → build → review → gate → transition) to confirm the change doesn’t break state transitions
- Claude Code API key: OPC’s review nodes call real LLMs. Testing requires API access
Point 4 is the killer. OPC isn’t a purely deterministic system — review node output depends on LLM responses. You changed gate logic and want to verify it correctly handles a FAIL judgment? You need a real review output, and that requires an API call.
Verification cost for role files is zero — no verification needed. Verification cost for core code is setting up a complete environment plus consuming API costs.
Wall 3: Rollback Confidence
You submit a PR for a role file, it gets merged, turns out there’s an issue. Rollback: git revert, delete the file, done. Impact scope: one review node loses one perspective.
You submit a PR for gate logic, it gets merged, turns out there’s an issue. Rollback: git revert — but wait. Before the revert, how many pipeline runs passed through the modified gate? Were those runs’ judgments correct? If a review that should have been FAIL was incorrectly judged as PASS and proceeded to the next node, reverting the gate logic doesn’t undo the incorrect judgment that already happened.
Role file rollback is stateless — just delete the file. Core code rollback may be stateful — judgments already made can’t be undone.
Issuing the “You Can Change This” Invitation
The problem is now clearly defined. The core has three walls: tacit knowledge, debug environment, rollback confidence. This isn’t the contributors’ problem — it’s the core maintainer’s (mine) problem.
If the core is to become a public asset, these three walls need to be addressed head-on:
Addressing tacit knowledge: Write Architecture Decision Records (ADRs). Not code comments — comments explain “what this line does”; ADRs explain “why we chose A over B.” One record per key design decision: why digraph over DAG, why emoji counting over LLM judgment, why maxLoopsPerEdge=3. Let outsiders reconstruct decision context, not just see decision outcomes.
Addressing debug environment: Provide a minimal local verification path. A Dockerized test environment where docker run opc-test runs all tests. Provide mock review outputs so gate logic can be tested without API keys. Lower the verification entry barrier.
Addressing rollback confidence: Explicitly label change impact scope. An IMPACT.md beside each core module: what downstream behaviors does changing this file affect? What’s the worst failure mode? How do you verify the change is correct? Let contributors assess risk before submitting a PR.
Public Code ≠ Public Understanding
This episode’s core argument in one sentence:
Public code does not equal public understanding.
OPC’s code is on GitHub, readable by anyone. But the distance between “can read” and “understand enough to safely modify” is far greater than most maintainers realize.
Pushing code to GitHub is the starting point of open source, not the finish line. Code is just the visible tip of the iceberg. Below the waterline: why this line is written this way, why not that way, which historical attempts failed, which implicit assumptions still hold. This information determines whether a modification is safe or dangerous.
PR #12 wasn’t closed because the fix was wrong. It was closed because even though the fix was correct, the external contributor couldn’t confirm its behavior within the complete state machine — and the maintainer (me) hadn’t provided enough context for them to confirm it.
When people only change leaves and won’t touch bones, the problem isn’t that they’re timid. The problem is the bones have no labels — no map saying “you can touch here,” no guide saying “here’s how to verify after touching,” no assessment saying “here’s how much impact breaking this would have.”
From “My Tool” to “Our Tool”
EP01 through EP04, the trust transfer picture grows clearer:
- Infrastructure layer (EP01): Won’t run on Linux. Trust stalls at installation before it even starts.
- Pattern layer (EP03): Role format is self-explaining; outsiders understood and used it. Trust established at the leaf level.
- Contribution layer (EP02): Five role PRs submitted. Trust at the leaf level produced action.
- Core layer (EP04, this episode): Zero core PRs. Trust stopped at the core layer. Not because nobody showed up, but because the core never issued an invitation.
Between “my tool” and “our tool,” the gap isn’t just code quality — it’s knowledge transferability.
Next episode returns to the infrastructure layer: turning a tool that only runs on my machine into one anyone can run on any platform. Not just fixing one capitalization issue — systematically cleaning up all implicit environment assumptions.
Silicon Team S3: From “I Can Use It” to “Others Can Use It” ← S3E03: Why the Role System Could Be Understood by Outsiders | S3E05: Turning a Personal Tool Into Something a Stranger Can Run →