这是一篇由原始材料转换而来的阅读页,保留了源文件的主要结构,并补充了可追溯的来源说明与链接。
This file is the “shift handoff protocol” for long running work (spanning many sessions / context windows).
AGENTS.md — Effective harness for long-running coding agents (ready-to-run)
This file is the “shift handoff protocol” for long-running work (spanning many sessions / context windows).
Mental model: each session like a new engineer joining mid-project with no memory. Your job is to:
1) get up to speed fast, 2) make one verifiable increment, 3) leave durable artifacts so the next session can continue without guessing.
This template is designed to work well with Codex CLI (codex).
0) Non-negotiable rules
- No one-shotting. Each session completes exactly one feature (or one small bugfix) end-to-end.
- No “declare victory”. A feature only becomes
passes: trueafter real verification. - Leave the repo clean. End the session in a state that could be merged:
-
git statusis clean (unless explicitly explained in progress notes) - changes are committed with a meaningful message - Artifacts over memory. The next session must be able to resume by reading files + git history only.
1) Required harness artifacts (in repo)
Place these in the repository root (or document the exact paths here):
init.sh- one command to (a) start the dev environment and (b) run a smoke test
-
should exit non-zero if anything is broken
-
feature_list.json - a structured, end-to-end feature checklist
- every item has
passes: false/true -
coding sessions should only modify
passes(and may append new items if the list is missing scope) -
progress.md(orprogress.log) - append-only shift log
-
must include: what you did, commands you ran, results, commit hashes, next steps
-
Git history
- each session ends with a commit; git is your rollback-able memory
Templates live at: /srv/project/harness-engineering/templates/*
2) Session protocol (every coding session)
2.1 Get your bearings (target: < 5 minutes)
Run these in order:
1) Confirm where you are + repo health
pwd
ls
git status
2) Read the handoff log
test -f progress.md && sed -n '1,200p' progress.md || true
3) Read recent commits
git log --oneline -20
4) Inspect feature list and pick the highest-priority passes=false
test -f feature_list.json && cat feature_list.json | head -n 160 || true
5) Start + smoke test (mandatory)
bash ./init.sh
If smoke test fails: stop. Fix the breakage first. Don’t start new work on top of a broken baseline.
2.2 Implement exactly one feature (increment)
- Implement the chosen feature.
- Verify it end-to-end (not just “unit tests passed”).
- Only after verification, flip that item’s
passestotrueinfeature_list.json.
Practical note (my take): if you cannot design a reliable verification step for a feature, it’s not ready to be marked passing. Add missing steps/tests/tools first.
2.3 Close the session (handoff + commit)
1) Append to progress.md:
- which feature you targeted (quote the description)
- what changed (key files)
- commands you ran (esp. test/smoke/e2e)
- results + any remaining issues
- next recommended feature to tackle
2) Commit everything
git add -A
git commit -m "feat: <short summary>"
3) Final check
git status
3) Codex CLI: copy/paste launch commands
Replace
<REPO_DIR>with your repository path (e.g./srv/project/repos/myapp).
3.1 Initializer session (first run only)
Goal: create the harness artifacts and an initial commit. Do not implement product features here.
codex exec -m gpt-5.4 -C <REPO_DIR> --full-auto - <<'PROMPT'
You are the initializer agent. Your job is to set up a long-running harness for this repository.
Deliverables (repo root):
1) init.sh: installs/checks deps as appropriate, starts the dev environment, runs a minimal smoke test, exits non-zero on failure.
2) feature_list.json: a structured end-to-end feature checklist derived from README and the codebase; initialize all passes=false.
3) progress.md: write the first handoff entry: what you created, how to run init.sh, how to pick the next feature.
Process:
- Run init.sh once to verify the script works (if it requires manual pre-steps, document them precisely in progress.md).
- Create an initial git commit with message: "chore: initialize agent harness".
Constraints:
- Don’t attempt to implement product features.
- Prefer JSON structure for the feature list; later sessions should only flip passes.
PROMPT
If you prefer interactive (pair-programming style):
codex -m gpt-5.4 -C <REPO_DIR>
3.2 Coding session (every subsequent run)
codex exec -m gpt-5.4 -C <REPO_DIR> --full-auto - <<'PROMPT'
You are the coding agent (shift engineer). You must complete exactly ONE passes=false feature end-to-end.
Mandatory protocol:
1) git status; git log --oneline -20
2) read progress.md and feature_list.json
3) run: bash ./init.sh (smoke test). If it fails, fix it first.
4) choose the highest-priority passes=false feature; implement only that.
5) verify end-to-end; only then flip passes to true.
6) append progress.md with: what you did, commands, results, commit hash, next step.
7) git add -A && git commit -m "feat: <short summary>"
8) ensure git status is clean.
Hard rules:
- Do not mark passes=true without verification.
- Do not edit feature descriptions/steps except to append new items when scope is missing.
- Leave the repo in a merge-ready state.
PROMPT
3.3 Resume a prior interactive Codex session (optional)
codex resume --last
4) Why JSON for feature_list.json (practical reason)
My observation matches the blog’s: models are much more likely to “helpfully rewrite” Markdown. JSON’s rigidity helps enforce the rule:
- later sessions change only passes.
5) Suggested definition of done (DoD) for a feature
A feature can be marked passes: true only if:
- smoke test via init.sh still passes
- the feature’s listed steps can be reproduced reliably
- you can explain in progress.md how you validated it
6) Optional: scaling the harness (future improvement)
If you notice repeated failure modes, consider splitting roles (even if it’s still one model): - testing-focused session (improve init.sh + e2e) - cleanup/refactor session (reduce tech debt) - feature session (pure increments)
Keep the same artifact protocol so every role can hand off cleanly.
来源与参考
源文件: anthropic/AGENTS.md
来源目录: /srv/project/harness-engineering