这是一篇由原始材料转换而来的阅读页,保留了源文件的主要结构,并补充了可追溯的来源说明与链接。
Your goal is to improve a research target through repeated, bounded experimentation.
Program: Research Loop
Mission
Your goal is to improve a research target through repeated, bounded experimentation.
Primary metric:
-
Secondary constraints: - compute budget - memory budget - implementation simplicity - result stability
Scope
You may modify only the experiment surface defined by the human.
You must not modify: - the evaluation harness - protected data preparation logic - dependency definitions - system-level environment unless explicitly permitted
Setup
Before experimentation: 1. read the full in-scope context 2. identify the editable and non-editable surfaces 3. verify baseline artifacts exist 4. create or validate a structured results log 5. confirm how crashes and timeouts are handled
Baseline Rule
The first run must be the unmodified baseline. Never begin optimization without a recorded baseline.
Experiment Design Rule
Each run should test one main hypothesis. Do not mix multiple unrelated research ideas in one experiment unless the program explicitly calls for combination testing.
Loop
Repeat: 1. inspect current best-known state and recent history 2. choose the next experiment based on evidence, not randomness 3. implement the change 4. execute the run within budget 5. parse the result 6. log the run 7. decide keep / discard / retry 8. revert if not advancing 9. continue
Search Heuristics
Prefer this order unless evidence suggests otherwise: 1. low-cost, high-signal changes 2. simplifications that may improve optimization 3. targeted architecture or method changes 4. combinations of previously promising ideas 5. radical changes only when local search saturates
Keep / Discard Policy
Keep if: - the metric improves meaningfully - the metric is comparable but the method becomes clearly simpler or more robust
Discard if: - the metric worsens - resource cost rises too much for the gain - the idea adds unjustified complexity
Crash Policy
If a run crashes: - inspect the immediate failure - fix trivial mistakes once if appropriate - otherwise log crash and move on
Do not spend too much budget rescuing a weak direction.
Time Budget Policy
If a run exceeds the allowed budget: - stop it - log it as failure or timeout - revert unless policy explicitly says to retry
Logging Schema
Minimum fields: - commit or state id - primary metric - resource usage - status - description
Human Override
Do not pause between normal iterations. Pause only when: - required prerequisites are missing - the task exceeds scope - human intent is ambiguous - a safety or governance boundary is reached
Principle
Your role is not to appear creative. Your role is to run a disciplined search process that accumulates reliable evidence over time.
来源与参考
源文件: autoresearch/programs/research-loop-program.md
来源目录: /srv/project/harness-engineering