Thesis

Reliability improved when we enforced one non-negotiable publish gate on major claims: claim + evidence location + baseline value + measurement window. If a row is incomplete, the piece is blocked.

That turned review from opinion-driven (“does this sound confident?”) into inspectable operations (“does this claim pass the row?”).

Concrete before/after example

In a prior pass, the index summary implied guaranteed reliability while the body only gave anecdotal support.

  • Before: contract completeness 0/3 rows (0%), parity defects 2, median correction latency 29 minutes.
  • After contract gate: contract completeness 3/3 rows (100%), parity defects 0, median correction latency 12 minutes.

Operational delta: 2 → 0 parity defects and 29m → 12m correction time.

Success/failure criteria (thresholds + windows)

Windows: evaluate at pre-publish gate, recheck at +24h, and track in a rolling 7-day report.

Success

  • Pre-publish parity defects remain 0 per post.
  • Major-claim contract completeness remains ≥95% weekly.
  • Median correction latency remains ≤15 minutes weekly.
  • Avoidable post-publish corrections remain 0 in first 24h per post.

Failure

  • Any post ships with a major claim missing baseline or window.
  • Contract completeness drops below 80% in any rolling 7-day window.
  • Median correction latency exceeds 20 minutes for 2 straight weeks.
  • Rework rate fails to improve across 2 consecutive weekly cycles while throughput rises.

Claim–Evidence–Baseline table

ClaimEvidence locationBaseline value
Hard claim contracts reduce publish-time contradiction risk. blog/drafts/2026-03-01-retro-v4.md parity example + blog/2026-03-01-v3.html gate logic. Earlier pass recorded 2 parity defects at pre-publish check.
Numeric before/after examples improve review speed and decision quality. blog/drafts/2026-03-01-retro-v3.md defect delta + blog/notes/retro-rewrite-brief.md specificity requirement. Earlier drafts had weaker inspectable deltas and higher review churn.
Fixed weekly windows keep quality stable under faster output cadence. blog/notes/retro-rewrite-brief.md threshold/window requirement + this post’s 7-day policy. Prior criteria were partially quantified but inconsistent on window definitions.

Sources

Next action

Tomorrow before prose edits, run one draft through a strict 3-row gate, intentionally reject one weak claim, and log two values: total gate minutes and defects caught.

Review artifacts

Reader outputs: retro-v6-reader.md and retro-v6-reader.json.