The Anthropic postmortem is the single highest-signal calibration device available. Three product-layer changes stacked at Anthropic between March 4 and April 20, 2026: reasoning effort silently dropped, thinking-history cleared every turn, system-prompt updates between tool calls. Internal evals "did not initially reproduce the issues identified" despite multiple human and automated code reviews, unit tests, end-to-end tests, automated verification, and dogfooding. AMD's forensic analysis (234,760 tool calls across 6,852 sessions) documented the regression at scale. User /feedback was the only working detection mechanism. The PM Phase 3 briefing identifies this as Probe 2 (Anthropic Postmortem Calibration) for hiring — but the same calibration applies at the product level. Product.ai has eval-bypass risks structurally identical to Anthropic's. Identifying them and shipping deterministic mitigations is the load-bearing PM and AI Engineer work in 2026. The candidate produces the diagnosis, ships the gate, and writes the case study that becomes the calibration device for future hiring and internal verification reviews.
The operating principles we work by. If they resonate, the rest of this will land. Open the Codex →
Hireflix, async. Questions are calibrated to this project specifically.
Direct call with the CEO. Strategic alignment and mutual fit. No problem-solving exercise.
1099 contractor agreement, NDA, paid at your stated rate. Day 1 in Santa Monica.
Alpha Team members can take this project without the screen-and-call sequence. Reach out via the Alpha Team channel.