Summaries > > Anthropic > Anthropic's New Benchmark Changes Everything—Most People Will Miss Why...
TLDR AI progress is accelerating on a super-exponential curve, with Opus 4.5 showing about 5 hours of human‑equivalent work at 50% and about 45.5 hours at 80%, and the pace roughly doubling every 4–4.5 months. By 2026 the game changes to defining, assigning, and managing work through AI agents—delegating tasks, coordinating multiple agents, and maintaining quality—rather than doing it all yourself. Deep domain expertise remains valuable, but the future of work centers on owning outcomes and building agent‑driven workflows across professions.
AI progress is accelerating on a super-exponential curve, not just linearly. Data points such as Opus 4.5 show human-equivalent work rising to roughly 5 hours at 50% progress and about 45.5 hours at 80%, a dramatic leap from earlier milestones. The pace appears to double roughly every 4 to 4.5 months, which means meaningful gains can compound quickly over the next year. Waiting for a traditional ‘AI quarter’ may leave you behind; the wise move is to plan with a 6–12 month horizon. The key implication is that early action compounds as AI agents become more capable.
Identify tasks that would take you a week to complete. Start framing those tasks as agent-enabled workflows you can delegate to AI. In a super-exponential environment, the value of learning to delegate grows as quickly as the technology itself. Those who start this in January–March will have a head start when progress accelerates later in the year.
Establish a simple weekly workflow where one or more agents handle a defined set of tasks and you provide a human review at key milestones. Set clear success metrics such as accuracy, turnaround time, and the usefulness of outputs. Use versioning, audit trails, and regular retrospectives to improve agent behavior over time. This governance lets you benefit from fast progress while maintaining accountability for quality.
Once you can cover a week of work with an agent, add a second and then a third to amplify output. In a power-law world, small increases in team capability yield outsized gains, so your productivity can grow faster than linearly. Design orchestration where tasks flow between agents and humans, rather than relying on a single agent. This scalability puts you ahead of peers who wait for a later ‘AI quarter’ to start.
Technical skills will spread across job families, and engineers will need business and customer fluency to architect agent-enabled systems. Learn to communicate goals, constraints, and acceptance criteria to agents and to non-technical teammates. Develop workflows that enable diverse contributors to participate and improve outcomes. The rise of longer-running agents will change how many professions work, making domain expertise still essential.
The future rewards those who own the work and steer it toward useful outcomes. By 2026, you’ll be asked to define, assign, and manage a week’s worth of agent-driven work, not just perform tasks yourself. Expect a surge of outputs and noise; you must judge which agent-produced results are meaningful and yield compounding value. Even in domains like law, deep domain knowledge remains valuable while agents handle repetitive tasks. Become an individual strategist who leads a team of agents to create a lasting competitive advantage.
PTR does not top out and TR has no upper bound, unlike benchmarks that saturate near 100%. This supports a super-exponential progress trajectory rather than a simple exponential one.
Opus 4.5 shows about 4 hours 45 minutes (nearly 5 hours) of human-equivalent work at 50%, and about 2728 minutes (roughly 45.5 hours) at 80%, a dramatic advance over earlier benchmarks.
The pace appears to double roughly every 4 to 4.5 months.
By 2026 you may be able to delegate a week’s worth of work to AI via agents; those who act in early months (January–March) will have a easy advantage, and the big question will be whether you can delegate a week’s work and let go of much of what you do now.
AI will increasingly train AI itself and automate more, speeding up progress with no apparent upper limit.
The ability to define, assign, and manage work through agents; both technical and non-technical skills; business and customer fluency; domain expertise remains valuable as you lead agents to create value.
Work will be organized around outcomes and ownership; individuals become strategists who manage teams of agents, driving a compounding advantage across domains.
Yes—traditional job-family thinking should be abandoned in favor of outcome- and ownership-obsessed work, with a much higher volume of agent-produced output to judge for usefulness and quality.
Decades of experience remain valuable and some tasks will transform, but business understanding and domain expertise will still be critical; white-shoe law firms won’t be fully replaceable by non-lawyers.
Claude, Gemini, ChatGPT, and other model makers are driving progress, and similar exponential gains in agent-working time are expected from multiple players.
Whether AI progress is on an exponential or super-exponential curve; current evidence points to the latter.