The AI Goalpost Is Ownership

A cartoon developer and AI assistant move abstract goalposts along a winding path of code blocks — The useful question is not whether the AI goalpost moved. It is whether the new goalpost measures ownership instead of spectacle.

Publiczny Profil's essay, "It Still Can't Do My Job," is a deliberately sharp timeline of AI coding arguments from ChatGPT's 2022 launch through July 2026, followed by forecasts for 2027, 2028, 2030, and 2033. Its central claim is that many dismissals of AI were locally reasonable, then became stale as the tools improved.

My read is that the essay matters less as a prediction scoreboard and more as a reminder that software people keep arguing about the wrong unit of work. Code generation is no longer the interesting boundary by itself. The real boundary is ownership: who understands the change, who verifies it, who can roll it back, and who is accountable when it reaches production.

Answer Snapshot

Question	My read
What happened?	A July 2026 essay framed four years of AI-coding debate as a series of moving goalposts, from broken toy programs to production accountability.
Why it matters	The strongest version of the argument is not "AI can do every job." It is that old capability objections decay quickly, so teams need better tests than vibes.
Who benefits if this is handled well?	Developers, engineering leaders, security teams, and product people who need AI tools to create audited software work rather than just larger diffs.
My thesis	The goalpost should move. It should move from "can the model code?" to "can the organization own what the model helped ship?"

The Essay Is A Mirror

The source is persuasive because it does not pretend the skeptics were always wrong. It starts with the December 2022 Stack Overflow moment, when the site temporarily banned ChatGPT-generated answers because moderators said the average correctness rate was too low and the volume was swamping volunteer curation. That was a real, practical objection, not just anti-AI sentiment.

The later entries show why the objection could not stay frozen. Google said in its October 2024 earnings remarks that more than a quarter of new code at Google was AI-generated, then reviewed and accepted by engineers. METR's July 2025 study found experienced open-source developers took 19% longer with early-2025 AI tools on their own repositories, while METR later warned that result was historical and likely did not reflect early-2026 tool impact. Google DeepMind also reported that an advanced Gemini Deep Think model reached official gold-medal standard at the 2025 International Mathematical Olympiad, solving five of six problems for 35 points.

Those facts do not settle the job question. They do make lazy certainty look bad. A model can be dangerous to Stack Overflow quality in 2022, useful enough for reviewed code generation at Google in 2024, disappointing in one mature-repo productivity study in 2025, and impressive on elite math reasoning in the same year. All of that can be true at once.

A cartoon developer sorts abstract AI-generated code pieces into different review trays — The practical work is not accepting or rejecting AI output in the abstract. It is sorting what belongs in the product from what needs revision or deletion.

The Part I Would Keep

The best line of thought in the essay is that objections need expiry dates. "It cannot write Snake" was a useful complaint only while it could not write Snake. "It cannot write production software" is useful only if it names a concrete production test: which repository, which risk class, which review process, which rollback path, and which owner.

This is where I think a lot of AI debate goes mushy. Boosters overstate the distance from demo to durable system. Skeptics sometimes treat every failure mode as permanent. The more useful stance is narrower: describe the task class, measure the output, and decide whether the human work moved upstream, downstream, or actually disappeared.

The Public Split Is The Point

The Hacker News discussion around the essay is already a good map of the divide. Some readers took the timeline as useful evidence that people keep underrating AI progress. Others pushed back that hallucination has not gone away, that executive claims often run ahead of reality, and that replacing jobs is not the same as making better software. A few focused on whether the post itself had an AI-written texture, which is also part of the cultural fatigue around this topic.

I would not summarize that reaction as a consensus. It is more useful as a taxonomy of concerns. The optimistic side sees compounding capability. The skeptical side sees unresolved verification, economics, and accountability. The pragmatic side asks where AI actually removes work instead of moving it into review, governance, or cleanup.

A cartoon developer and AI assistant balance speed, verification, and ownership as abstract symbols on a scale — The tradeoff is not speed versus humans. It is speed versus the verification and ownership needed to make that speed useful.

METR Is The Caution Label

METR's 2025 result is important because it interrupts both easy narratives. The study tested 16 experienced open-source developers on 246 real tasks in repositories they knew well, and the surprising result was slower completion when AI tools were allowed. Sean Goedecke's analysis of the study is useful because it points out why that setting matters: experienced maintainers working in large, mature codebases may already be very fast on familiar problems, and high-quality library or compiler work leaves less room for casual generated patches.

But METR's February 2026 update is just as important. METR said the old result was out of date, that later experiment data had selection problems, and that it was likely developers were more sped up by AI in early 2026 than in early 2025. The measurement problem got harder because developers increasingly did not want to work without AI, selected different tasks, and used multiple agents concurrently.

That is exactly why the Publiczny Profil essay lands. The goalpost did move, but not in a clean triumphal line. The measurement target changed because the workflow changed. If agents are running in parallel while developers supervise, pause, review, and context-switch, stopwatch productivity is harder to read than old coding-time studies.

Ownership Is The New Benchmark

GitLab's June 2026 AI accountability research gives this debate a more concrete enterprise shape. The company reported that 91% of surveyed organizations had two or more AI coding tools in active use and 78% said developers were writing and committing code faster. But it also reported that 85% agreed AI had shifted the bottleneck from writing code to reviewing and validating it, and 82% said AI-generated code risked creating a new form of technical debt their organization was not prepared to manage.

That is the version of the goalpost I care about. If code is generated faster but the organization cannot tell where it came from, what it was meant to do, or who owns it in production, the gain is incomplete. It may still be worth using. It may even be transformative. But the work has not vanished. It has changed shape.

A cartoon developer checks glowing AI-generated code blocks before they pass through production gates — The production test is not whether code appears quickly. It is whether provenance, review, tests, permissions, and rollback are visible before the code crosses the threshold.

The Part I Would Not Buy Yet

The essay's forecast half is entertaining, but I would not treat it as evidence. A polished one-shot open-world game in 2027, a quarter-long legacy-monolith refactor in 2028, an AI on-call rotation in 2030, and a zero-employee billion-dollar founder in 2033 are story beats. They may age well or badly. They should not be smuggled into the same evidentiary bucket as Stack Overflow policy, Google earnings remarks, METR data, or DeepMind's certified IMO result.

The risk in the moving-goalpost frame is that it can make all caution look like denial. Some caution is denial. Some is discipline. A production engineer who asks about provenance, security, rollback, tests, latency, cost, and accountability is not necessarily moving the goalpost away from AI. They may be moving it toward the part of the job that was always real.

My Takeaway

I like the essay because it refuses the comforting idea that today's AI boundary will hold still. I distrust it when it makes the future feel more inevitable than measured. The right response is not panic, and it is not a shrug. It is to keep updating the benchmark.

For software teams, that benchmark should now be ownership. Show me the task. Show me the diff. Show me the tests. Show me what the model saw. Show me what it changed, what it guessed, what it skipped, and who is prepared to carry the pager after merge. If the AI can keep passing that test on harder and messier work, then the goalpost should move again. Until then, "it still can't do my job" is both too broad and too comfortable.

The AI Goalpost Is Ownership

Answer Snapshot

The Essay Is A Mirror

The Part I Would Keep

The Public Split Is The Point

METR Is The Caution Label

Ownership Is The New Benchmark

The Part I Would Not Buy Yet

My Takeaway

License

ZCode Makes the Harness the Product

Sonnet 5 Puts Agents in the Default Lane

South Korea's $1T AI Bet Runs on Water and Power

License

Related News

ZCode Makes the Harness the Product

Sonnet 5 Puts Agents in the Default Lane

South Korea's $1T AI Bet Runs on Water and Power