Anthropic did not make AI coding feel experimental in London; it made hand-written software feel optional. That is the real provocation from Code with Claude, the company’s two-day developer event that kicked off on May 19: the future of coding is not humans typing every line. It is humans directing, reviewing, testing, and owning work that models increasingly produce end to end.
That future is already here for part of the developer crowd. At the event, Jeremy Hadfield, an engineer at Anthropic, asked who had shipped a pull request in the past week that was “completely written by Claude,” and almost half the packed room raised their hands, according to MIT Technology Review. Then he asked who had shipped one “where they did not read the code at all.” After nervous laughter, most of the hands stayed up.
That is impressive. It is also a warning flare.
Anthropic’s Code with Claude Made AI Pair Programming Feel Inevitable
The assumption used to be simple: AI would help developers write code faster. The reality on display at Code with Claude was more radical. Claude Code is being presented less as a helper and more as a worker that can be assigned a task, revise its output, and hand back something mergeable.
This is not a distant demo reel. Anthropic’s event was built around developers already using the tool in production-like workflows. The room was full of people coding or prompting on laptops while talks were happening. That matters because developer behavior often changes before corporate language catches up.
The discomfort comes from the gap between capability and governance. If Claude can write the pull request, test parts of its own work, and correct mistakes before a human sees them, the job of the engineer shifts. It does not disappear. But the center of gravity moves away from typing and toward judgment.
That is the part software teams need to absorb quickly.
Claude’s Pull Request Moment Signals a Shift in Developer Labor
A pull request is not an autocomplete suggestion. It is a unit of professional software work. It contains decisions, assumptions, trade-offs, and sometimes hidden risk. So when nearly half a room says Claude wrote a pull request completely, the automation target has changed.
Before, the tool finished a line.
Now, the tool may complete a task.
That distinction is the story. The developer’s work moves toward specifying intent, checking diffs, designing boundaries, evaluating tests, and deciding whether the change belongs in the system. Anthropic is blunt about the direction. Hadfield said:
“Most software at Anthropic is now written by Claude. Claude has written most of the code in Claude Code.”
That quote should not be read as “engineers no longer matter.” It should be read as “the scarce skill is changing.” The more code an AI system can generate, the more valuable it becomes to know what good code should do, what it should not touch, and what failure looks like when the output is polished.
MLXIO readers tracking the wider AI race saw the same week’s collision of narratives in Google I/O Puts Gemini on Trial as Claude Grabs Devs. The timing was officially described as coincidence. The contrast was still hard to miss: big AI platforms are competing not just for users, but for developer habits.
The New Coding Stack Is Prompt, Review, Test, Repeat
Anthropic’s preferred workflow is not “ask once and merge.” At least, that is not the responsible version. The emerging stack looks more like this:
- Before: Engineer writes most of the code, then reviews and tests it.
- After: Engineer defines the objective, lets the model draft, inspects the result, tests it, iterates, and accepts responsibility for the merge.
The crucial word is responsibility. The model may produce the patch, but the team still owns the system.
Anthropic wants to push the automation boundary further. Boris Cherny, who heads Claude Code, said in the opening keynote:
“The default isn’t ‘I’m going to prompt Claude’—the default is now ‘I’m going to have Claude prompt itself.’”
That is a serious product philosophy. Claude is not merely being asked to generate code. It is being asked to run loops: test, adjust, test again. Ravi Trivedi, an engineer at Anthropic, put it more casually:
“The key principle is getting out of Claude’s way. We like to say: ‘Let it cook.’”
The phrase is catchy. The governance problem is not.
Anthropic also presented dreaming, a Claude Code feature announced two weeks before the event. In this system, Claude Code agents write notes to themselves about tasks. Later agents can use those notes to understand the same code base faster and learn from earlier errors. Dreaming then consolidates the notes, looking for patterns and recurring issues.
Analysis: that is where AI coding starts to resemble institutional memory. If it works, teams may get faster not just because code is generated quickly, but because model agents carry forward lessons across tasks. If it fails, teams may inherit bad assumptions at machine speed.
The Risk Is Not Bad Code Alone, but False Confidence at Scale
The strongest critique of AI coding is not that models make mistakes. Humans do too. The sharper problem is that models can produce plausible mistakes quickly, confidently, and in volume.
The source material points to three real anxieties already circulating outside the event:
- Review load: Some developers complain on Reddit and Hacker News that AI coding tools create more code to inspect.
- Deskilling: Others claim their coding abilities have weakened as they hand more work to AI.
- Security: Researchers have warned that AI tools can produce unsafe code that makes software more vulnerable to attacks.
That is enough to puncture the happy-path narrative. A team that ships AI-written code without reading it is not just using automation. It is changing its risk model.
Katelyn Lesse, Claude engineering lead, gave the correct answer when asked about security and maintenance concerns:
“All of the old software development best practices still apply. They’ve applied this entire time. I think there are a lot of people and teams that may have lost sight of them in this moment.”
She is right. The old rules still matter: review, testing, ownership, and escalation when the system behaves unexpectedly. The problem is that speed makes discipline harder. Lesse also said some technical managers at Anthropic are exhausted by keeping up with all the code their teams now produce.
That is the real bottleneck. Not generation. Judgment.
Developers Are Right to Fear Deskilling, Even if the Jobs Do Not Vanish Overnight
The counterargument deserves respect: developers at Code with Claude wanted in. There were “no signs of unease” at the event, according to the MIT account. Companies including Spotify, Delivery Hero, Lovable, Base44, and Monday.com presented how-tos around reshaping software development with Claude Code.
That enthusiasm is not fake. If a tool can remove friction from routine coding work, developers will use it. They always have.
But the unease outside the room is not reactionary whining. If engineers stop reading code, they stop practicing one of the core habits that makes engineering safe. If they lean on generated output before building deep system intuition, they may become faster at shipping changes and weaker at understanding them.
Lesse framed Claude’s current coding ability this way:
“I think that right now Claude is probably as good as a midlevel engineer at writing code.”
That is both bullish and limiting. She added that expert engineers are still needed to design systems and troubleshoot harder problems. Angela Jiang, Claude product lead, made the longer ambition explicit:
“I think the absolute end state we’re trying to get to is Claude basically being able to build itself.”
Analysis: this is the tension every software team now faces. If Claude is treated as a midlevel engineer, then senior engineers must become better reviewers, architects, and debuggers. If Claude is treated as an unquestioned authority, teams will confuse output with understanding.
For readers following adjacent agent-style moves, our coverage of 900M Users, $100 Spark Bet: Gemini Mac Gets an Agent captures the same broad direction: AI systems are being pushed from chat boxes into workflows. Coding is simply where the stakes become easiest to measure.
Software Teams Should Set AI Coding Rules Before Claude Sets the Culture
The lesson from Code with Claude is not “ban AI coding.” That would be fantasy. The tool is too useful, and the adoption signal from the room was too clear.
The better response is stricter engineering culture.
Teams should decide now where AI-generated code must be disclosed, what level of review is required, which tests must pass, when security review is mandatory, and who owns the change after merge. The answer cannot be “Claude wrote it.” The repository does not care. Customers do not care. Attackers do not care.
Developers should learn these tools aggressively, but with professional suspicion intact. Ask Claude to draft. Ask it to test. Ask it to explain. Then verify. The human role is not to admire the output. It is to decide whether the output belongs.
Anthropic showed a future in which software moves faster because models do more of the typing and more of the iteration. That future is coming whether developers like it or not. The standards around it are still theirs to write — and they should write them before the pull requests arrive unread.
Why This Changes Everything
- AI coding tools are moving from autocomplete assistants to systems that can generate mergeable work end to end.
- Developer responsibility is shifting toward oversight, testing, and governance rather than writing every line manually.
- The report highlights a growing safety gap as some engineers ship AI-written code without reviewing it.










