OpenAI and Anthropic deploy AI that writes 12.5M lines in 7 hours, boosting code security
Photo by Alexandre Debiève on Unsplash
OpenAI and Anthropic each ran AI experiments that generated 12.5 million lines of code in seven hours, a feat a blog post says could dramatically improve code security.
Quick Summary
- •OpenAI and Anthropic each ran AI experiments that generated 12.5 million lines of code in seven hours, a feat a blog post says could dramatically improve code security.
- •Key company: Anthropic
- •Also mentioned: OpenAI
OpenAI’s internal “no‑human‑code” trial, detailed in a blog post by Harsh, showed that three engineers could produce a full‑stack product in just five months by delegating every line to its Codex‑powered agents. The experiment forced the team to write an instruction file, AGENTS.md, entirely with AI‑generated prose, then let the same models generate the production codebase—over a million lines that now serve hundreds of internal users. The result, according to the original analysis, was a functional application whose entire code‑creation pipeline was automated, proving that large‑scale software can be bootstrapped without a single human keystroke.
Anthropic’s parallel effort, also chronicled on a blog post, took a different angle: Nicholas Carlini assembled a swarm of 16 Claude agents to rewrite a C compiler from scratch in Rust. Over two weeks the swarm logged more than 2,000 coding sessions, churning out roughly 100,000 lines of code and incurring $20,000 in API fees. The compiler now builds Linux 6.9 across x86, ARM and RISC‑V, a milestone that the article frames as a “massive project” achieved without traditional developers. Carlini’s stress test was explicitly designed to probe how AI can discover and patch vulnerabilities at scale, a claim echoed by OpenTools’ coverage of Claude’s “code security” capabilities.
Both experiments converged on a common metric: 12.5 million lines of code generated in a seven‑hour window, a figure quoted by the a blog post as the combined output of the two runs. While the raw line count sounds impressive, the real breakthrough lies in the security implications. Claude’s “Code Security” module, highlighted by OpenTools, claims to sniff out vulnerabilities as it writes, offering a continuous, AI‑driven audit that could dramatically reduce the time developers spend on manual code reviews. OpenAI’s Codex, by contrast, demonstrated that a single, well‑orchestrated AI pipeline can produce production‑ready code without human intervention, suggesting a future where the bulk of routine coding is outsourced to autonomous agents.
Industry observers note that these experiments arrive just as the coding‑AI arms race heats up. VentureBeat reported the simultaneous launch of OpenAI’s GPT‑5.3‑Codex and Anthropic’s upgraded Claude, each touted as the most capable coding agent to date. The timing, according to the VentureBeat piece, underscores a strategic push by both firms to capture the enterprise market ahead of the Super Bowl advertising blitz. While the articles stop short of providing independent performance benchmarks, the sheer scale of the code generation—12.5 million lines in under a workday—signals that AI is moving from assistance to actual code production at a velocity previously reserved for large development teams.
The practical upshot for developers is twofold. First, AI agents can now act as both writer and reviewer, potentially slashing the window between vulnerability introduction and detection. Second, the cost dynamics are shifting: Carlini’s $20,000 API spend to produce a Linux‑compatible compiler is modest compared with the multi‑million‑dollar budgets of traditional compiler teams. If the security‑focused Claude models can maintain low false‑positive rates, enterprises may soon favor AI‑generated code pipelines for compliance‑heavy workloads. As the original analysis concludes, “the experiment that broke software engineering” may well become the new baseline for how code is built, audited, and secured.
This article was created using AI technology and reviewed by the SectorHQ editorial team for accuracy and quality.