OpenAI launches GPT‑5.4 with extreme reasoning mode and 1 million‑token context

While OpenAI’s earlier models capped context at 8 K tokens, the new GPT‑5.4 now handles a 1 million‑token window and adds an “extreme reasoning” mode that boosts compute for deeper, multi‑hour tasks, reports indicate.

Key Facts

•Key company: OpenAI

OpenAI’s GPT‑5.4 expands the model’s context window from the 8 K tokens that defined GPT‑4’s public offering to a full 1 million tokens, a scale that matches the long‑context capabilities recently demonstrated by Google’s Gemini and Anthropic’s Claude models, according to The Information. The enlarged window allows a single prompt to contain roughly the length of a full research paper, a novel, or an extensive codebase without truncation, eliminating the need for manual chunking or external summarization pipelines. Internally, the model must now maintain a much larger attention matrix, which OpenAI has reportedly optimized through a combination of sparse attention patterns and kernel‑based approximations to keep inference latency within acceptable bounds for enterprise workloads.

The “Extreme reasoning” mode introduced with GPT‑5.4 allocates additional compute cycles to each token, effectively deepening the model’s forward‑pass iterations. The Information notes that this mode is designed for “deeper thinking” and can sustain multi‑hour tasks, a stark contrast to the sub‑minute response times typical of earlier versions. By extending the number of transformer layers that are dynamically re‑evaluated during inference, the mode reduces error propagation in long‑horizon reasoning chains, yielding lower error rates on complex, multi‑step problems. This capability is particularly relevant for autonomous agents and automation frameworks such as OpenAI’s Codex, where a single request may involve iterative code generation, testing, and debugging loops that span hours.

Memory handling across multi‑step workflows also sees a marked improvement. The Information reports that GPT‑5.4 “improved memory” enables the model to retain contextual cues over extended interactions, reducing the need for explicit state‑passing between calls. In practice, this means that an agent can maintain a coherent plan across dozens of API calls without re‑injecting the entire history each time, cutting token overhead and improving throughput. The combination of a 1 million‑token window and persistent memory makes GPT‑5.4 well suited for scientific research tasks that require the ingestion of massive datasets—such as genomic sequences or climate model outputs—and for solving intricate problems that demand sustained logical chains, from theorem proving to multi‑disciplinary design optimization.

OpenAI frames GPT‑5.4 as part of a shift toward monthly model updates, a cadence that mirrors the rapid iteration cycles seen in the broader AI industry. By delivering both a massive context window and an extreme reasoning mode in a single release, OpenAI aims to close the functional gap with competing long‑context models while differentiating itself through deeper compute allocation. The Information emphasizes that the model is “better at tasks” that involve extensive reasoning, suggesting that benchmark performance on established long‑context suites (e.g., LAMBADA, NarrativeQA) will likely see measurable gains, though no specific numbers have been disclosed. The practical upshot for developers is a single API endpoint capable of handling both massive inputs and prolonged, compute‑intensive operations without resorting to external orchestration.

Finally, the broader implications of GPT‑5.4’s architecture point toward a new class of AI‑driven applications. The extreme reasoning mode, coupled with a million‑token context, opens the door for end‑to‑end pipelines that previously required a mosaic of specialized tools—document retrieval, summarization, reasoning, and execution. OpenAI’s move to embed these capabilities directly into the model could streamline workflows in fields ranging from drug discovery, where researchers must parse terabytes of literature and experimental data, to legal analysis, where a single prompt might encompass an entire corpus of case law. As The Information notes, the model is “useful for scientific research & complex problems,” positioning GPT‑5.4 as a versatile engine for any domain where scale and depth of reasoning are paramount.

OpenAI launches GPT‑5.4 with extreme reasoning mode and 1 million‑token context

Key Facts

Sources

🏢Companies in This Story

Related Stories

OpenAI launches GPT‑5.4 with extreme reasoning mode and 1 million‑token context

Key Facts

Sources

🏢Companies in This Story

Related Stories

OpenAI launches GPT‑5.4 with extreme reasoning mode and 1 million‑token context