Skip to main content
MiniMax

MiniMax M2.7 Launches on BlockRun, Introducing the First Self‑Evolving Reasoning AI Agents

Published by
SectorHQ Editorial
MiniMax M2.7 Launches on BlockRun, Introducing the First Self‑Evolving Reasoning AI Agents

Photo by Possessed Photography on Unsplash

Before developers spent months building scaffolding around static models, today MiniMax’s M2.7 rewrites its own code—reports indicate the first self‑evolving reasoning AI agents have launched on BlockRun, turning the harness into the product.

Key Facts

  • Key company: MiniMax
  • Also mentioned: Block

MiniMax’s M2.7 is the first model that “deeply participates in its own evolution,” according to the company’s official launch note on BlockRun. The model is delivered as a single‑call API—no subscription, no API key, just a curl request to https://blockrun.ai/v1/chat/completions with the model identifier minimax/minimax-m2.7. Existing users of the prior M2.5 version are automatically redirected to M2.7, meaning that developers can adopt the new capabilities without any code changes (BlockRun, Mar 21).

What distinguishes M2.7 from a conventional LLM is its recursive self‑improvement loop. MiniMax reports that the model runs more than 100 autonomous optimization cycles on its own agent harness, rewriting and refining the scaffolding that normally has to be engineered by humans. Those iterations produced a 30 % performance gain on internal benchmarks, and the model now “handles 30‑50 % of research workflows autonomously,” the company’s technical brief states. In practice, the model exhibits 97 % skill adherence across 40 + complex tasks, each exceeding 2,000 tokens, and it can generate and optimize its own agent code on the fly (MiniMax, Mar 21).

The performance claims are backed by a suite of benchmark results that MiniMax published alongside the launch. On the SWE‑Pro software‑engineering benchmark, M2.7 achieved a 56.22 % match rate, edging out the GPT‑5.3‑Codex baseline (55.6 %). In the VIBE‑Pro end‑to‑end project‑delivery test, it posted a 57.0 % score, and on the multilingual SWE benchmark it reached 76.5 %. For multi‑repo engineering tasks, the model scored 52.7 % on the Multi‑SWE benchmark. In the machine‑learning research arena, M2.7’s MLE‑Bench Lite—covering 22 Kaggle‑style competitions—averaged a 66.6 % medal rate, second only to Opus 4.6 (75.7 %) and GPT‑5.4 (71.2 %). Its best single run earned nine gold, five silver, and one bronze medals (MiniMax, Mar 21).

The launch arrives amid a broader wave of self‑evolving AI agents. Max Quimby notes that within the past six weeks, Andrej Karpathy open‑sourced “autoresearch,” DeepMind shipped “AlphaEvolve,” OpenAI unveiled “Symphony,” and Sam Altman told Stanford that “current AI models are already smart enough to help discover the next architecture” (Computeleap, Mar 21). Quimby argues that these independent signals confirm a shift from “harness engineering”—the manual construction of scaffolding around static models—to models that can improve their own harnesses. MiniMax’s claim that M2.7 “rewrote its scaffolding” aligns with that narrative, moving the developer’s role from building pipelines to supervising autonomous optimization cycles.

Analysts see the commercial implications as significant. Because M2.7 is offered on a pay‑per‑request basis without a subscription lock‑in, enterprises can experiment with self‑evolving agents at low upfront cost, potentially accelerating adoption in research‑intensive domains. The model’s ability to autonomously manage 30‑50 % of research workflows suggests a reduction in human‑in‑the‑loop effort, a metric that could translate directly into cost savings for large‑scale AI labs. However, the performance gains are measured on internal benchmarks; external validation will be needed to confirm whether the 30 % improvement holds across diverse real‑world tasks.

If the early data hold up, M2.7 could redefine the economics of AI development. By collapsing the distinction between model and harness, MiniMax is effectively turning the “product” into the “process,” a shift that may force developers to rethink skill sets that have been centered on prompt engineering and pipeline construction. As Quimby warns, the era when “frontier AI research used to be done by meat computers” is ending, and the next wave of talent will likely be judged on how well they can supervise and direct self‑evolving agents rather than hand‑craft their code (Computeleap, Mar 21).

Sources

Primary source

No primary source found (coverage-based)

Other signals
  • Dev.to Machine Learning Tag

Reporting based on verified sources and public filings. Sector HQ editorial standards require multi-source attribution.

Compare these companies

More from SectorHQ:📊Intelligence📝Blog

🏢Companies in This Story

Related Stories