Anthropic launches Claude 4.6 with built‑in memory, ending A/B test workflow

$200 / month. That’s the fee users pay for Claude Code, yet Anthropic is silently A/B‑testing the model and degrading workflows, Backnotprop reports.

Key Facts

•Key company: Anthropic

Anthropic’s rollout of Claude 4.6 this week marks the company’s most ambitious upgrade to its flagship chatbot, adding a persistent “memory” layer that stores user‑specific context across sessions. According to a post on the AI‑productivity forum ai‑dong, the new memory system replaces the traditional “blank‑slate” approach by retaining preferences such as a developer’s PostgreSQL settings, async/await conventions, and documentation style. The feature relies on an “activation” mechanism that pulls only the most relevant fragments, avoiding the token‑inflation that typically drives up inference costs. Anthropic also released a migration utility that ingests exported histories from OpenAI or Google, allowing users to rebuild their “AI soul” in minutes via a Settings → Personalization workflow. The post claims a 90 % reduction in API costs for long‑context tasks because Claude now caches memory blocks rather than re‑prompting the entire history each time.

The memory upgrade has been highlighted in multiple tech outlets, including The Verge, which frames the change as a strategic move to lure “AI switchers” away from rival platforms. By offering a seamless import path and a cost‑effective persistence model, Anthropic hopes to capture professional developers who have grown weary of re‑training new assistants on each model iteration. The Verge’s coverage underscores that the memory feature is not merely a static prompt but an evolving storage layer that can be updated incrementally, positioning Claude as a more “personal” assistant in a market where customization is increasingly a differentiator.

However, the launch coincides with mounting user frustration over Anthropic’s opaque experimentation practices. Backnotprop, a user‑run watchdog blog, documented that subscribers to Claude Code—Anthropic’s $200‑per‑month developer tier—have been silently placed into a GrowthBook‑managed A/B test named tengu_pewter_ledger. The test manipulates the “plan mode” output by cycling through four variants (null, trim, cut, cap), each imposing progressively stricter limits on the length and structure of generated plans. Users assigned the most aggressive “cap” variant receive terse bullet‑point plans with no contextual prose, effectively stripping away the interactive back‑and‑forth that many engineers rely on. Backnotprop’s analysis of the Claude Code binary confirms the existence of this hidden test and notes that engineers frequently encounter regressions that are later attributed to being “in an A/B test” without any prior notice.

Anthropic’s dual narrative—introducing a high‑profile memory feature while maintaining covert A/B testing—raises questions about its product‑governance philosophy. The company markets Claude as an AI safety‑focused platform, yet the Backnotprop report likens its experimentation culture to that of Meta, where silent user tests have long been a revenue driver. For enterprise customers paying premium fees, the lack of configurability and transparency could erode trust, especially when critical workflow functions are altered without consent. The tension between rapid feature rollout and user‑centric stability is a familiar dilemma in the AI SaaS space, but Anthropic’s approach may compel larger clients to demand clearer opt‑out mechanisms or contractual guarantees.

From a market perspective, Claude 4.6’s memory could be a decisive advantage if Anthropic can reconcile its engineering practices with customer expectations. The cost savings highlighted by ai‑dong—up to a 90 % reduction in long‑context API expenses—address a key pain point for developers who run large code‑base analyses or multi‑step planning tasks. Meanwhile, the migration tool lowers the barrier for users entrenched in OpenAI or Google ecosystems, potentially expanding Claude’s addressable market. Yet the lingering A/B‑test controversy may temper adoption among risk‑averse enterprises, prompting competitors like OpenAI and Google to emphasize their own transparency policies. How Anthropic balances these forces will likely shape its positioning in the increasingly crowded generative‑AI market.

Anthropic launches Claude 4.6 with built‑in memory, ending A/B test workflow

Key Facts

Sources

🏢Companies in This Story

Related Stories